Model selection using a human hindbrain dataset¶

In this notebook, we use RegVelo’s ModelCaomparison class to select a model setup on the human hindbrain dataset from Braun, E. et al, 2023 that we previously preprocessed. For model evaluation, we use the cross-boundary correctness (CBC) based on known hindbrain trajectories.

A detailed tutorial on how to use RegVelo’s ModelComparison class is provided in the Model Comparison Tutorial.

Library import¶

import numpy as np
import scanpy as sc

import cellrank as cr
import scvi
import scvelo as scv

from regvelo import REGVELOVI
from regvelo import ModelComparison
import regvelo as rgv

General settings¶

scvi.settings.seed = 0

scv.settings.set_figure_params("scvelo", dpi=80, transparent=True, fontsize=14, color_map="viridis")

%matplotlib inline

Load data¶

In the following, we load the preprocessed embryonic hindbrain dataset from the previous tutorial notebook and define the ground truth transitions used for model evaluation.

adata = rgv.datasets.hindbrain(data_type = "preprocessed")

STATE_TRANSITIONS = [
    ("Hindbrain neuroblast", "Pontocerebellar glutamatergic"),
    ("Hindbrain neuroblast", "Dorsal interneurons"),
    ("Hindbrain neuroblast", "Dorsal sensory neuroblasts"),
    ("Hindbrain radial glia", "Visceral motor neuron"),
]

ModelComparison¶

Object initialization¶

We now initialize the ModelComparison object and sepcify the parameter state_transition.

comp = ModelComparison(adata = adata, state_transition=STATE_TRANSITIONS)

Train¶

We set batch_size to np.min(adata.shape[0], 5000) to avoid out of memory issue.

comp.train(model_list=['soft','hard','soft_regularized'],
           lam2=[1.0],
           n_repeat=1,
           batch_size=min(adata.shape[0], 5000))

Epoch 442/1500:  29%|██▉       | 442/1500 [40:39<1:37:18,  5.52s/it, v_num=1]
Monitored metric elbo_validation did not improve in the last 45 records. Best score: -2095.273. Signaling Trainer to stop.
Epoch 848/1500:  57%|█████▋    | 848/1500 [1:17:29<59:35,  5.48s/it, v_num=1]

Monitored metric elbo_validation did not improve in the last 45 records. Best score: -2043.481. Signaling Trainer to stop.
Epoch 685/1500:  46%|████▌     | 685/1500 [1:03:17<1:15:18,  5.54s/it, v_num=1]
Monitored metric elbo_validation did not improve in the last 45 records. Best score: -2030.300. Signaling Trainer to stop.

['soft_0', 'hard_0', 'soft_regularized\nlam2:1.0_0']

Evaluate and save¶

We use CBC to evaluate if the model’s inferred transitions align with the ground truth state changes and specify the parameter side_key, which contains cell-type annotations stored in adata.obs.

comp.evaluate(side_information='CBC',
              side_key='regvelo_annotation')

('df_CBC',
                         Model      Corr Run
 0                        soft  0.695765   0
 1                        hard  0.686511   0
 2  soft_regularized\nlam2:1.0  0.744029   0)

comp.plot_results(side_information='CBC')

../../_images/45322b2ca212572eecec57355d361a27661ac6e0971533bb5acf8616b2125ef1.png

vae_sr = comp.MODEL_TRAINED['soft_regularized\nlam2:1.0_0']
vae_sr.save('vae_sr')

Trajectory and latent time prediction¶

vae_sr = REGVELOVI.load('vae_sr', adata)
rgv.tl.set_output(adata, vae_sr, n_samples=30, batch_size=5000)

INFO     File vae_sr/model.pt already downloaded

scv.tl.velocity_graph(adata)

computing velocity graph (using 1/128 cores)
WARNING: Unable to create progress bar. Consider installing `tqdm` as `pip install tqdm` and `ipywidgets` as `pip install ipywidgets`,
or disable the progress bar using `show_progress_bar=False`.
    finished (0:01:18) --> added 
    'velocity_graph', sparse matrix with cosine correlations (adata.uns)

scv.settings.set_figure_params("scvelo", dpi=80, transparent=True, fontsize=14, color_map="viridis")

scv.pl.velocity_embedding_stream(adata, basis="umap", title="", color = "CellClass")

../../_images/56c4703d44a37c593e9d2ea15971563a1a410e2a4d5f6e041cdfcc8c89458b64.png

adata.obs["latent_time"] = adata.layers["fit_t"].mean(1)
scv.pl.scatter(adata, basis="umap", title="",color = ["latent_time"],legend_loc = "on data")

../../_images/25f54f628f8966b4c73b27060fe9a07c7731d105ff058a3324ceb2754638eb8c.png