Screening regulatory circut via regulation perturbation

One advantage of RegVelo as a mecahnistic model is it ability to simulate targeted perturbations of the gene regulatory network (GRN). By perturbing specific regulatory circuit—or substructure of the global GRN—we can identify critical downstream interactions that shape cell fate decisions.

In this tutorial, we illustrate this approach using the transcription factor elf1 and its regulatory context as an example. Perturbation analysis enables us to uncover the mechanisms underlying fate determination.

To demonstrate the integration of perturbation analysis with regulatory circuits, we use our single-cell dataset of zebrafish neural crest development spanning four developmenta stages [Wang et al., 2024]. This is the same dataset used in the previous tutorial Dynamic analysis in zebrafish neural crest development, where we trained a RegVelo model with a GRN derived from multiome data as prior knowledge. Here, we extend that analysis by focusing on downstream circuit analysis, highlighting how perturbation experiments can reveal key regulators and decision-making modules in the system.

Note

If you want to run this on you own scRNA-seq dataset, you will need to go through Dynamic analysis in zebrafish neural crest development to make sure you that have trained a RegVelo model the dataset.

Key takeaways

  • RegVelo uses the Jacobian matrix framework to infer both global GRNs and cell-specific GRNs [Qiu et al., 2022].

  • By simulating perturbations of regulatory circuits, RegVelo identifies key regulators and evaluates their downstream effects on cell fate decisions.

Library import

import numpy as np
import pandas as pd
import scanpy as sc
import cellrank as cr
import scvi
import scvelo as scv

import regvelo as rgv

import matplotlib.pyplot as plt
import mplscience
import seaborn as sns

General settings

# Set seed
scvi.settings.seed = 0

Load data and preprocess

adata = rgv.datasets.zebrafish_nc()
prior_net = rgv.datasets.zebrafish_grn()

# Preprocessing
sc.pp.neighbors(adata, n_neighbors=30, n_pcs=50)
scv.pp.moments(adata)

adata = rgv.pp.preprocess_data(adata)
adata = rgv.pp.set_prior_grn(adata, prior_net.T)

adata
computing moments based on connectivities
    finished (0:00:00) --> added 
    'Ms' and 'Mu', moments of un/spliced abundances (adata.layers)
computing velocities
    finished (0:00:00) --> added 
    'velocity', velocity vectors for each individual cell (adata.layers)
AnnData object with n_obs × n_vars = 697 × 1008
    obs: 'initial_size_unspliced', 'initial_size_spliced', 'initial_size', 'n_counts', 'cell_type', 'stage'
    var: 'Accession', 'Chromosome', 'End', 'Start', 'Strand', 'gene_count_corr', 'is_tf', 'velocity_gamma', 'velocity_qreg_ratio', 'velocity_r2', 'velocity_genes'
    uns: 'cell_type_colors', 'neighbors', 'velocity_params', 'regulators', 'targets', 'skeleton', 'network'
    obsm: 'X_pca', 'X_umap'
    layers: 'ambiguous', 'matrix', 'spliced', 'unspliced', 'Ms', 'Mu', 'velocity'
    obsp: 'distances', 'connectivities'

Recap: Perturbing TF regulon to identify key regulators

The following section is a recap from the perturbation analysis introduced in our previous tutorial, where we applied regulon-level perturbations to investigate the role of elf1. Our simulations with RegVelo predict suggest that knocking out the elf1 regulon leads to pigment cell depletion and mesenchymal cell enrichment.

model = 'rgv_model'
vae = rgv.REGVELOVI.load(model, adata)
adata = vae.add_regvelo_outputs_to_adata(adata=adata)

TERMINAL_STATES = ["mNC_head_mesenchymal",
                  "mNC_arch2",
                  "mNC_hox34",
                  "Pigment"]

vk = cr.kernels.VelocityKernel(adata).compute_transition_matrix()
estimator = cr.estimators.GPCCA(vk)

## evaluate the fate probabilities on original space
estimator.compute_macrostates(n_states=7, cluster_key="cell_type")
estimator.set_terminal_states(TERMINAL_STATES)
estimator.compute_fate_probabilities()
estimator.plot_fate_probabilities(same_plot=False)
INFO     File rgv_model/model.pt already downloaded
../../_images/903e6386f47c91e56e6257f9d749f5b083b499f568bce4fb3ca72393c9653c7e.png
terminal_indices = np.where(adata.obs["term_states_fwd"].isin(TERMINAL_STATES))[0]
start_indices = np.where(adata.obs["cell_type"].isin(["NPB_nohox"]))[0]
adata_perturb, _ =  rgv.tl.in_silico_block_simulation(model=model,
                                                      adata=adata,
                                                      TF="elf1",
                                                      cutoff=0)
INFO     File rgv_model/model.pt already downloaded
vk = cr.kernels.VelocityKernel(adata).compute_transition_matrix()
vkt = vk.transition_matrix.A

vk_p = cr.kernels.VelocityKernel(adata_perturb).compute_transition_matrix()
vkt_p = vk_p.transition_matrix.A

total_simulations = rgv.tl.markov_density_simulation(adata, 
                                                     vkt, 
                                                     start_indices, 
                                                     terminal_indices, 
                                                     TERMINAL_STATES,
                                                     n_steps=500)

_ = rgv.tl.markov_density_simulation(adata_perturb, 
                                     vkt_p, 
                                     start_indices, 
                                     terminal_indices, 
                                     TERMINAL_STATES,
                                     n_steps=500)
dd_score, dd_sig = rgv.tl.simulated_visit_diff(adata, adata_perturb, TERMINAL_STATES)
rgv.pl.simulated_visit_diff(adata, 
                            adata_perturb, 
                            TERMINAL_STATES, 
                            total_simulations,
                            title="Density difference (elf1)")
../../_images/91b20ce7437ebe7e47cd6b14097a189780b618e1b3b4f135da05626901e820f1.png

Here we can observe both depletion effects on pigment cell fate and enrichment effects on cranial mesenchyme (arch2 + facial mesenchyme) fate.

We now try to understand the underlying mechanisms.

Regulatory context of elf1

To investigate the regulatory context of elf1, we first extracted all of its downstream target genes. Specifically, we computed the global GRN and determined its downstream target genes. We then use RegVelo’s rgv.tl.regulation_scanning function to study the impact of these regulatroy edges (TF -> target gene) in our inferred GRN by systematically blocking them in silico and checking how much this changes predicted cell fates.

To zoom in the regulatory context, we firstly extracted all downstream targets of elf1. To do so, we compute the global GRN as follows:

GRN = rgv.tl.inferred_grn(vae, adata, label="cell_type", group="all", data_frame=True)
Computing global GRN...
targets = GRN.loc[:, "elf1"]
targets = np.array(targets.index.tolist())[np.array(targets) != 0]

Note

Because most gene expressions are low and noisy in single-cell context, some inferred inhibition might not be actual direct inhibition. Therefore, similar to the previous method pySCENIC, user can specifically focus on activation links via targets = np.array(targets.index.tolist())[np.array(targets) > 0]

RegVelo further screens the key regulatory link by perturbing each downstream link of elf1 and uses RegVelo depletion likelihood to rank the link, associates specific gene regulation with cell fate decision. Here we want to know which link contributes to pigment cell depletion when we remove the elf1 regulon.

perturb_screening = rgv.tl.regulation_scanning(model=model, 
                                               adata=adata, 
                                               n_states=7, 
                                               cluster_label="cell_type", 
                                               terminal_states=TERMINAL_STATES, 
                                               TF=["elf1"], 
                                               target=targets, 
                                               effect=0,
                                               n_samples=50)
INFO     File rgv_model/model.pt already downloaded
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> ENSDARG00000024966
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> ENSDARG00000042329
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> alcama
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> aopep
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> apc
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> atp6v0ca
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> baz1b
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> calr3b
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> ccny
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> cdk1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> cdkn1ca
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> celf2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> cenpf
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> cpeb4b
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> cxxc5b
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> diaph2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> dlg1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> dnajb2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> dnmt1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> dusp5
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> ebf3a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> emc2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> ephb3a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> erbb3b
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> esco2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> eva1ba
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> fam49a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> fhl3a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> fhod1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> fli1a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> foxo1a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> fzd3a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> glb1l
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> glulb
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> gpd2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> hat1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> hexb
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> hivep1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> hivep3b
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> hmgn2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> hnrnpabb
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> hpcal4
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> hsp70.2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> hspa5
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> hspb8
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> id2a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> ildr2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> inka1a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> itga3a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> itga8
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> ivns1abpa
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> klf6a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> kntc1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> mbnl2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> metrn
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> mibp
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> myo10l1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> nr6a1b
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> pcdh2g28
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> pdgfba
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> pdlim4
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> pleca
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> plpp3
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> pmp22a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> ppt1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> prdm1a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> prkceb
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> prkcsh
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> pttg1ipb
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> rabl6b
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> ralgps2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> rgl1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> rhbdf1a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> rhoca
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> rxraa
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> sash1b
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> sema3d
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> sema4ba
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> sept12
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> sept9a
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> serinc5
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> shroom4
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> si:ch211-199g17.2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> si:ch211-222l21.1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> si:ch73-335l21.1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> si:dkey-17m8.1
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> sipa1l2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> slbp
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> slc12a9
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> sox6
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> src
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> stat3
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> tcf12
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> tes
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> tle3b
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> tuba8l4
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> vash2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> vav2
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> zgc:154093
INFO     File rgv_model/model.pt already downloaded
Finished elf1 -> zgc:165573
coef = pd.DataFrame(np.array(perturb_screening["coefficient"]))
coef.index = perturb_screening["links"]
coef.columns = list(perturb_screening["coefficient"][0].keys())

We further list the top links leading to the highest pigment cell depletion.

Pigment = coef.sort_values(by="Pigment", ascending=False)[:15]["Pigment"]
df = pd.DataFrame({"Gene": Pigment.index.tolist(), "Score": np.array(Pigment)})

# Sort DataFrame by -log10(p-value) for ordered plotting
df = df.sort_values(by="Score", ascending=False)

# Highlight specific genes
# Set up the plot
with mplscience.style_context():
    sns.set_style(style="white")
    fig, ax = plt.subplots(figsize=(3, 6))
    sns.scatterplot(data=df, x="Score", y="Gene", palette="purple", s=200, legend=False)

    for _, row in df.iterrows():
        plt.hlines(row["Gene"], xmin=0.5, xmax=row["Score"], colors="grey", linestyles="-", alpha=0.5)

    # Customize plot appearance
    plt.xlabel("Pigment depletion likelihood")
    plt.ylabel("")
    plt.title("Pigment")

    plt.gca().spines["top"].set_visible(False)
    plt.gca().spines["right"].set_visible(False)
    plt.gca().spines["left"].set_color("black")
    plt.gca().spines["bottom"].set_color("black")
    # Show plot
../../_images/e1ad293563f865e2c50cc158af58504b2dea2d31f3ba64ca5652afacb7dde658.png

We see that elf1 -> fli1a is ranked as the top link affecting pigmentation. We further focus on the upstream of elf1 to study which regulators determine the elf1 expression in driving pigmentation.

regulators = GRN.loc["elf1",:]
regulators = np.array(regulators.index.tolist())[np.array(regulators) != 0]
perturb_screening_tf = rgv.tl.regulation_scanning(model=model, 
                                                  adata=adata, 
                                                  n_states=7, 
                                                  cluster_label="cell_type", 
                                                  terminal_states=TERMINAL_STATES, 
                                                  TF=regulators, 
                                                  target=["elf1"], 
                                                  effect=0,
                                                  n_samples=50)
INFO     File rgv_model/model.pt already downloaded
INFO     File rgv_model/model.pt already downloaded
Finished arntl1b -> elf1
INFO     File rgv_model/model.pt already downloaded
Finished fli1a -> elf1
INFO     File rgv_model/model.pt already downloaded
Finished fosl2 -> elf1
INFO     File rgv_model/model.pt already downloaded
Finished hnf1ba -> elf1
INFO     File rgv_model/model.pt already downloaded
Finished hoxa4a -> elf1
INFO     File rgv_model/model.pt already downloaded
Finished hoxb8a -> elf1
INFO     File rgv_model/model.pt already downloaded
Finished hoxc3a -> elf1
INFO     File rgv_model/model.pt already downloaded
Finished sox10 -> elf1
INFO     File rgv_model/model.pt already downloaded
Finished sox9b -> elf1
INFO     File rgv_model/model.pt already downloaded
Finished tfap2a -> elf1
INFO     File rgv_model/model.pt already downloaded
Finished tfec -> elf1
coef = pd.DataFrame(np.array(perturb_screening_tf["coefficient"]))
coef.index = perturb_screening_tf["links"]
coef.columns = list(perturb_screening_tf["coefficient"][0].keys())
Pigment = coef.sort_values(by="Pigment", ascending=False)["Pigment"]
df = pd.DataFrame({"Gene": Pigment.index.tolist(), "Score": np.array(Pigment)})

# Sort DataFrame by -log10(p-value) for ordered plotting
df = df.sort_values(by="Score", ascending=False)

# Highlight specific genes
# Set up the plot
with mplscience.style_context():
    sns.set_style(style="white")
    fig, ax = plt.subplots(figsize=(4, 6))
    sns.scatterplot(data=df, x="Score", y="Gene", palette="purple", s=200, legend=False)

    for _, row in df.iterrows():
        plt.hlines(row["Gene"], xmin=0.45, xmax=row["Score"], colors="grey", linestyles="-", alpha=0.5)

    # Customize plot appearance
    plt.xlabel("Pigment depletion likelihood")
    plt.ylabel("")
    plt.title("Pigment")

    plt.gca().spines["top"].set_visible(False)
    plt.gca().spines["right"].set_visible(False)
    plt.gca().spines["left"].set_color("black")
    plt.gca().spines["bottom"].set_color("black")
    # Show plot
../../_images/dd1aa63cc7aabf2213e6c1df42500175a945c9138da7a1df5412f7f8ed33f8e6.png

We see that hoxb8a and tfec are the top regulators of elf1. We have now found a subnetwork centered around elf1 for the pigment cell fate determination.

Visualization of the genetic circuit

We further visualize this genetic circuit, in which we include its top upstream regulator tfec and downstream target fli1a.

motif = [
    ["tfec","elf1", GRN.loc["elf1","tfec"]],
    ["elf1", "fl1a", GRN.loc["elf1", "fli1a"]],
    ["fl1a", "elf1", GRN.loc["fli1a", "elf1"]],
]

motif = pd.DataFrame(motif)
motif.columns = ["from", "to", "weight"]
motif["weight"] = np.sign(motif["weight"])
with mplscience.style_context():
    rgv.pl.regulatory_network(motif=motif)
../../_images/66cf1bc0e6cfaced4bab4bd14f8ba99900d9793ab0620e108eb8ba9c59c93d1b.png

Visualization genetic circut

Using the Jacobian matrix framework [Qiu et al., 2022], RegVelo also enables the estimation of cell-specific regulatory weights. We can easily visualize this in a 2D embedding space. In the following, we demonstrate this for the regulatory edge tfec -> elf1.

GRN = rgv.tl.inferred_grn(vae, adata, cell_specific_grn=True)
GRN.shape
(697, 1008, 1008)

Note

The estimated cell-specific GRN has three dimensions: the first corresponds to the cells, the second to the target genes, and the third to the regulator genes.

tfec_elf1 = GRN[:,[i == "elf1" for i in adata.var.index], [i == "tfec" for i in adata.var.index]].reshape(-1)
adata.obs["tfec_elf1_weight"] = tfec_elf1
scv.pl.umap(adata,
            color=["tfec","elf1"],
            frameon=False,layer = ["Ms"],title = ["tfec","elf1"])
../../_images/87e9de62e8d72fa8bd0da2ac7f692189f4aeab8f6333bbf825b1ea0d9cc6cf18.png
scv.pl.umap(adata,
            color=["tfec_elf1_weight"],
            cmap="Reds")
../../_images/470cb43d6ac3593c5a41a01004c3d65412f7a5562b5d658d987b83357ce8ddf7.png