regvelo.ModelComparison¶

class regvelo.ModelComparison(adata, terminal_states=None, state_transition=None, n_states=None)[source]¶

Compare different types of RegVelo models : cite:p: Wang2025.

This class is used to compare different RegVelo models with different optimization mode (soft, hard, soft_regularized) and under different normalization factor lamda2. User can evaluate and visulize competence of different types of models based on various side information (Real time, Pseudo Time, Stemness Score, Terminal States Identification, Cross Boundary Correctness) of cell. Finally, it will return a barplot with best performed model marked, and its performance will also be highlighted by significance test.

Examples

See notebook.

Parameters:

adata (AnnData)
terminal_states (list)
state_transition (dict)
n_states (int)

Methods table¶

Methods¶

ModelComparison.calculate(adata, side_information, side_key=None)[source]¶

Parameters:

adata (AnnData)
side_information (str)
side_key (str | None)

ModelComparison.evaluate(side_information, side_key=None)[source]¶

Evaluate all of trained model under one specific side_information mode, For example, if user know the exact time or stage of cells, user can choose ‘Real_Time’ as reference; If users has used Pseudotime calculator such as CellRank beforehand, they can also choose ‘Pseudo_Time’ as reference.

Parameters:

side_information (str) – User can choose perspectives to compare RegVelo models, including ‘Real_Time’, ‘Pseudo_Time’, ‘Stemness_Score’,’TSI’,’CBC’.
side_key (Optional[str]) – Column name of adata.obs which used to store information of selected side_information. For ‘Pseudo_Time’ and ‘Stemness_Score’, we provide default side_key, but you can also choose your own side_key as input.

Return type:

DataFrame

Returns:

: A dataframe records evaluation performance of all models.

ModelComparison.get_significance(pvalue)[source]¶

ModelComparison.min_max_scaling(x)[source]¶

ModelComparison.model_load(pthfilepath)[source]¶

Load a trained model from a given file path using cloudpickle.

Parameters:

pthfilepath (str) – The file path from which the model will be loaded.

Raises:

FileNotFoundError – If the specified file does not exist.
ValueError – If the loaded object is not a dictionary.
Exception – For other unexpected errors during loading.

Return type:

None

Returns:

: None

The function assigns the loaded model dictionary to self.MODEL_TRAINED.

ModelComparison.model_save(pthfilepath='model_dict.pth')[source]¶

Save the trained model to a given file path using cloudpickle.

Parameters:

pthfilepath (str, optional) – The file path where the model will be saved. Defaults to ‘model_dict.pth’.

Return type:

None

Returns:

: None

The function saves the model to disk and prints the status.

ModelComparison.plot_results(side_information, figsize=(6, None), palette='lightpink')[source]¶

Visualize comparision result by barplot with scatters. The significant mark will only show with n_repeats more than 3, and p < 0.05.

Paramters¶

side_information: Here choose the side_information you wish to visulize, which must be performed in ‘evaluation’ step in advance.
figsize: You can choose the size of figure. Default is (6,None), which means the height of the plot are set to change with the number of models.
palette: You can choose the color of barplot.

returns:: : Nothing, just plots the figure.

ModelComparison.result_load(side_information)[source]¶

Load a CSV file into a DataFrame and assign it to the instance.

Parameters:

side_information (str) – The key identifying the side information to be loaded. The CSV file must be named <side_information>.csv.

Return type:

bool

Returns:

: bool

True if the file is successfully loaded and assigned to self.df_<side_information>. False if the file does not exist, the key is invalid, or another error occurs.

ModelComparison.result_save(side_information)[source]¶

Save a DataFrame associated with the given side information to CSV.

Parameters:

side_information (str) – The key identifying the DataFrame attribute of the instance, expected to be stored as self.df_<side_information>.

Return type:

None

Returns:

: None

The function saves the DataFrame to a CSV file named <side_information>.csv in the current directory.

ModelComparison.train(model_list, adata=None, lam2=None, n_repeat=1, batch_size=None)[source]¶

Train all the possible models given by users, and stored them in a dictionary, where users can reach them easily and deal with them in batch.If there are already model trained and saved before, they won’t be removed.

Parameters:

adata (Optional[AnnData]) – The annotated data matrix. After input of adata, the object will store it as self variable.
model_list (list[str]) – The list of valid model type, including ‘Soft’, ‘Hard’, ‘Soft_regularized’
lam2 (Union[list[float], float, None]) – Normalization factor used under ‘soft_regularized’ mode. A float or a list of float number in range of (0,1)
batch_size – Training batch size. This enable user to adjust batch size according to data size.
n_repeat (int)

Return type:

list

Returns:

: A dictionary key names, represent to all models trained in this step.

ModelComparison.validate_input(adata, model_list=None, side_information=None, lam2=None, side_key=None)[source]¶

Return type:

None

Parameters:

adata (AnnData)
model_list (list[str] | None)
side_information (str | None)
lam2 (list[float] | float | None)
side_key (str | None)