squidpy.gr.ligrec

squidpy.gr.ligrec(adata, cluster_key, interactions=None, complex_policy='min', threshold=0.01, corr_method=None, corr_axis='clusters', use_raw=True, copy=False, key_added=None, gene_symbols=None, *, table_key=None, **kwargs)[source]

Perform the permutation test as described in [Efremova et al., 2020].

Parameters:

adata (AnnData | SpatialData) – Annotated data object.
use_raw (bool) – Whether to access anndata.AnnData.raw.
table_key (str | None) – Key in spatialdata.SpatialData.tables where the table is stored. Required when adata is a spatialdata.SpatialData object and ignored otherwise.
interactions (DataFrame | Mapping[str, Sequence[str]] | Sequence[str] | tuple[Sequence[str], Sequence[str]] | Sequence[tuple[str, str]] | None) –
Interaction to test. The type can be one of:
- pandas.DataFrame - must contain at least 2 columns named ‘source’ and ‘target’.
- dict - dictionary with at least 2 keys named ‘source’ and ‘target’.
- typing.Sequence - Either a sequence of str, in which case all combinations are produced, or a sequence of tuple of 2 str or a tuple of 2 sequences.
If None, the interactions are extracted from omnipath. Protein complexes can be specified by delimiting the components with ‘_’, such as ‘alpha_beta_gamma’.
complex_policy (Literal['min', 'all']) –
Policy on how to handle complexes. Valid options are:
- ’min’ - select gene with the minimum average expression. This is the same as in [Efremova et al., 2020].
- ’all’ - select all possible combinations between ‘source’ and ‘target’ complexes.
interactions_params – Keyword arguments for omnipath.interactions.import_intercell_network() defining the interactions. These datasets from [Türei et al., 2016] are used by default: omnipath, pathwayextra, kinaseextra and ligrecextra.
transmitter_params – Keyword arguments for omnipath.interactions.import_intercell_network() defining the transmitter side of intercellular connections.
receiver_params – Keyword arguments for omnipath.interactions.import_intercell_network() defining the receiver side of intercellular connections.
cluster_key (str) – Key in anndata.AnnData.obs where clustering is stored.
clusters – Clusters from anndata.AnnData.obs ['{cluster_key}']. Can be specified either as a sequence of tuple or just a sequence of cluster names, in which case all combinations considered.
n_perms – Number of permutations for the permutation test.
threshold (float) – Do not perform permutation test if any of the interacting components is being expressed in less than threshold percent of cells within a given cluster.
seed – Random seed for reproducibility.
corr_method (str | None) – Correction method for multiple testing. See statsmodels.stats.multitest.multipletests() for valid options.
corr_axis (Literal['interactions', 'clusters']) –
Axis over which to perform the FDR correction. Only used when corr_method != None. Valid options are:
- ’interactions’ - correct interactions by performing FDR correction across the clusters.
- ’clusters’ - correct clusters by performing FDR correction across the interactions.
alpha – Significance level for FDR correction. Only used when corr_method != None.
copy (bool) – If True, return the result, otherwise save it to the adata object.
key_added (str | None) – Key in anndata.AnnData.uns where the result is stored if copy = False. If None, '{cluster_key}_ligrec' will be used.
numba_parallel – Whether to use numba.prange() or not. If None, it is determined automatically. For small datasets or small number of interactions, it’s recommended to set this to False.
n_jobs – Number of parallel jobs to use. For backend="loky", the number of cores used by numba for each job spawned by the backend will be set to 1 in order to overcome the oversubscription issue in case you run numba in your function to parallelize. To set the absolute maximum number of threads in numba for your python program, set the environment variable: NUMBA_NUM_THREADS before running the program.
backend – Parallelization backend to use. See joblib.Parallel for available options.
show_progress_bar – Whether to show the progress bar or not.
gene_symbols (str | None) – Key in anndata.AnnData.var to use instead of anndata.AnnData.var_names.

Return type:

Mapping[str, DataFrame] | None

Returns:

If copy = True, returns a dict with following keys:

’means’ - pandas.DataFrame containing the mean expression.

’pvalues’ - pandas.DataFrame containing the possibly corrected p-values.

’metadata’ - pandas.DataFrame containing interaction metadata.

Otherwise, modifies the adata object with the following key:

anndata.AnnData.uns ['{key_added}'] - the above mentioned dict.

NaN p-values mark combinations for which the mean expression of one of the interacting components was 0 or it didn’t pass the threshold percentage of cells being expressed within a given cluster.