squidpy.gr.ligrec

squidpy.gr.ligrec(adata, cluster_key, interactions=None, complex_policy='min', threshold=0.01, corr_method=None, corr_axis='clusters', use_raw=True, copy=False, key_added=None, gene_symbols=None, **kwargs)[source]

Perform the permutation test as described in [Efremova et al., 2020].

Parameters:

adata (AnnData | SpatialData) – Annotated data object.
use_raw (bool) – Whether to access anndata.AnnData.raw.
interactions (Union[DataFrame, Mapping[str, Sequence[str]], Sequence[str], tuple[Sequence[str], Sequence[str]], Sequence[tuple[str, str]], None]) –
Interaction to test. The type can be one of:
- pandas.DataFrame - must contain at least 2 columns named ‘source’ and ‘target’.
- dict - dictionary with at least 2 keys named ‘source’ and ‘target’.
- typing.Sequence - Either a sequence of str, in which case all combinations are produced, or a sequence of tuple of 2 str or a tuple of 2 sequences.
If None, the interactions are extracted from omnipath. Protein complexes can be specified by delimiting the components with ‘_’, such as ‘alpha_beta_gamma’.
complex_policy (Literal['min', 'all']) –
Policy on how to handle complexes. Valid options are:
- ’min’ - select gene with the minimum average expression. This is the same as in [Efremova et al., 2020].
- ’all’ - select all possible combinations between ‘source’ and ‘target’ complexes.
interactions_params – Keyword arguments for omnipath.interactions.import_intercell_network() defining the interactions. These datasets from [Türei et al., 2016] are used by default: omnipath, pathwayextra, kinaseextra and ligrecextra.
transmitter_params – Keyword arguments for omnipath.interactions.import_intercell_network() defining the transmitter side of intercellular connections.
receiver_params – Keyword arguments for omnipath.interactions.import_intercell_network() defining the receiver side of intercellular connections.
cluster_key (str) – Key in anndata.AnnData.obs where clustering is stored.
clusters – Clusters from anndata.AnnData.obs ['{cluster_key}']. Can be specified either as a sequence of tuple or just a sequence of cluster names, in which case all combinations considered.
n_perms – Number of permutations for the permutation test.
threshold (float) – Do not perform permutation test if any of the interacting components is being expressed in less than threshold percent of cells within a given cluster.
seed – Random seed for reproducibility.
corr_method (Optional[str]) – Correction method for multiple testing. See statsmodels.stats.multitest.multipletests() for valid options.
corr_axis (Literal['interactions', 'clusters']) –
Axis over which to perform the FDR correction. Only used when corr_method != None. Valid options are:
- ’interactions’ - correct interactions by performing FDR correction across the clusters.
- ’clusters’ - correct clusters by performing FDR correction across the interactions.
alpha – Significance level for FDR correction. Only used when corr_method != None.
copy (bool) – If True, return the result, otherwise save it to the adata object.
key_added (Optional[str]) – Key in anndata.AnnData.uns where the result is stored if copy = False. If None, '{cluster_key}_ligrec' will be used.
numba_parallel – Whether to use numba.prange or not. If None, it is determined automatically. For small datasets or small number of interactions, it’s recommended to set this to False.
n_jobs – Number of parallel jobs.
backend – Parallelization backend to use. See joblib.Parallel for available options.
show_progress_bar – Whether to show the progress bar or not.
gene_symbols (Optional[str]) – Key in anndata.AnnData.var to use instead of anndata.AnnData.var_names.

Return type:

Mapping[str, DataFrame] | None

Returns:

: If copy = True, returns a dict with following keys:

’means’ - pandas.DataFrame containing the mean expression.

’pvalues’ - pandas.DataFrame containing the possibly corrected p-values.

’metadata’ - pandas.DataFrame containing interaction metadata.

Otherwise, modifies the adata object with the following key:

anndata.AnnData.uns ['{key_added}'] - the above mentioned dict.

NaN p-values mark combinations for which the mean expression of one of the interacting components was 0 or it didn’t pass the threshold percentage of cells being expressed within a given cluster.