squidpy.gr.ligrec(adata, cluster_key, interactions=None, complex_policy='min', threshold=0.01, corr_method=None, corr_axis='clusters', use_raw=True, copy=False, key_added=None, gene_symbols=None, **kwargs)[source]

Perform the permutation test as described in [Efremova et al., 2020].

  • adata (AnnData) – Annotated data object.

  • use_raw (bool) – Whether to access anndata.AnnData.raw.

  • interactions (Union[DataFrame, Mapping[str, Sequence[str]], Sequence[str], Tuple[Sequence[str], Sequence[str]], Sequence[Tuple[str, str]], None]) –

    Interaction to test. The type can be one of:

    • pandas.DataFrame - must contain at least 2 columns named ‘source’ and ‘target’.

    • dict - dictionary with at least 2 keys named ‘source’ and ‘target’.

    • typing.Sequence - Either a sequence of str, in which case all combinations are produced, or a sequence of tuple of 2 str or a tuple of 2 sequences.

    If None, the interactions are extracted from omnipath. Protein complexes can be specified by delimiting the components with ‘_’, such as ‘alpha_beta_gamma’.

  • complex_policy (Literal[‘min’, ‘all’]) –

    Policy on how to handle complexes. Valid options are:

    • ’min’ - select gene with the minimum average expression. This is the same as in [Efremova et al., 2020].

    • ’all’ - select all possible combinations between ‘source’ and ‘target’ complexes.

  • interactions_params – Keyword arguments for omnipath.interactions.import_intercell_network() defining the interactions. These datasets from [Türei et al., 2016] are used by default: omnipath, pathwayextra, kinaseextra and ligrecextra.

  • transmitter_params – Keyword arguments for omnipath.interactions.import_intercell_network() defining the transmitter side of intercellular connections.

  • receiver_params – Keyword arguments for omnipath.interactions.import_intercell_network() defining the receiver side of intercellular connections.

  • cluster_key (str) – Key in anndata.AnnData.obs where clustering is stored.

  • clusters – Clusters from anndata.AnnData.obs ['{cluster_key}']. Can be specified either as a sequence of tuple or just a sequence of cluster names, in which case all combinations considered.

  • n_perms – Number of permutations for the permutation test.

  • threshold (float) – Do not perform permutation test if any of the interacting components is being expressed in less than threshold percent of cells within a given cluster.

  • seed – Random seed for reproducibility.

  • corr_method (Optional[str]) – Correction method for multiple testing. See statsmodels.stats.multitest.multipletests() for valid options.

  • corr_axis (Literal[‘interactions’, ‘clusters’]) –

    Axis over which to perform the FDR correction. Only used when corr_method != None. Valid options are:

    • ’interactions’ - correct interactions by performing FDR correction across the clusters.

    • ’clusters’ - correct clusters by performing FDR correction across the interactions.

  • alpha – Significance level for FDR correction. Only used when corr_method != None.

  • copy (bool) – If True, return the result, otherwise save it to the adata object.

  • key_added (Optional[str]) – Key in anndata.AnnData.uns where the result is stored if copy = False. If None, '{cluster_key}_ligrec' will be used.

  • numba_parallel – Whether to use numba.prange or not. If None, it is determined automatically. For small datasets or small number of interactions, it’s recommended to set this to False.

  • n_jobs – Number of parallel jobs.

  • backend – Parallelization backend to use. See joblib.Parallel for available options.

  • show_progress_bar – Whether to show the progress bar or not.

  • gene_symbols (Optional[str]) – Key in anndata.AnnData.var to use instead of anndata.AnnData.var_names.

Return type

Optional[Mapping[str, DataFrame]]


If copy = True, returns a dict with following keys:

Otherwise, modifies the adata object with the following key:

NaN p-values mark combinations for which the mean expression of one of the interacting components was 0 or it didn’t pass the threshold percentage of cells being expressed within a given cluster.