squidpy.gr.ligrec
- squidpy.gr.ligrec(adata, cluster_key, interactions=None, complex_policy='min', threshold=0.01, corr_method=None, corr_axis='clusters', use_raw=True, copy=False, key_added=None, gene_symbols=None, **kwargs)[source]
Perform the permutation test as described in [Efremova et al., 2020].
- Parameters:
adata (
AnnData
|SpatialData
) – Annotated data object.use_raw (
bool
) – Whether to accessanndata.AnnData.raw
.interactions (
Union
[DataFrame
,Mapping
[str
,Sequence
[str
]],Sequence
[str
],tuple
[Sequence
[str
],Sequence
[str
]],Sequence
[tuple
[str
,str
]],None
]) –Interaction to test. The type can be one of:
pandas.DataFrame
- must contain at least 2 columns named ‘source’ and ‘target’.dict
- dictionary with at least 2 keys named ‘source’ and ‘target’.typing.Sequence
- Either a sequence ofstr
, in which case all combinations are produced, or a sequence oftuple
of 2str
or atuple
of 2 sequences.
If None, the interactions are extracted from
omnipath
. Protein complexes can be specified by delimiting the components with ‘_’, such as ‘alpha_beta_gamma’.complex_policy (
Literal
['min'
,'all'
]) –Policy on how to handle complexes. Valid options are:
’min’ - select gene with the minimum average expression. This is the same as in [Efremova et al., 2020].
’all’ - select all possible combinations between ‘source’ and ‘target’ complexes.
interactions_params – Keyword arguments for
omnipath.interactions.import_intercell_network()
defining the interactions. These datasets from [Türei et al., 2016] are used by default: omnipath, pathwayextra, kinaseextra and ligrecextra.transmitter_params – Keyword arguments for
omnipath.interactions.import_intercell_network()
defining the transmitter side of intercellular connections.receiver_params – Keyword arguments for
omnipath.interactions.import_intercell_network()
defining the receiver side of intercellular connections.cluster_key (
str
) – Key inanndata.AnnData.obs
where clustering is stored.clusters – Clusters from
anndata.AnnData.obs
['{cluster_key}']
. Can be specified either as a sequence oftuple
or just a sequence of cluster names, in which case all combinations considered.n_perms – Number of permutations for the permutation test.
threshold (
float
) – Do not perform permutation test if any of the interacting components is being expressed in less thanthreshold
percent of cells within a given cluster.seed – Random seed for reproducibility.
corr_method (
Optional
[str
]) – Correction method for multiple testing. Seestatsmodels.stats.multitest.multipletests()
for valid options.corr_axis (
Literal
['interactions'
,'clusters'
]) –Axis over which to perform the FDR correction. Only used when
corr_method != None
. Valid options are:’interactions’ - correct interactions by performing FDR correction across the clusters.
’clusters’ - correct clusters by performing FDR correction across the interactions.
alpha – Significance level for FDR correction. Only used when
corr_method != None
.copy (
bool
) – IfTrue
, return the result, otherwise save it to theadata
object.key_added (
Optional
[str
]) – Key inanndata.AnnData.uns
where the result is stored ifcopy = False
. If None,'{cluster_key}_ligrec'
will be used.numba_parallel – Whether to use
numba.prange
or not. If None, it is determined automatically. For small datasets or small number of interactions, it’s recommended to set this to False.n_jobs – Number of parallel jobs.
backend – Parallelization backend to use. See
joblib.Parallel
for available options.show_progress_bar – Whether to show the progress bar or not.
gene_symbols (
Optional
[str
]) – Key inanndata.AnnData.var
to use instead ofanndata.AnnData.var_names
.
- Return type:
- Returns:
: If
copy = True
, returns adict
with following keys:’means’ -
pandas.DataFrame
containing the mean expression.’pvalues’ -
pandas.DataFrame
containing the possibly corrected p-values.’metadata’ -
pandas.DataFrame
containing interaction metadata.
Otherwise, modifies the
adata
object with the following key:anndata.AnnData.uns
['{key_added}']
- the above mentioneddict
.
NaN p-values mark combinations for which the mean expression of one of the interacting components was 0 or it didn’t pass the
threshold
percentage of cells being expressed within a given cluster.