%matplotlib inline

Analyze 4i data

This tutorial shows how to apply Squidpy for the analysis of 4i data.

The data used here was obtained from [Gut et al., 2018]. We provide a pre-processed subset of the data, in anndata.AnnData format. For details on how it was pre-processed, please refer to the original paper.

Import packages & data

To run the notebook locally, create a conda environment as conda env create -f environment.yml using this environment.yml <https://github.com/scverse/squidpy_notebooks/blob/main/environment.yml>_.

import squidpy as sq


# load the pre-processed dataset
adata = sq.datasets.four_i()

First, let’s visualize cluster annotation in spatial context with squidpy.pl.spatial_scatter().

sq.pl.spatial_scatter(adata, shape=None, color="cluster", size=1)
WARNING: Please specify a valid `library_id` or set it permanently in `adata.uns['spatial']`

Neighborhood enrichment analysis

Similar to other spatial data, we can investigate spatial organization of clusters in a quantitative way, by computing a neighborhood enrichment score. You can compute such score with the following function: squidpy.gr.nhood_enrichment(). In short, it’s an enrichment score on spatial proximity of clusters: if spots belonging to two different clusters are often close to each other, then they will have a high score and can be defined as being enriched. On the other hand, if they are far apart, the score will be low and they can be defined as depleted. This score is based on a permutation-based test, and you can set the number of permutations with the n_perms argument (default is 1000).

Since the function works on a connectivity matrix, we need to compute that as well. This can be done with squidpy.gr.spatial_neighbors(). Please see Building spatial neighbors graph for more details of how this function works.

Finally, we’ll directly visualize the results with squidpy.pl.nhood_enrichment(). We’ll add a dendrogram to the heatmap computed with linkage method ward.

sq.gr.spatial_neighbors(adata, coord_type="generic")
sq.gr.nhood_enrichment(adata, cluster_key="cluster")
sq.pl.nhood_enrichment(adata, cluster_key="cluster", method="ward", vmin=-100, vmax=100)
100%|██████████| 1000/1000 [00:21<00:00, 45.75/s]

A similar analysis can be performed with squidpy.gr.interaction_matrix(). The function computes the number of shared edges in the neighbor graph between clusters. Please see Compute interaction matrix for more details of how this function works.

sq.gr.interaction_matrix(adata, cluster_key="cluster")
sq.pl.interaction_matrix(adata, cluster_key="cluster", method="ward", vmax=20000)

Additional analyses to gain quantitative understanding of spatial patterning of sub-cellular observations are:

Spatially variable genes with spatial autocorrelation statistics

With Squidpy we can investigate spatial variability of gene expression. This is an example of a function that only supports 2D data. squidpy.gr.spatial_autocorr() conveniently wraps two spatial autocorrelation statistics: Moran’s I and Geary’s C. They provide a score on the degree of spatial variability of gene expression. The statistic as well as the p-value are computed for each gene, and FDR correction is performed. For the purpose of this tutorial, let’s compute the Moran’s I score. See Compute Moran’s I score for more details.

sq.gr.spatial_autocorr(adata, mode="moran")
I pval_norm var_norm pval_norm_fdr_bh
Yap/Taz 0.972975 0.0 0.000001 0.0
CRT 0.958505 0.0 0.000001 0.0
TUBA1A 0.939577 0.0 0.000001 0.0
NUPS 0.915073 0.0 0.000001 0.0
TFRC 0.895786 0.0 0.000001 0.0
HSP60 0.889395 0.0 0.000001 0.0
Actin 0.879217 0.0 0.000001 0.0
CTNNB1 0.876393 0.0 0.000001 0.0
Climp63 0.873853 0.0 0.000001 0.0
VINC 0.862522 0.0 0.000001 0.0

The results are stored in adata.uns['moranI'] and we can visualize selected genes with squidpy.pl.spatial_scatter().

sq.pl.spatial_scatter(adata, shape=None, color="Yap/Taz", size=1)
WARNING: Please specify a valid `library_id` or set it permanently in `adata.uns['spatial']`