%matplotlib inline

Analyze 4i data

This tutorial shows how to apply Squidpy for the analysis of 4i data.

The data used here was obtained from [Gut et al., 2018]. We provide a pre-processed subset of the data, in anndata.AnnData format. For details on how it was pre-processed, please refer to the original paper.

Import packages & data

To run the notebook locally, create a conda environment as conda env create -f environment.yml using this environment.yml <https://github.com/scverse/squidpy_notebooks/blob/main/environment.yml>_.

import squidpy as sq

print(f"squidpy=={sq.__version__}")

# load the pre-processed dataset
adata = sq.datasets.four_i()

squidpy==1.2.2

First, let’s visualize cluster annotation in spatial context with squidpy.pl.spatial_scatter().

sq.pl.spatial_scatter(adata, shape=None, color="cluster", size=1)

WARNING: Please specify a valid `library_id` or set it permanently in `adata.uns['spatial']`

../../_images/117e88fd38ffa85182fa73705f82a302405f1a4b9a77744e0acdf0356887569f.png

Neighborhood enrichment analysis

Similar to other spatial data, we can investigate spatial organization of clusters in a quantitative way, by computing a neighborhood enrichment score. You can compute such score with the following function: squidpy.gr.nhood_enrichment(). In short, it’s an enrichment score on spatial proximity of clusters: if spots belonging to two different clusters are often close to each other, then they will have a high score and can be defined as being enriched. On the other hand, if they are far apart, the score will be low and they can be defined as depleted. This score is based on a permutation-based test, and you can set the number of permutations with the n_perms argument (default is 1000).

Since the function works on a connectivity matrix, we need to compute that as well. This can be done with squidpy.gr.spatial_neighbors(). Please see Building spatial neighbors graph for more details of how this function works.

Finally, we’ll directly visualize the results with squidpy.pl.nhood_enrichment(). We’ll add a dendrogram to the heatmap computed with linkage method ward.

sq.gr.spatial_neighbors(adata, coord_type="generic")
sq.gr.nhood_enrichment(adata, cluster_key="cluster")
sq.pl.nhood_enrichment(adata, cluster_key="cluster", method="ward", vmin=-100, vmax=100)

100%|██████████| 1000/1000 [00:21<00:00, 45.75/s]

../../_images/04427ccca55d5e41c5c5787cba59ce711bc6bdd4f5b5d6bfe9a11b6e3496c329.png

A similar analysis can be performed with squidpy.gr.interaction_matrix(). The function computes the number of shared edges in the neighbor graph between clusters. Please see Compute interaction matrix for more details of how this function works.

sq.gr.interaction_matrix(adata, cluster_key="cluster")
sq.pl.interaction_matrix(adata, cluster_key="cluster", method="ward", vmax=20000)

../../_images/112ddc86451fee6fd290ae703c463c13b0114eb0ff7e37d2ca5e510b9a991921.png

Additional analyses to gain quantitative understanding of spatial patterning of sub-cellular observations are:

Compute Ripley’s statistics for Ripley’s statistics.
Compute co-occurrence probability for co-occurrence score.

Spatially variable genes with spatial autocorrelation statistics

With Squidpy we can investigate spatial variability of gene expression. This is an example of a function that only supports 2D data. squidpy.gr.spatial_autocorr() conveniently wraps two spatial autocorrelation statistics: Moran’s I and Geary’s C. They provide a score on the degree of spatial variability of gene expression. The statistic as well as the p-value are computed for each gene, and FDR correction is performed. For the purpose of this tutorial, let’s compute the Moran’s I score. See Compute Moran’s I score for more details.

adata.var_names_make_unique()
sq.gr.spatial_autocorr(adata, mode="moran")
adata.uns["moranI"].head(10)

	I	var_norm
Yap/Taz	0.972975	0.000001
CRT	0.958505	0.000001
TUBA1A	0.939577	0.000001
NUPS	0.915073	0.000001
TFRC	0.895786	0.000001
HSP60	0.889395	0.000001
Actin	0.879217	0.000001
CTNNB1	0.876393	0.000001
Climp63	0.873853	0.000001
VINC	0.862522	0.000001

The results are stored in adata.uns['moranI'] and we can visualize selected genes with squidpy.pl.spatial_scatter().

sq.pl.spatial_scatter(adata, shape=None, color="Yap/Taz", size=1)

WARNING: Please specify a valid `library_id` or set it permanently in `adata.uns['spatial']`

../../_images/93eb92fe4bb88e557cc67f188a87ccef53ff7b4aa852254b5a6cda27e5b7b0bb.png