Launch binder

Analyze Imaging Mass Cytometry data

This tutorial shows how to apply Squidpy to Imaging Mass Cytometry data.

The data used here comes from a recent paper from [Jackson et al., 2020]. We provide a pre-processed subset of the data, in anndata.AnnData format. For details on how it was pre-processed, please refer to the original paper.

See also

See Analyze seqFISH data for additional analysis examples.

Import packages & data

To run the notebook locally, create a conda environment as conda env create -f environment.yml using this environment.yml

import scanpy as sc
import squidpy as sq

sc.logging.print_header()
print(f"squidpy=={sq.__version__}")

# load the pre-processed dataset
adata = sq.datasets.imc()

Out:

scanpy==1.7.1 anndata==0.7.5 umap==0.5.1 numpy==1.20.2 scipy==1.6.2 pandas==1.2.3 scikit-learn==0.24.1 statsmodels==0.12.2 python-igraph==0.9.1 leidenalg==0.8.3
squidpy==1.0.0

  0%|          | 0.00/1.50M [00:00<?, ?B/s]
  3%|3         | 48.0k/1.50M [00:00<00:06, 250kB/s]
 10%|9         | 152k/1.50M [00:00<00:03, 419kB/s]
 19%|#8        | 288k/1.50M [00:00<00:02, 572kB/s]
 70%|#######   | 1.05M/1.50M [00:00<00:00, 2.39MB/s]
100%|##########| 1.50M/1.50M [00:00<00:00, 2.26MB/s]

First, let’s visualize the cluster annotation in spatial context with scanpy.pl.spatial().

sc.pl.spatial(adata, color="cell type", spot_size=10)
cell type

We can appreciate how the majority of the tissue seems to consist of apoptotic tumor cells. There also seem to be other cell types scattered across the tissue, annotated as T cells, Macrophages and different types of Stromal cells. We can also appreciate how a subset of tumor cell, basal CK tumor cells seems to be located in the lower part of the tissue.

Co-occurrence across spatial dimensions

We can visualize cluster co-occurrence in spatial dimensions using the original spatial coordinates. The co-occurrence score is defined as:

\[\frac{p(exp|cond)}{p(exp)}\]

where \(p(exp|cond)\) is the conditional probability of observing a cluster \(exp\) conditioned on the presence of a cluster \(cond\), whereas \(p(exp)\) is the probability of observing \(exp\) in the radius size of interest. The score is computed across increasing radii size around each cell in the tissue.

We can compute this score with squidpy.gr.co_occurrence() and set the cluster annotation for the conditional probability with the argument clusters. Then, we visualize the results with squidpy.pl.co_occurrence(). We visualize the result for two conditional groups, namely basal CK tumor cell and T cells.

sq.gr.co_occurrence(adata, cluster_key="cell type")
sq.pl.co_occurrence(
    adata,
    cluster_key="cell type",
    clusters=["basal CK tumor cell", "T cells"],
    figsize=(15, 4),
)
$\frac{p(exp|T cells)}{p(exp)}$, $\frac{p(exp|basal CK tumor cell)}{p(exp)}$

Out:

  0%|          | 0/1 [00:00<?, ?/s]
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'rocket' which already exists.
  mpl_cm.register_cmap(_name, _cmap)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'rocket_r' which already exists.
  mpl_cm.register_cmap(_name + "_r", _cmap_r)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'mako' which already exists.
  mpl_cm.register_cmap(_name, _cmap)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'mako_r' which already exists.
  mpl_cm.register_cmap(_name + "_r", _cmap_r)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'icefire' which already exists.
  mpl_cm.register_cmap(_name, _cmap)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'icefire_r' which already exists.
  mpl_cm.register_cmap(_name + "_r", _cmap_r)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'vlag' which already exists.
  mpl_cm.register_cmap(_name, _cmap)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'vlag_r' which already exists.
  mpl_cm.register_cmap(_name + "_r", _cmap_r)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'flare' which already exists.
  mpl_cm.register_cmap(_name, _cmap)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'flare_r' which already exists.
  mpl_cm.register_cmap(_name + "_r", _cmap_r)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1582: UserWarning: Trying to register the cmap 'crest' which already exists.
  mpl_cm.register_cmap(_name, _cmap)
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/seaborn/cm.py:1583: UserWarning: Trying to register the cmap 'crest_r' which already exists.
  mpl_cm.register_cmap(_name + "_r", _cmap_r)

We can observe that T cells seems to co-occur with endothelial and vimentin hi stromal cells, whereas basal CK tumor cell seem to largely cluster together, except for the presence of a type of stromal cells (small elongated stromal cell) at close distance.

Neighborhood enrichment

A similar analysis that can inform on the neighbor structure of the tissue is the neighborhood enrichment test. You can compute such score with the following function: squidpy.gr.nhood_enrichment(). In short, it’s an enrichment score on spatial proximity of clusters: if spots belonging to two different clusters are often close to each other, then they will have a high score and can be defined as being enriched. On the other hand, if they are far apart, the score will be low and they can be defined as depleted. This score is based on a permutation-based test, and you can set the number of permutations with the n_perms argument (default is 1000).

Since the function works on a connectivity matrix, we need to compute that as well. This can be done with squidpy.gr.spatial_neighbors(). Please see Building spatial neighbors graph for more details of how this function works.

Finally, we visualize the results with squidpy.pl.nhood_enrichment().

sq.gr.spatial_neighbors(adata)
sq.gr.nhood_enrichment(adata, cluster_key="cell type")
sq.pl.nhood_enrichment(adata, cluster_key="cell type")
Neighborhood enrichment

Out:

  0%|          | 0/1000 [00:00<?, ?/s]
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/pandas/core/arrays/categorical.py:2487: FutureWarning: The `inplace` parameter in pandas.Categorical.remove_unused_categories is deprecated and will be removed in a future version.
  res = method(*args, **kwargs)

Interestingly, T cells shows an enrichment with stromal and endothelial cells, as well as macrophages. Another interesting result is that apoptotic tumor cells, being uniformly spread across the tissue area, show a neighbor depletion against any other cluster (but a strong enrichment for itself). This is a correct interpretation from a permutation based approach, because the cluster annotation, being uniformly spread across the tissue, and in high number, it’s more likely to be enriched with cell types from the same class, rather than different one.

Interaction matrix and network centralities

Squidpy provides other descriptive statistics of the spatial graph. For instance, the interaction matrix, which counts the number of edges that each cluster share with all the others. This score can be computed with the function squidpy.gr.interaction_matrix(). We can visualize the results with squidpy.pl.interaction_matrix().

sq.gr.interaction_matrix(adata, cluster_key="cell type")
sq.pl.interaction_matrix(adata, cluster_key="cell type")
Interaction matrix

Out:

/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.8/site-packages/pandas/core/arrays/categorical.py:2487: FutureWarning: The `inplace` parameter in pandas.Categorical.remove_unused_categories is deprecated and will be removed in a future version.
  res = method(*args, **kwargs)

Finally, similar to the previous analysis, we can investigate properties of the spatial graph by computing different network centralities:

  • degree_centrality

  • average_clustering

  • closeness_centrality

Squidpy provides a convenient function for all of them: squidpy.gr.centrality_scores() and squidpy.pl.centrality_scores() for visualization.

sq.gr.centrality_scores(
    adata,
    cluster_key="cell type",
)
sq.pl.centrality_scores(adata, cluster_key="cell type", figsize=(20, 5), s=500)
Average clustering, Closeness centrality, Degree centrality

You can familiarize yourself with network centralities from the excellent networkx documentation . For the purpose of this analysis, we can appreciate that the apoptotic tumor cell clusters shows high closeness centrality, indicating that nodes belonging to that group are often close to each other in the spatial graph.

Total running time of the script: ( 0 minutes 44.291 seconds)

Estimated memory usage: 182 MB