Analyze 4i data

This tutorial shows how to apply Squidpy for the analysis of 4i data.

The data used here was obtained from . We provide a pre-processed subset of the data, in `anndata.AnnData` format. For details on how it was pre-processed, please refer to the original paper.

See Analyze Imaging Mass Cytometry data for additional analysis examples.

Import packages & data

To run the notebook locally, create a conda environment as conda env create -f environment.yml using this environment.yml.

```import scanpy as sc
import squidpy as sq

print(f"squidpy=={sq.__version__}")

```

Out:

```scanpy==1.9.1 anndata==0.8.0 umap==0.5.3 numpy==1.21.6 scipy==1.8.0 pandas==1.4.2 scikit-learn==1.1.0 statsmodels==0.13.2 python-igraph==0.9.10 pynndescent==0.5.7
squidpy==1.2.1

0%|          | 0.00/173M [00:00<?, ?B/s]
0%|          | 56.0k/173M [00:00<07:24, 408kB/s]
0%|          | 208k/173M [00:00<03:42, 814kB/s]
1%|          | 936k/173M [00:00<01:03, 2.84MB/s]
2%|2         | 3.73M/173M [00:00<00:17, 9.89MB/s]
6%|5         | 9.56M/173M [00:00<00:07, 21.8MB/s]
9%|8         | 15.1M/173M [00:00<00:05, 28.3MB/s]
12%|#2        | 20.9M/173M [00:00<00:04, 33.0MB/s]
15%|#5        | 26.7M/173M [00:01<00:04, 36.1MB/s]
19%|#8        | 32.7M/173M [00:01<00:03, 38.5MB/s]
22%|##2       | 38.4M/173M [00:01<00:03, 39.6MB/s]
26%|##5       | 44.2M/173M [00:01<00:03, 40.6MB/s]
29%|##8       | 50.1M/173M [00:01<00:03, 41.4MB/s]
32%|###2      | 56.0M/173M [00:01<00:02, 41.9MB/s]
36%|###5      | 61.8M/173M [00:01<00:02, 46.5MB/s]
37%|###7      | 64.9M/173M [00:02<00:02, 41.3MB/s]
41%|####      | 70.8M/173M [00:02<00:02, 46.7MB/s]
43%|####2     | 73.8M/173M [00:02<00:02, 40.8MB/s]
46%|####6     | 79.7M/173M [00:02<00:02, 42.0MB/s]
49%|####9     | 85.4M/173M [00:02<00:01, 46.6MB/s]
51%|#####1    | 88.5M/173M [00:02<00:02, 41.1MB/s]
54%|#####4    | 94.2M/173M [00:02<00:01, 46.2MB/s]
56%|#####6    | 97.2M/173M [00:02<00:01, 40.6MB/s]
59%|#####9    | 103M/173M [00:02<00:01, 45.9MB/s]
61%|######1   | 106M/173M [00:03<00:01, 40.5MB/s]
64%|######4   | 112M/173M [00:03<00:01, 45.7MB/s]
66%|######6   | 115M/173M [00:03<00:01, 40.4MB/s]
70%|######9   | 120M/173M [00:03<00:01, 45.4MB/s]
71%|#######1  | 124M/173M [00:03<00:01, 40.7MB/s]
74%|#######4  | 129M/173M [00:03<00:01, 44.9MB/s]
77%|#######6  | 132M/173M [00:03<00:01, 41.0MB/s]
80%|#######9  | 138M/173M [00:03<00:00, 45.0MB/s]
82%|########1 | 141M/173M [00:03<00:00, 41.1MB/s]
84%|########4 | 146M/173M [00:04<00:00, 44.3MB/s]
87%|########6 | 150M/173M [00:04<00:00, 41.5MB/s]
90%|########9 | 155M/173M [00:04<00:00, 44.4MB/s]
92%|#########1| 159M/173M [00:04<00:00, 41.4MB/s]
95%|#########4| 164M/173M [00:04<00:00, 44.1MB/s]
97%|#########6| 168M/173M [00:04<00:00, 41.6MB/s]
100%|#########9| 172M/173M [00:04<00:00, 43.9MB/s]
100%|##########| 173M/173M [00:04<00:00, 38.9MB/s]
/home/runner/work/squidpy_notebooks/squidpy_notebooks/.tox/docs/lib/python3.9/site-packages/anndata/_core/anndata.py:1830: UserWarning: Variable names are not unique. To make them unique, call `.var_names_make_unique`.
utils.warn_names_duplicates("var")
```

First, let’s visualize cluster annotation in spatial context with `scanpy.pl.spatial()`.

```sc.pl.spatial(adata, color="cluster", spot_size=1)
```

Neighborhood enrichment analysis

Similar to other spatial data, we can investigate spatial organization of clusters in a quantitative way, by computing a neighborhood enrichment score. You can compute such score with the following function: `squidpy.gr.nhood_enrichment()`. In short, it’s an enrichment score on spatial proximity of clusters: if spots belonging to two different clusters are often close to each other, then they will have a high score and can be defined as being enriched. On the other hand, if they are far apart, the score will be low and they can be defined as depleted. This score is based on a permutation-based test, and you can set the number of permutations with the n_perms argument (default is 1000).

Since the function works on a connectivity matrix, we need to compute that as well. This can be done with `squidpy.gr.spatial_neighbors()`. Please see Building spatial neighbors graph for more details of how this function works.

Finally, we’ll directly visualize the results with `squidpy.pl.nhood_enrichment()`. We’ll add a dendrogram to the heatmap computed with linkage method ward.

```sq.gr.spatial_neighbors(adata, coord_type="generic")
```

Out:

```  0%|          | 0/1000 [00:00<?, ?/s]
0%|          | 1/1000 [00:00<02:15,  7.36/s]
0%|          | 5/1000 [00:00<00:41, 24.06/s]
1%|1         | 10/1000 [00:00<00:28, 34.86/s]
2%|1         | 16/1000 [00:00<00:22, 42.79/s]
2%|2         | 21/1000 [00:00<00:21, 45.08/s]
3%|2         | 26/1000 [00:00<00:21, 44.35/s]
3%|3         | 31/1000 [00:00<00:21, 45.45/s]
4%|3         | 36/1000 [00:00<00:20, 46.21/s]
4%|4         | 42/1000 [00:00<00:20, 46.65/s]
5%|4         | 47/1000 [00:01<00:20, 46.78/s]
5%|5         | 52/1000 [00:01<00:20, 47.13/s]
6%|5         | 57/1000 [00:01<00:19, 47.39/s]
6%|6         | 62/1000 [00:01<00:19, 47.56/s]
7%|6         | 67/1000 [00:01<00:19, 47.96/s]
7%|7         | 72/1000 [00:01<00:19, 47.93/s]
8%|7         | 77/1000 [00:01<00:19, 48.18/s]
8%|8         | 82/1000 [00:01<00:19, 48.05/s]
9%|8         | 87/1000 [00:01<00:18, 48.13/s]
9%|9         | 92/1000 [00:02<00:19, 47.34/s]
10%|9         | 97/1000 [00:02<00:19, 47.14/s]
10%|#         | 103/1000 [00:02<00:19, 46.84/s]
11%|#         | 108/1000 [00:02<00:18, 46.98/s]
11%|#1        | 113/1000 [00:02<00:19, 46.04/s]
12%|#1        | 118/1000 [00:02<00:18, 46.59/s]
12%|#2        | 124/1000 [00:02<00:18, 47.02/s]
13%|#2        | 129/1000 [00:02<00:18, 47.31/s]
13%|#3        | 134/1000 [00:02<00:18, 47.59/s]
14%|#3        | 139/1000 [00:03<00:18, 47.78/s]
14%|#4        | 144/1000 [00:03<00:17, 47.86/s]
15%|#4        | 149/1000 [00:03<00:17, 47.81/s]
15%|#5        | 154/1000 [00:03<00:17, 47.51/s]
16%|#5        | 159/1000 [00:03<00:17, 47.27/s]
16%|#6        | 164/1000 [00:03<00:17, 47.16/s]
17%|#6        | 169/1000 [00:03<00:17, 47.00/s]
18%|#7        | 175/1000 [00:03<00:17, 47.18/s]
18%|#8        | 180/1000 [00:03<00:17, 46.69/s]
18%|#8        | 185/1000 [00:04<00:17, 46.85/s]
19%|#9        | 191/1000 [00:04<00:16, 49.18/s]
20%|#9        | 196/1000 [00:04<00:17, 44.92/s]
20%|##        | 201/1000 [00:04<00:17, 45.83/s]
21%|##        | 207/1000 [00:04<00:17, 46.40/s]
21%|##1       | 212/1000 [00:04<00:16, 46.54/s]
22%|##1       | 217/1000 [00:04<00:16, 46.73/s]
22%|##2       | 222/1000 [00:04<00:16, 46.53/s]
23%|##2       | 227/1000 [00:04<00:16, 45.82/s]
23%|##3       | 233/1000 [00:05<00:16, 46.30/s]
24%|##3       | 238/1000 [00:05<00:16, 46.17/s]
24%|##4       | 243/1000 [00:05<00:16, 46.29/s]
25%|##4       | 248/1000 [00:05<00:16, 46.74/s]
25%|##5       | 253/1000 [00:05<00:15, 47.23/s]
26%|##5       | 258/1000 [00:05<00:15, 47.37/s]
26%|##6       | 263/1000 [00:05<00:15, 47.57/s]
27%|##6       | 269/1000 [00:05<00:15, 46.96/s]
27%|##7       | 274/1000 [00:05<00:15, 46.74/s]
28%|##7       | 279/1000 [00:06<00:15, 46.57/s]
28%|##8       | 284/1000 [00:06<00:15, 46.35/s]
29%|##8       | 289/1000 [00:06<00:15, 46.54/s]
29%|##9       | 294/1000 [00:06<00:15, 46.41/s]
30%|##9       | 299/1000 [00:06<00:15, 46.47/s]
30%|###       | 305/1000 [00:06<00:14, 46.84/s]
31%|###1      | 310/1000 [00:06<00:14, 46.93/s]
32%|###1      | 316/1000 [00:06<00:14, 46.84/s]
32%|###2      | 321/1000 [00:06<00:14, 46.88/s]
33%|###2      | 326/1000 [00:07<00:14, 47.21/s]
33%|###3      | 331/1000 [00:07<00:14, 47.05/s]
34%|###3      | 337/1000 [00:07<00:14, 46.38/s]
34%|###4      | 343/1000 [00:07<00:14, 45.88/s]
35%|###4      | 348/1000 [00:07<00:14, 45.34/s]
35%|###5      | 353/1000 [00:07<00:14, 45.13/s]
36%|###5      | 359/1000 [00:07<00:13, 45.93/s]
36%|###6      | 364/1000 [00:07<00:13, 46.07/s]
37%|###6      | 369/1000 [00:07<00:13, 46.65/s]
37%|###7      | 374/1000 [00:08<00:13, 46.83/s]
38%|###7      | 379/1000 [00:08<00:13, 47.08/s]
38%|###8      | 384/1000 [00:08<00:13, 47.22/s]
39%|###8      | 389/1000 [00:08<00:12, 47.41/s]
39%|###9      | 394/1000 [00:08<00:12, 47.55/s]
40%|###9      | 399/1000 [00:08<00:12, 47.75/s]
40%|####      | 404/1000 [00:08<00:12, 47.69/s]
41%|####      | 409/1000 [00:08<00:12, 47.54/s]
41%|####1     | 414/1000 [00:08<00:12, 47.59/s]
42%|####1     | 419/1000 [00:09<00:12, 47.63/s]
42%|####2     | 424/1000 [00:09<00:12, 47.51/s]
43%|####2     | 429/1000 [00:09<00:12, 47.42/s]
43%|####3     | 434/1000 [00:09<00:11, 47.30/s]
44%|####3     | 439/1000 [00:09<00:11, 47.45/s]
44%|####4     | 444/1000 [00:09<00:11, 47.44/s]
45%|####4     | 449/1000 [00:09<00:11, 47.43/s]
45%|####5     | 454/1000 [00:09<00:11, 47.55/s]
46%|####5     | 459/1000 [00:09<00:11, 47.51/s]
46%|####6     | 464/1000 [00:09<00:11, 47.32/s]
47%|####6     | 469/1000 [00:10<00:11, 47.34/s]
48%|####7     | 475/1000 [00:10<00:10, 50.07/s]
48%|####8     | 481/1000 [00:10<00:10, 49.35/s]
49%|####8     | 486/1000 [00:10<00:11, 46.09/s]
49%|####9     | 491/1000 [00:10<00:10, 46.78/s]
50%|####9     | 496/1000 [00:10<00:10, 47.07/s]
50%|#####     | 501/1000 [00:10<00:10, 47.42/s]
51%|#####     | 506/1000 [00:10<00:10, 47.64/s]
51%|#####1    | 511/1000 [00:10<00:10, 48.10/s]
52%|#####1    | 516/1000 [00:11<00:10, 48.37/s]
52%|#####2    | 521/1000 [00:11<00:09, 48.28/s]
53%|#####2    | 527/1000 [00:11<00:09, 51.16/s]
53%|#####3    | 533/1000 [00:11<00:09, 47.51/s]
54%|#####3    | 539/1000 [00:11<00:09, 47.66/s]
54%|#####4    | 544/1000 [00:11<00:09, 47.83/s]
55%|#####4    | 549/1000 [00:11<00:09, 47.73/s]
55%|#####5    | 554/1000 [00:11<00:09, 47.74/s]
56%|#####6    | 560/1000 [00:11<00:09, 48.81/s]
56%|#####6    | 565/1000 [00:12<00:09, 45.49/s]
57%|#####7    | 571/1000 [00:12<00:09, 45.92/s]
58%|#####7    | 576/1000 [00:12<00:09, 44.86/s]
58%|#####8    | 581/1000 [00:12<00:09, 44.10/s]
59%|#####8    | 586/1000 [00:12<00:09, 43.47/s]
59%|#####9    | 591/1000 [00:12<00:09, 44.19/s]
60%|#####9    | 596/1000 [00:12<00:09, 44.51/s]
60%|######    | 602/1000 [00:12<00:08, 47.35/s]
61%|######    | 607/1000 [00:13<00:08, 43.87/s]
61%|######1   | 612/1000 [00:13<00:09, 43.04/s]
62%|######1   | 617/1000 [00:13<00:08, 43.81/s]
62%|######2   | 622/1000 [00:13<00:08, 44.58/s]
63%|######2   | 627/1000 [00:13<00:08, 45.37/s]
63%|######3   | 632/1000 [00:13<00:08, 45.94/s]
64%|######3   | 638/1000 [00:13<00:07, 48.97/s]
64%|######4   | 643/1000 [00:13<00:07, 45.46/s]
65%|######4   | 648/1000 [00:13<00:07, 45.95/s]
65%|######5   | 653/1000 [00:14<00:07, 46.26/s]
66%|######5   | 658/1000 [00:14<00:07, 45.90/s]
66%|######6   | 663/1000 [00:14<00:07, 46.09/s]
67%|######6   | 668/1000 [00:14<00:07, 46.53/s]
67%|######7   | 674/1000 [00:14<00:06, 48.59/s]
68%|######7   | 679/1000 [00:14<00:06, 46.59/s]
68%|######8   | 684/1000 [00:14<00:06, 47.05/s]
69%|######9   | 690/1000 [00:14<00:06, 47.50/s]
70%|######9   | 695/1000 [00:14<00:06, 48.08/s]
70%|#######   | 701/1000 [00:15<00:06, 48.47/s]
71%|#######   | 706/1000 [00:15<00:06, 48.51/s]
71%|#######1  | 712/1000 [00:15<00:05, 48.16/s]
72%|#######1  | 717/1000 [00:15<00:05, 48.20/s]
72%|#######2  | 722/1000 [00:15<00:05, 47.88/s]
73%|#######2  | 727/1000 [00:15<00:05, 47.95/s]
73%|#######3  | 732/1000 [00:15<00:05, 47.84/s]
74%|#######3  | 738/1000 [00:15<00:05, 47.35/s]
74%|#######4  | 743/1000 [00:15<00:05, 47.39/s]
75%|#######4  | 748/1000 [00:16<00:05, 47.49/s]
75%|#######5  | 753/1000 [00:16<00:05, 47.36/s]
76%|#######5  | 758/1000 [00:16<00:05, 47.65/s]
76%|#######6  | 763/1000 [00:16<00:04, 47.62/s]
77%|#######6  | 769/1000 [00:16<00:04, 47.43/s]
77%|#######7  | 774/1000 [00:16<00:04, 47.17/s]
78%|#######7  | 779/1000 [00:16<00:04, 46.72/s]
78%|#######8  | 784/1000 [00:16<00:04, 46.62/s]
79%|#######8  | 789/1000 [00:16<00:04, 47.46/s]
79%|#######9  | 794/1000 [00:17<00:04, 45.85/s]
80%|#######9  | 799/1000 [00:17<00:04, 46.26/s]
80%|########  | 804/1000 [00:17<00:04, 46.44/s]
81%|########  | 809/1000 [00:17<00:04, 46.37/s]
81%|########1 | 814/1000 [00:17<00:03, 46.67/s]
82%|########1 | 819/1000 [00:17<00:03, 46.49/s]
82%|########2 | 824/1000 [00:17<00:03, 46.80/s]
83%|########2 | 830/1000 [00:17<00:03, 46.70/s]
84%|########3 | 835/1000 [00:17<00:03, 47.14/s]
84%|########4 | 840/1000 [00:17<00:03, 47.37/s]
84%|########4 | 845/1000 [00:18<00:03, 47.63/s]
85%|########5 | 850/1000 [00:18<00:03, 46.90/s]
86%|########5 | 855/1000 [00:18<00:03, 46.24/s]
86%|########6 | 861/1000 [00:18<00:03, 45.98/s]
87%|########6 | 867/1000 [00:18<00:02, 46.20/s]
87%|########7 | 872/1000 [00:18<00:02, 46.34/s]
88%|########7 | 877/1000 [00:18<00:02, 46.14/s]
88%|########8 | 882/1000 [00:18<00:02, 46.07/s]
89%|########8 | 887/1000 [00:19<00:02, 46.59/s]
89%|########9 | 892/1000 [00:19<00:02, 46.54/s]
90%|########9 | 897/1000 [00:19<00:02, 47.06/s]
90%|######### | 902/1000 [00:19<00:02, 47.30/s]
91%|######### | 907/1000 [00:19<00:01, 47.57/s]
91%|#########1| 912/1000 [00:19<00:01, 47.29/s]
92%|#########1| 917/1000 [00:19<00:01, 47.66/s]
92%|#########2| 922/1000 [00:19<00:01, 47.83/s]
93%|#########2| 928/1000 [00:19<00:01, 47.97/s]
93%|#########3| 933/1000 [00:19<00:01, 47.77/s]
94%|#########3| 939/1000 [00:20<00:01, 46.95/s]
94%|#########4| 944/1000 [00:20<00:01, 46.67/s]
95%|#########4| 949/1000 [00:20<00:01, 46.49/s]
95%|#########5| 954/1000 [00:20<00:00, 46.12/s]
96%|#########5| 959/1000 [00:20<00:00, 46.86/s]
96%|#########6| 964/1000 [00:20<00:00, 47.19/s]
97%|#########6| 969/1000 [00:20<00:00, 47.60/s]
97%|#########7| 974/1000 [00:20<00:00, 46.78/s]
98%|#########7| 979/1000 [00:20<00:00, 46.88/s]
98%|#########8| 985/1000 [00:21<00:00, 48.00/s]
99%|#########9| 990/1000 [00:21<00:00, 45.81/s]
100%|#########9| 995/1000 [00:21<00:00, 45.97/s]
100%|##########| 1000/1000 [00:21<00:00, 46.73/s]
```

A similar analysis can be performed with `squidpy.gr.interaction_matrix()`. The function computes the number of shared edges in the neighbor graph between clusters. Please see Compute interaction matrix for more details of how this function works.

```sq.gr.interaction_matrix(adata, cluster_key="cluster")
```

Additional analyses to gain quantitative understanding of spatial patterning of sub-cellular observations are: - Compute Ripley’s statistics for Ripley’s statistics. - Compute co-occurrence probability for co-occurrence score.

Spatially variable genes with spatial autocorrelation statistics

With Squidpy we can investigate spatial variability of gene expression. This is an example of a function that only supports 2D data. `squidpy.gr.spatial_autocorr()` conveniently wraps two spatial autocorrelation statistics: Moran’s I and Geary’s C. They provide a score on the degree of spatial variability of gene expression. The statistic as well as the p-value are computed for each gene, and FDR correction is performed. For the purpose of this tutorial, let’s compute the Moran’s I score. See Compute Moran’s I score for more details.

```adata.var_names_make_unique()
```
I pval_norm var_norm pval_norm_fdr_bh
Yap/Taz 0.972969 0.0 0.000001 0.0
CRT 0.958588 0.0 0.000001 0.0
TUBA1A 0.939611 0.0 0.000001 0.0
NUPS 0.915056 0.0 0.000001 0.0
TFRC 0.895769 0.0 0.000001 0.0
HSP60 0.889343 0.0 0.000001 0.0
Actin 0.879215 0.0 0.000001 0.0
CTNNB1 0.876511 0.0 0.000001 0.0
Climp63 0.873844 0.0 0.000001 0.0
VINC 0.862487 0.0 0.000001 0.0

The results are stored in adata.uns[‘moranI’] and we can visualize selected genes with `scanpy.pl.spatial()`.

```sc.pl.spatial(adata, color="Yap/Taz", spot_size=1)
```

Total running time of the script: ( 2 minutes 21.951 seconds)

Estimated memory usage: 1257 MB