分享

Seurat_V5|单细胞转录组 + 蛋白,WNN方法分析单细胞多模态数据

 生信补给站 2024-03-21 发布于北京

使用Stuart*, Butler* et al, Cell 2019提供的CITEseq数据作为示例,含有30,672 scRNA-seq 和 25 antibodies数据。

一 载入R包,数据

使用SeuratData中的bmcite数据示例,展示CITEseq数据中的单细胞转录组和蛋白数据的结合 。

如果代码直接安装bmcite有问题的小伙伴可以在网页中输入http://seurat./src/contrib/bmcite.SeuratData_0.3.0.tar.gz 直接下载tar文件,然后本地安装。

library(Seurat)library(SeuratData)library(cowplot)library(dplyr)
#InstallData("bmcite")install.packages('./bmcite.SeuratData_0.3.0.tar.gz', repos = NULL, type = "source")bm <- LoadData(ds = "bmcite")bm
An object of class Seurat 17034 features across 30672 samples within 2 assays Active assay: RNA (17009 features, 2000 variable features) 2 layers present: counts, data 1 other assay present: ADT 1 dimensional reduction calculated: spca

可以看到对象中有RNA 和 ADT 2个assay。确认下RNA和ADT的cell barcode 是否一致 

二 WNN 模态数据处理

1,RNA标准分析

先指定默认assay为RNA,然后进行至PCA的标准分析

DefaultAssay(bm) <- 'RNA'bm <- NormalizeData(bm) %>%   FindVariableFeatures() %>%   ScaleData() %>%   RunPCA()bm
An object of class Seurat 17034 features across 30672 samples within 2 assays Active assay: RNA (17009 features, 2000 variable features) 3 layers present: counts, data, scale.data 1 other assay present: ADT 2 dimensional reductions calculated: spca, pca
2, ADT标准分析指定默认assay为ADT,然后标准分析
DefaultAssay(bm) <- 'ADT'# we will use all ADT features for dimensional reduction# we set a dimensional reduction name to avoid overwriting the VariableFeatures(bm) <- rownames(bm[["ADT"]])bm <- NormalizeData(bm, normalization.method = 'CLR', margin = 2) %>%   ScaleData() %>%   RunPCA(reduction.name = 'apca')bm
An object of class Seurat 17034 features across 30672 samples within 2 assays Active assay: ADT (25 features, 25 variable features) 3 layers present: counts, data, scale.data 1 other assay present: RNA 3 dimensional reductions calculated: spca, pca, apca

注:这里使用所有的ADT特征进行降维,此外注意设置reduction.name,以避免覆盖 。
3,WNN

对于每个细胞,我们根据RNA和蛋白质相似性的加权组合识别数据集中的多模态邻居,并将结果存储在neighbors插槽中,注意reduction.list中的pca 和 apca 要和前面单独分析时定义的名字一致

# Identify multimodal neighbors. These will be stored in the neighbors slot, # and can be accessed using bm[['weighted.nn']]# The WNN graph can be accessed at bm[["wknn"]], # and the SNN graph used for clustering at bm[["wsnn"]]# Cell-specific modality weights can be accessed at bm$RNA.weightbm <- FindMultiModalNeighbors(  bm,   reduction.list = list("pca", "apca"),   dims.list = list(1:30, 1:18),   modality.weight.name = "RNA.weight")bm


这里修改modality.weight.name 可能不起作用,通过View(FindMultiModalNeighbors) 查看函数内容发现名字是和assay有关的,没啥影响。


然后就可以进行后续的降维 聚类分析了。
4,降维聚类

上述WNN方法完成数据合并后,分别可以基于(1)转录组 (2)蛋白组 (3)以及结合权重的数据 进行UMAP 可视化分析。(1)基于 RNA + protein 的结合数据
注意这里不是assay,而是nn.name 参数指定,当然要注意reduction.name 。

bm <- RunUMAP(bm, nn.name = "weighted.nn",               reduction.name = "wnn.umap",               reduction.key = "wnnUMAP_")bm <- FindClusters(bm,                    graph.name = "wsnn",                    algorithm = 3,                    resolution = 2, verbose = FALSE)bm
An object of class Seurat 17034 features across 30672 samples within 2 assays Active assay: ADT (25 features, 25 variable features) 3 layers present: counts, data, scale.data 1 other assay present: RNA 4 dimensional reductions calculated: spca, pca, apca, wnn.umap
p1 <- DimPlot(bm, reduction = 'wnn.umap',               label = TRUE,               repel = TRUE, label.size = 2.5) + NoLegend()p2 <- DimPlot(bm, reduction = 'wnn.umap',               group.by = 'celltype.l2',               label = TRUE, repel = TRUE,               label.size = 2.5) + NoLegend()p1 + p2



(2)单独基于 RNA 或 protein 数据
bm <- RunUMAP(bm, reduction = 'pca', dims = 1:30, assay = 'RNA'              reduction.name = 'rna.umap', reduction.key = 'rnaUMAP_')p3 <- DimPlot(bm, reduction = 'rna.umap', group.by = 'celltype.l2', label = TRUE,               repel = TRUE, label.size = 2.5) + NoLegend()              bm <- RunUMAP(bm, reduction = 'apca', dims = 1:18, assay = 'ADT',               reduction.name = 'adt.umap', reduction.key = 'adtUMAP_')p4 <- DimPlot(bm, reduction = 'adt.umap'group.by = 'celltype.l2', label = TRUE,               repel = TRUE, label.size = 2.5) + NoLegend()p3 + p4


(3)经典marker 可视化
查看典型marker基因和蛋白在多模态整合后的UMAP上的表达,为注释提供参考
p5 <- FeaturePlot(bm, features = c("adt_CD45RA","adt_CD16","adt_CD161"),                  reduction = 'wnn.umap', max.cutoff = 2,                   cols = c("lightgrey","darkgreen"), ncol = 3)p6 <- FeaturePlot(bm, features = c("rna_TRDC","rna_MPO","rna_AVP"),                   reduction = 'wnn.umap', max.cutoff = 3, ncol = 3)p5 / p6

ADT中含有的蛋白数量和种类不一致,综合考虑一些经典marker的蛋白 和 基因,可以更好的进行注释 以及 对注释进行验证。

◆ ◆ ◆  ◆ 

精心整理(含图PLUS版)|R语言生信分析,可视化(R统计,ggplot2绘图,生信图形可视化汇总)

RNAseq纯生信挖掘思路分享?不,主要是送你代码!(建议收藏)

    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章