准备官网上PC数目的确定(https:///seurat/v3.1/pbmc3k_tutorial.html) 1library(Seurat) 2 3load(file = 'Cluster_seurat.Rdata') # data.filt 4seurat_data <- data.filt
方法一:DimHeatmap函数1# Explore heatmap of PCs 2DimHeatmap(seurat_data, dims = 1:6, cells = 500, balanced = TRUE) ![](http://image109.360doc.com/DownloadImg/2021/07/1515/226326808_2_20210715033438100_wm)
1DimHeatmap(seurat_data , dims = 7:12, cells = 500, balanced = TRUE)
![](http://image109.360doc.com/DownloadImg/2021/07/1515/226326808_3_20210715033438428_wm)
方法二:ElbowPlot函数1# Plot the elbow plot 2ElbowPlot(object = seurat_data , ndims = 30)
![](http://image109.360doc.com/DownloadImg/2021/07/1515/226326808_4_20210715033438725_wm)
方法三:JackStrawPlot函数1# Slow slow slow 2seurat_data <- JackStraw(object = seurat_data, dims = 50) 3seurat_data <- ScoreJackStraw(seurat_data, dims = 1:50) 4JackStrawPlot(object = seurat_data, dims = 1:50) 5
![](http://image109.360doc.com/DownloadImg/2021/07/1515/226326808_5_20210715033438944_wm)
上面三种方法只能给出PC数的粗略范围,选择不同PC数目,细胞聚类效果差别较大,因此,需要一个更具体的PC数目。作者提出一个确定PC阈值的三个标准:
主成分累积贡献大于90% PC本身对方差贡献小于5% 两个连续PCs之间差异小于0.1%
1# Determine percent of variation associated with each PC 2pct <- seurat_data [["pca"]]@stdev / sum( seurat_data [["pca"]]@stdev) * 100 3 4 5# Calculate cumulative percents for each PC 6cumu <- cumsum(pct) 7 8 9# Determine which PC exhibits cumulative percent greater than 90% and % variation associated with the PC as less than 5 10co1 <- which(cumu > 90 & pct < 5)[1] 11co1 12 13# Determine the difference between variation of PC and subsequent PC 14co2 <- sort(which((pct[1:length(pct) - 1] - pct[2:length(pct)]) > 0.1), decreasing = T)[1] + 1 15 16 17# last point where change of % of variation is more than 0.1%. 18co2 19 20# Minimum of the two calculation 21pcs <- min(co1, co2) 22pcs 23 24# Create a dataframe with values 25plot_df <- data.frame(pct = pct, cumu = cumu, rank = 1:length(pct)) 26 27 28# Elbow plot to visualize 29ggplot(plot_df, aes(cumu, pct, label = rank, color = rank > pcs)) + 30 geom_text() + 31 geom_vline(xintercept = 90, color = "grey") + 32 geom_hline(yintercept = min(pct[pct > 5]), color = "grey") + 33 theme_bw() ![](http://image109.360doc.com/DownloadImg/2021/07/1515/226326808_6_20210715033439491_wm)
查看PC相关高可变基因。如果我们看到一种罕见细胞类型的已知标记基因的PC数,那么可以选择从1~直到该PC值的所有PC数目。 1# Printing out the most variable genes driving PCs 2print(x = seurat_data [["pca"]], dims = 1:25, nfeatures = 5)
1PC_ 1 2Positive: NEIL1, LTB, KLF2, TP53INP1, CD27 3Negative: TYMS, MKI67, PCLAF, RRM2, NUSAP1 4PC_ 2 5Positive: GZMA, ARL4C, PRF1, CST7, GZMM 6Negative: SLC35E3, ID3, PRDX1, TOP2B, RPLP0 7PC_ 3 8Positive: HBA2, HBB, HBA1, AHSP, HBD 9Negative: RPS18, RPL18A, RPS2, RPSA, RPL37A 10PC_ 4 11Positive: IGLL1, SLC35E3, PCDH9, CD38, F13A1 12Negative: CCL17, HMBS, BLVRB, AQP1, CD36 13PC_ 5 14Positive: GYPC, RPS18, RPS2, C1QTNF4, RPL18A 15Negative: MNDA, LYZ, S100A9, S100A8, FCN1 16PC_ 6 17Positive: PLK1, CDC20, CENPA, HMMR, CENPE 18Negative: GINS2, MCM6, HELLS, MCM4, MCM3 19PC_ 7 20Positive: GYPC, C1QTNF4, LIMS1, NRIP1, S100A9 21Negative: SPIB, TAGLN2, MS4A1, IGLC6, PTPRC 22PC_ 8 23Positive: FCGR3A, GZMB, SPON2, KLRF1, MYOM2 24Negative: CCR7, CD3G, CD3D, IL7R, GPR183 25PC_ 9 26Positive: CCL17, LTB, TMEM154, CCND2, HSPA12B 27Negative: ACTG1, LGALS1, IGLL1, CCDC81, TOP2B 28PC_ 10 29Positive: AHNAK, VIM, EMP1, LMNA, CD27 30Negative: MT1X, CCL17, FTL, HSP90B1, NSMCE1 31PC_ 11 32Positive: NEIL1, LTB, FTH1, CFD, CST3 33Negative: LCN2, RETN, S100A8, LTF, CAMP 34PC_ 12 35Positive: RPS12, RPLP1, RPL18A, EEF1B2, RPS5 36Negative: HNRNPU, NCL, AHNAK, AC245060.5, EMP1 37PC_ 13 38Positive: CD3D, TRAC, CD3G, IGLC6, CD27 39Negative: MARCH1, MS4A1, BANK1, ADAM28, LINC02397 40PC_ 14 41Positive: SCIMP, SRGN, GUSB, SHISA2, MARCH1 42Negative: MS4A1, ZNF608, ENAM, CCND2, CCL17 43PC_ 15 44Positive: ATF5, HSPA5, PSAT1, PHGDH, MARCH1 45Negative: NT5E, GIMAP4, TP53INP1, SHISA2, DBI 46PC_ 16 47Positive: ACSM3, IGLC6, SHISA2, REXO2, MT1X 48Negative: CD82, GCHFR, PRDX1, UBASH3B, PTGDR 49PC_ 17 50Positive: MARCKSL1, FTH1, S100A1, CRIP2, EMP2 51Negative: HSP90B1, HSPA5, UBASH3B, PPIB, FKBP5 52PC_ 18 53Positive: MARCH1, H3F3A, CALM2, ACTB, PRDX1 54Negative: HSP90B1, ATF5, HSPA5, MT-ND6, CANX 55PC_ 19 56Positive: TRGC2, LGALS1, KLRG1, CCL5, PTMS 57Negative: CCR7, TXK, FCER1G, CD7, TCF7 58PC_ 20 59Positive: PIM1, SOCS3, ADGRE5, RGCC, EPHA4 60Negative: LRMP, BANK1, MS4A1, CLEC4E, NME1 61PC_ 21 62Positive: CCR7, CMTM2, S100A11, LRMP, TXK 63Negative: TRGC2, RPS12, KLRG1, LCN6, RPS18 64PC_ 22 65Positive: CTGF, PMAIP1, FOS, KLF6, FOSB 66Negative: FUT7, SLC9A3R2, LCN6, PPP1R14A, EMP3 67PC_ 23 68Positive: ATF5, PSAT1, HSP90B1, PHGDH, HSPA5 69Negative: CTHRC1, NSMCE1, MAP1A, IGLL1, BTNL9 70PC_ 24 71Positive: SERINC2, LST1, NAMPT, MT1X, SLC25A37 72Negative: SHISA2, DEPP1, GADD45A, PSTPIP2, CD33 73PC_ 25 74Positive: CDKN1C, RHOB, BATF3, CX3CR1, SERPINA1 75Negative: FOS, ALDH2, MGST1, MPO, FOSB
|