前言:自从公开ggheatmap后,笔者收到了许多读者的鼓励。真的感激大家的支持与鼓励~如果在使用过程中存在什么问题,或者有比较希望实现的功能可以联系笔者,你的每一个建议对我来说都是一次学习和进步,也希望有机会和大家一起探讨学习~ 接下来,笔者将会举两个例子,用来说明ggheatmap 的用途,希望都过去抛砖引玉的作用(本质上:也就heatmap+拼图)
前期准备说明:以下数据并非真实存在,只是作者随机生成,所以不同人运行的结果可能存在不同,大家不放心的话,可以使用pheatmap 包进行验证。
devtools::install_github("XiaoLuo-boy/ggheatmap") library(ggheatmap) library(aplot) set.seed(123) df <- matrix(runif(600,0,10),ncol = 12) colnames(df) <- paste("sample",1:12,sep = "") rownames(df) <- sapply(1:50, function(x)paste(sample(LETTERS,3,replace = F),collapse = "")) df[1:4,1:4]
row_metaData <- data.frame(exprtype=sample(c("Up","Down"),50,replace = T), genetype=sample(c("Metabolism","Immune","None"),50,replace = T)) rownames(row_metaData) <- rownames(df)
col_metaData <- data.frame(tissue=sample(c("Normal","Tumor"),12,replace = T), risklevel=sample(c("High","Low"),12,replace = T)) rownames(col_metaData) <- colnames(df)
exprcol <- c("#EE0000FF","#008B45FF" ) names(exprcol) <- c("Up","Down") genecol <- c("#EE7E30","#5D9AD3","#D0DFE6FF") names(genecol) <- c("Metabolism","Immune","None") tissuecol <- c("#98D352","#FF7F0E") names(tissuecol) <- c("Normal","Tumor") riskcol <- c("#EEA236FF","#46B8DAFF") names(riskcol) <- c("High","Low") col <- list(exprtype=exprcol,genetype=genecol,tissue=tissuecol,risklevel=riskcol)
stackDat <- data.frame(sample=rep(paste0("sample",1:12),each=2), count=sample(1:10,24,replace = T), type=rep(c("Negative","Positive"),times=12))
数据说明: - stackDat:每个样本上调基因和下调基因的总数(为了方便,该数据也是随机生成的)
Example1 复现图片1**图片出处说明:**本热图出自《Cell》文献:https:///10.1016/j.cell.2020.05.032。起初是作者在“木舟笔记”推送上面看到的图片,感觉热图超级好看的,所以尽力复现出来(只模仿形,不模仿意)。 用法:可以用于对特定基因或者样本的多重注释。比如某些基因是免疫基因,某些基因是自噬相关相关基因,某些是肿瘤驱动基因。可能某些基因可能既是免疫基因也是自噬相关基因,这也就意味这用条形图注释不在合适,所以可以采用这个办法进行可视化。 代码实现 ggheatmap<- ggheatmap(df,color=colorRampPalette(c( "#66b032","white","#ff3800"))(100), cluster_rows = T,cluster_cols = T,scale = "row", cluster_num = c(5,3), tree_color_rows = c("#3B4992FF","#EE0000FF","#008B45FF","#631879FF","#008280FF"), tree_color_cols = c("#1F77B4FF","#FF7F0EFF","#2CA02CFF"), annotation_rows = row_metaData, annotation_cols = col_metaData, annotation_color = col )
dat <- data.frame(Glycolysis=sample(c(1,NA),50,replace = T), TAC=sample(c(1,NA),50,replace = T), gene=rownames(df)) p1 <- ggplot(dat,aes(x=Glycolysis,y=gene))+ geom_point(color="#d40749",size=3)+theme_classic()+ theme(line = element_blank(),axis.text = element_blank(),axis.title.y = element_blank(), axis.title.x = element_text(colour ="#d40749",face = "bold",size = 10))+ xlab("Glycolysis")+scale_x_discrete(position = "top") p2 <- ggplot(dat,aes(x=TAC,y=gene))+ geom_point(color="#0092db",size=3)+theme_classic()+ theme(line = element_blank(),axis.text = element_blank(),axis.title.y = element_blank(), axis.title.x = element_text(colour ="#0092db",face = "bold",size = 10))+ xlab("TAC")+scale_x_discrete(position = "top")
ggheatmap%>%insert_right(p1,width = 0.1)%>%insert_right(p2,width = 0.1)
Example1Example2 Example2**图片出处说明:**本图来源于data.world可视化项目:Energy Use at 10 Downing St in 2017(列为月份,行为日期)(只模仿形,不模仿意)。 用法:用于描述行列元素某一特征的变化趋势。比如说:如果样本以我们构建的模型打分的大小排序,同时需要描述特定类型基因的表达情况,那么这样绘制不免是一个很好的切入点。你也可以把柱形图换为线图等等。 代码实现 ggheatmap2<- ggheatmap(df,cluster_rows = T,cluster_cols = T,scale = "row", cluster_num = c(5,3), tree_color_rows = c("#3B4992FF","#EE0000FF","#008B45FF","#631879FF","#008280FF"), tree_color_cols = c("#1F77B4FF","#FF7F0EFF","#2CA02CFF"), annotation_rows = row_metaData, annotation_color = col,show_cluster_cols=F ) p3 <- ggplot(stackDat, aes(x=sample,y=count, fill=type)) + geom_bar(position="stack", stat="identity")+ scale_fill_manual(values = c("#24c1ff","#ffbd24"))+ theme_classic()+ theme(axis.text.x =element_blank(), axis.title.x = element_blank(), axis.line.x = element_blank(), axis.ticks.x = element_blank())+ scale_y_continuous(expand = c(0,0),position = "right") ggheatmap2%>%insert_top(p3,height = 0.1)
Example2文末友情推荐如果你也恰好是医学生,学习R语言也有困难,那么你值得拥有下面的学习班:
|