【原】R语言简单for循环

医科研 2021-01-25

展开全文

欢迎来到医科研，这里是白介素2的读书笔记，跟我一起聊临床与科研的故事, 生物医学数据挖掘，R语言，TCGA、GEO数据挖掘。

简单for循环

创建一个简单数据框

 1Sys.setlocale('LC_ALL','C')
 2## [1] "C"
 3library(tidyverse)
 4## -- Attaching packages ---------------------------------------------------- tidyverse 1.2.1 --
 5## <U+221A> ggplot2 3.1.0       <U+221A> purrr   0.3.0  
 6## <U+221A> tibble  2.0.1       <U+221A> dplyr   0.8.0.1
 7## <U+221A> tidyr   0.8.2       <U+221A> stringr 1.4.0  
 8## <U+221A> readr   1.3.1       <U+221A> forcats 0.4.0
 9## -- Conflicts ------------------------------------------------------- tidyverse_conflicts() --
10## x dplyr::filter() masks stats::filter()
11## x dplyr::lag()    masks stats::lag()
12df<-tibble(
13  a=rnorm(10),
14  b=rnorm(10),
15  c=rnorm(10),
16  d=rnorm(10)
17)
18df
19## # A tibble: 10 x 4
20##         a      b      c       d
21##     <dbl>  <dbl>  <dbl>   <dbl>
22##  1 -0.956 -0.848  0.701 -1.10  
23##  2  1.10  -0.925 -0.326 -0.328 
24##  3 -1.39  -0.553 -1.36   1.17  
25##  4 -0.102 -2.13   0.949  0.219 
26##  5 -0.930 -0.789 -1.81  -0.611 
27##  6 -1.56  -1.08   1.54  -0.403 
28##  7 -0.115 -0.310 -2.07  -1.10  
29##  8  0.817  0.151 -0.464 -0.0366
30##  9 -1.97   1.25  -0.396 -0.700 
31## 10  2.44   0.486 -1.91  -0.334

元素提取的差异 df[1]与df[[1]]

提取第一列

 1df[1]
 2## # A tibble: 10 x 1
 3##         a
 4##     <dbl>
 5##  1 -0.956
 6##  2  1.10 
 7##  3 -1.39 
 8##  4 -0.102
 9##  5 -0.930
10##  6 -1.56 
11##  7 -0.115
12##  8  0.817
13##  9 -1.97 
14## 10  2.44

提取第一列的元素操作

1df[[1]]
2##  [1] -0.9556780  1.1017707 -1.3890825 -0.1017430 -0.9304293 -1.5648136
3##  [7] -0.1151510  0.8174654 -1.9693236  2.4369937

实现一个需求：要求使用求出每一列的中位值
当然这个需求可以用简单的代码实现，因为数据少，但我们偏要用for循环来解决

先来思考for循环的三个部分

输出
序列
函数体

输出

注意这里的输出是明确长度的，因此可以确定下来

1output<-vector("double",ncol(df))##

序列+循环体

这里的seq_along与length类似，但要好一些

1for (i in seq_along(df)) {
2  output[[i]]<-median(df[[i]])
3}

输出结果

1output
2## [1] -0.5227902 -0.6708582 -0.4300781 -0.3684835

如果使用output[i]索引发现得到了类似的结果

1output<-vector("double",ncol(df))##
2
3for (i in seq_along(df)) {
4  output[i]<-median(df[[i]])
5}
6output
7## [1] -0.5227902 -0.6708582 -0.4300781 -0.3684835

调整思路-创建未知长度的空向量

1output<-vector()##
2
3for (i in seq_along(df)) {
4  value<-median(df[[i]])
5  output<-c(value,output)
6}
7output
8## [1] -0.3684835 -0.4300781 -0.6708582 -0.5227902

for循环改装成函数

R语言是一门函数式编程语言，这意味着可以先将for循环包装在函数中然后可以直接调用函数，而不是直接去使用for循环 下面我们示例来改装一下，想一个合适的函数名，因为计算每列的中位值，可命为col_median
创建一个函数col_median
输入参数1:df,是一个简单数据框tibble

 1col_median<-function(df){
 2  output<-vector()##
 3  for (i in seq_along(df)) {
 4  value<-median(df[[i]])
 5  output<-c(value,output)
 6}
 7  output##函数的输出
 8}
 9col_median(df)
10## [1] -0.3684835 -0.4300781 -0.6708582 -0.5227902

测试一下这个函数

 1###创建矩阵
 2data<-matrix(rnorm(60),nrow = 6,ncol = 10)
 3head(data)
 4##              [,1]        [,2]       [,3]       [,4]       [,5]       [,6]
 5## [1,] -0.821624744 -1.42055483 -1.6212754 -0.9966531  0.3076887 -1.3621218
 6## [2,]  0.389617000  0.91243785 -0.1969125  2.2527410 -1.5184419 -0.9301008
 7## [3,]  0.002507928  0.33528398 -0.3859504  1.9842380  0.5913518  1.1907294
 8## [4,] -0.974889049  0.55289822  1.2403620 -0.2995565 -0.1231051  1.5828930
 9## [5,] -0.057873471  0.01868644  0.3741498  0.8565143 -1.3681877  2.0053276
10## [6,]  0.539604156  3.16231565  0.4713369 -0.4801932  0.4394711 -0.9297590
11##            [,7]       [,8]       [,9]      [,10]
12## [1,] -0.4878761 -2.0633086 -0.9382000  1.2048067
13## [2,] -0.1664908 -0.5548023 -1.5614915  1.3281471
14## [3,] -1.1470670 -0.2582778  0.5214065  0.3797929
15## [4,] -1.1619733 -0.9926352  0.6941835 -0.2334606
16## [5,]  1.5798016 -1.3599199 -0.1201355 -0.2287048
17## [6,] -1.5974384 -0.8201751  0.2489410  0.7259488
18dim(data)
19## [1]  6 10

由于函数接受的输入是tibble,调整数据格式为tibble

1data<-as_tibble(data)
2## Warning: `as_tibble.matrix()` requires a matrix with column names or a `.name_repair` argument. Using compatibility `.name_repair`.
3## This warning is displayed once per session.
4col_median(data)
5##  [1]  0.55287084  0.06440272 -0.90640517 -0.81747157  0.13048523
6##  [6]  0.09229177  0.27847891  0.08861867  0.44409110 -0.02768277