之前写的博客, 记录记录一下学习的轨迹.目标这次, 我要实现这个路程图. 目标介绍
1. 生成三个文件(snake_test) [dengfei@localhost ex4]$ ls *txt 1.txt 2.txt 3.txt (snake_test) [dengfei@localhost ex4]$ cat *txt this is 1.txt this is 2.txt this is 3.txt 2. 在每个文件中增加”add a”对应的Snakefile内容如下: rule adda: input: "{file}.txt" output: "{file}_add_a.txt" shell: "cat {input} |xargs echo add a >{output}" 预览一下命令: 注意: 这里要把生成的文件{1,2,3}_add_a.txt写出来, 命令才可以运行. (snake_test) [dengfei@localhost ex4]$ snakemake -np {1,2,3}_add_a.txt Building DAG of jobs... Job counts: count jobs 3 adda 3
[Tue Apr 2 21:09:19 2019] rule adda: input: 3.txt output: 3_add_a.txt jobid: 2 wildcards: file=3
cat 3.txt |xargs echo add a >3_add_a.txt
[Tue Apr 2 21:09:19 2019] rule adda: input: 2.txt output: 2_add_a.txt jobid: 0 wildcards: file=2
cat 2.txt |xargs echo add a >2_add_a.txt
[Tue Apr 2 21:09:19 2019] rule adda: input: 1.txt output: 1_add_a.txt jobid: 1 wildcards: file=1
cat 1.txt |xargs echo add a >1_add_a.txt Job counts: count jobs 3 adda 3 This was a dry-run (flag -n). The order of jobs does not reflect the order of execution. 执行命令: snakemake {1,2,3}_add_a.txt Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 1 Rules claiming more threads will be scaled down. Job counts: count jobs 3 adda 3
[Tue Apr 2 21:11:09 2019] rule adda: input: 3.txt output: 3_add_a.txt jobid: 0 wildcards: file=3
[Tue Apr 2 21:11:09 2019] Finished job 0. 1 of 3 steps (33%) done
[Tue Apr 2 21:11:09 2019] rule adda: input: 1.txt output: 1_add_a.txt jobid: 1 wildcards: file=1
[Tue Apr 2 21:11:09 2019] Finished job 1. 2 of 3 steps (67%) done
[Tue Apr 2 21:11:09 2019] rule adda: input: 2.txt output: 2_add_a.txt jobid: 2 wildcards: file=2
[Tue Apr 2 21:11:09 2019] Finished job 2. 3 of 3 steps (100%) done Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T211109.153566.snakemake.log 查看*add_a.txt文件: (snake_test) [dengfei@localhost ex4]$ ls *add_a.txt 1_add_a.txt 2_add_a.txt 3_add_a.txt (snake_test) [dengfei@localhost ex4]$ cat *add_a.txt add a this is 1.txt add a this is 2.txt add a this is 3.txt 搞定. 3. 在每个文件中增加”add b”对应的Snakefile内容如下: rule adda: input: "{file}.txt" output: "{file}_add_a.txt" shell: "cat {input} |xargs echo add a >{output}" rule addb: input: "{file}_add_a.txt" output: "{file}_add_a_add_b.txt" shell: "cat {input} | xargs echo add b >{output}" 预览一下命令: (snake_test) [dengfei@localhost ex4]$ snakemake {1,2,3}_add_a_add_b.txt Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 1 Rules claiming more threads will be scaled down. Job counts: count jobs 3 addb 3
[Tue Apr 2 21:13:57 2019] rule addb: input: 2_add_a.txt output: 2_add_a_add_b.txt jobid: 0 wildcards: file=2
[Tue Apr 2 21:13:57 2019] Finished job 0. 1 of 3 steps (33%) done
[Tue Apr 2 21:13:57 2019] rule addb: input: 1_add_a.txt output: 1_add_a_add_b.txt jobid: 1 wildcards: file=1
[Tue Apr 2 21:13:57 2019] Finished job 1. 2 of 3 steps (67%) done
[Tue Apr 2 21:13:57 2019] rule addb: input: 3_add_a.txt output: 3_add_a_add_b.txt jobid: 2 wildcards: file=3
[Tue Apr 2 21:13:57 2019] Finished job 2. 3 of 3 steps (100%) done Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T211357.666661.snakemake.log 执行命令: snakemake {1,2,3}_add_a_add_b.txt 查看流程图 命令: snakemake --dag {1,2,3}_add_a_add_b.txt |dot -Tpdf >a.pdf 这里生成的a.pdf如下: 4. 在每个文件中增加”add c”Snakemake命令: rule adda: input: "{file}.txt" output: "{file}_add_a.txt" shell: "cat {input} |xargs echo add a >{output}" rule addb: input: "{file}_add_a.txt" output: "{file}_add_a_add_b.txt" shell: "cat {input} | xargs echo add b >{output}"
rule addc: input: "{file}_add_a_add_b.txt" output: "{file}_add_a_add_b_add_c.txt" shell: "cat {input} | xargs echo add c >{output}" 流程图: 命令: snakemake --dag {1,2,3}_add_a_add_b_add_c.txt |dot -Tpdf >a1.pdf 5. 将文件合并rule adda: input: "{file}.txt" output: "{file}_add_a.txt" shell: "cat {input} |xargs echo add a >{output}" rule addb: input: "{file}_add_a.txt" output: "{file}_add_a_add_b.txt" shell: "cat {input} | xargs echo add b >{output}"
rule addc: input: "{file}_add_a_add_b.txt" output: "{file}_add_a_add_b_add_c.txt" shell: "cat {input} | xargs echo add c >{output}"
rule hebing: input: a=expand("{file}_add_a_add_b_add_c.txt",file=["1","2","3"]), b=expand("{file}_add_a_add_b.txt",file=["1","2"]) output:"hebing.txt" shell:"cat {input.a} {input.b} >{output}" 执行命令: snakemake hebing.txt 执行结果: Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 1 Rules claiming more threads will be scaled down. Job counts: count jobs 3 addc 1 hebing 4
[Tue Apr 2 21:21:04 2019] rule addc: input: 1_add_a_add_b.txt output: 1_add_a_add_b_add_c.txt jobid: 1 wildcards: file=1
[Tue Apr 2 21:21:04 2019] Finished job 1. 1 of 4 steps (25%) done
[Tue Apr 2 21:21:04 2019] rule addc: input: 3_add_a_add_b.txt output: 3_add_a_add_b_add_c.txt jobid: 3 wildcards: file=3
[Tue Apr 2 21:21:04 2019] Finished job 3. 2 of 4 steps (50%) done
[Tue Apr 2 21:21:04 2019] rule addc: input: 2_add_a_add_b.txt output: 2_add_a_add_b_add_c.txt jobid: 2 wildcards: file=2
[Tue Apr 2 21:21:04 2019] Finished job 2. 3 of 4 steps (75%) done
[Tue Apr 2 21:21:04 2019] rule hebing: input: 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt, 1_add_a_add_b.txt, 2_add_a_add_b.txt output: hebing.txt jobid: 0
[Tue Apr 2 21:21:04 2019] Finished job 0. 4 of 4 steps (100%) done Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T212104.719887.snakemake.log 流程图: 搞定欢迎关注我的公众号: R-breeding 相关阅读snakemake 学习笔记1 后记1今天测试了一下 因为最后的输出文件是 rule all: input:"hebing.txt" rule adda: input: "{file}.txt" output: "{file}_add_a.txt" shell: "cat {input} |xargs echo add a >{output}" rule addb: input: "{file}_add_a.txt" output: "{file}_add_a_add_b.txt" shell: "cat {input} | xargs echo add b >{output}"
rule addc: input: "{file}_add_a_add_b.txt" output: "{file}_add_a_add_b_add_c.txt" shell: "cat {input} | xargs echo add c >{output}"
rule hebing: input: a=expand("{file}_add_a_add_b_add_c.txt",file=["1","2","3"]), b=expand("{file}_add_a_add_b.txt",file=["1","2"]) output:"hebing.txt" shell:"cat {input.a} {input.b} >{output}" 执行命令: snakemake 结果如下: (base) [dengfei@localhost ex4]$ snakemake Provided cores: 1 Rules claiming more threads will be scaled down. Job counts: count jobs 3 adda 3 addb 3 addc 1 all 1 hebing 11
rule adda: input: 1.txt output: 1_add_a.txt jobid: 7 wildcards: file=1
Finished job 7. 1 of 11 steps (9%) done
rule adda: input: 2.txt output: 2_add_a.txt jobid: 9 wildcards: file=2
Finished job 9. 2 of 11 steps (18%) done
rule adda: input: 3.txt output: 3_add_a.txt jobid: 10 wildcards: file=3
Finished job 10. 3 of 11 steps (27%) done
rule addb: input: 3_add_a.txt output: 3_add_a_add_b.txt jobid: 8 wildcards: file=3
Finished job 8. 4 of 11 steps (36%) done
rule addb: input: 1_add_a.txt output: 1_add_a_add_b.txt jobid: 3 wildcards: file=1
Finished job 3. 5 of 11 steps (45%) done
rule addb: input: 2_add_a.txt output: 2_add_a_add_b.txt jobid: 6 wildcards: file=2
Finished job 6. 6 of 11 steps (55%) done
rule addc: input: 3_add_a_add_b.txt output: 3_add_a_add_b_add_c.txt jobid: 5 wildcards: file=3
Finished job 5. 7 of 11 steps (64%) done
rule addc: input: 2_add_a_add_b.txt output: 2_add_a_add_b_add_c.txt jobid: 2 wildcards: file=2
Finished job 2. 8 of 11 steps (73%) done
rule addc: input: 1_add_a_add_b.txt output: 1_add_a_add_b_add_c.txt jobid: 4 wildcards: file=1
Finished job 4. 9 of 11 steps (82%) done
rule hebing: input: 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt, 1_add_a_add_b.txt, 2_add_a_add_b.txt output: hebing.txt jobid: 1
Finished job 1. 10 of 11 steps (91%) done
localrule all: input: hebing.txt jobid: 0
Finished job 0. 11 of 11 steps (100%) done 查看结果: (base) [dengfei@localhost ex4]$ cat hebing.txt add c add b add a this is 1.txt add c add b add a this is 2.txt add c add b add a this is 3.txt add b add a this is 1.txt add b add a this is 2.txt 后记2snakemake如果是默认的名称, 为Snakefile, 但是这样写没有高亮, 可以写为 rule all: input:"hebing.txt" rule adda: input: "{file}.txt" output: "{file}_add_a.txt" shell: "cat {input} |xargs echo add a >{output}" rule addb: input: "{file}_add_a.txt" output: "{file}_add_a_add_b.txt" shell: "cat {input} | xargs echo add b >{output}"
rule addc: input: "{file}_add_a_add_b.txt" output: "{file}_add_a_add_b_add_c.txt" shell: "cat {input} | xargs echo add c >{output}"
rule hebing: input: a=expand("{file}_add_a_add_b_add_c.txt",file=["1","2","3"]), b=expand("{file}_add_a_add_b.txt",file=["1","2"]) output:"hebing.txt" shell:"cat {input.a} {input.b} >{output}" 执行结果: (base) [dengfei@localhost ex4]$ snakemake -s a.py Provided cores: 1 Rules claiming more threads will be scaled down. Job counts: count jobs 3 adda 3 addb 3 addc 1 all 1 hebing 11
rule adda: input: 1.txt output: 1_add_a.txt jobid: 8 wildcards: file=1
Finished job 8. 1 of 11 steps (9%) done
rule adda: input: 3.txt output: 3_add_a.txt jobid: 10 wildcards: file=3
Finished job 10. 2 of 11 steps (18%) done
rule adda: input: 2.txt output: 2_add_a.txt jobid: 9 wildcards: file=2
Finished job 9. 3 of 11 steps (27%) done
rule addb: input: 3_add_a.txt output: 3_add_a_add_b.txt jobid: 7 wildcards: file=3
Finished job 7. 4 of 11 steps (36%) done
rule addb: input: 2_add_a.txt output: 2_add_a_add_b.txt jobid: 4 wildcards: file=2
Finished job 4. 5 of 11 steps (45%) done
rule addb: input: 1_add_a.txt output: 1_add_a_add_b.txt jobid: 3 wildcards: file=1
Finished job 3. 6 of 11 steps (55%) done
rule addc: input: 3_add_a_add_b.txt output: 3_add_a_add_b_add_c.txt jobid: 2 wildcards: file=3
Finished job 2. 7 of 11 steps (64%) done
rule addc: input: 2_add_a_add_b.txt output: 2_add_a_add_b_add_c.txt jobid: 5 wildcards: file=2
Finished job 5. 8 of 11 steps (73%) done
rule addc: input: 1_add_a_add_b.txt output: 1_add_a_add_b_add_c.txt jobid: 6 wildcards: file=1
Finished job 6. 9 of 11 steps (82%) done
rule hebing: input: 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt, 1_add_a_add_b.txt, 2_add_a_add_b.txt output: hebing.txt jobid: 1
Finished job 1. 10 of 11 steps (91%) done
localrule all: input: hebing.txt jobid: 0
Finished job 0. 11 of 11 steps (100%) done
|
|