博文的顺序有点乱,因为怕读到前面的公共测序数据下载这篇文章的朋友搞不清楚,我如何调用各种软件的,所以我这里强势插入一篇博客来描述这件事,当然也只是略过,我所有的软件理论上都是安装在我的home目录下的biosoft文件夹,所以你看到我一般安装程序都是: cd ~/biosoft mkdir macs2 && cd macs2 ##指定的软件安装在指定文件夹里面 这只是我个人的安装习惯,因为我不是root,所以不能在linux系统下做太多事,我这里贴出我所有的软件安装代码: ## pre-step: download sratoolkit /fastx_toolkit_0.0.13/fastqc/bowtie2/bwa/MACS2/HOMER/QuEST/mm9/hg19/bedtools ## http://www.ncbi.nlm./Traces/sra/sra.cgi?view=software ## http://www.ncbi.nlm./books/NBK158900/ ## Download and install sratoolkit cd ~/biosoft mkdir sratoolkit && cd sratoolkit wget http://ftp-trace.ncbi.nlm./sra/sdk/2.6.3/sratoolkit.2.6.3-centos_linux64.tar.gz ## ## Length: 63453761 (61M) [application/x-gzip] ## Saving to: "sratoolkit.2.6.3-centos_linux64.tar.gz" tar zxvf sratoolkit.2.6.3-centos_linux64.tar.gz ## Download and install bedtools cd ~/biosoft mkdir bedtools && cd bedtools wget https://github.com/arq5x/bedtools2/releases/download/v2.25.0/bedtools-2.25.0.tar.gz ## Length: 19581105 (19M) [application/octet-stream] tar -zxvf bedtools-2.25.0.tar.gz cd bedtools2 make ## Download and install PeakRanger cd ~/biosoft mkdir PeakRanger && cd PeakRanger wget https:///projects/ranger/files/PeakRanger-1.18-Linux-x86_64.zip/ ## Length: 1517587 (1.4M) [application/octet-stream] unzip PeakRanger-1.18-Linux-x86_64.zip ~/biosoft/PeakRanger/bin/peakranger -h ## Download and install bowtie cd ~/biosoft mkdir bowtie && cd bowtie wget https:///projects/bowtie-bio/files/bowtie2/2.2.9/bowtie2-2.2.9-linux-x86_64.zip/download #Length: 27073243 (26M) [application/octet-stream] #Saving to: "download" ## I made a mistake here for downloading the bowtie2 mv download bowtie2-2.2.9-linux-x86_64.zip unzip bowtie2-2.2.9-linux-x86_64.zip mkdir -p ~/biosoft/bowtie/hg19_index cd ~/biosoft/bowtie/hg19_index # download hg19 chromosome fasta files wget http://hgdownload.cse./goldenPath/hg19/bigZips/chromFa.tar.gz # unzip and concatenate chromosome and contig fasta files tar zvfx chromFa.tar.gz cat *.fa > hg19.fa rm chr*.fa ## ~/biosoft/bowtie/bowtie2-2.2.9/bowtie2-build ~/biosoft/bowtie/hg19_index/hg19.fa ~/biosoft/bowtie/hg19_index/hg19 ## Download and install BWA cd ~/biosoft mkdir bwa && cd bwa http:///projects/bio-bwa/files/ tar xvfj bwa-0.7.12.tar.bz2 # x extracts, v is verbose (details of what it is doing), f skips prompting for each individual file, and j tells it to unzip .bz2 files cd bwa-0.7.12 make export PATH=$PATH:/path/to/bwa-0.7.12 # Add bwa to your PATH by editing ~/.bashrc file (or .bash_profile or .profile file) # /path/to/ is an placeholder. Replace with real path to BWA on your machine source ~/.bashrc # bwa index [-a bwtsw|is] index_prefix reference.fasta bwa index -p hg19bwaidx -a bwtsw ~/biosoft/bowtie/hg19_index/hg19.fa # -p index name (change this to whatever you want) # -a index algorithm (bwtsw for long genomes and is for short genomes) ## Download and install macs2 ## // https://pypi./pypi/MACS2/ cd ~/biosoft mkdir macs2 && cd macs2 wget ~~~~~~~~~~~~~~~~~~~~~~MACS2-2.1.1.20160309.tar.gz tar zxvf MACS2-2.1.1.20160309.tar.gz cd MACS2-2.1.1.20160309 python setup.py install --user #################### The log for installing MACS2: Creating ~/.local/lib/python2.7/site-packages/site.py Processing MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg Copying MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg to ~/.local/lib/python2.7/site-packages Adding MACS2 2.1.1.20160309 to easy-install.pth file Installing macs2 script to ~/.local/bin Finished processing dependencies for MACS2==2.1.1.20160309 ############################################################ ~/.local/bin/macs2 --help Example for regular peak calling: macs2 callpeak -t ChIP.bam -c Control.bam -f BAM -g hs -n test -B -q 0.01 Example for broad peak calling: macs2 callpeak -t ChIP.bam -c Control.bam --broad -g hs --broad-cutoff 0.1 ## Download and install homer (Hypergeometric Optimization of Motif EnRichment) ## // http://homer./homer/ ## // http://blog.:8080/archives/3024 ## pre-install: Ghostscript,seqlogo,blat cd ~/biosoft mkdir homer && cd homer wget http://homer./homer/configureHomer.pl perl configureHomer.pl -install perl configureHomer.pl -install hg19
一般来说,对我这样水平的人来说,软件安装就跟家常便饭一样,没有什么问题了,但如果你是初学者呢,肯定没那么轻松,所以请加强学习,我无法在这里讲解太具体的知识了。 所有软件安装完毕后就可以下载文章对这些CHIP-seq的处理结果了,这个很重要,检验我们是否重复了人家的数据分析过程: ## step3 : download the results from paper ## http://www./1571.html mkdir paper_results && cd paper_results wget ftp://ftp.ncbi.nlm./geo/series/GSE52nnn/GSE52964/suppl/GSE52964_RAW.tar tar xvf GSE52964_RAW.tar ls *gz |xargs gunzip ## step4 : run FastQC to check the sequencing quality. ##这里可以看到我们下载的原始数据已经被作者处理好了,去了接头,去了低质量序列 ls *.fastq | while read id ; do ~/biosoft/fastqc/FastQC/fastqc $id;done ## Sequence length 51 ## %GC 39 ## Adapter Content passed The quality of the reads is pretty good, we don't need to do any filter or trim mkdir QC_results mv *zip *html QC_results/
所以我们可以直接拿这些数据去做比对了!!!
|