Full Installation

The following document is for the full installation of all software required by the mg-process-fastq module and all programmes that it uses. The document has been written with Ubuntu Linux in mind, although many of the commands are cross platform (*nix) complient.

If you already have certain packages installed feel free to skip over certain steps. Likewise the bin, lib and code directories are relative to the home dir; if this is not the case for your system then make the required changes when running these commands.

Setup the System Environment

1
2
3
4
5
6
7
8
9
sudo apt-get install -y make build-essential libssl-dev zlib1g-dev       \\
libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev \\
libncursesw5-dev xz-utils tk-dev unzip mcl libgtk2.0-dev r-base-core     \\
libcurl4-gnutls-dev python-rpy2 git libtbb2 pigz liblzma-dev libhdf5-dev \\
texlive-latex-base tree libblas-dev liblapack-dev

cd ${HOME}
mkdir bin lib code
echo 'export PATH="${HOME}/bin:$PATH"' >> ~/.bash_profile

Setup pyenv and pyenv-virtualenv

This is required for managing the version of Python and the installation environment for the Python modules so that they can be installed in the user space.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
git clone https://github.com/pyenv/pyenv.git ~/.pyenv
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
echo 'eval "$(pyenv init -)"' >> ~/.bash_profile

# Add the .bash_profile to your .bashrc file
echo 'source ~/.bash_profile"' >> ~/.bashrc

git clone https://github.com/pyenv/pyenv-virtualenv.git ${PYENV_ROOT}/plugins/pyenv-virtualenv

pyenv install 2.7.12
pyenv virtualenv 2.7.12 mg-process-fastq

# Python 3 environment required by iNPS
pyenv install 3.5.3
ln -s ${HOME}/.pyenv/versions/3.5.3/bin/python ${HOME}/bin/py3

Installation Process

UCSC Tools

1
2
3
4
5
6
7
8
cd ${HOME}/lib
wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/bedToBigBed
wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/wigToBigWig

wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/faToTwoBit
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/twoBitInfo

chmod +x bedToBigBed wigToBigWig faToTwoBit twoBitInfo

BioBamBam2

BioBamBam is used for the filtering of aligned reads as part of the ChIP-seq pipeline. It also requires the libmaus2 package to be installed.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
cd ${HOME}/lib
git clone https://github.com/gt1/libmaus2.git
cd libmaus2
libtoolize
aclocal
autoheader
automake --force-missing --add-missing
autoconf
./configure --prefix=${HOME}/lib/libmaus2
make
make install

cd ${HOME}/lib
git clone https://github.com/gt1/biobambam2.git
cd biobambam2
autoreconf -i -f
./configure --with-libmaus2=${HOME}/lib/libmaus2 --prefix=${HOME}/lib/biobambam2
make install

Bowtie2 Aligner

1
2
3
cd ${HOME}/lib
wget --max-redirect 1 https://downloads.sourceforge.net/project/bowtie-bio/bowtie2/2.3.4/bowtie2-2.3.4-linux-x86_64.zip
unzip bowtie2-2.3.4-linux-x86_64.zip

BWA Sequence Aligner

1
2
3
4
cd ${HOME}/lib
git clone https://github.com/lh3/bwa.git
cd bwa
make

FastQC

1
2
3
4
5
cd ${HOME}/lib
wget http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.5.zip
unzip fastqc_v0.11.5.zip
cd FastQC/
chmod 755 fastqc

GEM Sequence Aligner

1
2
3
cd ${HOME}/lib
wget http://barnaserver.com/gemtools/releases/GEMTools-static-core2-1.7.1.tar.gz
tar -xzf GEMTools-static-core2-1.7.1.tar.gz

iNPS Peak Caller

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
cd ${HOME}/lib
mkdir iNPS
cd iNPS
wget http://www.picb.ac.cn/hanlab/files/iNPS_V1.2.2.zip
unzip iNPS_V1.2.2.zip

cd ${HOME}/bin
touch iNPS
cat iNPS <<EOL
#!/usr/bin/env bash
py3 ${HOME}/lib/iNPS/iNPS_V1.2.2.py "$@"
EOL

chmod 777 iNPS

Kallisto

1
2
3
cd ${HOME}/lib
wget https://github.com/pachterlab/kallisto/releases/download/v0.43.1/kallisto_linux-v0.43.1.tar.gz
tar -xzf kallisto_linux-v0.43.1.tar.gz

SAMtools

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
cd ${HOME}/lib
git clone https://github.com/samtools/htslib.git
cd htslib
autoheader
autoconf
./configure --prefix=${HOME}/lib/htslib
make
make install

cd ${HOME}/lib
git clone https://github.com/samtools/samtools.git
cd samtools
autoheader
autoconf -Wno-syntax
./configure --prefix=${HOME}/lib/samtools
make
make install

bedTools

1
2
3
4
5
cd ${HOME}/lib
wget https://github.com/arq5x/bedtools2/releases/download/v2.26.0/bedtools-2.26.0.tar.gz
tar -zxvf bedtools-2.26.0.tar.gz
cd bedtools2
make

Prepare the Python Environment

Install APIs and Pipelines

Checkout the code for the DM API and the mg-process-fastq pipelines:

1
2
3
4
5
6
7
8
9
cd ${HOME}/code
pyenv activate mg-process-fastq
pip install git+https://github.com/Multiscale-Genomics/mg-dm-api.git
pip install git+https://github.com/Multiscale-Genomics/mg-tool-api.git

git clone https://github.com/Multiscale-Genomics/mg-process-fastq.git
cd mg-process-fastq
pip install -e .
pip install -r requirements.txt

Install MACS2

This should get installed as part of the installation in the mg-process-fastq package, if not then it will need to be installed separately.

For Python 2.7:

1
2
3
4
5
cd ${HOME}/code
pyenv activate mg-process-fastq
pip install MACS2

ln -s ${HOME}/.pyenv/versions/mg-process-fastq/bin/macs2 ${HOME}/bin/macs2

For Python 3.6:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
cd ${HOME}/code
pyenv activate mg-process-fastq
git clone https://github.com/taoliu/MACS.git
cd MACS
git checkout MACS2p3

cython MACS2/*.pyx
cython MACS2/IO/*.pyx
python setup_w_cython.py install

pip install .
alias macs2="macs2p3"

Install iDEAR

1
2
3
4
5
6
cd ${HOME}/lib
source("https://bioconductor.org/biocLite.R")
biocLite("BSgenome")
biocLite("DESeq2")
if(!require("devtools")) install.packages("devtools")
devtools::install_bitbucket("juanlmateo/idear")

Install TADbit

1
2
3
4
5
6
7
8
cd ${HOME}/lib
wget https://github.com/3DGenomes/TADbit/archive/dev.zip -O tadbit.zip
unzip tadbit.zip
cd TADbit-dev

# If the pyenv env is not called mg-process-fastq then change this to match,
# the same is true for the version of python
python setup.py install --install-lib=${HOME}/.pyenv/versions/mg-process-fastq/lib/python2.7/site-packages/ --install-scripts=${HOME}/bin

Install BSseeker

1
2
3
4
5
6
7
cd ${HOME}/lib
git clone https://github.com/BSSeeker/BSseeker2.git

cd ${HOME}/code/mg-process-fastq
ln -s ${HOME}/lib/BSseeker2/bs_align bs_align
ln -s ${HOME}/lib/BSseeker2/bs_index bs_index
ln -s ${HOME}/lib/BSseeker2/bs_utils bs_utils

Trim Galore

1
2
3
4
5
6
7
cd ${HOME}/lib
pip install cutadapt
wget -O trim_galore.tar.gz https://github.com/FelixKrueger/TrimGalore/archive/0.5.0.tar.gz
tar -xzf trim_galore.tar.gz

cd ${HOME}/bin
ln -s ${HOME}/lib/TrimGalore-0.5.0/trim_galore trim_galore

Running on a COMPSs VM the symlink will need to be created in a system accessible area:

1
2
sudo ln -s ${HOME}/lib/TrimGalore-0.4.3/trim_galore /usr/local/bin/trim_galore
pip install cutadapt

Post Installation Tidyup

1
2
cd ${HOME}/lib
rm *.zip *.tar.gz