Installing FunGene Pipeline

Introduction

The FunGene Pipeline is for processing functional gene sequences. It differs from the processing of rRNA gene sequences in several respects:

  • Chimeras are usually removed using uchime in de novo mode.
  • FrameBot is used to correct sequencing errors (insertions & deletions) and translate DNA sequences.
  • FrameBot can output an OTU table based on closest match to a reference database. (Analogous to a supervised approach.)
  • HMMER3 is used to align the translated sequences.

The aligned sequences can then be processed according to the unsupervised method, following the same steps as used for rRNA gene sequences.

Install Dependencies

The FunGene Pipeline makes use of RDPTools, so be sure to install it and its dependencies (java and hmmer) first. In addition, you will need the following:

  • infernal
  • FastTree
  • blast2
  • USEARCH 8.1

You can check if you have already installed the first three of the above by searching for them in Synaptic Package Manager or by calling up help files associated with each:

# For FastTree:
fasttree -h

# For infernal:
cmalign -h

# For blast2:
blastn -help

If you need to install any of these packages, you may do so with apt (new to Ubuntu 16.04) or apt-get, using the commands in the code box below. If the programs are already installed, have no fear. You will get a message to that effect, and possibly the option to update.

sudo apt install fasttree
sudo apt install infernal
sudo apt install blast2

If you do not already have USEARCH 8.1, get it from drive5.com, make it executable, rename it to usearch8.1, and move it to your directory /usr/local/bin/. For example, if you download the file to ~/Downloads:

cd ~/Downloads
chmod 755 usearch8.1.1861_i86linux32
sudo mv usearch8.1.1861_i86linux32 /usr/local/bin/usearch8.1

Install FunGene Pipeline

Next, clone FunGene from its git repository:

cd /usr/local/
sudo git clone https://github.com/rdpstaff/fungene_pipeline.git

The final step is to create a configuration file. A template file is provided as config_skel.ini in  /usr/local/fungene_pipeline. The edited file should be saved in the same directory as config.ini. You may edit the file by changing to that directory and using sudo nano, but you may find it easier to copy config_skel.ini to your home directory as config.ini, edit it there with gedit, and then move it back to /usr/loca/fungene_pipeline. for example:

cd ~/
cp /usr/local/fungene_pipeline/config_skel.ini ~/config.ini
gedit config.ini

# Edit the paths in the file and save changes.

sudo mv config.ini /usr/local/fungene_pipeline/config.ini

Having followed all instructions on this website, my config.ini file looks like this:

[pipeline]
resource_dir = ~/resources
blastx_db = #/scratch/blast_db/nr #unused
blastn_db = #/scratch/blast_db/nt #unused
distribute_jobs = false #Set to true to use qsub-like submission to job control system (only tested with open gridengine 6.2u3)

cmalign_cmd = /usr/bin/cmalign
hmmalign_cmd = /usr/bin/hmmalign
blast_cmd = /usr/bin/blastall
formatdb_cmd = /usr/bin/formatdb
parse_error_analysis_cmd = /usr/local/fungene_pipeline/parseErrorAnalysis.py
usearch_cmd=/usr/local/bin/usearch8.1

#gridware_env_path=/usr/bin:/usr/sbin:/Software/bin #not used

process_class_jar = /usr/local/RDPTools/SeqFilters.jar
cluster_jar = /usr/local/RDPTools/Clustering.jar
framebot_jar = /usr/local/RDPTools/FrameBot.jar
alignment_tools_jar = /usr/local/RDPTools/AlignmentTools/dist/AlignmentTools.jar
abundance_jar = /usr/local/RDPTools/AbundanceStats.jar

In editing your config.ini file, check to make sure that all paths are correct for your package installations.