The FunGene Pipeline is for processing functional gene sequences. It differs from the processing of rRNA gene sequences in several respects:
- Chimeras are usually removed using uchime in de novo mode.
- FrameBot is used to correct sequencing errors (insertions & deletions) and translate DNA sequences.
- FrameBot can output an OTU table based on closest match to a reference database. (Analogous to a supervised approach.)
- HMMER3 is used to align the translated sequences.
The aligned sequences can then be processed according to the unsupervised method, following the same steps as used for rRNA gene sequences.
The FunGene Pipeline makes use of RDPTools, so be sure to install it and its dependencies (java and hmmer) first. In addition, you will need the following:
- USEARCH 8.1
You can check if you have already installed the first three of the above by searching for them in Synaptic Package Manager or by calling up help files associated with each:
# For FastTree: fasttree -h # For infernal: cmalign -h # For blast2: blastn -help
If you need to install any of these packages, you may do so with
apt (new to Ubuntu 16.04) or
apt-get, using the commands in the code box below. If the programs are already installed, have no fear. You will get a message to that effect, and possibly the option to update.
sudo apt install fasttree sudo apt install infernal sudo apt install blast2
You may have already installed USEARCH 8.1 when you installed RDPTools. If not, get it from drive5.com, make it executable, rename it to usearch8.1, and move it to your directory
/usr/local/bin/. You can do this from the command line with the following:
cd ~/Downloads wget https://drive5.com/downloads/usearch8.1.1861_i86linux32.gz gzip -d usearch8.1.1861_i86linux32.gz chmod 755 usearch8.1.1861_i86linux32 sudo mv usearch8.1.1861_i86linux32 /usr/local/bin/usearch8.1
Install FunGene Pipeline
Next, clone FunGene from its git repository:
sudo git clone https://github.com/rdpstaff/fungene_pipeline.git
The final step is to create a configuration file. A template file is provided as
/usr/local/fungene_pipeline. The edited file should be saved in the same directory as
config.ini. You may edit the file by changing to that directory and using
sudo nano, but you may find it easier to copy
config_skel.ini to your home directory as
config.ini, edit it there with
gedit, and then move it back to
/usr/loca/fungene_pipeline. for example:
cd ~/ cp /usr/local/fungene_pipeline/config_skel.ini ~/config.ini gedit config.ini # Edit the paths in the file and save changes. sudo mv config.ini /usr/local/fungene_pipeline/config.ini
Having followed all instructions on this website, my
config.ini file looks like this:
[pipeline] resource_dir = ~/resources blastx_db = #/scratch/blast_db/nr #unused blastn_db = #/scratch/blast_db/nt #unused distribute_jobs = false #Set to true to use qsub-like submission to job control system (only tested with open gridengine 6.2u3) cmalign_cmd = /usr/bin/cmalign hmmalign_cmd = /usr/bin/hmmalign blast_cmd = /usr/bin/blastall formatdb_cmd = /usr/bin/formatdb parse_error_analysis_cmd = /usr/local/fungene_pipeline/parseErrorAnalysis.py usearch_cmd=/usr/local/bin/usearch8.1 #gridware_env_path=/usr/bin:/usr/sbin:/Software/bin #not used process_class_jar = /usr/local/RDPTools/SeqFilters.jar cluster_jar = /usr/local/RDPTools/Clustering.jar framebot_jar = /usr/local/RDPTools/FrameBot.jar alignment_tools_jar = /usr/local/RDPTools/AlignmentTools/dist/AlignmentTools.jar abundance_jar = /usr/local/RDPTools/AbundanceStats.jar
In editing your
config.ini file, check to make sure that all paths are correct for your package installations.