Treeing in QIIME
If you are using QIIME 2 to process your sequence data, by far the easiest way to tree the representative sequences is with the QIIME 2 phylogeny plug-in. It outputs either/or both unrooted and center rooted tree files in compressed qza format. These can be exported in regular Newick format. The code below is included in the script for processing 16S rRNA gene sequences with QIIME2 and DADA2.
qiime phylogeny align-to-tree-mafft-fasttree \
--i-sequences rep-seqs.qza \
--o-alignment aligned-rep-seqs.qza \
--o-masked-alignment masked-aligned-rep-seqs.qza \
--o-tree unrooted-tree.qza \
--o-rooted-tree rooted-tree.qza
# Export tree files
qiime tools export \
--input-path unrooted-tree.qza \
--output-path phyloseq
cd phyloseq
mv tree.nwk unrooted_tree.nwk
cd ../
qiime tools export \
--input-path rooted-tree.qza \
--output-path phyloseq
cd phyloseq
mv tree.nwk rooted_tree.nwk
Treeing with infernal and fasttree
Infernal is a structure aware aligner and therefore a good choice for aligning rRNA gene sequences. The aligned sequences can then be treed with fasttree. This was the method of choice presented in the RDPTools command line tutorials on this web site.
Infernal makes use of covariance models built from a database of sequences. RDPTools includes covariance models for bacterial and archaeal 16S rRNA genes sequences and fungal 28S rRNA gene sequences.
See my GitBook at https://jfq3.gitbook.io/rdptools-docker for how to install Docker and RDPTools. Since the RDP closed, this is the only reliable way of installing RDPTools. The installation will map a directory on your computer that the docker image can access.
Once you have the RDPTools docker image installed, open a terminal and start RDPTools by entering the following:
docker start rdp_tools
docker attach rdp_tools
Copy the script below to your clipboard. It includes paths to three different covariance models. Be sure to comment out the ones you do not want to use. As the script is written, the model for bacteria will be used.
!#/bin/bash
# This configuration is correct for a Docker installation of RDPTools.
infernalDir=/usr/lib/infernal
# Comment out all models you do not want to use.
# For bacteria:
cmModelDir=/usr/local/fungene_pipeline/resources/RRNA_16S_BACTERIA
# # For archaea:
# cmModelDir=/mnt/research/rdp/public/fungene_pipeline/resources/RRNA_16S_ARCHAEA
# # For fungi:
# cmModelDir=/mnt/research/rdp/public/fungene_pipeline/resources/RRNA_28S
FastTree=/usr/bin/fasttree
f=$1
$infernalDir/cmalign -g --noprob --outformat AFA \
--dnaout -o aligned.fasta $cmModelDir/model.cm $f
$FastTree -nt -gtr < aligned.fasta > my_tree.nwk
Then open nano by typing:
nano tree_with_infernal.sh
And paste the clipboard contents into the terminal. Type Ctrl O and Enter to write out the file to the same name and Ctrl X to exit.
Make the script executable:
chmod u+x tree_with_infernal.sh
Put the fasta file of your representative sequences in the mapped directory and then run the script giving the name of your fasta file, ref_seqs.fasta
in the example below.
./tree_with_inferna.sh ref_seqs.fasta
The tree file my_tree.nwk
will appear in the mapped directory. The tree is unrooted, but can be rooted in R later.
When you have finished using RDPTools and Docker, enter Ctrl C and close the terminal. Then reopen the terminal and enter:
docker stop rdp_tools
Or stop rdp_tools in Docker Desktop.