The RDP Classifier was updated to version 2.13 and released 30 July 2020 on SourceForge (https://sourceforge.net/projects/rdp-classifier/). The update is based on bacterial and archaeal training set No.18 with over 800 new genera and significant rearrangements of several phyla and genera based on the latest genome analyses. For detailed explanations of these revisions, please see the release notes.
As in earlier editions, databases are also included for classification of fungal ITS sequences by UNITE and Warcup taxonomies and for classification of fungal 28S rRNA gene sequences.
The web version of the classifier has been updated to use training set No. 18. The taxonomy browser has also been updated to comply with the new training set.
Installing as a stand-alone application
Written in Java, the RDP Classifier may be installed and run in Windows, Mac OS, Linux, Unix and Solaris environments.
- Test if the Java Runtime Environment (JRE) is already installed by opening a terminal and entering:
- If necessary, install JRE downloaded from here. While Oracle now charges for the development version of Java (JDK), JRE is still free.
Download RDP Classifier version 2.13 from here and extract the compressed file.
classifier.jar file will be in sub-directory
Installing as part of RDPTools
The only way to update the classifier in RDPTools is to remove and re-install RDPTools following the instructions here. Updating RDPTools will not work because the new databases are not part of the code but are downloaded during RDPTools installation.
Updating the QIIME2 version of the RDP Classifier
Reference sequences and corresponding taxonomy file for re-training the RDP Classifier included in QIIME2 can be downloaded by clicking here. See this page for how to re-train the classifier.
ASV Tables Created in R
ASV tables created using the Bioconductor/R version of DADA2 are matrix files with samples as rows and taxa as columns. The taxa names are the sequences themselves. Because these matrices can be quite large they are most conveniently saved as compressed rds files. Read these files into R and create an experiment level phyloseq object containing an OTU or ASV table and representative sequences with the following R script: Continue reading “Import DADA2 ASV Tables into phyloseq”
Compact letter displays (CLDs) are letters that show which treatment groups are not significantly different by some statistical test. It is often desirable to include CLDs on graphs. Here I show how to add them to a box plot created with ggplot2. Continue reading “Compact Letter Displays”
I have added a new function (
get_plot_limits) to my package
QsRutils. It extracts the minimum and maximum X and Y values for a ggplot panel. This is useful in formatting ggplots. For example, you may wish to expand the panel to avoid text running out of the panel, or nudge text relative to some point. For an example, see my post on adding compact letter displays to box plots created with
The method of rooting trees described in the post “Unifrac and Tree Roots” is now included in
QsRutils beginning with version 0.3.2 as function
root_phyloseq_tree. Given a phyloseq object with an unrooted tree, it returns the same type of
phyloseq object with the tree rooted by the longest terminal branch.
Unifrac distances have the attraction of including phylogenetic relatedness, based on a tree of the representative sequences, in the distances among samples calculated from an OTU table. FastTree is the usual method of choice in generating the tree, although USEARCH also provides a method. Both methods calculate unrooted trees, and calculation of Unifrac distances requires a rooted tree. The problem arises in how to best root the tree. I found a discussion of the problem on the
phyloseq GitHub site (https://github.com/joey711/phyloseq/issues/597). Continue reading “Unifrac and Tree Roots”
If you installed R from the Ubuntu repository with
sudo apt-get install r-base
you most likely got an out of date version. In February 2018, that method still gave me R version 3.2.3 (2015-12-10). To get the latest versions of R and its packages, you need to add CRAN to the apt-get repositories. Do this with the code below. Enter one line at a time. Cut and paste to prevent errors. Continue reading “Update R in Ubuntu”
If you need to know the execution time for a bash script, you can place it inside the script below. The the total run time will be printed to the screen after the script finishes. Continue reading “Get Execution Time for a Shell Script”
Gedit is the basic editor that is included in Ubuntu and other Linux distributions. Its functionality can be extended with plugins as explained in the post below. I installed the plugin initially because it allows one to comment or un-comment selected lines of text. I find this useful when I want to include two configuration blocks in a script, say one for a local installation of a program and another for a remote installation on a cluster. If you do this, just make sure the appropriate blocks are commented and un-commented when you run the script.
Source: Code Comment – gedit Plugin | Delightly Linux
The R package vegan includes the function ordiplot for making ordination plots using R’s base graphics. Additionally vegan provides several functions for enhancing the plots with spiders, hulls, and ellipses. It is even possible to overlay an ordination plot with a cluster diagram. (See my package ggordiplots on GitHub for making the same kinds of plots with ggplot graphics.) Vegan’s functions for adding these layers begin with “ordi”: ordispider, ordihull, ordiellipse, ordicluster. Earlier this year I discovered the package vegan3d. It makes use of rgl graphics which means that the plots it generates can be scaled and rotated with the mouse. This is not just fun – it allows you to see how well separated treatment groups are in the ordination space.
Continue reading “Rotatable 3D Plots”