Processing and Analyzing Sequencing Data

This site is primarily about processing and analyzing DNA sequencing data for the purpose of characterizing microbial communities. I started it as a repository for tutorials I have written and materials for workshops I have taught. Many of these concern how to use the tools provided by the RDP at Michigan State University. These tools have expanded from the original web-based tools for trimming, sorting, classifying and clustering bacterial rRNA gene sequences. They presently include web-based and command line versions of FunGene and Xander for functional genes, and now, in collaboration with Kostas Konstantinidis’ lab at Georgia Tech, MiGA for genome classification.  Other tutorials are concerned with how to get data into R, and especially phyloseq, for downstream analysis of microbial community structure. To ease these processes I have included R functions and packages with detailed vignettes.

Be sure to check out the Resources page for books and web resources I recommend. These can help you with tasks beyond the ones covered on this web site.

Sequencing technology has changed greatly over the last 20 years, and R and its library of packages have continuously evolved, too. This means that to keep up this “old dog” has to constantly learn new tricks. I will blog here about the new tricks I discover, not only to share with you, but because this old dog is sometimes forgetful and needs a place to be reminded of lessons previously learned.

I am constantly adding to this site as time permits, so please check back often.