FIGARO

FIGARO is a program written by the folks at Zymo Research to take the guess work out of deciding what truncation parameters to use with the QIIME2 DADA2 plug-in. A pre-print describing the program and its use is available here. The program is available on GitHub along with instructions for installing a Docker version and a python command line version. I used conda to create an environment for FIGARO and having done so, I can share my FIGARO yml file with you to make it easier for you to install your own FIGARO environment . If you have not created environments before, see my tutorial Conda and Virtual Environments.

Install FIGARO

Download my FIGARO yml file , paste it into a text editor (I recommend Notepad++), and save it with line endings appropriate for where you will install FIGARO. For example, if you will be running the program on a cluster, save it with linux/unix line endings. Then move the file into your home directory.

Open a terminal and move into your home directory. If your installation of miniconda requires it, source conda.sh:

source ~/miniconda3/etc/profile.d/conda.sh

Then create your own FIGARO environment:

conda env create FIGARO.yml

Next, download the GitHub repository as a zip file (it will be named figaro-master.zip) and place it in your home directory. From your home directory, run the following commands:

unzip figaro-master.zip
rm figaro-master.zip
mv figaro-master figaro
cd figaro
chmod 755 figaro.py

Test FIGARO

You can test your FIGARO installation by running the following code. Paste it into a file named test_figaro.sh and run it.

#!/bin/bash

# Test FIGARO installation
# Activate the FIGARO environment
source ~/miniconda3/etc/profile.d/conda.sh # If necessary for your conda installation.
conda activate figaro

cd 
mkdir test_figaro
cd test_figaro

# Download example files from the QIIME2 tutorial pages
wget "https://data.qiime2.org/2020.2/tutorials/atacama-soils/1p/forward.fastq.gz"
wget "https://data.qiime2.org/2020.2/tutorials/atacama-soils/1p/reverse.fastq.gz"
 
 # Decompress
 gzip -d *.fastq.gz
 
 # Rename the files in Zymo format
 mv forward.fastq sam1_16s_R1.fastq
 mv reverse.fastq sam1_16s_R2.fastq
 
 # Run FIGARO
 # cd to installation folder
 cd ~/figaro
 python figaro.py -i ~/test_figaro/ -o ~/test_figaro/ -f 10 -r 10 -a 253 -F zymo
 
 conda deactivate

It is not necessary to demultiplex the files for this test. The parameters are:

    • -i    the input directory
    • -o   the output directory
    • -f    the length of the forward primer. Enter 1 if the primer has been removed.
    • -r    the length of the reverse primer. Enter 1 if the primer has been removed.
    • -a    the expected merged amplicon length. You could be conservative and give a slightly larger value.
    • -F    the file name format. The other possible value is illumina.

Several files are written to the output directory:

    • trimParameters.json
    • forwardExpectedError.png
    • reverseExpectedError.png

To get the recommended truncation parameters, view the beginning of trimParameters.json:

cd ~/test_FIGARO
less trimParametersjson

In this case, you should see the following:

[
{
"trimPosition": [
143,
150
],
"maxExpectedError": [
1,
2
],
"readRetentionPercent": 82.1,
"score": 81.0979134529512
},

The recommended forward truncation position is 143 and the recommended reverse truncation position is 150. After trimming and truncation, the expected number of errors in the forward read is 1 and the expected number of errors in the reverse read is 2. Using these truncation parameters with the QIIME2 DADA2 plug-in should result in merging 82.1% of the reads.