FIGARO

FIGARO is a program written by the folks at Zymo Research to take the guess work out of deciding what truncation parameters to use with the QIIME2 DADA2 plug-in. These parameters are chosen to minimize expected errors in forward and reverse reads, provide enough overlap length to merge the reads, and simultaneously maximize the number of reads passing this filtration step. A pre-print describing the program and its use is available here. The program is available on GitHub along with instructions for installing a Docker version and a python command line version. I used conda to create an environment for FIGARO and having done so, I can share my FIGARO yml file with you to make it easier for you to install your own FIGARO environment . If you have not created environments before, see my tutorial Conda and Virtual Environments.

Install FIGARO

First, create a conda environment by doing the following:

Open a terminal and move into your home directory. If your installation of miniconda requires it, source conda.sh:

cd
source ~/miniconda3/etc/profile.d/conda.sh

Then create your own FIGARO environment using my FIGARO yml file:

wget https://john-quensen.com/wp-content/uploads/2023/12/figaro_conda.yml
conda env create -n figaro -f figaro_conda.yml

Next, download and install FIGARO by running the following commands from your home directory:

wget https://github.com/Zymo-Research/figaro/archive/master.zip
unzip master.zip
rm master.zip
cd figaro-master/figaro
chmod 755 *.py

Test FIGARO

Notice: I run Ubuntu in WSL2 on Windows 11, and since updating to version 22H2 I get an error when running FIGARO that the program cannot find an X display, something like:

qt.qpa.screen: QXcbConnection: Could not connect to display 172.22.48.1:0
Could not connect to any X display.

If you encounter this problem, it can be resolved by running the following command before running figaro.py:

export DISPLAY=:0.0

You can test your FIGARO installation by running the code below from your home directory. Copy and paste it into a file named test_figaro.sh and run it. (Alternatively you can download the file with wget https://github.com/jfq3/Miscellaneous-scripts/raw/master/test_figaro.sh).

#!/bin/bash

# Test FIGARO installation
# Activate the FIGARO environment
source ~/miniconda3/etc/profile.d/conda.sh # If necessary for your conda installation.
conda activate figaro

cd 
mkdir test_figaro
cd test_figaro

# Download example files from the QIIME2 tutorial pages
wget "https://data.qiime2.org/2020.2/tutorials/atacama-soils/1p/forward.fastq.gz"
wget "https://data.qiime2.org/2020.2/tutorials/atacama-soils/1p/reverse.fastq.gz"
 
 # Decompress
 gzip -d *.fastq.gz
 
 # Rename the files in Zymo format
 mv forward.fastq sam1_16s_R1.fastq
 mv reverse.fastq sam1_16s_R2.fastq
 
 # Run FIGARO
 # cd to installation folder
 cd ~/figaro-master/figaro
 python figaro.py -i ~/test_figaro/ -o ~/test_figaro/ -f 10 -r 10 -a 253 -F zymo
 
 conda deactivate

It is not necessary to demultiplex the files for this test. The parameters are:

    • -i    the input directory
    • -o   the output directory
    • -f    the length of the forward primer. Enter 1 if the primer has been removed.
    • -r    the length of the reverse primer. Enter 1 if the primer has been removed.
    • -a    the expected merged amplicon length. You could be conservative and give a slightly larger value.
    • -F    the file name format. The other possible value is illumina.

Several files are written to the output directory:

    • trimParameters.json
    • forwardExpectedError.png
    • reverseExpectedError.png

To get the recommended truncation parameters, view the beginning of trimParameters.json:

cd ~/test_FIGARO
less trimParametersjson

In this case, you should see the following:

[
{
"trimPosition": [
143,
150
],
"maxExpectedError": [
1,
2
],
"readRetentionPercent": 82.1,
"score": 81.0979134529512
},

The recommended forward truncation position is 143 and the recommended reverse truncation position is 150. After trimming and truncation, the expected number of errors in the forward read is 1 and the expected number of errors in the reverse read is 2. Using these truncation parameters with the QIIME2 DADA2 plug-in should result in 82.1% of the reads passing this filtration step and being passed along to DADA2’s denoising step.

If you encounter any problems installing and testing FIGARO, please email me for help (see the Contacts page).

Some Further Notes

FIGARO requires that all sequences it is given to scan are the same length. This should be true if the sequences are directly from the sequencing facility. If, however, you have altered them in some way they may not be and FIGARO will fail. For example, if you have already trimmed the sequences of primers, they may differ in length, especially between forward and reverse reads. You can remedy that when trimming by specifying that the minimum and maximum length be the same, or after trimming by filtering all of the sequences to be the same length, i.e. the minimum length found among all of the sequences.

FIGARO requires non-zero values be given for the lengths of the forward and reverse primers. If primers are not present, enter 1 for both lengths. You could then add 1 to the suggested truncation parameters, but doing so will not make much difference.

Also, FIGARO requires that the file names be in one of two formats. These are set with the -F parameter to either Illumina or Zymo. Illumina sequencers output file names in the Illumina format. Filenames in the Zymo format have three parts separated by underscores, for example AAA_BBB_CCC.fastq. In some of the examples and exercises on this website I rename files to have this format. You may find occasion to do the same in your own work.