After almost a year in different stages of review and revision, in which the paper (but not the software) saw a total transformation, I am happy to announce that the paper describing Metaxa2 has been accepted in Molecular Ecology Resources and is available in a rudimentary online early form. The figures in this version are not that pretty, but those who wants to read the paper asap, you have the possibility to do so.
This means that if you have been using Metaxa2 for a publication, there is now a new preferred way of citing this, namely:
Bengtsson-Palme J, Hartmann M, Eriksson KM, Pal C, Thorell K, Larsson DGJ, Nilsson RH: Metaxa2: Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data. Molecular Ecology Resources (2015). doi: 10.1111/1755-0998.12399
The paper (1), apart from describing the new Metaxa version, also brings a very thorough evaluation of the software, compared to other tools for taxonomic classification implemented in QIIME (2). In short, we show that:
Metaxa2 can make trustworthy taxonomic classifications even with reads as short as 100 bp
Generally, the performance is reliable across the entire SSU rRNA gene, regardless of which V-region a read is derived from
Metaxa2 can reliably recapture species composition from short-read metagenomic data, comparable with results of amplicon sequencing
Metaxa2 outperforms other popular tools such as Mothur (3), the RDP Classifier (4), Rtax (5) and the QIIME implementation of Uclust (6) in terms of proportion of correctly classified reads from metagenomic data
The false positive rate of Metaxa2 is very close to zero; far superior to many of the above mentioned tools, many of which assume that reads must derive from the rRNA gene
Metaxa2 can be downloaded here. We have already used it for around two years internally, and it forms the base of the taxonomic classifications in e.g. our recently published paper on antibiotic resistance in a polluted Indian lake (7).
Bengtsson-Palme J, Hartmann M, Eriksson KM, Pal C, Thorell K, Larsson DGJ, Nilsson RH: Metaxa2: Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data. Molecular Ecology Resources (2015). doi: 10.1111/1755-0998.12399[Paper link]
Caporaso JG, Kuczynski J, Stombaugh J et al.: QIIME allows analysis of high-throughput community sequencing data. Nature Methods, 7, 335–336 (2010).
Schloss PD, Westcott SL, Ryabin T et al.: Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environmental Microbiology, 75, 7537–7541 (2009).
Wang Q, Garrity GM, Tiedje JM, Cole JR: Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and Environmental Microbiology, 73, 5261–5267 (2007).
Soergel DAW, Dey N, Knight R, Brenner SE: Selection of primers for optimal taxonomic classification of environmental 16S rRNA gene sequences. The ISME Journal, 6, 1440–1444 (2012).
Edgar RC: Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26, 2460–2461 (2010).
Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, 5, 648 (2014).