Last week, I uploaded a new database to the Metaxa2 Database Repository, called DAIRYdb. DAIRYdb (1) is a manually curated reference database for 16S rRNA amplicon sequences from dairy products. Significant efforts have been put into improving annotation algorithms, such as Metaxa2 (2), while less attention has been put into curation of reliable and consistent databases (3). Previous studies have shown that databases restricted to the studied environment improve unambiguous taxonomy annotation to the species level, thanks to consistent taxonomy, lack of blanks and reduced competition between different reference taxonomies (4-5). The usage of DAIRYdb in combination with different classification tools allows taxonomy annotation accuracy of over 90% at species level for microbiome samples from dairy products, where species identification is mandatory due to the affiliation to few closely related genera of most dominant lactic acid bacteria.
The database can be added to your Metaxa2 (version 2.2 or later) installation by using the following command:
metaxa2_install_database -g SSU_DAIRYdb_v1.1.2
Further adaptations of the DAIRYdb can be found on GitHub and the preprint has been deposited in BioRxiv(1). DAIRYdb was developed by Marco Meola, Etienne Rifa and their collaborators, who also provided most of the text for this post. Thanks Marco for this excellent addition to the database collection!
Meola M, Rifa E, Shani N, Delbes C, Berthoud H, Chassard C: DAIRYdb: A manually curated gold standard reference database for improved taxonomy annotation of 16S rRNA gene sequences from dairy products. bioRxiv, 386151 (2018). doi: 10.1101/386151
Bengtsson-Palme J, Hartmann M, Eriksson KM, Pal C, Thorell K, Larsson DGJ, Nilsson RH: Metaxa2: Improved identification and taxonomic classification of small and large subunit rRNA in metagenomic data. Molecular Ecology Resources, 15, 6, 1403–1414 (2015). doi: 10.1111/1755-0998.12399
Edgar RC: Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences. PeerJ, 6, e4652 (2018). doi: 10.7717/peerj.4652
Ritari J, Salojärvi J, Last L, de Vos WM: Improved taxonomic assignment of human intestinal 16S rRNA sequences by a dedicated reference database. BMC Genomics, 16, 1, 1056 (2015). doi: 10.1186/s12864-015-2265-y
Newton ILG, Roeselers G: The effect of training set on the classification of honey bee gut microbiota using the naïve bayesian classifier. BMC Microbiology, 12, 1, 221 (2012). doi: 10.1186/1471-2180-12-221