Microbiology, Metagenomics and Bioinformatics

Johan Bengtsson-Palme, University of Gothenburg

Browsing Posts tagged Taxonomy

Mitochondrial DNA Part B today published a mitochondrial genome announcement paper (1) in which I was involved in doing the assemblies and annotating them. The paper describes the mitogenome of Calanus glacialis, a marine planktonic copepod, which is a keystone species in the Arctic Ocean. The mitogenome is 20,674 bp long, and includes 13 protein-coding genes, 2 rRNA genes and 22 tRNA genes. While this is of course note a huge paper, we believe that this new resource will be of interest in understanding the structure and dynamics of C. glacialis populations. The main work in this paper has been carried out by Marvin Choquet at Nord University in Bodø, Norway. So hats off to him for great work, thanks Marvin! The paper can be read here.


  1. Choquet M, Alves Monteiro HJ, Bengtsson-Palme J, Hoarau G: The complete mitochondrial genome of the copepod Calanus glacialis. Mitochondrial DNA Part B, 2, 2, 506–507 (2017). doi: 10.1080/23802359.2017.1361357 [Paper link]

Today, I am very happy to announce that after years in the making and months in testing, the next generation of ITSx, version 1.1, is ready to step into the public light and scrutiny. I have today uploaded a public beta version of the ITSx 1.1 release, which I encourage everyone that have enjoyed using ITSx to try out.

The 1.1 release of ITSx includes a wide range of new feature, including:

  • A 2-10x performance increase (depending on the dataset), since ITSx now utilizes hmmsearch instead of hmmscan to detect the ITS regions and distributes the CPU cores better
  • Improved ITS detection among fungi and chlorophyta, by addition of new HMM-profiles
  • The HMM profile format for ITSx has been updated to HMMER3/f (thus ITSx now requires HMMER version 3.1 or later)
  • Better handling of interrupted HMMER searches
  • Added the --require_anchor option to only include sequences where the complete anchor is found in the output
  • Added the possibility for partial sequence output for the SSU, LSU and 5.8S regions
  • Fixed a bug causing problems when reading sequence data from standard input

A lot of the code has changed in this version, which means that there might still be bugs lingering in the program. Since I will be on vacation throughout July, I encourage everyone to submit bug reports and questions, but I will not promise to respond to them until in August.

I hope that you will enjoy this new ITSx release, which you can download here. Happy barcoding!

Today, a review paper which I wrote together with Joakim Larsson and Erik Kristiansson was published in Journal of Antimicrobial Chemotherapy (1). We have for a long time used metagenomic DNA sequencing to study antibiotic resistance in different environments (2-6), including in the human microbiota (7). Generally, our ultimate purpose has been to assess the risks to human health associated with resistance genes in the environment. However, a multitude of methods exist for metagenomic data analysis, and over the years we have learned that not all methods are suitable for the investigation of resistance genes for this purpose. In our review paper, we describe and discuss current methods for sequence handling, mapping to databases of resistance genes, statistical analysis and metagenomic assembly. We also provide an overview of important considerations related to the analysis of resistance genes, and end by recommending some of the currently used tools, databases and methods that are best equipped to inform research and clinical practice related to antibiotic resistance (see the figure from the paper below). We hope that the paper will be useful to researchers and clinicians interested in using metagenomic sequencing to better understand the resistance genes present in environmental and human-associated microbial communities.


  1. Bengtsson-Palme J, Larsson DGJ, Kristiansson E: Using metagenomics to investigate human and environmental resistomes. Journal of Antimicrobial Chemotherapy, advance access (2017). doi: 10.1093/jac/dkx199 [Paper link]
  2. Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, 5, 648 (2014). doi: 10.3389/fmicb.2014.00648 [Paper link]
  3. Lundström S, Östman M, Bengtsson-Palme J, Rutgersson C, Thoudal M, Sircar T, Blanck H, Eriksson KM, Tysklind M, Flach C-F, Larsson DGJ: Minimal selective concentrations of tetracycline in complex aquatic bacterial biofilms. Science of the Total Environment, 553, 587–595 (2016). doi: 10.1016/j.scitotenv.2016.02.103 [Paper link]
  4. Bengtsson-Palme J, Hammarén R, Pal C, Östman M, Björlenius B, Flach C-F, Kristiansson E, Fick J, Tysklind M, Larsson DGJ: Elucidating selection processes for antibiotic resistance in sewage treatment plants using metagenomics. Science of the Total Environment, 572, 697–712 (2016). doi: 10.1016/j.scitotenv.2016.06.228 [Paper link]
  5. Pal C, Bengtsson-Palme J, Kristiansson E, Larsson DGJ: The structure and diversity of human, animal and environmental resistomes. Microbiome, 4, 54 (2016). doi: 10.1186/s40168-016-0199-5 [Paper link]
  6. Flach C-F, Pal C, Svensson CJ, Kristiansson E, Östman M, Bengtsson-Palme J, Tysklind M, Larsson DGJ: Does antifouling paint select for antibiotic resistance? Science of the Total Environment, 590–591, 461–468 (2017). doi: 10.1016/j.scitotenv.2017.01.213 [Paper link]
  7. Bengtsson-Palme J, Angelin M, Huss M, Kjellqvist S, Kristiansson E, Palmgren H, Larsson DGJ, Johansson A: The human gut microbiome as a transporter of antibiotic resistance genes between continents. Antimicrobial Agents and Chemotherapy, 59, 10, 6551–6560 (2015). doi: 10.1128/AAC.00933-15 [Paper link]

Yesterday, Molecular Ecology Resources put online an unedited version of a recent paper which I co-authored. This time, Rodney Richardson at Ohio State University has made a tremendous work of evaluating three taxonomic classification software – the RDP Naïve Bayesian Classifier, RTAX and UTAX – on a set of DNA barcoding regions commonly used for plants, namely the ITS2, and the matK, rbcL, trnL and trnH genes.

In the paper (1), we discuss the results, merits and limitations of the classifiers. In brief, we found that:

  • There is a considerable trade-off between accuracy and sensitivity for the classifiers tested, which indicates a need for improved sequence classification tools (2)
  • UTAX was superior with respect to error rate, but was exceedingly stringent and thus suffered from a low assignment rate
  • The RDP Naïve Bayesian Classifier displayed high sensitivity and low error at the family and order levels, but had a genus-level error rate of 9.6 percent
  • RTAX showed high sensitivity at all taxonomic ranks, but at the same time consistently produced the high error rates
  • The choice of locus has significant effects on the classification sensitivity of all tested tools
  • All classifiers showed strong relationships between database completeness, classification sensitivity and classification accuracy

We believe that the methods of comparison we have used are simple and robust, and thereby provides a methodological and conceptual foundation for future software evaluations. On a personal note, I will thoroughly enjoy working with Rodney and Reed again; I had a great time discussing the ins and outs of taxonomic classification with them! The paper can be found here.

References and notes

  1. Richardson RT, Bengtsson-Palme J, Johnson RM: Evaluating and Optimizing the Performance of Software Commonly Used for the Taxonomic Classification of DNA Sequence Data. Molecular Ecology Resources, Early view (2016). doi: 10.1111/1755-0998.12628 [Paper link]
  2. This is something that several classifiers also showed in the evaluation we did for the Metaxa2 paper (3). Interestingly enough, Metaxa2 is better at maintaining high accuracy also as sensitivity is increased.
  3. Bengtsson-Palme J, Hartmann M, Eriksson KM, Pal C, Thorell K, Larsson DGJ, Nilsson RH: Metaxa2: Improved identification and taxonomic classification of small and large subunit rRNA in metagenomic data. Molecular Ecology Resources, 15, 6, 1403–1414 (2015). doi: 10.1111/1755-0998.12399 [Paper link]

Late yesterday, Microbiome put online our most recent work, covering the resistomes to antibiotics, biocides and metals across a vast range of environments. In the paper (1), we perform the largest characterization of resistance genes, mobile genetic elements (MGEs) and bacterial taxonomic compositions to date, covering 864 different metagenomes from humans (350), animals (145) and external environments such as soil, water, sewage, and air (369 in total). All the investigated metagenomes were sequenced to at least 10 million reads each, using Illumina technology, making the results more comparable across environments than in previous studies (2-4).

We found that the environment types had clear differences both in terms of resistance profiles and bacterial community composition. Humans and animals hosted microbial communities with limited taxonomic diversity as well as low abundance and diversity of biocide/metal resistance genes and MGEs. On the contrary, the abundance of ARGs was relatively high in humans and animals. External environments, on the other hand, showed high taxonomic diversity and high diversity of biocide/metal resistance genes and MGEs. Water, sediment and soil generally carried low relative abundance and few varieties of known ARGs, whereas wastewater and sludge were on par with the human gut. The environments with the largest relative abundance and diversity of ARGs, including genes encoding resistance to last resort antibiotics, were those subjected to industrial antibiotic pollution and air samples from a Beijing smog event.

A paper investigating this vast amount of data is of course hard to describe in a blog post, so I strongly suggest the interested reader to head over to Microbiome’s page and read the full paper (1). However, here’s a ver short summary of the findings:

  • The median relative abundance of ARGs across all environments was 0.035 copies per bacterial 16S rRNA
  • Antibiotic-polluted environments have (by far) the highest abundances of ARGs
  • Urban air samples carried high abundance and diversity of ARGs
  • Human microbiota has high abundance and diversity of known ARGs, but low taxonomic diversity compared to the external environment
  • The human and animal resistomes are dominated by tetracycline resistance genes
  • Over half of the ARGs were only detected in external environments, while 20.5 % were found in human, animal and at least one of the external environments
  • Biocide and metal resistance genes are more common in external environments than in the human microbiota
  • Human microbiota carries low abundance and richness of MGEs compared to most external environments

Importantly, less than 1.5 % of all detected ARGs were found exclusively in the human microbiome. At the same time, 57.5 % of the known ARGs were only detected in metagenomes from environmental samples, despite that the majority of the investigated ARGs were initially encountered in pathogens. Still, our analysis suggests that most of these genes are relatively rare in the human microbiota. Environmental samples generally contained a wider distribution of resistance genes to a more diverse set of antibiotics classes. For example, the relative abundance of beta-lactam resistance genes was much larger in external environments than in human and animal microbiomes. This suggests that the external environment harbours many more varieties of resistance genes than the ones currently known from the clinic. Indeed, functional metagenomics has resulted in the discovery of many novel ARGs in external environments (e.g. 5). This all fits well with an overall much higher taxonomic diversity of environmental microbial communities. In terms of consequences associated with the potential transfer of ARGs to human pathogens, we argue that unknown resistance genes are of greater concern than those already known to circulate among human-associated bacteria (6).

This study describes the potential for many external environments, including those subjected to pharmaceutical pollution, air and wastewater/sludge, to serve as hotspots for resistance development and/or transmission of ARGs. In addition, our results indicate that these environments may play important roles in the mobilization of yet unknown ARGs and their further transmission to human pathogens. To provide guidance for risk-reducing actions we – based on this study – suggest strict regulatory measures of waste discharges from pharmaceutical industries and encourage more attention to air in the transmission of antibiotic resistance (1).


  1. Pal C, Bengtsson-Palme J, Kristiansson E, Larsson DGJ: The structure and diversity of human, animal and environmental resistomes. Microbiome, 4, 54 (2016). doi: 10.1186/s40168-016-0199-5
  2. Durso LM, Miller DN, Wienhold BJ. Distribution and quantification of antibiotic resistant genes and bacteria across agricultural and non-agricultural metagenomes. PLoS One. 2012;7:e48325.
  3. Nesme J, Delmont TO, Monier J, Vogel TM. Large-scale metagenomic-based study of antibiotic resistance in the environment. Curr Biol. 2014;24:1096–100.
  4. Fitzpatrick D, Walsh F. Antibiotic resistance genes across a wide variety of metagenomes. FEMS Microbiol Ecol. 2016. doi:10.1093/femsec/fiv168.
  5. Allen HK, Moe LA, Rodbumrer J, Gaarder A, Handelsman J. Functional metagenomics reveals diverse β-lactamases in a remote Alaskan soil. ISME J. 2009;3:243–51.
  6. Bengtsson-Palme J, Larsson DGJ: Antibiotic resistance genes in the environment: prioritizing risks. Nature Reviews Microbiology, 13, 369 (2015). doi: 10.1038/nrmicro3399-c1

MycoKeys today put a paper online which I was involved in. The paper describes the results of a workshop in May, when we added and refined annotations for fungal ITS sequences according to the MIxS-Built Environment annotation standard (1). Fungi have been associated with a range of unwanted effects in the built environment, including asthma, decay of building materials, and food spoilage. However, the state of the metadata annotation of fungal DNA sequences from the built environment is very much incomplete in public databases. The workshop aimed to ease a little part of this problem, by distributing the re-annotation of public fungal ITS sequences across 36 persons. In total, we added or changed of 45,488 data points drawing from published literature, including addition of 8,430 instances of countries of collection, 5,801 instances of building types, and 3,876 instances of surface-air contaminants. The results have been implemented in the UNITE database and shared with other online resources. I believe, that distributed initiatives like this (and the ones I have been involved in in the past (2,3)) serve a very important purpose for establishing better annotation of sequence data, an issue I have brought up also for sequences outside of barcoding genes (4). The full paper can be found here.


  1. Abarenkov K, Adams RI, Laszlo I, Agan A, Ambrioso E, Antonelli A, Bahram M, Bengtsson-Palme J, Bok G, Cangren P, Coimbra V, Coleine C, Gustafsson C, He J, Hofmann T, Kristiansson E, Larsson E, Larsson T, Liu Y, Martinsson S, Meyer W, Panova M, Pombubpa N, Ritter C, Ryberg M, Svantesson S, Scharn R, Svensson O, Töpel M, Untersehrer M, Visagie C, Wurzbacher C, Taylor AFS, Kõljalg U, Schriml L, Nilsson RH: Annotating public fungal ITS sequences from the built environment according to the MIxS-Built Environment standard – a report from a May 23-24, 2016 workshop (Gothenburg, Sweden). MycoKeys, 16, 1–15 (2016). doi: 10.3897/mycokeys.16.10000
  2. Kõljalg U, Nilsson RH, Abarenkov K, Tedersoo L, Taylor AFS, Bahram M, Bates ST, Bruns TT, Bengtsson-Palme J, Callaghan TM, Douglas B, Drenkhan T, Eberhardt U, Dueñas M, Grebenc T, Griffith GW, Hartmann M, Kirk PM, Kohout P, Larsson E, Lindahl BD, Lücking R, Martín MP, Matheny PB, Nguyen NH, Niskanen T, Oja J, Peay KG, Peintner U, Peterson M, Põldmaa K, Saag L, Saar I, Schüßler A, Senés C, Smith ME, Suija A, Taylor DE, Telleria MT, Weiß M, Larsson KH: Towards a unified paradigm for sequence-based identification of Fungi. Molecular Ecology, 22, 21, 5271–5277 (2013). doi: 10.1111/mec.12481
  3. Nilsson RH, Hyde KD, Pawlowska J, Ryberg M, Tedersoo L, Aas AB, Alias SA, Alves A, Anderson CL, Antonelli A, Arnold AE, Bahnmann B, Bahram M, Bengtsson-Palme J, Berlin A, Branco S, Chomnunti P, Dissanayake A, Drenkhan R, Friberg H, Frøslev TG, Halwachs B, Hartmann M, Henricot B, Jayawardena R, Jumpponen A, Kauserud H, Koskela S, Kulik T, Liimatainen K, Lindahl B, Lindner D, Liu J-K, Maharachchikumbura S, Manamgoda D, Martinsson S, Neves MA, Niskanen T, Nylinder S, Pereira OL, Pinho DB, Porter TM, Queloz V, Riit T, Sanchez-García M, de Sousa F, Stefaczyk E, Tadych M, Takamatsu S, Tian Q, Udayanga D, Unterseher M, Wang Z, Wikee S, Yan J, Larsson E, Larsson K-H, Kõljalg U, Abarenkov K: Improving ITS sequence data for identification of plant pathogenic fungi. Fungal Diversity, 67, 1, 11–19 (2014). doi: 10.1007/s13225-014-0291-8
  4. Bengtsson-Palme J, Boulund F, Edström R, Feizi A, Johnning A, Jonsson VA, Karlsson FH, Pal C, Pereira MB, Rehammar A, Sánchez J, Sanli K, Thorell K: Strategies to improve usability and preserve accuracy in biological sequence databases. Proteomics, Early view (2016). doi: 10.1002/pmic.201600034

After a long wait (1), Science of the Total Environment has finally decided to make our paper on selection of antibiotic resistance genes in sewage treatment plants (STPs) available (2). STPs are often suggested to be “hotspots” for emergence and dissemination of antibiotic-resistant bacteria (3-6). However, we actually do not know if the selection pressures within STPs, that can be caused either by residual antibiotics or other co-selective agents, are sufficiently large to specifically promote resistance. To better understand this, we used shotgun metagenomic sequencing of samples from different steps of the treatment process (incoming water, treated water, primary sludge, recirculated sludge and digested sludge) in three Swedish STPs in the Stockholm area to characterize the frequencies of resistance genes to antibiotics, biocides and metal, as well as mobile genetic elements and taxonomic composition. In parallel, we also measured concentrations of antibiotics, biocides and metals.

We found that only the concentrations of tetracycline and ciprofloxacin in the influent water were above those that we predict to cause resistance selection (7). However, there was no consistent enrichment of resistance genes to any particular class of antibiotics in the STPs, neither for biocide and metal resistance genes. Instead, the most substantial change of the bacterial communities compared to human feces (sampled from Swedes in another study of ours (8)) occurred already in the sewage pipes, and was manifested by a strong shift from obligate to facultative anaerobes. Through the treatment process, resistance genes against antibiotics, biocides and metals were not reduced to the same extent as fecal bacteria were.

Worryingly, the OXA-48 beta-lactamase gene was consistently enriched in surplus and digested sludge. OXA-48 is still rare in Swedish clinical isolates (9), but provides resistance to carbapenems, one of our most critically important classes of antibiotics. However, taken together metagenomic sequencing did not provide clear support for any specific selection of antibiotic resistance. Rather, since stronger selective forces affect gross taxonomic composition, and thereby also resistance gene abundances, it is very hard to interpret the metagenomic data from a risk-for-selection perspective. We therefore think that comprehensive analyses of resistant vs. non-resistant strains within relevant species are warranted.

Taken together, the main take-home messages of the paper (2) are:

  • There were no apparent evidence for direct selection of resistance genes by antibiotics or co-selection by biocides or metals
  • Abiotic factors (mostly oxygen availability) strongly shape taxonomy and seems to be driving changes of resistance genes
  • Metagenomic and/or PCR-based community studies may not be sufficiently sensitive to detect selection effects, as important shifts towards resistant may occur within species and not on the community level
  • The concentrations of antibiotics, biocides and metals were overall reduced, but not removed in STPs. Incoming concentrations of antibiotics in Swedish STPs are generally low
  • Resistance genes are overall reduced through the treatment process, but far from eliminated

References and notes

  1. Okay, those who takes notes know that I have already complained once before on Science of the Total Environment’s ridiculously long production handling times. But, seriously, how can a journal’s production team return the proofs for after three days of acceptance, and then wait seven weeks before putting the final proofs online? I still wonder what is going on beyond the scenes, which is totally obscure because the production office also refuses to respond to e-mails. Not a nice publishing experience this time either.
  2. Bengtsson-Palme J, Hammarén R, Pal C, Östman M, Björlenius B, Flach C-F, Kristiansson E, Fick J, Tysklind M, Larsson DGJ: Elucidating selection processes for antibiotic resistance in sewage treatment plants using metagenomics. Science of the Total Environment, in press (2016). doi: 10.1016/j.scitotenv.2016.06.228 [Paper link]
  3. Rizzo L, Manaia C, Merlin C, Schwartz T, Dagot C, Ploy MC, Michael I, Fatta-Kassinos D: Urban wastewater treatment plants as hotspots for antibiotic resistant bacteria and genes spread into the environment: a review. Science of the Total Environment, 447, 345–360 (2013). doi: 10.1016/j.scitotenv.2013.01.032
  4. Laht M, Karkman A, Voolaid V, Ritz C, Tenson T, Virta M, Kisand V: Abundances of Tetracycline, Sulphonamide and Beta-Lactam Antibiotic Resistance Genes in Conventional Wastewater Treatment Plants (WWTPs) with Different Waste Load. PLoS ONE, 9, e103705 (2014). doi: 10.1371/journal.pone.0103705
  5. Yang Y, Li B, Zou S, Fang HHP, Zhang T: Fate of antibiotic resistance genes in sewage treatment plant revealed by metagenomic approach. Water Research, 62, 97–106 (2014). doi: 10.1016/j.watres.2014.05.019
  6. Berendonk TU, Manaia CM, Merlin C, Fatta-Kassinos D, Cytryn E, Walsh F, et al.: Tackling antibiotic resistance: the environmental framework. Nature Reviews Microbiology, 13, 310–317 (2015). doi: 10.1038/nrmicro3439
  7. Bengtsson-Palme J, Larsson DGJ: Concentrations of antibiotics predicted to select for resistant bacteria: Proposed limits for environmental regulation. Environment International, 86, 140–149 (2016). doi: 10.1016/j.envint.2015.10.015
  8. Bengtsson-Palme J, Angelin M, Huss M, Kjellqvist S, Kristiansson E, Palmgren H, Larsson DGJ, Johansson A: The human gut microbiome as a transporter of antibiotic resistance genes between continents. Antimicrobial Agents and Chemotherapy, 59, 10, 6551–6560 (2015). doi: 10.1128/AAC.00933-15
  9. Hellman J, Aspevall O, Bengtsson B, Pringle M: SWEDRES-SVARM 2014. Consumption of antimicrobials and occurrence of antimicrobial resistance in Sweden. Public Health Agency of Sweden and National Veterinary Institute, Solna/Uppsala, Sweden. Report No.: 14027. Available from: http://www.folkhalsomyndigheten.se/publicerat-material/ (2014)

Today marks the five year anniversary for the Metaxa software’s initial release. Much has happened to the software since; Metaxa started off as an rRNA extraction utility for metagenomic data (1), including coarse classification to organism/organelle type. Since it has gained full-scale taxonomic classification ability better or on par with other software packages (2), much greater speed, support for the LSU gene, gained a range of related software tools (3), and spurred development of other tools such as ITSx (4). I have also been involved in no less than four peer-reviewed publications directly related to the software (1-3,5).

But it does not end here; these five years were just the beginning. We are – in different constellations – working on further enhancements to Metaxa2, including support for more genes, an updated classification database, and better customization options. I am very much still devoted to keep Metaxa2 alive and relevant as a tool for taxonomic analysis of metagenomes, applicable whenever accuracy is a key parameter. Thanks for being part of the community for these five years!


  1. Bengtsson J, Eriksson KM, Hartmann M, Wang Z, Shenoy BD, Grelet G, Abarenkov K, Petri A, Alm Rosenblad M, Nilsson RH: Metaxa: A software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets. Antonie van Leeuwenhoek, 100, 3, 471–475 (2011). doi:10.1007/s10482-011-9598-6. [Paper link]
  2. Bengtsson-Palme J, Hartmann M, Eriksson KM, Pal C, Thorell K, Larsson DGJ, Nilsson RH: Metaxa2: Improved identification and taxonomic classification of small and large subunit rRNA in metagenomic data. Molecular Ecology Resources, 15, 6, 1403–1414 (2015). doi: 10.1111/1755-0998.12399 [Paper link]
  3. Bengtsson-Palme J, Thorell K, Wurzbacher C, Sjöling Å, Nilsson RH: Metaxa2 Diversity Tools: Easing microbial community analysis with Metaxa2. Ecological Informatics, 33, 45–50 (2016). doi: 10.1016/j.ecoinf.2016.04.004 [Paper link]
  4. Bengtsson-Palme J, Ryberg M, Hartmann M, Branco S, Wang Z, Godhe A, De Wit P, Sánchez-García M, Ebersberger I, de Souza F, Amend AS, Jumpponen A, Unterseher M, Kristiansson E, Abarenkov K, Bertrand YJK, Sanli K, Eriksson KM, Vik U, Veldre V, Nilsson RH: Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for use in environmental sequencing. Methods in Ecology and Evolution, 4, 10, 914–919 (2013). doi: 10.1111/2041-210X.12073 [Paper link]
  5. Bengtsson-Palme J, Hartmann M, Eriksson KM, Nilsson RH: Metaxa, overview. In:Nelson K. (Ed.) Encyclopedia of Metagenomics: SpringerReference (www.springerreference.com). Springer-Verlag Berlin Heidelberg (2013). doi: 10.1007/978-1-4614-6418-1_239-6 [Link]

Yesterday, Ecological Informatics put our paper describing Metaxa2 Diversity Tools online (1). Metaxa2 Diversity Tools was introduced with Metaxa2 version 2.1 and consists of

  • metaxa2_dc – a tool for collecting several .taxonomy.txt output files into one large abundance matrix, suitable for analysis in, e.g., R
  • metaxa2_rf – generates resampling rarefaction curves (2) based on the .taxonomy.txt output
  • metaxa2_si – species inference based on guessing species data from the other species present in the .taxonomy.txt output file
  • metaxa2_uc – a tool for determining if the community composition of a sample is significantly different from others through resampling analysis

At the same time as I did this update to the web site, I also took the opportunity to update the Metaxa2 FAQ to better reflect recent updates to the Metaxa2 software.

Metaxa2 Diversity Tools
One often requested feature of Metaxa2 (3) has been the ability to make simple analyses from the data after classification. The Metaxa2 Diversity Tools included in Metaxa2 2.1 is a seed for such an effort (although not close to a full-fledged community analysis package comparable to QIIME (4) or Mothur (5)). It currently consist of four tools.

The Metaxa2 Data Collector (metaxa2_dc) is the simplest of them (but probably the most requested), designed to merge the output of several *.level_X.txt files from the Metaxa2 Taxonomic Traversal Tool into one large abundance matrix, suitable for further analysis in, for example, R. The Metaxa2 Species Inference tool (metaxa2_si) can be used to further infer taxon information on, for example, the species level at a lower reliability than what would be permitted by the Metaxa2 classifier, using a complementary algorithm. The idea is that is if only a single species is present in, e.g., a family and a read is assigned to this family, but not classified to the species level, that sequence will be inferred to the same species as the other reads, given that it has more than 97% sequence identity to its best reference match. This can be useful if the user really needs species or genus classifications but many organisms in the studied species group have similar rRNA sequences, making it hard for the Metaxa2 classifier to classify sequences to the species level.

The Metaxa2 Rarefaction analysis tool (metaxa2_rf) performs a resampling rarefaction analysis (2) based on the output from the Metaxa2 classifier, taking into account also the unclassified portion of rRNAs. The Metaxa2 Uniqueness of Community analyzer (metaxa2_uc), finally, allows analysis of whether the community composition of two or more samples or groups is significantly different. Using resampling of the community data, the null hypothesis that the taxonomic content of two communities is drawn from the same set of taxa (given certain abundances) is tested. All these tools are further described in the manual and the recent paper (1).

The latest version of Metaxa2, including the Metaxa2 Diversity Tools, can be downloaded here.


  1. Bengtsson-Palme J, Thorell K, Wurzbacher C, Sjöling Å, Nilsson RH: Metaxa2 Diversity Tools: Easing microbial community analysis with Metaxa2. Ecological Informatics, 33, 45–50 (2016). doi: 10.1016/j.ecoinf.2016.04.004 [Paper link]
  2. Gotelli NJ, Colwell RK: Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness. Ecology Letters, 4, 379–391 (2000). doi:10.1046/j.1461-0248.2001.00230.x
  3. Bengtsson-Palme J, Hartmann M, Eriksson KM, Pal C, Thorell K, Larsson DGJ, Nilsson RH: Metaxa2: Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data. Molecular Ecology Resources (2015). doi: 10.1111/1755-0998.12399 [Paper link]
  4. Caporaso JG, Kuczynski J, Stombaugh J et al.: QIIME allows analysis of high-throughput community sequencing data. Nature Methods, 7, 335–336 (2010).
  5. Schloss PD, Westcott SL, Ryabin T et al.: Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environmental Microbiology, 75, 7537–7541 (2009).

Metaxa2 has been updated again today to version 2.1.3. This update adds a few features to the Metaxa2 Diversity Tools (metaxa2_uc and metaxa2_rf). The core Metaxa2 programs remain the same as for the previous Metaxa2 versions. The new features were suggested as part of the review process of a Metaxa2-related manuscript, and we thank the anonymous reviewers for their great suggestions!

New features and bug fixes in this update:

  • Added the Chao1, iChao1 and ACE estimators in addition to the original species abundance (“Bengtsson-Palme”) model in metaxa2_rf
  • Added the Raup-Crick dissimilarity method to the metaxa2_uc tool
  • Added a warning message when data is highly skewed for metaxa2_uc
  • Improved robustness of the ‘model’ mode of metaxa2_uc for highly skewed sample groups
  • Fixed a bug causing miscalculation of Euclidean distances on binary data in metaxa2_uc

The updated version of Metaxa2 can be downloaded here.

Happy barcoding!