Microbiology, Metagenomics and Bioinformatics

Johan Bengtsson-Palme, University of Gothenburg | Wisconsin Institute for Discovery

Browsing Posts in Bioinformatics

I am part of the organizing committee for the newly invented annual meeting for GOTBIN – the Gothenburg Bioinformatics Network. We will arrange a meeting on December 6 to get the networking activities for 2017 kickstarted, and every bioinformatician in Gothenburg is invited!

GOTBIN was launched to bridge and bring together all researchers in Gothenburg who fully or partially dealt with bioinformatics in their research. Through the network it should be possible to quickly find other local researchers tackling the same research problems as you are; to find appropriate resources to run your analyses; and to discuss research or infrastructure problems as they arise. To keep the network alive and kicking, it is crucial to keep relations active. Furthermore, it is also crucial to interact with key persons in the GOTBIN network to keep the lists of active researchers, resources and discussion forums up to date.

To facilitate future communication, we invite everyone who works with bioinformatics in Gothenburg to participate in a get-together workshop. The purpose of this workshop is to find out who is working with what and where, and to better get to know each other. This is a great opportunity to meet your next collaboration partner, post-doc or supervisor! The event will take place in Birgit Thilander at Medicinareberget (just next to the large lecture hall Arvid Carlsson) on December 6th, from 9.00 to 12.00. Fika will be provided and we will arrange ice-breaker activities suitable for the number of participants, so please register at: https://goo.gl/forms/KYdiiZMBDf0F9hvp2 by the 16th of November.

We hope to find everyone with a research interest in bioinformatics there and that this will be the launch of the next era of GOTBIN in 2017! See you there!

I just want to highlight that the paper on strategies to improve database accuracy and usability we recently published in Proteomics (1) has been included in their most recent issue, which is a special issue focusing on Data Quality Issues in Proteomics. I highly recommend reading our paper (of course) and many of the other in the special issue. Happy reading!

On another note, I will be giving a talk next Wednesday (October 5th) on a seminar day on next generation sequencing in clinical microbiology, titled “Antibiotic resistance in the clinic and the environment – There and back again“. You are very welcome to the lecture hall at floor 3 in our building at Guldhedsgatan 10A here in Gothenburg if you are interested! (Bear in mind though that it all starts at 8.15 in the morning.)

Finally, it seems that I am going to the Next Generation Sequencing Congress in London this year, which will be very fun! Hope to see some of you dealing with sequencing there!

References

  1. Bengtsson-Palme J, Boulund F, Edström R, Feizi A, Johnning A, Jonsson VA, Karlsson FH, Pal C, Pereira MB, Rehammar A, Sánchez J, Sanli K, Thorell K: Strategies to improve usability and preserve accuracy in biological sequence databases. Proteomics, 16, 18, 2454–2460 (2016). doi: 10.1002/pmic.201600034 [Paper link]

MycoKeys today put a paper online which I was involved in. The paper describes the results of a workshop in May, when we added and refined annotations for fungal ITS sequences according to the MIxS-Built Environment annotation standard (1). Fungi have been associated with a range of unwanted effects in the built environment, including asthma, decay of building materials, and food spoilage. However, the state of the metadata annotation of fungal DNA sequences from the built environment is very much incomplete in public databases. The workshop aimed to ease a little part of this problem, by distributing the re-annotation of public fungal ITS sequences across 36 persons. In total, we added or changed of 45,488 data points drawing from published literature, including addition of 8,430 instances of countries of collection, 5,801 instances of building types, and 3,876 instances of surface-air contaminants. The results have been implemented in the UNITE database and shared with other online resources. I believe, that distributed initiatives like this (and the ones I have been involved in in the past (2,3)) serve a very important purpose for establishing better annotation of sequence data, an issue I have brought up also for sequences outside of barcoding genes (4). The full paper can be found here.

References

  1. Abarenkov K, Adams RI, Laszlo I, Agan A, Ambrioso E, Antonelli A, Bahram M, Bengtsson-Palme J, Bok G, Cangren P, Coimbra V, Coleine C, Gustafsson C, He J, Hofmann T, Kristiansson E, Larsson E, Larsson T, Liu Y, Martinsson S, Meyer W, Panova M, Pombubpa N, Ritter C, Ryberg M, Svantesson S, Scharn R, Svensson O, Töpel M, Untersehrer M, Visagie C, Wurzbacher C, Taylor AFS, Kõljalg U, Schriml L, Nilsson RH: Annotating public fungal ITS sequences from the built environment according to the MIxS-Built Environment standard – a report from a May 23-24, 2016 workshop (Gothenburg, Sweden). MycoKeys, 16, 1–15 (2016). doi: 10.3897/mycokeys.16.10000
  2. Kõljalg U, Nilsson RH, Abarenkov K, Tedersoo L, Taylor AFS, Bahram M, Bates ST, Bruns TT, Bengtsson-Palme J, Callaghan TM, Douglas B, Drenkhan T, Eberhardt U, Dueñas M, Grebenc T, Griffith GW, Hartmann M, Kirk PM, Kohout P, Larsson E, Lindahl BD, Lücking R, Martín MP, Matheny PB, Nguyen NH, Niskanen T, Oja J, Peay KG, Peintner U, Peterson M, Põldmaa K, Saag L, Saar I, Schüßler A, Senés C, Smith ME, Suija A, Taylor DE, Telleria MT, Weiß M, Larsson KH: Towards a unified paradigm for sequence-based identification of Fungi. Molecular Ecology, 22, 21, 5271–5277 (2013). doi: 10.1111/mec.12481
  3. Nilsson RH, Hyde KD, Pawlowska J, Ryberg M, Tedersoo L, Aas AB, Alias SA, Alves A, Anderson CL, Antonelli A, Arnold AE, Bahnmann B, Bahram M, Bengtsson-Palme J, Berlin A, Branco S, Chomnunti P, Dissanayake A, Drenkhan R, Friberg H, Frøslev TG, Halwachs B, Hartmann M, Henricot B, Jayawardena R, Jumpponen A, Kauserud H, Koskela S, Kulik T, Liimatainen K, Lindahl B, Lindner D, Liu J-K, Maharachchikumbura S, Manamgoda D, Martinsson S, Neves MA, Niskanen T, Nylinder S, Pereira OL, Pinho DB, Porter TM, Queloz V, Riit T, Sanchez-García M, de Sousa F, Stefaczyk E, Tadych M, Takamatsu S, Tian Q, Udayanga D, Unterseher M, Wang Z, Wikee S, Yan J, Larsson E, Larsson K-H, Kõljalg U, Abarenkov K: Improving ITS sequence data for identification of plant pathogenic fungi. Fungal Diversity, 67, 1, 11–19 (2014). doi: 10.1007/s13225-014-0291-8
  4. Bengtsson-Palme J, Boulund F, Edström R, Feizi A, Johnning A, Jonsson VA, Karlsson FH, Pal C, Pereira MB, Rehammar A, Sánchez J, Sanli K, Thorell K: Strategies to improve usability and preserve accuracy in biological sequence databases. Proteomics, Early view (2016). doi: 10.1002/pmic.201600034

I just wanted to share an experience with the FARAO software we recently published a paper about, and its compatibility with the GD and libpng libraries (used for creating PNG files). I have got questions from users about how to get this to work, and to test it out I decided to try to install it on my Mac. It turned out that it is nearly impossible to get this to work. These two packages are extremely picky with versions and dependencies. After trying for about on hour, I gave up and turned to my Linux machine. Surprisingly, I could not get it to work from scratch there either, despite that I have had it running (with some previous version combination) when we programmed and tested FARAO.

I find this extremely annoying myself, and I will try to look into other solutions for PNG or JPEG output from FARAO. In the mean time, I can only recommend to instead use the EPS output option, which produces more nice-looking figures and is considerably easier to set up. I am sorry about this and hope to be able to provide a better solution soon.

After a long wait (1), Science of the Total Environment has finally decided to make our paper on selection of antibiotic resistance genes in sewage treatment plants (STPs) available (2). STPs are often suggested to be “hotspots” for emergence and dissemination of antibiotic-resistant bacteria (3-6). However, we actually do not know if the selection pressures within STPs, that can be caused either by residual antibiotics or other co-selective agents, are sufficiently large to specifically promote resistance. To better understand this, we used shotgun metagenomic sequencing of samples from different steps of the treatment process (incoming water, treated water, primary sludge, recirculated sludge and digested sludge) in three Swedish STPs in the Stockholm area to characterize the frequencies of resistance genes to antibiotics, biocides and metal, as well as mobile genetic elements and taxonomic composition. In parallel, we also measured concentrations of antibiotics, biocides and metals.

We found that only the concentrations of tetracycline and ciprofloxacin in the influent water were above those that we predict to cause resistance selection (7). However, there was no consistent enrichment of resistance genes to any particular class of antibiotics in the STPs, neither for biocide and metal resistance genes. Instead, the most substantial change of the bacterial communities compared to human feces (sampled from Swedes in another study of ours (8)) occurred already in the sewage pipes, and was manifested by a strong shift from obligate to facultative anaerobes. Through the treatment process, resistance genes against antibiotics, biocides and metals were not reduced to the same extent as fecal bacteria were.

Worryingly, the OXA-48 beta-lactamase gene was consistently enriched in surplus and digested sludge. OXA-48 is still rare in Swedish clinical isolates (9), but provides resistance to carbapenems, one of our most critically important classes of antibiotics. However, taken together metagenomic sequencing did not provide clear support for any specific selection of antibiotic resistance. Rather, since stronger selective forces affect gross taxonomic composition, and thereby also resistance gene abundances, it is very hard to interpret the metagenomic data from a risk-for-selection perspective. We therefore think that comprehensive analyses of resistant vs. non-resistant strains within relevant species are warranted.

Taken together, the main take-home messages of the paper (2) are:

  • There were no apparent evidence for direct selection of resistance genes by antibiotics or co-selection by biocides or metals
  • Abiotic factors (mostly oxygen availability) strongly shape taxonomy and seems to be driving changes of resistance genes
  • Metagenomic and/or PCR-based community studies may not be sufficiently sensitive to detect selection effects, as important shifts towards resistant may occur within species and not on the community level
  • The concentrations of antibiotics, biocides and metals were overall reduced, but not removed in STPs. Incoming concentrations of antibiotics in Swedish STPs are generally low
  • Resistance genes are overall reduced through the treatment process, but far from eliminated

References and notes

  1. Okay, those who takes notes know that I have already complained once before on Science of the Total Environment’s ridiculously long production handling times. But, seriously, how can a journal’s production team return the proofs for after three days of acceptance, and then wait seven weeks before putting the final proofs online? I still wonder what is going on beyond the scenes, which is totally obscure because the production office also refuses to respond to e-mails. Not a nice publishing experience this time either.
  2. Bengtsson-Palme J, Hammarén R, Pal C, Östman M, Björlenius B, Flach C-F, Kristiansson E, Fick J, Tysklind M, Larsson DGJ: Elucidating selection processes for antibiotic resistance in sewage treatment plants using metagenomics. Science of the Total Environment, in press (2016). doi: 10.1016/j.scitotenv.2016.06.228 [Paper link]
  3. Rizzo L, Manaia C, Merlin C, Schwartz T, Dagot C, Ploy MC, Michael I, Fatta-Kassinos D: Urban wastewater treatment plants as hotspots for antibiotic resistant bacteria and genes spread into the environment: a review. Science of the Total Environment, 447, 345–360 (2013). doi: 10.1016/j.scitotenv.2013.01.032
  4. Laht M, Karkman A, Voolaid V, Ritz C, Tenson T, Virta M, Kisand V: Abundances of Tetracycline, Sulphonamide and Beta-Lactam Antibiotic Resistance Genes in Conventional Wastewater Treatment Plants (WWTPs) with Different Waste Load. PLoS ONE, 9, e103705 (2014). doi: 10.1371/journal.pone.0103705
  5. Yang Y, Li B, Zou S, Fang HHP, Zhang T: Fate of antibiotic resistance genes in sewage treatment plant revealed by metagenomic approach. Water Research, 62, 97–106 (2014). doi: 10.1016/j.watres.2014.05.019
  6. Berendonk TU, Manaia CM, Merlin C, Fatta-Kassinos D, Cytryn E, Walsh F, et al.: Tackling antibiotic resistance: the environmental framework. Nature Reviews Microbiology, 13, 310–317 (2015). doi: 10.1038/nrmicro3439
  7. Bengtsson-Palme J, Larsson DGJ: Concentrations of antibiotics predicted to select for resistant bacteria: Proposed limits for environmental regulation. Environment International, 86, 140–149 (2016). doi: 10.1016/j.envint.2015.10.015
  8. Bengtsson-Palme J, Angelin M, Huss M, Kjellqvist S, Kristiansson E, Palmgren H, Larsson DGJ, Johansson A: The human gut microbiome as a transporter of antibiotic resistance genes between continents. Antimicrobial Agents and Chemotherapy, 59, 10, 6551–6560 (2015). doi: 10.1128/AAC.00933-15
  9. Hellman J, Aspevall O, Bengtsson B, Pringle M: SWEDRES-SVARM 2014. Consumption of antimicrobials and occurrence of antimicrobial resistance in Sweden. Public Health Agency of Sweden and National Veterinary Institute, Solna/Uppsala, Sweden. Report No.: 14027. Available from: http://www.folkhalsomyndigheten.se/publicerat-material/ (2014)

I am happy to announce that our Viewpoint article on strategies for improving sequence databases has now been published in the journal Proteomics. The paper (1) defines some central problems hampering genomic, proteomic and metagenomic analyses and suggests five strategies to improve the situation:

  1. Clearly separate experimentally verified and unverified sequence entries
  2. Enable a system for tracing the origins of annotations
  3. Separate entries with high-quality, informative annotation from less useful ones
  4. Integrate automated quality-control software whenever such tools exist
  5. Facilitate post-submission editing of annotations and metadata associated with sequences

The paper is not long, so I encourage you to read it in its entirety. We believe that spreading this knowledge and pushing solutions to problems related to poor annotation metadata is vastly important in this era of big data. Although we specifically address protein-coding genes in this paper, the same logic also applies to other types of biological sequences. In this way the paper is related to my previous work with Henrik Nilsson on improving annotation data for taxonomic barcoding genes (2-4). This paper was one of the main end-results of the GoBiG network, and the backstory on the paper follows below the references…

References

  1. Bengtsson-Palme J, Boulund F, Edström R, Feizi A, Johnning A, Jonsson VA, Karlsson FH, Pal C, Pereira MB, Rehammar A, Sánchez J, Sanli K, Thorell K: Strategies to improve usability and preserve accuracy in biological sequence databases. Proteomics, Early view (2016). doi: 10.1002/pmic.201600034
  2. Kõljalg U, Nilsson RH, Abarenkov K, Tedersoo L, Taylor AFS, Bahram M, Bates ST, Bruns TT, Bengtsson-Palme J, Callaghan TM, Douglas B, Drenkhan T, Eberhardt U, Dueñas M, Grebenc T, Griffith GW, Hartmann M, Kirk PM, Kohout P, Larsson E, Lindahl BD, Lücking R, Martín MP, Matheny PB, Nguyen NH, Niskanen T, Oja J, Peay KG, Peintner U, Peterson M, Põldmaa K, Saag L, Saar I, Schüßler A, Senés C, Smith ME, Suija A, Taylor DE, Telleria MT, Weiß M, Larsson KH: Towards a unified paradigm for sequence-based identification of Fungi. Molecular Ecology, 22, 21, 5271–5277 (2013). doi: 10.1111/mec.12481
  3. Nilsson RH, Hyde KD, Pawlowska J, Ryberg M, Tedersoo L, Aas AB, Alias SA, Alves A, Anderson CL, Antonelli A, Arnold AE, Bahnmann B, Bahram M, Bengtsson-Palme J, Berlin A, Branco S, Chomnunti P, Dissanayake A, Drenkhan R, Friberg H, Frøslev TG, Halwachs B, Hartmann M, Henricot B, Jayawardena R, Jumpponen A, Kauserud H, Koskela S, Kulik T, Liimatainen K, Lindahl B, Lindner D, Liu J-K, Maharachchikumbura S, Manamgoda D, Martinsson S, Neves MA, Niskanen T, Nylinder S, Pereira OL, Pinho DB, Porter TM, Queloz V, Riit T, Sanchez-García M, de Sousa F, Stefaczyk E, Tadych M, Takamatsu S, Tian Q, Udayanga D, Unterseher M, Wang Z, Wikee S, Yan J, Larsson E, Larsson K-H, Kõljalg U, Abarenkov K: Improving ITS sequence data for identification of plant pathogenic fungi. Fungal Diversity, 67, 1, 11–19 (2014). doi: 10.1007/s13225-014-0291-8
  4. Nilsson RH, Tedersoo L, Ryberg M, Kristiansson E, Hartmann M, Unterseher M, Porter TM, Bengtsson-Palme J, Walker D, de Sousa F, Gamper HA, Larsson E, Larsson K-H, Kõljalg U, Edgar R, Abarenkov K: A comprehensive, automatically updated fungal ITS sequence dataset for reference-based chimera control in environmental sequencing efforts. Microbes and Environments, 30, 2, 145–150 (2015). doi: 10.1264/jsme2.ME14121

Backstory
In June 2013, the Gothenburg Bioinformatics Group for junior scientists (GoBiG) arranged a workshop with two themes: “Parallelized quantification of genes in large metagenomic datasets” and “Assigning functional predictions to NGS data”. The following discussion on how to database quality influenced results and what could be done to improve the situation was rather intense, and several good ideas were thrown around. I took notes from the meeting, and in the evening I put them down during a warm summer night at the balcony. In fact, the notes were good enough to be an early embryo for a manuscript. So I sent it to some of the most active GoBiG members (Kaisa Thorell and Fredrik Boulund), who were positive regarding the idea to turn it into a manuscript. I wrote it together more properly and we decided that everyone who contributed with ideas at the meeting would be invited to become co-authors. We submitted the manuscript in early 2014, only to see it (rather brutally) rejected. At that point most of us were sucked up in their own projects, so nothing happened to this manuscript for over a year. Then we decided to give it another go, updated the manuscript heavily and changed a few parts to better reflect the current database situation (at this point, e.g., UniProt had already started implementing some of our suggested ideas). Still, some of the proposed strategies were more radical in 2013 than they would be now, more than three years later. We asked the Proteomics editors if they would be interested in the manuscript, and they turned out to be very positive. Indeed, the entire experience with the editors at Proteomics has been very pleasant. I am very thankful to the GoBiG team for this time, and to the editors at Proteomics who saw the value of this manuscript.

Late last year, we introduced FARAO – the Flexible All-Round Annotation Organizer – a software tool that allows visualization of annotated features on contigs. Today, the Applications Note describing the software was published as an advance access paper in Bioinformatics (1). As I have described before, storing and visualizing annotation and coverage information in FARAO has a number of advantages. FARAO is able to:

  • Integrate annotation and coverage information for the same sequence set, enabling coverage estimates of annotated features
  • Scale across millions of sequences and annotated features
  • Filter sequences, such that only entries with annotations satisfying certain given criteria will be outputted
  • Handle annotation and coverage data produced by a range of different bioinformatics tools
  • Handle custom parsers through a flexible interface, allowing for adaption of the software to virtually any bioinformatic tool not supported out of the box
  • Produce high-quality EPS output
  • Integrate with MySQL databases

I have previously used FARAO to produce annotation figures in our paper on a polluted Indian lake (2), as well as in a paper on sewage treatment plants (which is in press and should be coming out any day now). We hope that the tool will find many more uses in other projects in the future!

References

  1. Hammarén R, Pal C, Bengtsson-Palme JFARAO: The Flexible All-Round Annotation Organizer. Bioinformatics, advance access (2016). doi: 10.1093/bioinformatics/btw499 [Paper link]
  2. Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, 5, 648 (2014). doi: 10.3389/fmicb.2014.00648 [Paper link]

Today marks the five year anniversary for the Metaxa software’s initial release. Much has happened to the software since; Metaxa started off as an rRNA extraction utility for metagenomic data (1), including coarse classification to organism/organelle type. Since it has gained full-scale taxonomic classification ability better or on par with other software packages (2), much greater speed, support for the LSU gene, gained a range of related software tools (3), and spurred development of other tools such as ITSx (4). I have also been involved in no less than four peer-reviewed publications directly related to the software (1-3,5).

But it does not end here; these five years were just the beginning. We are – in different constellations – working on further enhancements to Metaxa2, including support for more genes, an updated classification database, and better customization options. I am very much still devoted to keep Metaxa2 alive and relevant as a tool for taxonomic analysis of metagenomes, applicable whenever accuracy is a key parameter. Thanks for being part of the community for these five years!

References

  1. Bengtsson J, Eriksson KM, Hartmann M, Wang Z, Shenoy BD, Grelet G, Abarenkov K, Petri A, Alm Rosenblad M, Nilsson RH: Metaxa: A software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets. Antonie van Leeuwenhoek, 100, 3, 471–475 (2011). doi:10.1007/s10482-011-9598-6. [Paper link]
  2. Bengtsson-Palme J, Hartmann M, Eriksson KM, Pal C, Thorell K, Larsson DGJ, Nilsson RH: Metaxa2: Improved identification and taxonomic classification of small and large subunit rRNA in metagenomic data. Molecular Ecology Resources, 15, 6, 1403–1414 (2015). doi: 10.1111/1755-0998.12399 [Paper link]
  3. Bengtsson-Palme J, Thorell K, Wurzbacher C, Sjöling Å, Nilsson RH: Metaxa2 Diversity Tools: Easing microbial community analysis with Metaxa2. Ecological Informatics, 33, 45–50 (2016). doi: 10.1016/j.ecoinf.2016.04.004 [Paper link]
  4. Bengtsson-Palme J, Ryberg M, Hartmann M, Branco S, Wang Z, Godhe A, De Wit P, Sánchez-García M, Ebersberger I, de Souza F, Amend AS, Jumpponen A, Unterseher M, Kristiansson E, Abarenkov K, Bertrand YJK, Sanli K, Eriksson KM, Vik U, Veldre V, Nilsson RH: Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for use in environmental sequencing. Methods in Ecology and Evolution, 4, 10, 914–919 (2013). doi: 10.1111/2041-210X.12073 [Paper link]
  5. Bengtsson-Palme J, Hartmann M, Eriksson KM, Nilsson RH: Metaxa, overview. In:Nelson K. (Ed.) Encyclopedia of Metagenomics: SpringerReference (www.springerreference.com). Springer-Verlag Berlin Heidelberg (2013). doi: 10.1007/978-1-4614-6418-1_239-6 [Link]

Yesterday, Ecological Informatics put our paper describing Metaxa2 Diversity Tools online (1). Metaxa2 Diversity Tools was introduced with Metaxa2 version 2.1 and consists of

  • metaxa2_dc – a tool for collecting several .taxonomy.txt output files into one large abundance matrix, suitable for analysis in, e.g., R
  • metaxa2_rf – generates resampling rarefaction curves (2) based on the .taxonomy.txt output
  • metaxa2_si – species inference based on guessing species data from the other species present in the .taxonomy.txt output file
  • metaxa2_uc – a tool for determining if the community composition of a sample is significantly different from others through resampling analysis

At the same time as I did this update to the web site, I also took the opportunity to update the Metaxa2 FAQ to better reflect recent updates to the Metaxa2 software.

Metaxa2 Diversity Tools
One often requested feature of Metaxa2 (3) has been the ability to make simple analyses from the data after classification. The Metaxa2 Diversity Tools included in Metaxa2 2.1 is a seed for such an effort (although not close to a full-fledged community analysis package comparable to QIIME (4) or Mothur (5)). It currently consist of four tools.

The Metaxa2 Data Collector (metaxa2_dc) is the simplest of them (but probably the most requested), designed to merge the output of several *.level_X.txt files from the Metaxa2 Taxonomic Traversal Tool into one large abundance matrix, suitable for further analysis in, for example, R. The Metaxa2 Species Inference tool (metaxa2_si) can be used to further infer taxon information on, for example, the species level at a lower reliability than what would be permitted by the Metaxa2 classifier, using a complementary algorithm. The idea is that is if only a single species is present in, e.g., a family and a read is assigned to this family, but not classified to the species level, that sequence will be inferred to the same species as the other reads, given that it has more than 97% sequence identity to its best reference match. This can be useful if the user really needs species or genus classifications but many organisms in the studied species group have similar rRNA sequences, making it hard for the Metaxa2 classifier to classify sequences to the species level.

The Metaxa2 Rarefaction analysis tool (metaxa2_rf) performs a resampling rarefaction analysis (2) based on the output from the Metaxa2 classifier, taking into account also the unclassified portion of rRNAs. The Metaxa2 Uniqueness of Community analyzer (metaxa2_uc), finally, allows analysis of whether the community composition of two or more samples or groups is significantly different. Using resampling of the community data, the null hypothesis that the taxonomic content of two communities is drawn from the same set of taxa (given certain abundances) is tested. All these tools are further described in the manual and the recent paper (1).

The latest version of Metaxa2, including the Metaxa2 Diversity Tools, can be downloaded here.

References

  1. Bengtsson-Palme J, Thorell K, Wurzbacher C, Sjöling Å, Nilsson RH: Metaxa2 Diversity Tools: Easing microbial community analysis with Metaxa2. Ecological Informatics, 33, 45–50 (2016). doi: 10.1016/j.ecoinf.2016.04.004 [Paper link]
  2. Gotelli NJ, Colwell RK: Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness. Ecology Letters, 4, 379–391 (2000). doi:10.1046/j.1461-0248.2001.00230.x
  3. Bengtsson-Palme J, Hartmann M, Eriksson KM, Pal C, Thorell K, Larsson DGJ, Nilsson RH: Metaxa2: Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data. Molecular Ecology Resources (2015). doi: 10.1111/1755-0998.12399 [Paper link]
  4. Caporaso JG, Kuczynski J, Stombaugh J et al.: QIIME allows analysis of high-throughput community sequencing data. Nature Methods, 7, 335–336 (2010).
  5. Schloss PD, Westcott SL, Ryabin T et al.: Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environmental Microbiology, 75, 7537–7541 (2009).

Metaxa2 has been updated again today to version 2.1.3. This update adds a few features to the Metaxa2 Diversity Tools (metaxa2_uc and metaxa2_rf). The core Metaxa2 programs remain the same as for the previous Metaxa2 versions. The new features were suggested as part of the review process of a Metaxa2-related manuscript, and we thank the anonymous reviewers for their great suggestions!

New features and bug fixes in this update:

  • Added the Chao1, iChao1 and ACE estimators in addition to the original species abundance (“Bengtsson-Palme”) model in metaxa2_rf
  • Added the Raup-Crick dissimilarity method to the metaxa2_uc tool
  • Added a warning message when data is highly skewed for metaxa2_uc
  • Improved robustness of the ‘model’ mode of metaxa2_uc for highly skewed sample groups
  • Fixed a bug causing miscalculation of Euclidean distances on binary data in metaxa2_uc

The updated version of Metaxa2 can be downloaded here.

Happy barcoding!