Microbiology, Metagenomics and Bioinformatics

Johan Bengtsson-Palme, University of Gothenburg

TriMetAss has been updated to version 1.1. The new version addresses a number of minor issues and brings two new handy features. The update can be found here.

New features:

  • Multiple input files can now be specified by adding several -1 and -2 options.
  • TriMetAss now automatically stops if the candidate reads are the same for two iterations in a row.

Fixed issues:

  • Support for recent versions of Trinity that no longer contain the Trinity.pl script.
  • A minor bug causing TriMetAss to use more memory than necessary has been fixed.
  • Fixed the --stop_total option so that TriMetAss actually uses this option (rather than --stop_length)
  • Allowed complicated paths to be supplied for the output directory.

I would like to thank users Rickard Hammarén, Dr. Tatsuya Unno, Dr. Gisle Vestergaard and Dr. Joseph Nesme for providing me with the underlying information to provide these fixes. Thanks a lot!

Vacation

Comments off

This week is the first of my long summer break, and I will be on vacation until mid-August. This means that I will only read mail sporadically, if at all. For very urgent issues, please give me a call or send me an sms, and I will attend to your message as soon as possible (this of course only applies to those of you who have my number in the first place).

For support questions, there are a few options:

  • For questions regarding Metaxa or Metaxa2, please add “METAXA” to the beginning of the subject line of the e-mail.
  • For questions regarding ITSx, please add “ITSX” to the beginning of the subject line.
  • For other support questions, please add “SUPPORT” to the beginning of the subject line.

This way I can easily assess which mails that are urgent to reply to. Don’t add “IMPORANT” or “URGENT” since that will just invoke the spam filter.

I wish you all a very very great summer!

Late last year, an opinion paper by José Martínez, Teresa Coque and Fernando Baquero was published in Nature Reviews Microbiology (1). In this paper, the authors present a system – resistance readiness conditions (RESCon) – for ranking the risks associated with the detection of antibiotic resistance genes. They also outline the obstacles associated with determining risks presented by antibiotic resistance genes in environmental microbial communities in terms of their potential to transfer to human pathogens. Generally, I am very positive about this paper, which I think is a must-read for anyone who works with antibiotic resistance genes in metagenomes, regardless of it they stem from the human gut or the external environment.

There is, however, one very important aspect that struck me and many other members of our research group as curious: the proposed system assign antibiotic resistance genes already present on mobile genetic elements in human pathogens to the highest risk category (RESCon 1), while resistance genes encoding novel resistance mechanisms not yet been found on mobile elements in a pathogen are considered to be part of lower risk categories. We believe that this system will overestimate the risks associated with well-known resistance factors that are already circulating among human pathogens and under-appreciate the potentially disastrous consequences that the transfer of previously unknown resistance determinants from the environmental resistome could have (exemplified by the rapid clinical spread of the NDM-1 metallo-beta-lactamase gene (2,3)).

With this in mind me and Joakim Larsson wrote a response letter to Nature Reviews Microbiology that went online last monday (4), together with the authors’ reply to us (5). (I strongly suggest that you read the entire original paper (1) before you read the reply (5) to our response letter (4), since Martinez et al. changes the scope slightly from the original paper in their response letter, and these clarifications may (or may not) have been in response to our arguments.)

In our response, we also stress that the abundances of resistance genes, and not only their presence, should be accounted for when estimating risks (although that last point might have been slightly obscured due to the very low word limit). In other words, we think that identifying environmental hotspots for antibiotic resistance genes, where novel resistance genes could be selected for (6,7,8), is of great importance for mitigating public health risks related to environmental antibiotic resistance. Please read our full thoughts on the matter in Nature Reviews Microbiology.

Similar issues will be touched upon in my talk at the EDAR2015 conference later in May. Hope to see you there!

References

  1. Martinez JL, Coque TM, Baquero F: What is a resistance gene? Ranking risk in resistomes. Nat Rev Microbiol 2015, 13:116–123.
  2. Kumarasamy KK, et al.: Emergence of a new antibiotic resistance mechanism in India, Pakistan, and the UK: a molecular, biological, and epidemiological study. Lancet Infect Dis 2010, 10:597–602.
  3. Walsh TR, Weeks J, Livermore DM, Toleman MA: Dissemination of NDM‐1 positive bacteria in the New Delhi environment and its implications for human health: an environmental point prevalence study. Lancet Infect Dis 2011, 11:355–362.
  4. Bengtsson-Palme J, Larsson DGJ: Antibiotic resistance genes in the environment: prioritizing risks. Nat Rev Microbiol 2015, Advance online publication. doi:10.1038/nrmicro3399‐c1
  5. Martinez JL, Coque TM, Baquero F: Prioritizing risks of antibiotic resistance genes in all metagenomes. Nat Rev Microbiol 2015, Advance online publication. doi:10.1038/nrmicro3399‐c2
  6. Kristiansson E, et al.: Pyrosequencing of antibiotic‐contaminated river sediments reveals high levels of resistance and gene transfer elements. PLoS ONE 2011, 6:e17038.
  7. Bengtsson‐Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Front Microbiol 2014, 5:648.
  8. Marathe NP, et al.: A treatment plant receiving waste water from multiple bulk drug manufacturers is a reservoir for highly multi‐drug resistant integron‐bearing bacteria. PLoS ONE 2013, 8:e77310.

I will be giving a talk at the Third International symposium on the environmental dimension of antibiotic resistance (EDAR2015) next month (five weeks from now. The talk is entitled “Turn up the signal – wipe out the noise: Gaining insights into antibiotic resistance of bacterial communities using metagenomic data“, and will deal with handling of metagenomic data in antibiotic resistance gene research. The talk will highlight the some particular pitfalls related to interpretation of data, and exemplify how flawed analysis practices can result in misleading conclusions regarding antibiotic resistance risks. I will particularly address how taxonomic composition influences the frequencies of resistance genes, the importance of knowledge of the functions of the genes in the databases used, and how normalization strategies influence the results. Furthermore, we will show how the context of resistance genes can allow inference of their potential to spread to human pathogens from environmental or commensal bacteria. All these aspects will be exemplified by data from our studies of environments subjected to pharmaceutical pollution in India, the effect of travel on the human resistome, and modern municipal wastewater treatment processes.

The talk will take place on Monday, May 18, 2015 at 13:20. The full scientific program for the conference can be found here. Registration for the conference is still possible, although not for the early-bird price. I look forward to see a lot of the people who will attend the conference, and hopefully also you!

It is nice to see that Indian media has picked up the story about antibiotic resistance genes in the heavily polluted Kazipally lake. In this case, it is the Deccan Chronicle who have been reporting on our findings and briefly interviewed Prof. Joakim Larsson about the study. The issue of pharmaceutical pollution of the environment in drug-producing countries is still rather under-reported and public perception of the problem might be rather low. Therefore, it makes me happy to see an Indian newspaper reporting on the issue. The scientific publication referred to can be found here.

Metaxa2 has been updated to version 2.0.2 and can be downloaded from the Metaxa2 web site. The 2.0.2 update fixes two minor bugs; one causing the “.graph” file to display incorrect or no names for the regions of the LSU regions, and one causing misreporting of the number of sequences in single-end FASTQ files (paired-end files were reported correctly). The update also brings a slightly improved classifier. Thanks to Marco Severgnini for reporting the FASTQ file issue! The update is available here.

Some of you who think ITSx is running slowly despite being assigned multiple CPUs, particularly on datasets with only one kind of sequences (e.g. fungal) using the -t F option might be interested in trying out Andrew Krohn’s parallel ITSx implementation. The solution essentially employs a bash script spawning multiple ITSx instances running on different portions of the input file. Although there are some limitations to the script (e.g. you cannot select a custom name for the output and you will only get the ITS1 and ITS2 + full sequences FASTA files, as far as I understand the script), it may prove useful for many of you until we write up a proper solution to the poor multi-thread performance of ITSx (planned for version 1.1). In the coming months, I recommend that you check this solution out! See also the wiki documentation.

My speed tests shows the following (on a quite small test set of fungal ITS sequences):
ITSx parallel on 16 CPUs, all ITS types (option “-t all“):
3 min, 16 sec
ITSx parallel on 16 CPUs, only fungal ITS types (option “-t f“):
54 sec
ITSx native on 16 CPUs, all ITS types (options “-t all --cpu 16“):
4 min, 59 sec
ITSx native on 16 CPUs, only fungal types (options “-t f --cpu 16“):
5 min, 50 sec

Why fungal only took longer time in the native implementation is a mystery to me, but probably shows why there is a need to rewrite the multithreading code, as we did with Metaxa a couple of years ago. Stay tuned for ITSx updates!

A couple of days ago, a paper I have co-authored describing an ITS sequence dataset for chimera control in fungi went online as an advance online publication in Microbes and Environments. There are several software tools available for chimera detection (e.g. Henrik Nilsson’s fungal chimera checker (1) and UCHIME (2)), but these generally rely on the presence of a chimera-free reference dataset. Until now, there was no such dataset is for the fungal ITS region, and we in this paper (3) introduce a comprehensive, automatically updated reference dataset for fungal ITS sequences based on the UNITE database (4). This dataset supports chimera detection throughout the fungal kingdom and for full-length ITS sequences as well as partial (ITS1 or ITS2 only) datasets. We estimated the dataset performance on a large set of artificial chimeras to be above 99.5%, and also used the dataset to remove nearly 1,000 chimeric fungal ITS sequences from the UNITE database. The dataset can be downloaded from the UNITE repository. Thereby, it is also possible for users to curate the dataset in the future through the UNITE interactive editing tools.

References:

  1. Nilsson RH, Abarenkov K, Veldre V, Nylinder S, Wit P de, Brosché S, Alfredsson JF, Ryberg M, Kristiansson E: An open source chimera checker for the fungal ITS region. Molecular Ecology Resources, 10, 1076–1081 (2010).
  2. Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics, 27, 16, 2194-2200 (2011). doi:10.1093/bioinformatics/btr381
  3. Nilsson RH, Tedersoo L, Ryberg M, Kristiansson E, Hartmann M, Unterseher M, Porter TM, Bengtsson-Palme J, Walker D, de Sousa F, Gamper HA, Larsson E, Larsson K-H, Kõljalg U, Edgar R, Abarenkov K: A comprehensive, automatically updated fungal ITS sequence dataset for reference-based chimera control in environmental sequencing efforts. Microbes and Environments, Advance Online Publication (2015). doi: 10.1264/jsme2.ME14121
  4. Kõljalg U, Nilsson RH, Abarenkov K, Tedersoo L, Taylor AFS, Bahram M, Bates ST, Bruns TT, Bengtsson-Palme J, Callaghan TM, Douglas B, Drenkhan T, Eberhardt U, Dueñas M, Grebenc T, Griffith GW, Hartmann M, Kirk PM, Kohout P, Larsson E, Lindahl BD, Lücking R, Martín MP, Matheny PB, Nguyen NH, Niskanen T, Oja J, Peay KG, Peintner U, Peterson M, Põldmaa K, Saag L, Saar I, Schüßler A, Senés C, Smith ME, Suija A, Taylor DE, Telleria MT, Weiß M, Larsson KH: Towards a unified paradigm for sequence-based identification of Fungi. Molecular Ecology, 22, 21, 5271–5277 (2013). doi: 10.1111/mec.12481

After receiving a devastating amount of comment spam in the last couple of days, I have decided to close all commenting functionality, at least temporarily, on this site. If you want to discuss or comment on anything, please send me an e-mail. I would love to re-post your thoughts on the website! It’s sad that it has come down to this, but I never received as much comments as I did e-mails anyway. Thanks for your understanding, and let’s keep in touch over e-mail (or ResearchGate)!

A couple of days ago a paper was published in Environmental Sciences Europe summarizing the EU report on effect-based tools for use in toxicology in the aquatic environment I have been involved in (1). This report was officially published last spring (2), and can be found here, with the annex available on the European Commission document website. My contribution to the paper was, as with the report, in the genomics and metagenomics section. The paper briefly presents modern bioassays, biomarkers and ecological methods that can be used for aquatic monitoring of the environment.

References:

  1. Wernersson A-S, Carere M, Maggi C, Tusil P, Soldan P, James A, Sanchez W, Dulio V, Broeg K, Reifferscheid G, Buchinger S, Maas H, Van Der Grinten E, O’Toole S, Ausili A, Manfra L, Marziali L, Polesello S, Lacchetti I, Mancini L, Lilja K, Linderoth M, Lundeberg T, Fjällborg B, Porsbring T, Larsson DGJ, Bengtsson-Palme J, Förlin L, Kienle C, Kunz P, Vermeirssen E, Werner I, Robinson CD, Lyons B, Katsiadaki I, Whalley C, den Haan K, Messiaen M, Clayton H, Lettieri T, Negrão Carvalho R, Gawlik BM, Hollert H, Di Paolo C, Brack W. Kammann U, Kase R: The European technical report on aquatic effect-based monitoring tools under the water framework directive. Environmental Sciences Europe, 27, 7 (2015). doi: 10.1186/s12302-015-0039-4 [Paper link]
  2. Wernersson A-S, Carere M, Maggi C, Tusil P, Soldan P, James A, Sanchez W, Broeg K, Kammann U, Reifferscheid G, Buchinger S, Maas H, Van Der Grinten E, Ausili A, Manfra L, Marziali L, Polesello S, Lacchetti I, Mancini L, Lilja K, Linderoth M, Lundeberg T, Fjällborg B, Porsbring T, Larsson DGJ, Bengtsson-Palme J, Förlin L, Kase R, Kienle C, Kunz P, Vermeirssen E, Werner I, Robinson CD, Lyons B, Katsiadaki I, Whalley C, den Haan K, Messiaen M, Clayton H, Lettieri T, Negrão Carvalho R, Gawlik BM, Dulio V, Hollert H, Di Paolo C, Brack W (2014). Technical Report on Aquatic Effect-Based Monitoring Tools. European Commission. Technical Report 2014-077, Office for Official Publications of European Communities, ISBN: 978-92-79-35787-9. doi:10.2779/7260