Microbiology, Metagenomics and Bioinformatics

Johan Bengtsson-Palme, University of Gothenburg

After receiving a devastating amount of comment spam in the last couple of days, I have decided to close all commenting functionality, at least temporarily, on this site. If you want to discuss or comment on anything, please send me an e-mail. I would love to re-post your thoughts on the website! It’s sad that it has come down to this, but I never received as much comments as I did e-mails anyway. Thanks for your understanding, and let’s keep in touch over e-mail (or ResearchGate)!

A couple of days ago a paper was published in Environmental Sciences Europe summarizing the EU report on effect-based tools for use in toxicology in the aquatic environment I have been involved in (1). This report was officially published last spring (2), and can be found here, with the annex available on the European Commission document website. My contribution to the paper was, as with the report, in the genomics and metagenomics section. The paper briefly presents modern bioassays, biomarkers and ecological methods that can be used for aquatic monitoring of the environment.

References:

  1. Wernersson A-S, Carere M, Maggi C, Tusil P, Soldan P, James A, Sanchez W, Dulio V, Broeg K, Reifferscheid G, Buchinger S, Maas H, Van Der Grinten E, O’Toole S, Ausili A, Manfra L, Marziali L, Polesello S, Lacchetti I, Mancini L, Lilja K, Linderoth M, Lundeberg T, Fjällborg B, Porsbring T, Larsson DGJ, Bengtsson-Palme J, Förlin L, Kienle C, Kunz P, Vermeirssen E, Werner I, Robinson CD, Lyons B, Katsiadaki I, Whalley C, den Haan K, Messiaen M, Clayton H, Lettieri T, Negrão Carvalho R, Gawlik BM, Hollert H, Di Paolo C, Brack W. Kammann U, Kase R: The European technical report on aquatic effect-based monitoring tools under the water framework directive. Environmental Sciences Europe, 27, 7 (2015). doi: 10.1186/s12302-015-0039-4 [Paper link]
  2. Wernersson A-S, Carere M, Maggi C, Tusil P, Soldan P, James A, Sanchez W, Broeg K, Kammann U, Reifferscheid G, Buchinger S, Maas H, Van Der Grinten E, Ausili A, Manfra L, Marziali L, Polesello S, Lacchetti I, Mancini L, Lilja K, Linderoth M, Lundeberg T, Fjällborg B, Porsbring T, Larsson DGJ, Bengtsson-Palme J, Förlin L, Kase R, Kienle C, Kunz P, Vermeirssen E, Werner I, Robinson CD, Lyons B, Katsiadaki I, Whalley C, den Haan K, Messiaen M, Clayton H, Lettieri T, Negrão Carvalho R, Gawlik BM, Dulio V, Hollert H, Di Paolo C, Brack W (2014). Technical Report on Aquatic Effect-Based Monitoring Tools. European Commission. Technical Report 2014-077, Office for Official Publications of European Communities, ISBN: 978-92-79-35787-9. doi:10.2779/7260

After almost a year in different stages of review and revision, in which the paper (but not the software) saw a total transformation, I am happy to announce that the paper describing Metaxa2 has been accepted in Molecular Ecology Resources and is available in a rudimentary online early form. The figures in this version are not that pretty, but those who wants to read the paper asap, you have the possibility to do so.

This means that if you have been using Metaxa2 for a publication, there is now a new preferred way of citing this, namely:

Bengtsson-Palme J, Hartmann M, Eriksson KM, Pal C, Thorell K, Larsson DGJ, Nilsson RH: Metaxa2: Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data. Molecular Ecology Resources (2015). doi: 10.1111/1755-0998.12399

The paper (1), apart from describing the new Metaxa version, also brings a very thorough evaluation of the software, compared to other tools for taxonomic classification implemented in QIIME (2). In short, we show that:

  • Metaxa2 can make trustworthy taxonomic classifications even with reads as short as 100 bp
  • Generally, the performance is reliable across the entire SSU rRNA gene, regardless of which V-region a read is derived from
  • Metaxa2 can reliably recapture species composition from short-read metagenomic data, comparable with results of amplicon sequencing
  • Metaxa2 outperforms other popular tools such as Mothur (3), the RDP Classifier (4), Rtax (5) and the QIIME implementation of Uclust (6) in terms of proportion of correctly classified reads from metagenomic data
  • The false positive rate of Metaxa2 is very close to zero; far superior to many of the above mentioned tools, many of which assume that reads must derive from the rRNA gene

Metaxa2 can be downloaded here. We have already used it for around two years internally, and it forms the base of the taxonomic classifications in e.g. our recently published paper on antibiotic resistance in a polluted Indian lake (7).

References

  1. Bengtsson-Palme J, Hartmann M, Eriksson KM, Pal C, Thorell K, Larsson DGJ, Nilsson RH: Metaxa2: Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data. Molecular Ecology Resources (2015). doi: 10.1111/1755-0998.12399 [Paper link]
  2. Caporaso JG, Kuczynski J, Stombaugh J et al.: QIIME allows analysis of high-throughput community sequencing data. Nature Methods, 7, 335–336 (2010).
  3. Schloss PD, Westcott SL, Ryabin T et al.: Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environmental Microbiology, 75, 7537–7541 (2009).
  4. Wang Q, Garrity GM, Tiedje JM, Cole JR: Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and Environmental Microbiology, 73, 5261–5267 (2007).
  5. Soergel DAW, Dey N, Knight R, Brenner SE: Selection of primers for optimal taxonomic classification of environmental 16S rRNA gene sequences. The ISME Journal, 6, 1440–1444 (2012).
  6. Edgar RC: Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26, 2460–2461 (2010).
  7. Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, 5, 648 (2014).

My colleague Henrik Nilsson has been interviewed by the ResearchGate news team about the recent effort to better annotate ITS data for plant pathogenic fungi. It’s an interesting read, and I think Henrik nicely underscores why large-scale efforts for improving and correcting sequence annotations are important. You can read the interview here, and the paper they talk about is referenced below.

Nilsson RH, Hyde KD, Pawlowska J, Ryberg M, Tedersoo L, Aas AB, Alias SA, Alves A, Anderson CL, Antonelli A, Arnold AE, Bahnmann B, Bahram M, Bengtsson-Palme J, Berlin A, Branco S, Chomnunti P, Dissanayake A, Drenkhan R, Friberg H, Frøslev TG, Halwachs B, Hartmann M, Henricot B, Jayawardena R, Jumpponen A, Kauserud H, Koskela S, Kulik T, Liimatainen K, Lindahl B, Lindner D, Liu J-K, Maharachchikumbura S, Manamgoda D, Martinsson S, Neves MA, Niskanen T, Nylinder S, Pereira OL, Pinho DB, Porter TM, Queloz V, Riit T, Sanchez-García M, de Sousa F, Stefaczyk E, Tadych M, Takamatsu S, Tian Q, Udayanga D, Unterseher M, Wang Z, Wikee S, Yan J, Larsson E, Larsson K-H, Kõljalg U, Abarenkov K: Improving ITS sequence data for identification of plant pathogenic fungi. Fungal Diversity, Volume 67, Issue 1 (2014), 11–19. doi: 10.1007/s13225-014-0291-8 [Paper link]

In a recent paper in Nature, a completely new antibiotic – teixobactin – is described (1). The really cool thing about this antibiotic is that it was discovered in a screen of uncultured bacteria, grown using new technology that enable controlled growth of single colonies in situ. I really like this idea, and I think the prospect of a novel antibiotic using a previously unexploited mechanism is super-promising, particularly in the light of alarming resistance development in clinically important pathogens (2,3). What really annoys me about the paper is the claim (already in the abstract) that since “we did not obtain any mutants of Staphylococcus aureus or Mycobacterium tuberculosis resistant to teixobactin (…) the properties of this compound suggest a path towards developing antibiotics that are likely to avoid development of resistance.” To me, this sounds pretty much like a bogus statement; in essence telling me that we apparently have not learned anything from the 70 years of antibiotics usage and resistance development. After working with antibiotic resistance a couple of years, particularly from the environmental perspective, I have a very disturbing feeling that there is already resistance mechanisms against teixobactin waiting out in the wild (4,5). Pretending that lack of mutation-associated resistance development means that there could not be resistance development did not help vancomycin (6,7), and we now see VRE (Vancomycin Resistant Enterococcus) showing up as a major problem in clinics. The “avoid development of resistance” claim is downright irresponsible, and the cynic in me cannot help to think that NovoBiotic Pharmaceuticals (the affiliation of almost half of the authors) has a monetary finger in this jar. In the end, time will tell how “resistance-resilient” teixobactin is and how well we can handle the gift of a novel antibiotic.

  1. Ling LL, Schneider T, Peoples AJ, Spoering AL, Engels I, Conlon BP, Mueller A, Schäberle TF, Hughes DE, Epstein S, Jones M, Lazarides L, Steadman VA, Cohen DR, Felix CR, Fetterman KA, Millett WP, Nitti AG, Zullo AM, Chen C, Lewis K: A new antibiotic kills pathogens without detectable resistance. Nature (2015). doi:10.1038/nature14098
  2. Finley RL, Collignon P, Larsson DGJ, McEwen SA, Li X-Z, Gaze WH, Reid-Smith R, Timinouni M, Graham DW, Topp E: The scourge of antibiotic resistance: the important role of the environment. Clin Infect Dis, 57: 704–710 (2013).
  3. French GL: The continuing crisis in antibiotic resistance. Int J Antimicrob Agents, 36 Suppl 3:S3–7 (2010).
  4. Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, 5: 648 (2014).
  5. Larsson DGJ: Antibiotics in the environment. Ups J Med Sci, 119: 108–112 (2014).
  6. Wright GD: Mechanisms of resistance to antibiotics. Curr Opin Chem Biol, 7:563–569 (2003).
  7. Werner G, Strommenger B, Witte W: Acquired vancomycin resistance in clinically relevant pathogens. Future Microbiol, 3: 547–562 (2008).

We’re approaching Christmas, and this year I will try to spend lots of time with my family and less time at the computer. We’ll see how that goes, but all in all it means that I will most likely not respond promptly to e-mails until after New Year’s, maybe not until January 8 or 9. If, for example, you have asked a support question and have not received a response before January 12 2015, then please feel free to re-send your e-mail as I should then at least have replied that I cannot solve your issue quickly.

A further note for the future is that I will be on parental leave with my lovely nine-month-old during the entire spring, so answering e-mails will not be my highest priority, and might be neglected entirely in periods. I apologize for all kinds of inconveniences that this might cause, especially for Metaxa, ITSx and Megraft users.

Merry Christmas and a Happy New Year!

A minor bug in the “its1.full_and_partial.fasta” file has been fixed in a minor update to ITSx (1.0.11) released to day. The bug occasionally caused newline characters at the end of a sequence to be skipped and the next entry to begin at the same row. The bug only manifested itself when ITSx was used with the --partial option and only in the above mentioned FASTA file. If you have been affected by the bug, you should have noticed as the resulting FASTA file would be considered corrupted by most bioinformatics software. The updated version of ITSx can be downloaded here.

If you’re looking for super-interesting jobs within bioinformatics, you don’t need to look any further. Instead, you should apply for a position at 1928 Diagnostics here in Gothenburg and join them in the fight against antibiotic resistant bacteria. The position is in the development team and the deadline for application is December 19. All the details can be found here.

Our paper describing the bacterial community of a polluted lake in India has now been typeset and appears in its final form in Frontiers in Microbiology. If I may say so, I think that the paper turned out to be very goodlooking and it is indeed nice to finally see it in print. The paper describes an unprecedented diversity and abundance of antibiotic resistance genes and genes enabling transfer of DNA between bacteria. We also describe a range of potential novel plasmids from the lake. Finally, the paper briefly describes a new approach to targeted assembly of metagenomic data — TriMetAss — which can be downloaded here.

Reference:
Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, 5, 648 (2014). doi: 10.3389/fmicb.2014.00648

One of the highlights of the Swedish Bioinformatics Workshop 2014 was of course the dinner entertainment, a song specially crafted for the event. It has now, fortunately, been put online. For anyone who might not catch all the words, here’s the complete lyrics for the song (which is based on the song “Java Jive” in the Manhattan Transfer arrangement):

The Bioinformatics ABC

Grab your coffee
Grab your tea
Put down your spoon now and listen to me
For the bioinformatics ABC
Wake up, wake up, wake up, wake up, wake up
(Boy)

A for ABYSS
B for BLAST
And C for Clustal, though it’s not that fast
Alternatives are Muscle and MAFFT
ABYSS and BLAST and Clustal, Muscle, MAFFT
(Yeah)

D count reads with DESeq or E for EdgeR
And F for FastQC and G for GLIMMER
H for HMMER using Markov Chains
(Explain)
Hidden hidden Markov model

I for Inchworm;
Jellyfish
Add Chrysalis and Butterfly and wish
Assemble fast with a sound that goes swish
Contig, contig, contig, contig, transcript
(Girl)

KBASE
Lasergene
And the ton of tools for metagenomics
MEGAN, Megraft,
MetaPhlan, MG-RAST, Meta-GeneMark
And that’s just mentioning a few of them
(Talk it boy)

N for Newbler, old-school it is
If you’re still using 454 it’s a bliss
O for Oases, P for PyroNoise
45-45-45-454-454

Q is for Quake for that great quality
And R is for all those neat statistics
S for the Spades assembler, oh yeah

T for TopHat
U for Uclust
V for Velvet

There is Wham to align
XMatchView to review
And YASS to pursue
(But do you)
Know any tools beginning with Z?
What?
Yeah, Zorro, Zorro, Zorro
Oh yeah