Microbiology, Metagenomics and Bioinformatics

Johan Bengtsson-Palme, University of Gothenburg

A minor bug in the “its1.full_and_partial.fasta” file has been fixed in a minor update to ITSx (1.0.11) released to day. The bug occasionally caused newline characters at the end of a sequence to be skipped and the next entry to begin at the same row. The bug only manifested itself when ITSx was used with the --partial option and only in the above mentioned FASTA file. If you have been affected by the bug, you should have noticed as the resulting FASTA file would be considered corrupted by most bioinformatics software. The updated version of ITSx can be downloaded here.

If you’re looking for super-interesting jobs within bioinformatics, you don’t need to look any further. Instead, you should apply for a position at 1928 Diagnostics here in Gothenburg and join them in the fight against antibiotic resistant bacteria. The position is in the development team and the deadline for application is December 19. All the details can be found here.

Our paper describing the bacterial community of a polluted lake in India has now been typeset and appears in its final form in Frontiers in Microbiology. If I may say so, I think that the paper turned out to be very goodlooking and it is indeed nice to finally see it in print. The paper describes an unprecedented diversity and abundance of antibiotic resistance genes and genes enabling transfer of DNA between bacteria. We also describe a range of potential novel plasmids from the lake. Finally, the paper briefly describes a new approach to targeted assembly of metagenomic data — TriMetAss — which can be downloaded here.

Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, 5, 648 (2014). doi: 10.3389/fmicb.2014.00648

One of the highlights of the Swedish Bioinformatics Workshop 2014 was of course the dinner entertainment, a song specially crafted for the event. It has now, fortunately, been put online. For anyone who might not catch all the words, here’s the complete lyrics for the song (which is based on the song “Java Jive” in the Manhattan Transfer arrangement):

The Bioinformatics ABC

Grab your coffee
Grab your tea
Put down your spoon now and listen to me
For the bioinformatics ABC
Wake up, wake up, wake up, wake up, wake up

And C for Clustal, though it’s not that fast
Alternatives are Muscle and MAFFT
ABYSS and BLAST and Clustal, Muscle, MAFFT

D count reads with DESeq or E for EdgeR
And F for FastQC and G for GLIMMER
H for HMMER using Markov Chains
Hidden hidden Markov model

I for Inchworm;
Add Chrysalis and Butterfly and wish
Assemble fast with a sound that goes swish
Contig, contig, contig, contig, transcript

And the ton of tools for metagenomics
MEGAN, Megraft,
MetaPhlan, MG-RAST, Meta-GeneMark
And that’s just mentioning a few of them
(Talk it boy)

N for Newbler, old-school it is
If you’re still using 454 it’s a bliss
O for Oases, P for PyroNoise

Q is for Quake for that great quality
And R is for all those neat statistics
S for the Spades assembler, oh yeah

T for TopHat
U for Uclust
V for Velvet

There is Wham to align
XMatchView to review
And YASS to pursue
(But do you)
Know any tools beginning with Z?
Yeah, Zorro, Zorro, Zorro
Oh yeah

With the publication of my latest paper last week (1), I also would like to highlight some of the software underpinning the findings a bit. To get around the problem that extremely common resistance genes could be present in multiple contexts and variants, causing assembler such as Velvet (2) to perform sub-optimally, we have written a software tool that utilizes Vmatch (3) and Trinity (4) to iteratively construct contigs from reads associated with resistance genes. This could of course be used in many other situations as well, when you want to specifically assemble a certain portion of a metagenome, but suspect that that portion might be found in multiple contexts.

TriMetAss is a Perl program, employing Vmatch and Trinity to construct multi-context contigs. TriMetAss uses extracted reads associated with, e.g., resistance genes as seeds for a Vmatch search against the complete set of read pairs, extracting reads matching with at least 49 bp (by default) to any of the seed reads. These reads are then assembled using Trinity. The resulting contigs are then used as seeds for another search using Vmatch to the complete set of reads, as above. All matches (including the previously matching read pairs) are again then used for a Trinity assembly. This iterative process is repeated until a stop criteria is met, e.g. when the total number of assembled nucleotides starts to drop rather than increase. The software can be downloaded here.


  1. Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, 5, 648 (2014). doi: 10.3389/fmicb.2014.00648
  2. Zerbino DR, Birney E: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18, 821–829 (2008). doi:10.1101/gr.074492.107
  3. Kurtz S: The Vmatch large scale sequence analysis software (2010). http://vmatch.de/
  4. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, et al.: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29, 644–652 (2011). doi:10.1038/nbt.1883

The first work in which I have employed metagenomics to investigate antibiotic resistance has been accepted in Frontiers in Microbiology, and is (at the time of writing) available as a provisional PDF. In the paper (1), which is co-authored by Fredrik Boulund, Jerker Fick, Erik Kristiansson and Joakim Larsson, we have used shotgun metagenomic sequencing of an Indian lake polluted by dumping of waste from pharmaceutical production. We used this data to describe the diversity of antibiotic resistance genes and the genetic context of those, to try to predict their genetic transferability. We found resistance genes against essentially every major class of antibiotics, as well as large abundances of genes responsible for mobilization of genetic material. Resistance genes were estimated to be 7000 times more abundant in the polluted lake than in a Swedish lake included for comparison, where only eight resistance genes were found. The abundances of resistance genes have previously only been matched by river sediment subject to pollution from pharmaceutical production (2). In addition, we describe twenty-six known and twenty-one putative novel plasmids from the Indian lake metagenome, indicating that there is a large potential for horizontal gene transfer through conjugation. Based on the wide range and high abundance of known resistance factors detected, we believe that it is plausible that novel resistance genes are also present in the lake. We conclude that environments polluted with waste from antibiotic manufacturing could be important reservoirs for mobile antibiotic resistance genes. This work further highlights previous findings that pharmaceutical production settings could provide sufficient selection pressure from antibiotics (3) to drive the development of multi-resistant bacteria (4,5), resistance which may ultimately end up in pathogenic species (6,7). The paper can be read in its entirety here.


  1. Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, Volume 5, Issue 648 (2014). doi: 10.3389/fmicb.2014.00648
  2. Kristiansson E, Fick J, Janzon A, Grabic R, Rutgersson C, Weijdegård B, Söderström H, Larsson DGJ: Pyrosequencing of antibiotic-contaminated river sediments reveals high levels of resistance and gene transfer elements. PLoS ONE, Volume 6, e17038 (2011). doi:10.1371/journal.pone.0017038.
  3. Larsson DGJ, de Pedro C, Paxeus N: Effluent from drug manufactures contains extremely high levels of pharmaceuticals. J Hazard Mater, Volume 148, 751–755 (2007). doi:10.1016/j.jhazmat.2007.07.008
  4. Marathe NP, Regina VR, Walujkar SA, Charan SS, Moore ERB, Larsson DGJ, Shouche YS: A Treatment Plant Receiving Waste Water from Multiple Bulk Drug Manufacturers Is a Reservoir for Highly Multi-Drug Resistant Integron-Bearing Bacteria. PLoS ONE, Volume 8, e77310 (2013). doi:10.1371/journal.pone.0077310
  5. Johnning A, Moore ERB, Svensson-Stadler L, Shouche YS, Larsson DGJ, Kristiansson E: Acquired genetic mechanisms of a multiresistant bacterium isolated from a treatment plant receiving wastewater from antibiotic production. Appl Environ Microbiol, Volume 79, 7256–7263 (2013). doi:10.1128/AEM.02141-13
  6. Pruden A, Larsson DGJ, Amézquita A, Collignon P, Brandt KK, Graham DW, Lazorchak JM, Suzuki S, Silley P, Snape JR., et al.: Management options for reducing the release of antibiotics and antibiotic resistance genes to the environment. Environ Health Perspect, Volume 121, 878–885 (2013). doi:10.1289/ehp.1206446
  7. Finley RL, Collignon P, Larsson DGJ, McEwen SA, Li X-Z, Gaze WH, Reid-Smith R, Timinouni M, Graham DW, Topp E: The scourge of antibiotic resistance: the important role of the environment. Clin Infect Dis, Volume 57, 704–710 (2013). doi:10.1093/cid/cit355

Metaxa2 update

Comments off

An update to Metaxa2 that has long remained in internal testing has been deemed bug-free (as far as we can tell) and has been uploaded to the Metaxa2 web site. The update brings a slightly improved classifier, and is the first release that we declare full stable, although we have found no problems with the previously available version (release candidate 3). This also means that we take a jump directly from version 2.0, release candidate 3 to version 2.0.1 without passing a final 2.0 release. The update is available here.

I don’t have much time to attend to the web site these days, and there are probably other things I should/could do right now, but it’s Saturday night and my baby is sleeping so… I found this nice little story covering our little family in the latest Mistra newsletter (out a couple of weeks ago). It is a kind of cute take on the “synthesis” of two Mistra-funded programs. I guess our daughter will grow up with the pains of having two research parents in different fields…

Story in English
Story in Swedish

After a long delay-time in testing ITSx version 1.0.10 has been made public. The new version patches a bug causing the 3′ anchor not being properly written to file when using the “--anchor hmm” option. If a number was used for the “--anchor” option, this bug did not apply. Thus, if you have not been using the “--anchor” option together with “hmm”, you have not been affected in any way by this bug. Nevertheless, I encourage updating in case you would use the “--anchor hmm” option in the future. The update can be downloaded here. Happy barcoding!

I would like to sincerely apologize for that I have been terrible at responding to support issues pertaining to ITSx, Metaxa, Atosh etc. lately. I am currently on 50% parental leave and at the same time I am wrapping up three first-author papers, organizing a workshop and preparing a talk. Thus, support issues has been lagging a bit behind the last weeks to be able to cope with everything else. I have been ticking off most (all?) of my support questions the last couple of days, but if I have any remaining issues that I have missed to reply to, please re-send them to me!

I will try to improve response times, but it is hard when I am working less than usual (also, note that I (strangely) don’t get paid for supporting software, so I have to do this on my “sparetime”). My aim is to respond within a few days, so if I have not done so, please resend your e-mail with a friendly reminder that you are waiting for my response. Reminding me will very likely put your question up the priority pile.

So, my advice to becoming dads is: Do take paternal leave. Do take a lot of it. Share responsibilities with your partner. Because what you get back is awesome. (And also you get a good reason not to answer support questions in time.) But finally, don’t plan to wrap up the last couple of year’s worth of work and arrange a conference at the same time as you take out paternal leave. That will only make you feel insufficient at all fronts.

Keep the spirit high!