Microbiology, Metagenomics and Bioinformatics

Johan Bengtsson-Palme, University of Gothenburg

If you’re looking for a PhD position in bioinformatics, working with antibiotic resistance, there’s an opening in Erik Krisiansson’s (best bioinformatician in Gothenburg? I think so) group. To apply you need to have a master’s level degree in bioinformatics, mathematical statistics, mathematics, computer science, physics, molecular biology or any equivalent topic, obtained latest June 2014. If you’re a master student and want to join us, this is your chance! You can read more and apply for the position here.

Metaxa2 is here!

1 comment

The new version of MetaxaMetaxa2 – which I first started talking about more than 1.5 years ago, has finally been determined to be so stable that we can officially release it! The release come around the same time as we submitted a paper describing the changes in it, but I will briefly go through the changes here:

  • Metaxa2 now handles extraction and classification of LSU rRNA sequences in addition to SSU rRNA
  • The classification engine has been completely redesigned, and now enables accurate taxonomic classifications down to the genus – or in some cases – species level
  • The classification database has been updated, and is now based on the SILVA 111 release
  • The Metaxa2 Taxonomic Traversal Tool – metaxa2_ttt – has been added to the package, to ease the counting of rRNA sequences in different organism groups (at various taxonomic levels)
  • Metaxa2 adds support for paired-end libraries
  • It is now possible to directly input of sequences in FASTQ-format to Metaxa2
  • The support for libraries with short read lengths (~100 bp) has been vastly improved (and is now assumed to be the case for default settings)
  • Metaxa2 can do quality pre-filtering of reads in FASTQ-format
  • Metaxa2 adds support for the modern BLAST+ package (although the old blastall version is still default)
  • Compatibility with the HMMER 3.1 beta

Metaxa2 brings together a large set of features that we have been gradually incorporating since 2011, many of which have been dependent on each other. Most of the new features and changes are thoroughly explained in the manual. While we hope Metaxa2 is bug free, there will likely be bugs caused by usage scenarios we have not envisioned. I therefore encourage anyone who come across some unexpected behavior to send me an e-mail. Especially, I would like to know about how the software performs using HMMER 3.1 and BLAST+, where testing has been limited compared to older parts of the code.

We hope that you will find Metaxa2 useful, and that it will bring taxonomic assessment of metagenomes another step forward! Metaxa2 can be downloaded here.

I have fixed a long-standing bug in the Bloutminer script, which has thereby been pushed to version 0.9.6. The new version fixes an issue when using the -o blast option without the -n option. The new version can be downloaded here.

I read an interesting note today in Nature regarding the willingness to be review papers. The author of the note (Dan Graur) claims that scientists that publish many papers contribute less to peer review, and proposes a system in which “journals should ask senior authors to provide evidence of their contribution to peer review as a condition for considering their manuscripts.” I think that this is a very interesting thought, however I see other problems coming with it. Let us for example assume that a senior author is neglecting peer review not to be evil, but simply due to an already monumental workload. If we force peer review on such a person, what kind of reviews do we expect to get back? Will this person be able to fulfill a proper, high-quality, peer review assignment? I doubt it.

On the other hand, I don’t have a good alternative either. If no one wants to do the peer reviewing, that system will inevitably break down. However, I think that there would be better to encourage peer review with positive bonuses, rather than pressure – maybe faster handling times, and higher priority, of papers with authors who have done their share of peer reviewing the last two years? Maybe cheaper publishing costs? In any case, I welcome that the subject is brought up for debate, since it is immensely important for the way we perform science today. Thanks Dan!

An ITSx user yesterday made me aware of an information-problem (thanks Suzanne!) regarding the use of ITSx in combination with the HMMER 3.1 beta. I have not been entirely clear on why you might get the “Error: bad format, binary auxfiles, (…) binary auxfiles are in an outdated HMMER format (3/b); please hmmpress your HMM file again” error message when running ITSx with HMMER 3.1 installed. You might think that following the instructions for Metaxa might do the trick. As you will notice, however, it will not. Instead you will be presented with the following error message: “Error: Failed to open binary auxfiles”. This is because while Metaxa 1.1.2 will re-create the HMM-files if needed, ITSx does not. Instead, ITSx has the option "--reset T" which can be added to the command line to recreate the HMM-files for the current HMMER version installed (regardless of which 3.x version).

Thus, the solution for the “bad format, binary auxfiles” error is to simply add "--reset T" (without quotes) to the ITSx command line and run the software again. You only need to do this once, unless you update HMMER and/or get the same error message again for some other reason. The Metaxa-post has been updated to clarify this as well.

A new year has begun, and it brings with it a few updates on the website. I have added a summary of the year 2013 from my perspective, and (as you may recognize) updated my picture on the front page. Briefly, this year will bring lots of exciting stuff. Personally, I am quite excited to finally be able to share the new version of Metaxa – Metaxa2 – which will be released to the public late this Winter (or early Spring). Additionally, I look forward to wrap up some manuscript on metagenomics and antibiotic resistance, which I have been working with for more than 2.5 years now. Also, we look forward to some super-intersting technology developments in DNA sequencing, with PacBio finally finding proper usage scenarios, Nano-pore sequencing around the corner, and super-multiplexing on the Illumina instruments. We’re in for a treat with DNA sequencing in 2014!

It seems like our paper on the recently launched database on resistance genes against antibacterial biocides and metals (BacMet) has gone online as an advance access paper in Nucleic Acids Research today. Chandan Pal – the first author of the paper, and one of my close colleagues as well as my roommate at work – has made a tremendous job taking the database from a list of genes and references, to a full-fledged browsable and searchable database with a really nice interface. I have contributed along the process, and wrote the lion’s share of the code for the BacMet-Scan tool that can be downloaded along with the database files.

BacMet is a curated source of bacterial resistance genes against antibacterial biocides and metals. All gene entries included have at least one experimentally confirmed resistance gene with references in scientific literature. However, we have also made a homology-based prediction of genes that are likely to share the same resistance function (the BacMet predicted dataset). We believe that the BacMet database will make it possible to better understand co- and cross-resistance of biocides and metals to antibiotics within bacterial genomes and in complex microbial communities from different environments.

The database can be easily accessed here: http://bacmet.biomedicine.gu.se, and use of the database in scientific work can cite the following paper, which recently appeared in Nucleic Acids Research:

Pal C, Bengtsson-Palme J, Rensing C, Kristiansson E, Larsson DGJ: BacMet: Antibacterial Biocide and Metal Resistance Genes Database. Nucleic Acids Research. Database issue, advance access. doi: 10.1093/nar/gkt1252 [Paper link]

Over the weekend, I’ve been able to finish off some stuff that has been stuck on my todo-list. Among these was to finish up the pieces of the ITSx update we put in the hands of our users today. This update brings three requested features, and a fix for an extremely rarely occurring bug:

  1. If the “–not_found T” option is used, ITSx now outputs both a list and a FASTA file of entries in the input file that did not have any ITS regions detected in them. This was a user requested feature, and a very nice an easily implemented one.
  2. As mentioned in a previous blog post, ITSx has up until now not been able to preserve the sequence headers of the input file. In hindsight, such an option would have been obvious to include, and as of version 1.0.4 ITSx comes with a “‘–preserve” option that allows headers to be carried over to all the output files.
  3. ITSx is now better at handling certain chimeric sequences.

In addition, there was a minor bug that very rarely (I have only seen one such example) that could cause the ITS region to be reported with negative lengths. This issue has now been fixed.

This update brings ITSx to version 1.0.4, and it can be downloaded here.

Those of you attending the Swedish Bioinformatics Workshop, this year given in Skövde, will have a chance seeing me talk about how sequencing depth influences the picture we get of the environmental resistance gene diversity. I think the topic is very urgent and interesting, and will likely come back to it in a more thorough blog post later. There are also a few other very interesting talks, for example about metagenomic gene quantification, and en masse sequencing of E. coli and H. pylori isolates. I think all attendants are in for a treat! See you there!

I am happy to inform you that our paper on ITSx now is out online in Methods in Ecology and Evolution issue 4.10. Meanwhile, I am slowly getting my stuff together on an update that will bring some minor requested features. The publication brings the proper citation of the ITSx paper to be:

Bengtsson-Palme, J., Ryberg, M., Hartmann, M., Branco, S., Wang, Z., Godhe, A., De Wit, P., Sánchez-García, M., Ebersberger, I., de Sousa, F., Amend, A. S., Jumpponen, A., Unterseher, M., Kristiansson, E., Abarenkov, K., Bertrand, Y. J. K., Sanli, K., Eriksson, K. M., Vik, U., Veldre, V., Nilsson, R. H. (2013), Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for analysis of environmental sequencing data. Methods in Ecology and Evolution, 4: 914–919. doi: 10.1111/2041-210X.12073