Tag: DNA sequencing

Pfam team aims at cleaning erroneous protein families

The guys at Pfam recently introduced a new database, called AntiFam, which will provide HMM profiles for some groups of sequences that seemingly formed larger protein families, although they were not actually real proteins. For example, rRNA sequences could contain putative ORFs, that seems to be conserved over broad lineages; with the only problem being that they are not translated into proteins in real life, as they are part of an rRNA [1].

With this initiative the Xfam team wants to “reduce the number of spurious proteins that make their way into the protein sequence databases.” I have run into this problem myself at some occasions with suspicious sequences in GenBank, and I highly encourage this development towards consistency and correctness in sequence databases. It is of extreme importance that databases remain reliable if we want bioinformatics to tell us anything about organismal or community functions. The Antifam database is a first step towards such a cleanup of the databases, and as such I would like to applaud Pfam for taking actions in this direction.

To my knowledge, GenBank are doing what they can with e.g. barcoding data (SSU, LSU, ITS sequences), but for bioinformatics and metagenomics (and even genomics) to remain viable, these initiatives needs to come quickly; and automated (but still very sensitive) tools for this needs to get our focus immediately. For example, Metaxa [2] could be used as a tool to clean up SSU sequences of misclassified origin. More such tools are needed, and a lot of work remains to be done in the area of keeping databases trustworthy in the age of large-scale sequencing.


  1. Tripp, H. J., Hewson, I., Boyarsky, S., Stuart, J. M., & Zehr, J. P. (2011). Misannotations of rRNA can now generate 90% false positive protein matches in metatranscriptomic studies. Nucleic Acids Research, 39(20), 8792–8802. doi:10.1093/nar/gkr576
  2. Bengtsson, J., Eriksson, K. M., Hartmann, M., Wang, Z., Shenoy, B. D., Grelet, G.-A., Abarenkov, K., et al. (2011). Metaxa: a software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets. Antonie van Leeuwenhoek, 100(3), 471–475. doi:10.1007/s10482-011-9598-6

Antibiotic resistance driving virulence?

It seriously worries me that a number of indications recently have pointed to that the heavy use of antibiotics does not only drive antibiotic resistance development, but also the development towards more virulent and aggressive strains of pathogenic bacteria. First, the genome sequencing of the E. coli strain that caused the EHEC outbreak in Germany in May revealed not only antibiotic resistance genes, but also is also able to make Shiga toxin, which is causes the severe diarrhoea and kidney damage related to the haemolytic uremic syndrome (HUS). The genes encoding the Shiga toxin are not originally bacterial genes, but instead seem to originate from phages. When E. coli gets infected with a Shiga toxin-producing phage, it becomes a human pathogen [1]. David Acheson, managing director for food safety at consulting firm Leavitt Partners, says that exposure to antibiotics might be enhancing the spread of Shiga toxin-producing phage. Some antibiotics triggers what is referred to as the SOS response, which induces the phage to start replicating. The replication of the phage causes the bacteria to burst, releasing the phages, and with them the toxin [1].

Second, there is apparently an ongoing outbreak of scarlet fever in Hong Kong. Kwok-Yung Yuen, microbiologist at the University of Hong Kong, has analyzed the draft sequence of the genome, and suggests that the bacteria acquired greater virulence and drug resistance by picking up one or more genes from bacteria in the human oral and urogenital tracts. He believes that the overuse of antibiotics is driving the emergence of drug resistance in these bacteria [2].

Now, both of these cases are just indications, but if they are true that would be an alarming development, where the use of antibiotics promotes the spread not only of resistance genes, impairing our ability to treat bacterial infections, but also the development of far more virulent and aggressive strains. Combining increasing untreatability with increasing aggressiveness seems to me like the ultimate weapon against our relatively high standards of treatment of common infections. Good thing hand hygiene still seems to help [3].


  1. Phage on the rampage (http://www.nature.com/news/2011/110609/full/news.2011.360.html), Published online 9 June 2011, Nature, doi:10.1038/news.2011.360
  2. Mutated Bacteria Drives Scarlet Fever Outbreak (http://news.sciencemag.org/scienceinsider/2011/06/mutated-bacteria-drives-scarlet.html?etoc&elq=cd94aa347dca45b3a82f144b8213e82b), Published online 27 June 2011.
  3. Luby SP, Halder AK, Huda T, Unicomb L, Johnston RB (2011) The Effect of Handwashing at Recommended Times with Water Alone and With Soap on Child Diarrhea in Rural Bangladesh: An Observational Study. PLoS Med 8(6): e1001052. doi:10.1371/journal.pmed.1001052 (http://www.plosmedicine.org/article/info%3Adoi%2F10.1371%2Fjournal.pmed.1001052)

Thesis presentation

I will present my master thesis “Metagenomic Analysis of Marine Periphyton Communities”, on Tuesday the 22nd of March, at 13.00. The presentation will take place in the room Folke Andreasson at Medicinaregatan 11 in Gothenburg. The presentation is open for everyone, but the number of seats are limited.

Raising the bar for genome sequencing

In a recent Nature article (1), Craig Venter and his co-workers at JCVI has not only sequenced one marine bacterium, but 137 different isolates. Their main goal of this study was to better understand the ecology of marine picoplankton in the context of Global Ocean Sampling (GOS) data (2,3). As I see it, there are at least two really interesting things going on here:

First, this is a milestone in sequencing. Were not talking one genome – one article anymore. Were talking one article – 137 new genomes. This vastly raises the bar for any sequencing efforts in the future, but even more importantly, it shifts the focus even further from the actual sequencing to the purpose of the sequencing. One sequenced genome might be interesting enough if it fills a biological knowledge gap, but just sequencing a bacterial strain isn’t worth that much anymore. With the arrival of second- and third-generation sequencing techniques, this development was pretty obvious, but this article is (to my knowledge) the first real proof of that this has finally happened. I expect that five to ten years from now, not sequencing an organism of interest for your research will be viewed as very strange and backwards-looking. “Why didn’t you sequence this?” will be a highly relevant review question for many publications. But also the days when you could write “we here publish for the first time the complete genome sequence of <insert organism name here>” and have that as the central theme for an article will soon be over. Sequencing will simply be reduced to the (valuable) tool it actually is. Which is probably good, as it brings us back to biology again. Articles like this one, where you look at ~200 genomes to investigate ecological questions, are simply providing a more relevant biological perspective than staring at the sequence of one genome in a time when DNA-data is flooding over us.

Second, this is the first (again, to my knowledge) publication where questions arising from metagenomics (2,3,4) has initiated a huge sequencing effort to understand the ecology or the environment to which the metagenome is associated. This highlights a new use of metagenomics as a prospective technique, to mine various environments for interesting features, and then select a few of its inhabitants and look closer at who is responsible for what. With a number of emerging single cell sequencing and visualisation techniques (5,6,7,8) as well as the application of cell sorting approaches to environmental communities (5,9), we can expect metagenomics to play a huge role in organism, strain and protein discovery, but also in determining microbial ecosystem services. Though Venter’s latest article (1) is just a first step towards this new role for metagenomics, it’s a nice example of what (meta)genomics could look like towards the end of this decade, if even not sooner.

  1. Yooseph et al. Genomic and functional adaptation in surface ocean planktonic prokaryotes. Nature (2010) vol. 468 (7320) pp. 60-6
  2. Yooseph et al. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. Plos Biol (2007) vol. 5 (3) pp. e16
  3. Rusch et al. The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. Plos Biol (2007) vol. 5 (3) pp. e77
  4. Rusch et al. Characterization of Prochlorococcus clades from iron-depleted oceanic regions. Proceedings of the National Academy of Sciences of the United States of America (2010) pp.
  5. Woyke et al. Assembling the marine metagenome, one cell at a time. PLoS ONE (2009) vol. 4 (4) pp. e5299
  6. Woyke et al. One bacterial cell, one complete genome. PLoS ONE (2010) vol. 5 (4) pp. e10314
  7. Moraru et al. GeneFISH – an in situ technique for linking gene presence and cell identity in environmental microorganisms. Environ Microbiol (2010) pp.
  8. Lasken. Genomic DNA amplification by the multiple displacement amplification (MDA) method. Biochem Soc Trans (2009) vol. 37 (Pt 2) pp. 450-3
  9. Mary et al. Metaproteomic and metagenomic analyses of defined oceanic microbial populations using microwave cell fixation and flow cytometric sorting. FEMS microbiology ecology (2010) pp.