Published paper: The global resistome

Late yesterday, Microbiome put online our most recent work, covering the resistomes to antibiotics, biocides and metals across a vast range of environments. In the paper (1), we perform the largest characterization of resistance genes, mobile genetic elements (MGEs) and bacterial taxonomic compositions to date, covering 864 different metagenomes from humans (350), animals (145) and external environments such as soil, water, sewage, and air (369 in total). All the investigated metagenomes were sequenced to at least 10 million reads each, using Illumina technology, making the results more comparable across environments than in previous studies (2-4).

We found that the environment types had clear differences both in terms of resistance profiles and bacterial community composition. Humans and animals hosted microbial communities with limited taxonomic diversity as well as low abundance and diversity of biocide/metal resistance genes and MGEs. On the contrary, the abundance of ARGs was relatively high in humans and animals. External environments, on the other hand, showed high taxonomic diversity and high diversity of biocide/metal resistance genes and MGEs. Water, sediment and soil generally carried low relative abundance and few varieties of known ARGs, whereas wastewater and sludge were on par with the human gut. The environments with the largest relative abundance and diversity of ARGs, including genes encoding resistance to last resort antibiotics, were those subjected to industrial antibiotic pollution and air samples from a Beijing smog event.

A paper investigating this vast amount of data is of course hard to describe in a blog post, so I strongly suggest the interested reader to head over to Microbiome’s page and read the full paper (1). However, here’s a ver short summary of the findings:

  • The median relative abundance of ARGs across all environments was 0.035 copies per bacterial 16S rRNA
  • Antibiotic-polluted environments have (by far) the highest abundances of ARGs
  • Urban air samples carried high abundance and diversity of ARGs
  • Human microbiota has high abundance and diversity of known ARGs, but low taxonomic diversity compared to the external environment
  • The human and animal resistomes are dominated by tetracycline resistance genes
  • Over half of the ARGs were only detected in external environments, while 20.5 % were found in human, animal and at least one of the external environments
  • Biocide and metal resistance genes are more common in external environments than in the human microbiota
  • Human microbiota carries low abundance and richness of MGEs compared to most external environments

Importantly, less than 1.5 % of all detected ARGs were found exclusively in the human microbiome. At the same time, 57.5 % of the known ARGs were only detected in metagenomes from environmental samples, despite that the majority of the investigated ARGs were initially encountered in pathogens. Still, our analysis suggests that most of these genes are relatively rare in the human microbiota. Environmental samples generally contained a wider distribution of resistance genes to a more diverse set of antibiotics classes. For example, the relative abundance of beta-lactam resistance genes was much larger in external environments than in human and animal microbiomes. This suggests that the external environment harbours many more varieties of resistance genes than the ones currently known from the clinic. Indeed, functional metagenomics has resulted in the discovery of many novel ARGs in external environments (e.g. 5). This all fits well with an overall much higher taxonomic diversity of environmental microbial communities. In terms of consequences associated with the potential transfer of ARGs to human pathogens, we argue that unknown resistance genes are of greater concern than those already known to circulate among human-associated bacteria (6).

This study describes the potential for many external environments, including those subjected to pharmaceutical pollution, air and wastewater/sludge, to serve as hotspots for resistance development and/or transmission of ARGs. In addition, our results indicate that these environments may play important roles in the mobilization of yet unknown ARGs and their further transmission to human pathogens. To provide guidance for risk-reducing actions we – based on this study – suggest strict regulatory measures of waste discharges from pharmaceutical industries and encourage more attention to air in the transmission of antibiotic resistance (1).

References

  1. Pal C, Bengtsson-Palme J, Kristiansson E, Larsson DGJ: The structure and diversity of human, animal and environmental resistomes. Microbiome, 4, 54 (2016). doi: 10.1186/s40168-016-0199-5
  2. Durso LM, Miller DN, Wienhold BJ. Distribution and quantification of antibiotic resistant genes and bacteria across agricultural and non-agricultural metagenomes. PLoS One. 2012;7:e48325.
  3. Nesme J, Delmont TO, Monier J, Vogel TM. Large-scale metagenomic-based study of antibiotic resistance in the environment. Curr Biol. 2014;24:1096–100.
  4. Fitzpatrick D, Walsh F. Antibiotic resistance genes across a wide variety of metagenomes. FEMS Microbiol Ecol. 2016. doi:10.1093/femsec/fiv168.
  5. Allen HK, Moe LA, Rodbumrer J, Gaarder A, Handelsman J. Functional metagenomics reveals diverse β-lactamases in a remote Alaskan soil. ISME J. 2009;3:243–51.
  6. Bengtsson-Palme J, Larsson DGJ: Antibiotic resistance genes in the environment: prioritizing risks. Nature Reviews Microbiology, 13, 369 (2015). doi: 10.1038/nrmicro3399-c1

Annual GOTBIN Meeting on December 6

I am part of the organizing committee for the newly invented annual meeting for GOTBIN – the Gothenburg Bioinformatics Network. We will arrange a meeting on December 6 to get the networking activities for 2017 kickstarted, and every bioinformatician in Gothenburg is invited!

GOTBIN was launched to bridge and bring together all researchers in Gothenburg who fully or partially dealt with bioinformatics in their research. Through the network it should be possible to quickly find other local researchers tackling the same research problems as you are; to find appropriate resources to run your analyses; and to discuss research or infrastructure problems as they arise. To keep the network alive and kicking, it is crucial to keep relations active. Furthermore, it is also crucial to interact with key persons in the GOTBIN network to keep the lists of active researchers, resources and discussion forums up to date.

To facilitate future communication, we invite everyone who works with bioinformatics in Gothenburg to participate in a get-together workshop. The purpose of this workshop is to find out who is working with what and where, and to better get to know each other. This is a great opportunity to meet your next collaboration partner, post-doc or supervisor! The event will take place in Birgit Thilander at Medicinareberget (just next to the large lecture hall Arvid Carlsson) on December 6th, from 9.00 to 12.00. Fika will be provided and we will arrange ice-breaker activities suitable for the number of participants, so please register at: https://goo.gl/forms/KYdiiZMBDf0F9hvp2 by the 16th of November.

We hope to find everyone with a research interest in bioinformatics there and that this will be the launch of the next era of GOTBIN in 2017! See you there!

1 billion seconds

In the area of random notes, I turned 1 gigasecond today. That is, this morning it was 1 billion seconds since I was born. As good reasons as any to celebrate with cake at work! You’re all welcome to the 2 gigasecond celebration in ~31 years. Cheers!

Published opinion piece: Why limit antibiotic pollution?

Me and Joakim Larsson wrote an opinion/summary piece for the APUA Newsletter, issued by the Alliance for Prudent Use of Antibiotics, that was published yesterday (1). The paper is essentially a summary of work included in my PhD thesis, and discusses how to establish minimal selective concentrations of antibiotics for microbial communities (2-4), how to identify risk environments for resistance selection (5-9), and which mitigation strategies that can be implemented (10-12). Partially, we also discussed these issues earlier in our paper in the Medicine Maker (10), but this paper goes deeper into why limiting antibiotic pollution is important to mitigate the accelerating antibiotic resistance problem. I recommend this short summary piece to anyone who would like a brief overview of our research on antibiotic resistance, and think that it can serve as a great starting point for further reading! In addition, this issue of the newsletter features very interesting pieces on reducing antibiotics use (and disposal) outside of the clinics (13) and revival of old antibiotics (14). Please go ahead to the APUA web site and read the entire newsletter!

References

  1. Bengtsson-Palme J, Larsson DGJ: Why limit antibiotic pollution? The role of environmental selection in antibiotic resistance development. APUA Newsletter, 34, 2, 6-9 (2016). [Paper link].
  2. Bengtsson-Palme J, Larsson DGJ: Concentrations of antibiotics predicted to select for resistant bacteria: Proposed limits for environmental regulation. Environment International, 86, 140-149 (2016). doi: 10.1016/j.envint.2015.10.015 [Paper link]
  3. Gullberg E, Cao S, Berg OG, Ilbäck C, Sandegren L, Hughes D, et al.: Selection of resistant bacteria at very low antibiotic concentrations. PLoS Pathogens 7, e1002158 (2011).
  4. Lundström S, Östman M, Bengtsson-Palme J, Rutgersson C, Thoudal M, Sircar T, Blanck H, Eriksson KM, Tysklind M, Flach C-F, Larsson DGJ: Minimal selective concentrations of tetracycline in complex aquatic bacterial biofilms. Science of the Total Environment, 553, 587–595 (2016). doi: 10.1016/j.scitotenv.2016.02.103
  5. Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, 5, 648 (2014). doi: 10.3389/fmicb.2014.00648
  6. Bengtsson-Palme J, Hammarén R, Pal C, Östman M, Björlenius B, Flach C-F, Kristiansson E, Fick J, Tysklind M, Larsson DGJ: Elucidating selection processes for antibiotic resistance in sewage treatment plants using metagenomics. Science of the Total Environment, in press (2016). doi: 10.1016/j.scitotenv.2016.06.228
  7. Berendonk TU, Manaia CM, Merlin C, Fatta-Kassinos D, Cytryn E, Walsh F, et al.: Tackling antibiotic resistance: the environmental framework. Nature Reviews Microbiology, 13, 310–317 (2015). doi: 10.1038/nrmicro3439
  8. Martinez JL, Coque TM, Baquero F: What is a resistance gene? Ranking risk in resistomes. Nature Reviews Microbiology 2015, 13:116–123. doi:10.1038/nrmicro3399
  9. Bengtsson-Palme J, Larsson DGJ: Antibiotic resistance genes in the environment: prioritizing risks. Nature Reviews Microbiology, 13, 369 (2015) doi:10.1038/nrmicro3399‐c1
  10. Bengtsson-Palme J, Larsson DGJ: Time to limit antibiotic pollution. The Medicine Maker, 0416, 302, 17–18 (2016). [Paper link]
  11. Ashbolt NJ, Amézquita A, Backhaus T, Borriello P, Brandt KK, Collignon P, et al.: Human Health Risk Assessment (HHRA) for Environmental Development and Transfer of Antibiotic Resistance. Environmental Health Perspectives, 121, 993–1001 (2013)
  12. Pruden A, Larsson DGJ, Amézquita A, Collignon P, Brandt KK, Graham DW, et al.: Management options for reducing the release of antibiotics and antibiotic resistance genes to the environment. Environmental Health Perspectives, 121, 878–85 (2013).
  13. Theuretzbacher U: Optimizing the Use of Old Antibiotics — A Global Health Agenda. APUA Newsletter, 34, 2, 10-13 (2016). [Paper link].
  14. Amábile-Cuevas CF: Antibiotics and Antibiotic Resistance All Around Us. APUA Newsletter, 34, 2, 3-5 (2016). [Paper link].

Database quality paper in special issue

I just want to highlight that the paper on strategies to improve database accuracy and usability we recently published in Proteomics (1) has been included in their most recent issue, which is a special issue focusing on Data Quality Issues in Proteomics. I highly recommend reading our paper (of course) and many of the other in the special issue. Happy reading!

On another note, I will be giving a talk next Wednesday (October 5th) on a seminar day on next generation sequencing in clinical microbiology, titled “Antibiotic resistance in the clinic and the environment – There and back again“. You are very welcome to the lecture hall at floor 3 in our building at Guldhedsgatan 10A here in Gothenburg if you are interested! (Bear in mind though that it all starts at 8.15 in the morning.)

Finally, it seems that I am going to the Next Generation Sequencing Congress in London this year, which will be very fun! Hope to see some of you dealing with sequencing there!

References

  1. Bengtsson-Palme J, Boulund F, Edström R, Feizi A, Johnning A, Jonsson VA, Karlsson FH, Pal C, Pereira MB, Rehammar A, Sánchez J, Sanli K, Thorell K: Strategies to improve usability and preserve accuracy in biological sequence databases. Proteomics, 16, 18, 2454–2460 (2016). doi: 10.1002/pmic.201600034 [Paper link]

Published paper: Annotating fungi from the built environment

MycoKeys today put a paper online which I was involved in. The paper describes the results of a workshop in May, when we added and refined annotations for fungal ITS sequences according to the MIxS-Built Environment annotation standard (1). Fungi have been associated with a range of unwanted effects in the built environment, including asthma, decay of building materials, and food spoilage. However, the state of the metadata annotation of fungal DNA sequences from the built environment is very much incomplete in public databases. The workshop aimed to ease a little part of this problem, by distributing the re-annotation of public fungal ITS sequences across 36 persons. In total, we added or changed of 45,488 data points drawing from published literature, including addition of 8,430 instances of countries of collection, 5,801 instances of building types, and 3,876 instances of surface-air contaminants. The results have been implemented in the UNITE database and shared with other online resources. I believe, that distributed initiatives like this (and the ones I have been involved in in the past (2,3)) serve a very important purpose for establishing better annotation of sequence data, an issue I have brought up also for sequences outside of barcoding genes (4). The full paper can be found here.

References

  1. Abarenkov K, Adams RI, Laszlo I, Agan A, Ambrioso E, Antonelli A, Bahram M, Bengtsson-Palme J, Bok G, Cangren P, Coimbra V, Coleine C, Gustafsson C, He J, Hofmann T, Kristiansson E, Larsson E, Larsson T, Liu Y, Martinsson S, Meyer W, Panova M, Pombubpa N, Ritter C, Ryberg M, Svantesson S, Scharn R, Svensson O, Töpel M, Untersehrer M, Visagie C, Wurzbacher C, Taylor AFS, Kõljalg U, Schriml L, Nilsson RH: Annotating public fungal ITS sequences from the built environment according to the MIxS-Built Environment standard – a report from a May 23-24, 2016 workshop (Gothenburg, Sweden). MycoKeys, 16, 1–15 (2016). doi: 10.3897/mycokeys.16.10000
  2. Kõljalg U, Nilsson RH, Abarenkov K, Tedersoo L, Taylor AFS, Bahram M, Bates ST, Bruns TT, Bengtsson-Palme J, Callaghan TM, Douglas B, Drenkhan T, Eberhardt U, Dueñas M, Grebenc T, Griffith GW, Hartmann M, Kirk PM, Kohout P, Larsson E, Lindahl BD, Lücking R, Martín MP, Matheny PB, Nguyen NH, Niskanen T, Oja J, Peay KG, Peintner U, Peterson M, Põldmaa K, Saag L, Saar I, Schüßler A, Senés C, Smith ME, Suija A, Taylor DE, Telleria MT, Weiß M, Larsson KH: Towards a unified paradigm for sequence-based identification of Fungi. Molecular Ecology, 22, 21, 5271–5277 (2013). doi: 10.1111/mec.12481
  3. Nilsson RH, Hyde KD, Pawlowska J, Ryberg M, Tedersoo L, Aas AB, Alias SA, Alves A, Anderson CL, Antonelli A, Arnold AE, Bahnmann B, Bahram M, Bengtsson-Palme J, Berlin A, Branco S, Chomnunti P, Dissanayake A, Drenkhan R, Friberg H, Frøslev TG, Halwachs B, Hartmann M, Henricot B, Jayawardena R, Jumpponen A, Kauserud H, Koskela S, Kulik T, Liimatainen K, Lindahl B, Lindner D, Liu J-K, Maharachchikumbura S, Manamgoda D, Martinsson S, Neves MA, Niskanen T, Nylinder S, Pereira OL, Pinho DB, Porter TM, Queloz V, Riit T, Sanchez-García M, de Sousa F, Stefaczyk E, Tadych M, Takamatsu S, Tian Q, Udayanga D, Unterseher M, Wang Z, Wikee S, Yan J, Larsson E, Larsson K-H, Kõljalg U, Abarenkov K: Improving ITS sequence data for identification of plant pathogenic fungi. Fungal Diversity, 67, 1, 11–19 (2014). doi: 10.1007/s13225-014-0291-8
  4. Bengtsson-Palme J, Boulund F, Edström R, Feizi A, Johnning A, Jonsson VA, Karlsson FH, Pal C, Pereira MB, Rehammar A, Sánchez J, Sanli K, Thorell K: Strategies to improve usability and preserve accuracy in biological sequence databases. Proteomics, Early view (2016). doi: 10.1002/pmic.201600034

A note on FARAO and graphical output

I just wanted to share an experience with the FARAO software we recently published a paper about, and its compatibility with the GD and libpng libraries (used for creating PNG files). I have got questions from users about how to get this to work, and to test it out I decided to try to install it on my Mac. It turned out that it is nearly impossible to get this to work. These two packages are extremely picky with versions and dependencies. After trying for about on hour, I gave up and turned to my Linux machine. Surprisingly, I could not get it to work from scratch there either, despite that I have had it running (with some previous version combination) when we programmed and tested FARAO.

I find this extremely annoying myself, and I will try to look into other solutions for PNG or JPEG output from FARAO. In the mean time, I can only recommend to instead use the EPS output option, which produces more nice-looking figures and is considerably easier to set up. I am sorry about this and hope to be able to provide a better solution soon.

Published paper: Antibiotic resistance in sewage treatment plants

After a long wait (1), Science of the Total Environment has finally decided to make our paper on selection of antibiotic resistance genes in sewage treatment plants (STPs) available (2). STPs are often suggested to be “hotspots” for emergence and dissemination of antibiotic-resistant bacteria (3-6). However, we actually do not know if the selection pressures within STPs, that can be caused either by residual antibiotics or other co-selective agents, are sufficiently large to specifically promote resistance. To better understand this, we used shotgun metagenomic sequencing of samples from different steps of the treatment process (incoming water, treated water, primary sludge, recirculated sludge and digested sludge) in three Swedish STPs in the Stockholm area to characterize the frequencies of resistance genes to antibiotics, biocides and metal, as well as mobile genetic elements and taxonomic composition. In parallel, we also measured concentrations of antibiotics, biocides and metals.

We found that only the concentrations of tetracycline and ciprofloxacin in the influent water were above those that we predict to cause resistance selection (7). However, there was no consistent enrichment of resistance genes to any particular class of antibiotics in the STPs, neither for biocide and metal resistance genes. Instead, the most substantial change of the bacterial communities compared to human feces (sampled from Swedes in another study of ours (8)) occurred already in the sewage pipes, and was manifested by a strong shift from obligate to facultative anaerobes. Through the treatment process, resistance genes against antibiotics, biocides and metals were not reduced to the same extent as fecal bacteria were.

Worryingly, the OXA-48 beta-lactamase gene was consistently enriched in surplus and digested sludge. OXA-48 is still rare in Swedish clinical isolates (9), but provides resistance to carbapenems, one of our most critically important classes of antibiotics. However, taken together metagenomic sequencing did not provide clear support for any specific selection of antibiotic resistance. Rather, since stronger selective forces affect gross taxonomic composition, and thereby also resistance gene abundances, it is very hard to interpret the metagenomic data from a risk-for-selection perspective. We therefore think that comprehensive analyses of resistant vs. non-resistant strains within relevant species are warranted.

Taken together, the main take-home messages of the paper (2) are:

  • There were no apparent evidence for direct selection of resistance genes by antibiotics or co-selection by biocides or metals
  • Abiotic factors (mostly oxygen availability) strongly shape taxonomy and seems to be driving changes of resistance genes
  • Metagenomic and/or PCR-based community studies may not be sufficiently sensitive to detect selection effects, as important shifts towards resistant may occur within species and not on the community level
  • The concentrations of antibiotics, biocides and metals were overall reduced, but not removed in STPs. Incoming concentrations of antibiotics in Swedish STPs are generally low
  • Resistance genes are overall reduced through the treatment process, but far from eliminated

References and notes

  1. Okay, those who takes notes know that I have already complained once before on Science of the Total Environment’s ridiculously long production handling times. But, seriously, how can a journal’s production team return the proofs for after three days of acceptance, and then wait seven weeks before putting the final proofs online? I still wonder what is going on beyond the scenes, which is totally obscure because the production office also refuses to respond to e-mails. Not a nice publishing experience this time either.
  2. Bengtsson-Palme J, Hammarén R, Pal C, Östman M, Björlenius B, Flach C-F, Kristiansson E, Fick J, Tysklind M, Larsson DGJ: Elucidating selection processes for antibiotic resistance in sewage treatment plants using metagenomics. Science of the Total Environment, in press (2016). doi: 10.1016/j.scitotenv.2016.06.228 [Paper link]
  3. Rizzo L, Manaia C, Merlin C, Schwartz T, Dagot C, Ploy MC, Michael I, Fatta-Kassinos D: Urban wastewater treatment plants as hotspots for antibiotic resistant bacteria and genes spread into the environment: a review. Science of the Total Environment, 447, 345–360 (2013). doi: 10.1016/j.scitotenv.2013.01.032
  4. Laht M, Karkman A, Voolaid V, Ritz C, Tenson T, Virta M, Kisand V: Abundances of Tetracycline, Sulphonamide and Beta-Lactam Antibiotic Resistance Genes in Conventional Wastewater Treatment Plants (WWTPs) with Different Waste Load. PLoS ONE, 9, e103705 (2014). doi: 10.1371/journal.pone.0103705
  5. Yang Y, Li B, Zou S, Fang HHP, Zhang T: Fate of antibiotic resistance genes in sewage treatment plant revealed by metagenomic approach. Water Research, 62, 97–106 (2014). doi: 10.1016/j.watres.2014.05.019
  6. Berendonk TU, Manaia CM, Merlin C, Fatta-Kassinos D, Cytryn E, Walsh F, et al.: Tackling antibiotic resistance: the environmental framework. Nature Reviews Microbiology, 13, 310–317 (2015). doi: 10.1038/nrmicro3439
  7. Bengtsson-Palme J, Larsson DGJ: Concentrations of antibiotics predicted to select for resistant bacteria: Proposed limits for environmental regulation. Environment International, 86, 140–149 (2016). doi: 10.1016/j.envint.2015.10.015
  8. Bengtsson-Palme J, Angelin M, Huss M, Kjellqvist S, Kristiansson E, Palmgren H, Larsson DGJ, Johansson A: The human gut microbiome as a transporter of antibiotic resistance genes between continents. Antimicrobial Agents and Chemotherapy, 59, 10, 6551–6560 (2015). doi: 10.1128/AAC.00933-15
  9. Hellman J, Aspevall O, Bengtsson B, Pringle M: SWEDRES-SVARM 2014. Consumption of antimicrobials and occurrence of antimicrobial resistance in Sweden. Public Health Agency of Sweden and National Veterinary Institute, Solna/Uppsala, Sweden. Report No.: 14027. Available from: http://www.folkhalsomyndigheten.se/publicerat-material/ (2014)

Published paper: Strategies for better databases

I am happy to announce that our Viewpoint article on strategies for improving sequence databases has now been published in the journal Proteomics. The paper (1) defines some central problems hampering genomic, proteomic and metagenomic analyses and suggests five strategies to improve the situation:

  1. Clearly separate experimentally verified and unverified sequence entries
  2. Enable a system for tracing the origins of annotations
  3. Separate entries with high-quality, informative annotation from less useful ones
  4. Integrate automated quality-control software whenever such tools exist
  5. Facilitate post-submission editing of annotations and metadata associated with sequences

The paper is not long, so I encourage you to read it in its entirety. We believe that spreading this knowledge and pushing solutions to problems related to poor annotation metadata is vastly important in this era of big data. Although we specifically address protein-coding genes in this paper, the same logic also applies to other types of biological sequences. In this way the paper is related to my previous work with Henrik Nilsson on improving annotation data for taxonomic barcoding genes (2-4). This paper was one of the main end-results of the GoBiG network, and the backstory on the paper follows below the references…

References

  1. Bengtsson-Palme J, Boulund F, Edström R, Feizi A, Johnning A, Jonsson VA, Karlsson FH, Pal C, Pereira MB, Rehammar A, Sánchez J, Sanli K, Thorell K: Strategies to improve usability and preserve accuracy in biological sequence databases. Proteomics, Early view (2016). doi: 10.1002/pmic.201600034
  2. Kõljalg U, Nilsson RH, Abarenkov K, Tedersoo L, Taylor AFS, Bahram M, Bates ST, Bruns TT, Bengtsson-Palme J, Callaghan TM, Douglas B, Drenkhan T, Eberhardt U, Dueñas M, Grebenc T, Griffith GW, Hartmann M, Kirk PM, Kohout P, Larsson E, Lindahl BD, Lücking R, Martín MP, Matheny PB, Nguyen NH, Niskanen T, Oja J, Peay KG, Peintner U, Peterson M, Põldmaa K, Saag L, Saar I, Schüßler A, Senés C, Smith ME, Suija A, Taylor DE, Telleria MT, Weiß M, Larsson KH: Towards a unified paradigm for sequence-based identification of Fungi. Molecular Ecology, 22, 21, 5271–5277 (2013). doi: 10.1111/mec.12481
  3. Nilsson RH, Hyde KD, Pawlowska J, Ryberg M, Tedersoo L, Aas AB, Alias SA, Alves A, Anderson CL, Antonelli A, Arnold AE, Bahnmann B, Bahram M, Bengtsson-Palme J, Berlin A, Branco S, Chomnunti P, Dissanayake A, Drenkhan R, Friberg H, Frøslev TG, Halwachs B, Hartmann M, Henricot B, Jayawardena R, Jumpponen A, Kauserud H, Koskela S, Kulik T, Liimatainen K, Lindahl B, Lindner D, Liu J-K, Maharachchikumbura S, Manamgoda D, Martinsson S, Neves MA, Niskanen T, Nylinder S, Pereira OL, Pinho DB, Porter TM, Queloz V, Riit T, Sanchez-García M, de Sousa F, Stefaczyk E, Tadych M, Takamatsu S, Tian Q, Udayanga D, Unterseher M, Wang Z, Wikee S, Yan J, Larsson E, Larsson K-H, Kõljalg U, Abarenkov K: Improving ITS sequence data for identification of plant pathogenic fungi. Fungal Diversity, 67, 1, 11–19 (2014). doi: 10.1007/s13225-014-0291-8
  4. Nilsson RH, Tedersoo L, Ryberg M, Kristiansson E, Hartmann M, Unterseher M, Porter TM, Bengtsson-Palme J, Walker D, de Sousa F, Gamper HA, Larsson E, Larsson K-H, Kõljalg U, Edgar R, Abarenkov K: A comprehensive, automatically updated fungal ITS sequence dataset for reference-based chimera control in environmental sequencing efforts. Microbes and Environments, 30, 2, 145–150 (2015). doi: 10.1264/jsme2.ME14121

Backstory
In June 2013, the Gothenburg Bioinformatics Group for junior scientists (GoBiG) arranged a workshop with two themes: “Parallelized quantification of genes in large metagenomic datasets” and “Assigning functional predictions to NGS data”. The following discussion on how to database quality influenced results and what could be done to improve the situation was rather intense, and several good ideas were thrown around. I took notes from the meeting, and in the evening I put them down during a warm summer night at the balcony. In fact, the notes were good enough to be an early embryo for a manuscript. So I sent it to some of the most active GoBiG members (Kaisa Thorell and Fredrik Boulund), who were positive regarding the idea to turn it into a manuscript. I wrote it together more properly and we decided that everyone who contributed with ideas at the meeting would be invited to become co-authors. We submitted the manuscript in early 2014, only to see it (rather brutally) rejected. At that point most of us were sucked up in their own projects, so nothing happened to this manuscript for over a year. Then we decided to give it another go, updated the manuscript heavily and changed a few parts to better reflect the current database situation (at this point, e.g., UniProt had already started implementing some of our suggested ideas). Still, some of the proposed strategies were more radical in 2013 than they would be now, more than three years later. We asked the Proteomics editors if they would be interested in the manuscript, and they turned out to be very positive. Indeed, the entire experience with the editors at Proteomics has been very pleasant. I am very thankful to the GoBiG team for this time, and to the editors at Proteomics who saw the value of this manuscript.

Published paper: FARAO

Late last year, we introduced FARAO – the Flexible All-Round Annotation Organizer – a software tool that allows visualization of annotated features on contigs. Today, the Applications Note describing the software was published as an advance access paper in Bioinformatics (1). As I have described before, storing and visualizing annotation and coverage information in FARAO has a number of advantages. FARAO is able to:

  • Integrate annotation and coverage information for the same sequence set, enabling coverage estimates of annotated features
  • Scale across millions of sequences and annotated features
  • Filter sequences, such that only entries with annotations satisfying certain given criteria will be outputted
  • Handle annotation and coverage data produced by a range of different bioinformatics tools
  • Handle custom parsers through a flexible interface, allowing for adaption of the software to virtually any bioinformatic tool not supported out of the box
  • Produce high-quality EPS output
  • Integrate with MySQL databases

I have previously used FARAO to produce annotation figures in our paper on a polluted Indian lake (2), as well as in a paper on sewage treatment plants (which is in press and should be coming out any day now). We hope that the tool will find many more uses in other projects in the future!

References

  1. Hammarén R, Pal C, Bengtsson-Palme JFARAO: The Flexible All-Round Annotation Organizer. Bioinformatics, advance access (2016). doi: 10.1093/bioinformatics/btw499 [Paper link]
  2. Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ: Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Frontiers in Microbiology, 5, 648 (2014). doi: 10.3389/fmicb.2014.00648 [Paper link]