Microbiology, Metagenomics and Bioinformatics

Johan Bengtsson-Palme, University of Gothenburg | Wisconsin Institute for Discovery

Browsing Posts tagged Databases

In an interesting development, Nature Publishing Group has launched a new initiative: Scientific Data – a online-only open access journal that publishes data sets without the demand of testing scientific hypotheses in connection to the data. That is, the data itself is seen as the valuable product, not any findings that might result from it. There is an immediate upside of this; large scientific data sets might be accessible to the research community in a way that enables proper credit for the sample collection effort. Since there is no demand for a full analysis of the data, the data itself might quicker be of use to others, without worrying that someone else might steal the bang of the data per se. I also see a possible downside, though. It would be easy to hold on to the data until you have analyzed it yourself, and then release it separately just about when you submit the paper on the analysis, generating extra papers and citation counts. I don’t know if this is necessarily bad, but it seems it could contribute to “publishing unit dilution”. Nevertheless, I believe that this is overall a good initiative, although how well it actually works will be up to us – the scientific community. Some info copied from the journal website:

Scientific Data’s main article-type is the Data Descriptor: peer-reviewed, scientific publications that provide an in-depth look at research datasets. Data Descriptors are a combination of traditional scientific publication content and structured information curated in-house, and are designed to maximize reuse and enable searching, linking and data mining. (…) Scientific Data aims to address the increasing need to make research data more available, citable, discoverable, interpretable, reusable and reproducible. We understand that wider data-sharing requires credit mechanisms that reward scientists for releasing their data, and peer evaluation mechanisms that account for data quality and ensure alignment with community standards.

It seems like our paper on the recently launched database on resistance genes against antibacterial biocides and metals (BacMet) has gone online as an advance access paper in Nucleic Acids Research today. Chandan Pal – the first author of the paper, and one of my close colleagues as well as my roommate at work – has made a tremendous job taking the database from a list of genes and references, to a full-fledged browsable and searchable database with a really nice interface. I have contributed along the process, and wrote the lion’s share of the code for the BacMet-Scan tool that can be downloaded along with the database files.

BacMet is a curated source of bacterial resistance genes against antibacterial biocides and metals. All gene entries included have at least one experimentally confirmed resistance gene with references in scientific literature. However, we have also made a homology-based prediction of genes that are likely to share the same resistance function (the BacMet predicted dataset). We believe that the BacMet database will make it possible to better understand co- and cross-resistance of biocides and metals to antibiotics within bacterial genomes and in complex microbial communities from different environments.

The database can be easily accessed here: http://bacmet.biomedicine.gu.se, and use of the database in scientific work can cite the following paper, which recently appeared in Nucleic Acids Research:

Pal C, Bengtsson-Palme J, Rensing C, Kristiansson E, Larsson DGJ: BacMet: Antibacterial Biocide and Metal Resistance Genes Database. Nucleic Acids Research. Database issue, advance access. doi: 10.1093/nar/gkt1252 [Paper link]