Finding novel genes in bacterial communities isolated from the environment

Krause, Lutz, Diaz, Naryttza N., Bartels, Daniela, Edwards, Robert A., Puehler, Alfred, Rohwer, Forest, Meyer, Folker and Stoye, Jens (2006). Finding novel genes in bacterial communities isolated from the environment. In: 14th Conference on Intelligent Systems for Molecular Biology, Fortaleza Brazil, (E281-E289). Aug 06-10, 2006. doi:10.1093/bioinformatics/btl247

Author Krause, Lutz
Diaz, Naryttza N.
Bartels, Daniela
Edwards, Robert A.
Puehler, Alfred
Rohwer, Forest
Meyer, Folker
Stoye, Jens
Title of paper Finding novel genes in bacterial communities isolated from the environment
Conference name 14th Conference on Intelligent Systems for Molecular Biology
Conference location Fortaleza Brazil
Conference dates Aug 06-10, 2006
Journal name Bioinformatics   Check publisher's open access policy
Publication Year 2006
Sub-type Fully published paper
DOI 10.1093/bioinformatics/btl247
ISSN 1367-4803
Volume 22
Issue 14
Start page E281
End page E289
Total pages 1
Language eng
Abstract/Summary Motivation: Novel sequencing techniques can give access to organisms that are difficult to cultivate using conventional methods. When applied to environmental samples, the data generated has some drawbacks, e.g. short length of assembled contigs, in-frame stop codons and frame shifts. Unfortunately, current gene finders cannot circumvent these difficulties. At the same time, the automated prediction of genes is a prerequisite for the increasing amount of genomic sequences to ensure progress in metagenomics. Results: We introduce a novel gene finding algorithm that incorporates features overcoming the short length of the assembled contigs from environmental data, in-frame stop codons as well as frame shifts contained in bacterial sequences. The results show that by searching for sequence similarities in an environmental sample our algorithm is capable of detecting a high fraction of its gene content, depending on the species composition and the overall size of the sample. The method is valuable for hunting novel unknown genes that may be specific for the habitat where the sample is taken. Finally, we show that our algorithm can even exploit the limited information contained in the short reads generated by 454 technology for the prediction of protein coding genes.
Subjects 1308 Clinical Biochemistry
1706 Computer Science Applications
1703 Computational Theory and Mathematics
Q-Index Code E1
Q-Index Status Provisional Code
Institutional Status Unknown

Document type: Conference Paper
Collections: ResearcherID Downloads
Scopus Import
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 35 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 51 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Fri, 30 Oct 2015, 19:17:52 EST by System User