An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea

McDonald, Daniel, Price, Morgan N., Goodrich, Julia, Nawrocki, Eric P., DeSantis, Todd Z, Probst, Alexander, Andersen, Gary L, Knight, Rob and Hugenholtz, Philip (2012) An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME Journal, 6 3: 610-618. doi:10.1038/ismej.2011.139

Author McDonald, Daniel
Price, Morgan N.
Goodrich, Julia
Nawrocki, Eric P.
DeSantis, Todd Z
Probst, Alexander
Andersen, Gary L
Knight, Rob
Hugenholtz, Philip
Title An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea
Journal name ISME Journal   Check publisher's open access policy
ISSN 1751-7362
Publication date 2012-03
Year available 2011
Sub-type Article (original research)
DOI 10.1038/ismej.2011.139
Volume 6
Issue 3
Start page 610
End page 618
Total pages 9
Place of publication New York, United States
Publisher Nature Publishing Group
Collection year 2012
Language eng
Formatted abstract
Reference phylogenies are crucial for providing a taxonomic framework for interpretation of marker gene and metagenomic surveys, which continue to reveal novel species at a remarkable rate. Greengenes is a dedicated full-length 16S rRNA gene database that provides users with a curated taxonomy based on de novo tree inference. We developed a ‘taxonomy to tree’ approach for transferring group names from an existing taxonomy to a tree topology, and used it to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408 315 sequences. We also incorporated explicit rank information provided by the NCBI taxonomy to group names (by prefixing rank designations) for better user orientation and classification consistency. The resulting merged taxonomy improved the classification of 75% of the sequences by one or more ranks relative to the original NCBI taxonomy with the most pronounced improvements occurring in under-classified environmental sequences. We also assessed candidate phyla (divisions) currently defined by NCBI and present recommendations for consolidation of 34 redundantly named groups. All intermediate results from the pipeline, which includes tree inference, jackknifing and transfer of a donor taxonomy to a recipient tree (tax2tree) are available for download. The improved Greengenes taxonomy should provide important infrastructure for a wide range of megasequencing projects studying ecosystems on scales ranging from our own bodies (the Human Microbiome Project) to the entire planet (the Earth Microbiome Project). The implementation of the software can be obtained from
Keyword Evolution
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ
Additional Notes Published online 1 December 2011

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2012 Collection
School of Chemistry and Molecular Biosciences
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 542 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 533 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Access Statistics: 86 Abstract Views  -  Detailed Statistics
Created: Thu, 15 Mar 2012, 09:40:42 EST by Lucy O'Brien on behalf of School of Chemistry & Molecular Biosciences