Improved similarity scores for comparing motifs

Tanaka, Emi, Bailey, Timothy L., Grant, Charles E., Noble, William Stafford and Keich, Uri (2011) Improved similarity scores for comparing motifs. Bioinformatics, 27 12: 1603-1609. doi:10.1093/bioinformatics/btr257


Author Tanaka, Emi
Bailey, Timothy L.
Grant, Charles E.
Noble, William Stafford
Keich, Uri
Title Improved similarity scores for comparing motifs
Journal name Bioinformatics   Check publisher's open access policy
ISSN 1367-4803
1460-2059
Publication date 2011-06-15
Sub-type Article (original research)
DOI 10.1093/bioinformatics/btr257
Volume 27
Issue 12
Start page 1603
End page 1609
Total pages 7
Place of publication Oxford, United Kingdom
Publisher Oxford University Press
Collection year 2012
Language eng
Formatted abstract
Motivation:
A question that often comes up after applying a motif finder to a set of co-regulated DNA sequences is whether the reported putative motif is similar to any known motif. While several tools have been designed for this task, Habib et al. pointed out that the scores that are commonly used for measuring similarity between motifs do not distinguish between a good alignment of two informative columns (say, all-A) and one of two uninformative
columns. This observation explains why tools such as Tomtom occasionally return an alignment of uninformative columns which is clearly spurious. To address this problem, Habib et al. suggested a new score [Bayesian Likelihood 2-Component (BLiC)] which uses a Bayesian information criterion to penalize matches that are also similar to the background distribution.

Results:
We show that the BLiC score exhibits other, highly undesirable properties, and we offer instead a general approach to adjust any motif similarity score so as to reduce the number of reported spurious alignments of uninformative columns. We implement our method in Tomtom and show that, without significantly compromising Tomtom’s retrieval accuracy or its runtime, we can drastically reduce the number of uninformative alignments.
Q-Index Code C1
Q-Index Status Confirmed Code
Institutional Status UQ

Document type: Journal Article
Sub-type: Article (original research)
Collections: Official 2012 Collection
School of Chemistry and Molecular Biosciences
Institute for Molecular Bioscience - Publications
 
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 22 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 21 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Mon, 24 Oct 2011, 17:50:03 EST by Dr Timothy Bailey on behalf of School of Chemistry & Molecular Biosciences