Many biological processes are mediated by dynamic interactions between and among proteins. In order to interact, two proteins must co-occur spatially and temporally. As protein-protein interactions (PPIs) and subcellular location (SCL) are discovered via separate empirical approaches, PPI and SCL annotations are independent and might complement each other in helping us to understand the role of individual proteins in cellular networks. We expect reliable PPI annotations to show that proteins interacting in vivo are co-located in the same cellular compartment. Our goal here is to evaluate the potential of using PPI annotation in determining SCL of proteins in human, mouse, fly and yeast, and to identify and quantify the factors that contribute to this complementarity.
Using publicly available data, we evaluate the hypothesis that interacting proteins must be co-located within the same subcellular compartment. Based on a large, manually curated PPI dataset, we demonstrate that a substantial proportion of interacting proteins are in fact co-located. We develop an approach to predict the SCL of a protein based on the SCL of its interaction partners, given sufficient confidence in the interaction itself. The frequency of false positive PPIs can be reduced by use of six lines of supporting evidence, three based on type of recorded evidence (empirical approach, multiplicity of databases, and multiplicity of literature citations) and three based on type of biological evidence (inferred biological process, domain-domain interactions, and orthology relationships), with biological evidence more-effective than recorded evidence. Our approach performs better than four existing prediction methods in identifying the SCL of membrane proteins, and as well as or better for soluble proteins.
Understanding cellular systems requires knowledge of the SCL of interacting proteins. We show how PPI data can be used more effectively to yield reliable SCL predictions for both soluble and membrane proteins. Scope exists for further improvement in our understanding of cellular function through consideration of the biological context of molecular interactions.