Computational methods to define the endosomal proteome

Josefine Sprenger (2010). Computational methods to define the endosomal proteome PhD Thesis, Institute for Molecular Bioscience, The University of Queensland.

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
s41230949_PhD_abstract.pdf Final Thesis Abstract application/pdf 78.82KB 4
s41230949_PhD_totalthesis.pdf Final Thesis Lodgement application/pdf 5.38MB 15
Author Josefine Sprenger
Thesis Title Computational methods to define the endosomal proteome
School, Centre or Institute Institute for Molecular Bioscience
Institution The University of Queensland
Publication date 2010-04
Thesis type PhD Thesis
Supervisor Dr Rohan D Teasdale
Dr Nicholas Hamilton
Dr Jenny Stow
Total pages 150
Total colour pages 25
Total black and white pages 125
Subjects 06 Biological Sciences
Abstract/Summary Endocytosis is one of the most dynamic processes in a mammalian cell. Beside the regulation of protein and lipid composition of cell and vesicle membranes, it also is responsible for signaling and trafficking processes such as protein and pathogen degradation or protein and molecule sorting. The protein machinery regulating the necessary vesicle fission, fusion and transport has been partly investigated, however a number of processes observed haven’t been explained yet and proteins involved are still unknown or not examined in relation to endocytosis. This thesis will investigate methods to determine and understand a potential endosomal proteome. The endosomal proteome is the collection of proteins that associate with these intracellular organelles. This will include the early endosome, late endosome, recycling endosome, lysosomes subcellular compartments and the membrane transport vesicles generated or directed to these subcellular compartments. I will focus on computational, specifically on bioinformatics methods, which have already been published. The integration of several methods can combine individual strengths and gives enough evidence for a membership of a protein in the endosomal proteome. The mammalian protein subcellular localization database LOCATE grew substantially over the last years. Not only the human proteome has been added, but also computational subcellular location prediction methods, automated image analysis and third party submission of literature data. LOCATE contains a manually reviewed set of endosomal proteins with peer-reviewed literature evidence. Computational tools allow subcellular location prediction for a complete proteome in a high- throughput approach. I reviewed the five leading prediction tools and tested them on an independent data set. This evaluation study revealed individual weaknesses and strengths, but also highlights the need of high quality data sets. The underrepresentation of minor organelles such as Golgi apparatus and endoplasmatic reticulum results in misleading information. The attempt to use computational prediction methods to determine endosomal proteins has been abandoned, as none of the methods are predicting this cellular compartment with any reliability. I described a new method to establish the endosomal proteome by using computational approaches only. Starting with the confident data set of endosomal proteins from LOCATE and applying text- mining, similarity, homology and affiliation to known protein complexes resulted in a potential data set of 645 proteins labeled Endo645. Incorporation of relational networks and their properties yielded new insights into Endo645, but also showed the effect of predicted annotations as from Gene Ontology. No certainty can be given to Gene Ontology annotations, as the majority has not been manually curated. The different extensions of the network in Endo645 resulted in a variety of protein-protein interaction networks with low informative value. The list of proteins in Endo645 serves now as a foundation for the computationally defined endosomal proteome. To evaluate Endo645 for being the endosomal proteome additional computational analysis was conducted. Gene expression data from the tissues and cells from the immune system was integrated and mapped onto the Endo645 protein-protein interaction network. A major finding was the significant difference in gene expression levels of Endo645 genes compared to the average expression levels of the whole mouse genome. I showed different ways to compare gene expression data between individual genes and in protein complexes to investigate high correlating profiles. In conclusion, this work focused initially on the update of LOCATE and the evaluation of computational prediction tools aiming to source information for endosomal proteins. The outcome led me to set up my own methods and analysis to establish a potential endosomal proteome. The final data set of 645 proteins has the potential to represent the endosome proteome. Endo645 provides the foundation for further analysis, computationally and experimentally.
Keyword Subcellular location prediction, endosomal system, organelle proteome, LOCATE database
Additional Notes 16, 21, 55-56, 59, 65, 69, 71, 73, 80, 89-90, 103, 108, 111, 113-118, 120-123

Citation counts: Google Scholar Search Google Scholar
Created: Wed, 29 Sep 2010, 07:49:51 EST by Ms Josefine Sprenger on behalf of Library - Information Access Service