The development and application of methods and tools for the assembly and analysis of second generation sequence data

Michael Imelfort (2011). The development and application of methods and tools for the assembly and analysis of second generation sequence data PhD Thesis, School of Agriculture and Food Sciences, The University of Queensland.

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
s41159479_PhD_finalabstract.pdf Phd Abstract application/pdf 42.82KB 10
s41159479_PhD_finalthesis.pdf Phd thesis application/pdf 17.79MB 38
Author Michael Imelfort
Thesis Title The development and application of methods and tools for the assembly and analysis of second generation sequence data
School, Centre or Institute School of Agriculture and Food Sciences
Institution The University of Queensland
Publication date 2011-08
Thesis type PhD Thesis
Total pages 339
Total colour pages 11
Total black and white pages 328
Language eng
Subjects 060102 Bioinformatics
080301 Bioinformatics Software
080109 Pattern Recognition and Data Mining
Abstract/Summary Any modern approach for developing a thorough understanding of any particular organism or group of organisms will at some stage involve determining all or part of their corresponding DNA or RNA sequences. DNA sequencing is commonly used to gain insight into a wide array of biological processes. Improvements in technologies and processes employed to gather information about the biological world have lead to the accumulation of enormous amounts of data which must be filtered, sorted and studied; a task beyond the capabilities of the human mind alone. Increasingly the domain of biology has become fused with the domains of information technology and mathematics. Computational systems designed to shift the burden of data processing away from scientists have evolved into systems of such complexity as to become areas of study in their own right. This thesis describes the design and implementation of a number of sequence-based bioinformatics analyses and tools, and their applications in the fields of genomics and plant genome research. Almost all of the tools described here have been designed to work exclusively with data produced using second generation sequencing (2GS) technologies. Included in this thesis is a description of a novel 2GS de novo assembly algorithm called SaSSY To demonstrate how SaSSY is being applied in current research, a . selection of projects the Author is involved with that have either used, or are currently using SaSSY are also described. These include the coral genome sequencing project, two comparative genomics projects involving the de novo assembly of BAC sequences from Secale cereale (rye) and Brassica rapa (rapeseed), and a project that aims to compare differences between different mitochondrial and chloroplast sequences in a variety of legumes. Also presented are summaries of the Author's role in the development of three bioinformatics software packages: autoSNPdb; a web based SNP detection and visualisation application, TagDB; a web based short read mapping and visualisation application, and BGA; an annotation pipeline developed primarily for annotating plant derived BAC and cDNA sequences. 2GS technologies have significantly influenced the direction, scope and perceived limitations of biological research as a whole and have particularly influenced the area of bioinformatics. It is becoming increasingly apparent that further revolutions in sequencing technology are expected to occur in the very near future indicating that research in this area will continue to grow, at an ever increasing pace.
Keyword Bioinformatics
DNA sequence assembly
Additional Notes Colour pages: 21,22,30,48,93,96,101,185,187,270,276

Citation counts: Google Scholar Search Google Scholar
Created: Mon, 12 Mar 2012, 13:33:58 EST by Mr Michael Imelfort on behalf of Library - Information Access Service