In 1977 it was discovered that the genes of eukaryotes contain introns - intervening sequences that are removed from the RNA transcript shortly after transcription. The work presented in this thesis contributes to the understanding of introns in two ways; through characterisation of intron data sets from various model organisms, and through computational identification and analysis of patterns of gene splicing.
Through the construction of gene data sets for eukaryote organisms, extracted from publicly available data, it has been possible to study the overall characteristics of eukaryote gene structures, and this has lead to a recognition that these characteristics are profoundly effected by regional base composition properties.
The split genes of eukaryotes allow for the generation of multiple gene products from a single gene, through the adoption of alternative patterns of gene splicing. Although the
possibility of alternative splicing was recognised with the discovery of introns, and examples found shortly afterwards, only recently has sufficient gene and transcript sequence data been available to allow for computational analysis. The methods employed for the identification of patterns of gene splicing, and details of the characterisation and analysis of the resultant data sets, are described in this thesis.