Wheat is an extremely important crop species to Australia and the rest of the world in both economic and social terms with population growth, disease, and other climate-related pressures requiring improvements to this crop if it is to be a secure food-source into the future. Experience with genome sequence data from the first sequenced plant genomes has demonstrated the utility of this knowledge not just in scientific terms to extend understanding of plant biology and evolution, but in the same economic and social terms under which these plants are deemed important. The polyploid complexity of wheat is a hindrance to the determination of its genome sequence using the techniques that have been more readily applicable to the plants whose genomes have already been sequenced. Second generation sequencing technologies are accelerating genome sequencing efforts in a number of crops but come with major computational challenges, and while a number of bioinformatics tools have been established to align and assemble second generation sequencing data specialized thinking is required to appropriately apply these technologies to polyploid genomes. A number of factors are of great importance in sequencing complex genomes, and while it should be both feasible and valuable to leverage second generation sequencing technologies the wheat community has been slow to do so.
This thesis describes a new approach to sequencing the wheat genome that utilises second generation sequencing technologies and syntenic relationships within the grasses to produce gene-based genomic scaffolds of wheat and demonstrates the application of this approach to extend wheat crop improvement and our understanding of wheat genome evolution. This approach was initially developed, applied, and validated using second generation sequencing data from isolated wheat chromosome arm 7DS, through which the capacity of this approach to assemble all or nearly all wheat genes was demonstrated. This approach was subsequently applied in chromosome arm 7BS to delimit a previously identified translocation within the range of a few genes and to predict a total gene count in wheat of ~77,000 genes. Finally the approach was applied to all of the wheat group 7 chromosomes providing the basis for the first assembly comparison of wheat’s subgenomes, which identified dispersion as one of the key factors that have driven genome fractionation in the recent evolution of the hexaploid wheat genome. The syntenic builds produced by this approach have been made publicly available through a user-friendly resource that simplifies the access to functional information. The public accessibility of the data provides a powerful resource to support wheat crop research and improvement as a template for varietal polymorphism prediction and transcriptomic analysis.
While a number of challenges exist for polyploid crop improvement, the syntenic build approach provides a strong basis upon which to conduct not only crop improvement research, but also investigate polyploid genome evolution. The methodology can be further improved and extended in a number of ways, and will therefore be a valuable approach to assist both wheat improvement and future genome sequencing efforts alike.