The ascomycete fungus Leptosphaeria maculans is a major pathogen of Brassica species, particularly canola (Brassica napus; rapeseed; oil-seed rape), and the primary cause of crop losses of canola in Australia, causing blackleg disease. L. maculans, a filamentous ascomycete, is the causal agent of phoma stem canker, commonly referred to as blackleg. In late stages of infection, it spreads through the stem vasculature causing lesions, leading to poor growth, lodging and eventually plant death. This fungus is found in canola-growing regions worldwide such as Australia, Canada and Europe. Increased production of canola in these regions has led to a rise in the severity of the disease. In Australia alone, L. maculans infection is responsible for an estimated Australian $100 million in crop losses each year, with average losses ranging from 15–48 % and significant efforts are underway to improve resistance to this disease.
Understanding the characteristics of L. maculans is vital for developing an effective and sustainable approach to the management of blackleg disease on Brassica species. The completion of the L. maculans genome sequence was a significant development in the study of this fungal pathogen and provides a reference genome to which molecular markers can be physically mapped. This has been highly useful in other plant pathogens with sequenced genomes, such as the wheat pathogen Parastagonospora nodorum and the cereal pathogen Fusarium graminearum. Importantly, a reference genome also allows mapping of whole genome re-sequencing data, which is becoming a high-throughput, cost-effective method to study genome-wide diversity, particularly for the relatively small, lower complexity genomes of many fungal species. By re-sequencing the genome of different L. maculans isolates, variations in genome sequence and structure can be elucidated.
Advances in genome sequencing technologies have revolutionised plant and fungal genomics. They have made genome sequencing, re-sequencing and Single Nucleotide Polymorphism (SNP) discovery highly accessible, high-throughput and cost-effective. The process of whole genome re-sequencing involves aligning millions of short sequence reads to a reference genome sequence. Once this has been achieved, it is possible to identify genetic variation between individuals, which can be linked to variation in phenotype to provide molecular genetic markers and insights into gene function. Sequence variation can have a major impact on how an organism develops and responds to the environment.
This thesis describes the implementation of several approaches to elucidating the genome structure and variation of a number of L. maculans isolates, including SNPs and presence/absence variations (PAVs).
Initially, the re-sequencing of two L. maculans isolates for the identification of 21,814 SNPs was performed. I demonstrated the application of a novel SNP calling method, SGSautoSNP and its robustness and sensitivity in identifying polymorphisms in L. maculans. I described the use of these SNPs for phylogenetic analysis, genome analysis, including SNP properties and density in relation to genomic position and predicted function. This method correctly predicted polymorphisms in AvrLm genes, which are important in the pathogen’s interaction with its host plant. Whole-genome polymorphic trends such as genome-wide SNP density and transition/transversion ratios were also determined with this approach. The SNPs from this study were subsequently applied for the genotyping of 59 L. maculans isolates from around Australia in a separate study conducted within our group (Patel et al., 2015).
Subsequently a larger scale SNP prediction was performed using ten L. maculans isolates with known avirulence (AvrLm) gene content, based on infection studies toward Brassica species. The genome re-sequencing of these isolates was performed and yielded high genome coverage, ranging from 26 times to 266 times coverage. This resulted in the identification of 47,097 SNPs with an average of 1 SNP every 953 bp. This provides a greater resource for further study of L. maculans variation across individuals and populations than previous work. Genome analysis was performed and analyses of SNP properties and positions were undertaken. Importantly, the SGSautoSNP prediction correctly predicted the mutations within AvrLm genes in these isolates, indicating that the infection assays and the computational approaches can complement each other and indeed can be used to determine novel infection-related genes.
Furthermore, prediction and analysis of presence/absence variations (PAVs) was undertaken in order to understand genome structure and variation within L. maculans. The PAVs allow a better understanding of larger polymorphisms within this genome that can be several hundred base pairs (bp) long, as compared to the singular nature of SNPs. I analysed the positions and the occurrence of these variations and their effect on the genome in both coding and non-coding regions of the ten L. maculans isolates used in the SNP analysis. Results of these analyses indicate that a number of highly variable regions exist within the genome of L. maculans. This was particularly evident on SuperContig 13, where a number of secondary metabolite genes, of which some are involved in plant infection processes, are located. Other genes of interest, such as genes involved in fatty acid metabolism, antibiotic and antifungal resistance, were also shown to be affected by PAVs. This approach was also effective in identifying the presence/absence of avirulence genes, known to be present or lost from previous studies, such as AvrLm1 and AvrLm6.
The aim of this project was to contribute to a greater understanding of Leptosphaeria maculans and its genomic qualities, and that in turn this can help improve efforts to reduce the occurrence of this disease in Australia and abroad. An improved understanding of this pathogen will aid in developing more resistant plant varieties and thus improve the yields but also the sustainability of the canola industry for the long term.