The genome of the human blood fluke Schistosoma japonicum was explored for dispersed repeat sequences, or mobile genetic elements (MGE). Using expressed sequence tags encoding reverse transcriptase as probes, S. japonicum size selected genomic DNA libraries were screened for the presence of different types of retrotransposons. Four families of class I transposable elements were characterised including an Smα-like retroposon termed Sjα, a long terminal repeat (LTR) retrotransposon named Gulliver, and two families of non-LTR retrotransposon designated pido and SjR2.
Smα is a short interspersed element (SINE)-like retroposon that occurs in high copy number in the Schistosoma mansoni genome. Smα includes the hallmark features of SINE-like elements including a promoter region for RNA polymerase 111,
an AT-rich stretch at its 3' terminus, a short length of 500 bp or less, and short direct repeat sequences flanking the insertion site. Interestingly, the sequence of Smα also encodes an active ribozyme bearing a hammerhead domain. Other authors recently stated that Smα-like elements were absent from the genome of S. japonicum but in this study a family of Smα-like retroposons was identified in the S. japonicum genome and the elements were named the Sjα family. Like Smα, Sjα elements are SINE-like in structure and sequence, are present at about 10,000 copies per haploid genome, and contain an ostensibly functional, hanmierhead ribozyme motif The presence of these elements in all species of Schistosoma so far examined suggests that the hammerhead domain was acquired by vertical transmission from a common schistosome ancestor.
The consensus sequence of Gulliver, an LTR retrotransposon, was 4,788 bp, and the element was flanked at its 5' and 3' ends by LTRs of 259 bp. Each LTR included RNA polymerase II promoter sequences, a CAAT signal and a TATA box. Gulliver exhibited features characteristic of a functional LTR retrotransposon including two read through (termination) open reading frames (ORFs) encoding retroviral gag and pol proteins. The gag ORF encoded motifs conserved in nucleic acid binding proteins, while the pol ORF encoded conserved domains of aspartic protease, reverse transcriptase (RT), RNaseH and integrase, in that order, a pol pattern conserved in the Gypsy lineage. Phylogenetic analysis revealed that Gulliver is related to the mag family from Bombyx mori. Gulliver was present at between 100-1,000 copies per haploid
genome, and Southern blotting of S. mansoni genomic DNA indicated that a similar element was also present in the genome of the Afiican schistosome as well, pido and SjR2 are non-LTR retrotransposons. A consensus sequence of 3,564 bp of the truncated pido element was characterised. The sequence encoded part of the first ORF, the entire second ORF and, at its 3' terminus, a tandemly repetitive, A-rich (TA6TA5TA8) tail. ORFl of pido encoded a nucleic acid binding protein and ORF2 encoded a polyprotein that included apurinic/apyrimidinic (AP) endonuclease (EN) and RT domains, in that order, pido did not appear to have a tight target site specificity. At least 1,000 partial copies of pido are dispersed throughout the genome. The SjR2 consensus is 3,921 bp in size and is
constituted of a single ORF encoding a polyprotein with AP-EN and RT domains. The ORF is bound by 5' and 3' terminal UTRs and bears a short repeat (TGAC)3 at its 3' terminus. It was estimated that ~ 10,000 copies of SjR2 were dispersed throughout the genome.
Transcripts encoding the RT domains of Gulliver, pido and SjR2 were detected by RT-PCR in larval and adult stages of 5". japonicum, indicating that (at least) the RT domains of these elements are transcribed. Phylogenetic analyses of pido and SjR2 show that pido belongs to the CRI family while SjR2 is from the RTE lineage. This suggests that non-LTR elements of schistosomes are more likely to group by their structure than by their host species, but vertical transmission also occurs in
this family of elements. Taken together, these four MGEs account for about 20% of the 5. japonicum genome.
Exploration of the coding and non-coding regions of SjR2 revealed two notable characteristics. First, recombinant RT and EN domains of SjR2 expressed in isolation in insect cells both primed reverse transcription of SjR2 mRNA in vitro, the latter activity representing a hitherto unknown attribute of retrofransposon-encoded EN. Second, the 5' UTR of SjR2 was >80% identical to the 3' UTR of a schistosome heat shock protein-70 (HSP-70) gene in the antisense orientation. Whereas this was an unexpected finding in a non-LTR retrotransposon, human HSP-70 is known to be able to assist human AP-EN to cleave substrate DNA.