Isolation and Expression of Two Genes Encoding Eukaryotic Release

Release Factor 1 from Paramecium tetraurelia ... of specific tRNAs and on modification of eukaryotic release factor one (eRF1), a factor involved in stop.
1MB taille 1 téléchargements 256 vues
J. Eukaryot. Microbiol., 49(5), 2002 pp. 374–382 q 2002 by the Society of Protozoologists

Isolation and Expression of Two Genes Encoding Eukaryotic Release Factor 1 from Paramecium tetraurelia STEPHANIE KERVESTIN,a OLIVIER A. GARNIER,b ANDREY L. KARAMYSHEV,c,1 KOICHI ITO,c YOSHIKAZU NAKAMURA,c ERIC MEYERb and OLIVIER JEAN-JEANa aUnite ´ de Biochimie Cellulaire, CNRS UMR 7098, Universite´ Pierre et Marie Curie, 9 quai Saint-Bernard, Paris 75005, France, and bRe ´ gulation de l’Expression Ge´ne´tique, CNRS UMR 8541, Ecole Normale Supe´rieure, 46, rue d’Ulm, Paris 75005, France, and cDepartment of Basic Medical Sciences, Institute of Medical Science, University of Tokyo, 4–6–1 Shirokanedai, Tokyo, Minato-ku 108-8630, Japan ABSTRACT. Paramecium tetraurelia, like some other ciliate species, uses an alternative nuclear genetic code where UAA and UAG are translated as glutamine and UGA is the only stop codon. It has been postulated that the use of stop codons as sense codons is dependent on the presence of specific tRNAs and on modification of eukaryotic release factor one (eRF1), a factor involved in stop codon recognition during translation termination. We describe here the isolation and characterisation of two genes, eRF1-a and eRF1b, coding for eRF1 in P. tetraurelia. The two genes are very similar, both in genomic organization and in sequence, and might result from a recent duplication event. The two coding sequences are 1,314 nucleotides long, and encode two putative proteins of 437 amino acids with 98.5% identity. Interestingly, when compared with the eRF1 sequences either of ciliates having the same variant genetic code, or of other eukaryotes, the eRF1 of P. tetraurelia exhibits significant differences in the N-terminal region, which is thought to interact with stop codons. We discuss here the consequences of these changes in the light of recent models proposed to explain the mechanism of stop codon recognition in eukaryotes. Besides, analysis of the expression of the two genes by Northern blotting and primer extension reveals that these genes exhibit a differential expression during vegetative growth and autogamy. Key Words. Alternative genetic code, autogamy, differential gene expression, eRF1, macronuclear DNA, stop codons, transcription start sites, translation termination, vegetative growth.

T

ermination of protein synthesis occurs when the ribosome translation machinery encounters an in-frame stop codon on the mRNA. Hydrolysis of the esther bond linking the polypeptide chain and the last tRNA at the P site of the ribosome is triggered by the peptidyl transferase center of the ribosome and requires specific release factors (RFs) and GTP (Kisselev and Buckingham 2000). In eukaryotes, release factor one (eRF1) recognizes all three stop codons and promotes the activation of the peptidyl transferase center, leading to the release of the nascent polypeptide (Frolova et al. 1994), and release factor three (eRF3) is a GTPase that enhances eRF1 activity (Zhouravleva et al. 1995). Release factor eRF1 functionally acts as a tRNA and thus must bind to the ribosomal A site (Nakamura, Ito, and Ehrenberg 2000). The elucidation of the crystal structure of human eRF1 confirmed the structural similarities between eRF1 and tRNA (Song et al. 2000). Using the structural data, Song et al. (2000) designed a model that organized eRF1 into three functional domains: 1) the N-terminal Domain 1 involved in stop codon recognition; 2) the central Domain 2 involved in activation of the peptidyl transferase centre; and 3) the carboxy-terminal Domain 3 involved in binding to eRF3 and other eRF1-interacting proteins. The genetic and biochemical data obtained recently in yeast tend to confirm that the amino-terminal portion of eRF1 is indeed involved in stop codon recognition (Bertram et al. 2000). One of the remarkable features of some ciliate species is their use of alternative nuclear genetic codes. As far as we know, all the changes concern the assignment of stop codons to sense codons. For example, the oligohymenophoreans Tetrahymena and Paramecium and the spirotrichs Stylonychia and Oxytricha translate UAA and UAG as glutamine, UGA being the only stop codon. Conversely, the spirotrich Euplotes translates UGA as cysteine and uses UAA and UAG as stop codons (Caron and Meyer 1985; Harper and Jahn 1989; Helftenbein 1985; Herrick et al. 1987; Horowitz and Gorovsky 1985; Meyer et al. 1991; Preer et al. 1985). The distribution of these changes on the

Corresponding Author: S. Kervestin—Telephone number: 33-1-44-2722-99; FAX number: 33-1-44-27-22-15; E-mail: stephanie.kervestin@ snv.jussieu.fr 1 Present address: College of Medicine, Texas A&M University System, Health Sciences Center, College Station, TX 77843-1114, USA.

ciliate phylogenetic tree constructed using 28S ribosomal RNA (rRNA) sequences suggested that they have occurred independently several times within this phylum (Baroin-Tourancheau et al. 1995). It seems clear now that the use of stop codons as sense codons involves changes in tRNAs and in the ability of eRF1 to recognize stop codons. In Tetrahymena thermophila, two specific tRNAGln isoacceptors implicated in the decoding of UAA and UAG have been isolated (Hanyu et al. 1986; Kuchino et al. 1985). In Euplotes octocarinatus, a tRNACys with GCA anticodon was supposed to decipher UGA in addition to the normal UGU- and UGC-cysteine codons (Grimm et al. 1998). Moreover, it has been shown that eRF1 from Euplotes aediculatus does not recognize the UGA stop codon, supporting the hypothesis that in all ciliates using variant genetic codes, eRF1 has lost its ability to recognize the reassigned stop codons (Kervestin et al. 2001). Because the study of modifications of ciliate eRF1 could provide the opportunity to elucidate the mechanism of stop codon recognition in eukaryotes, substantial efforts were made recently to clone and sequence eRF1 genes from various ciliate species (Inagaki and Doolittle 2001; Karamyshev, Ito, and Nakamura 1999; Liang et al. 2001; Lozupone, Knight, and Landweber 2001). The phylogenetic analysis of these sequences and the comparisons with eRF1s from eukaryotes using canonical genetic codes have shown that ciliate eRF1s exhibit a high evolutionary rate, which is reflected in a high number of variable positions (Inagaki and Doolittle 2001; Moreira et al. 2002). We describe here the isolation and the genomic organization of two eRF1 genes from the Paramecium tetraurelia macronuclear genome. The genes encode deduced proteins of 437 amino acids with 98.5% identity and are differentially expressed during the different stages of the life cycle, probably by differential regulation of transcription. MATERIALS AND METHODS Materials. The Paramecium tetraurelia genomic DNA library (4.5 3 107 pfu/ml) was constructed by inserting total DNA of the homozygous strain d4.2 digested with MboI into the BamHI site of hEMBL3 bacteriophage. The probe used for library screening was a 900-basepairs-EcoRI restriction fragment of a derivative of pUC118 plasmid containing the 39-part Tetrahymena thermophila eRF1 cDNA (Karamyshev, Ito, and Nakamura 1999). Oligonucleotides pa51 (59-CAATGGAGACATT-

374

KERVESTIN ET AL.—PARAMECIUM TETRAURELIA eRF1

GACAGAAG-39) and pa31 (59-ATTTCTCAGATAGATGTTCAA-39) were used for PCR amplification and oligonucleotides rt502 (59-GTGCCAGCAGTTCTTTCCTGG-39) and rt1102 (59GTGCCAGCAGTTCTTTCTTGA-39) were used for primer extension analysis. Cell lines and culture. Paramecium tetraurelia stocks 51 and d4.2 are well-characterized homozygous stocks carrying the A 51 and A 29 alleles of the gene for surface antigen A, respectively (Sonneborn 1974). Stock 51–2A expresses mating type VIII and stock 51–2B expresses mating type VII. Cells were grown in a wheat grass powder (Pines Int’l. Co., Lawrence, Kansas) infusion medium bacterized the day before use with Klebsiella pneumoniae and supplemented with 0.8 mg/ml of b-sitosterol, at 27 8C. Basic methods of cell culture have been described previously (Sonneborn 1974). Autogamy was induced by starvation conditions and assessed by staining cells with a 20:1 (vol./vol.) mix of carmine red (0.5% in 45% acetic acid) and fast green (1% in ethanol). Library screening, gene manipulation, DNA sequencing, and PCR amplification. Library screening and gene manipulations were carried out by standard procedures (Sambrook, Fritsch, and Maniatis 1989). Phage plaques transferred on Hybond N1 filters (Amersham-Pharmacia Biotechnology, Little Chalfont, United Kingdom) were detected by hybridization with the T. thermophila eRF1 cDNA probe labeled with a random priming kit (Roche, Meylan, France). Hybridization at nonstringent conditions was carried out for 1 h at 60 8C followed by slow cooling to 30 8C and washing in 2@ SSC (0.3 M NaCl, 30 mM sodium citrate)-0.1% sodium dodecyl sulfate (SDS) at 35 8C. Fragments of positive phage inserts selected after Southern blot analysis (see below) were cloned in pBluescript II SK1 plasmid (Stratagene). The entire nucleotide sequences of cloned fragments were determined on both strands by the ‘‘walk on DNA’’ strategy using the dideoxy chain termination method (Sanger, Nicklen, and Coulson 1977). Sequencing reactions were carried out with the Sequenase Version 2.0 DNA sequencing kit (Amersham-Pharmacia Biotechnology). PCR amplifications of DNA were carried out in 25-ml reaction mixtures containing 100 ng of total genomic DNA or 1 ng of recombinant l phage DNA, 100 pmol of each primer (pa51 and pa31), 200 mM of each deoxynucleoside triphosphate, 1@ of commercial PCR buffer, and 2.5 units of Pwo DNA polymerase (Roche). Amplifications were run for 30 cycles (94 8C, 30 s; 50 8C, 30 s; 72 8C, 1 min) in a Perkin-Elmer Cetus thermocycler. PCR fragments were purified by the QIAquick PCR purification kit procedure (Qiagen, Courtaboeuf, France). Genomic DNA extraction and Southern blotting. Cultures of exponentially growing cells (400 ml) at 1,000 cells/ml were centrifuged. After being washed in 10 mM Tris-HCl, pH 7.0, the pellet was resuspended in one vol. of 10 mM Tris-HCl, pH 7 and added quickly to four vol. of lysis solution (0.44 M EDTA at pH 9, 1% SDS, 0.5% N-laurylsarcosine, and 1 mg/ml of proteinase K) at 55 8C. The lysate was incubated 5 h at 55 8C, gently extracted once with phenol and dialyzed twice against TE (10 mM Tris-HCl, pH 8, 1 mM EDTA) containing 20% ethanol and once against TE. After restriction enzyme digestion and electrophoresis, DNA was transferred from agarose gel to Hybond N1 membranes in 0.4 N NaOH after depurination in 0.25 N HCl for seven minutes. Hybridization in stringent conditions was carried out in Church solution (0.5 M sodium phosphate, pH 7.2, 7% SDS, 1% bovine serum albumin, 1 mM EDTA) at 60 8C (Church and Gilbert 1984). Probes were labeled with 40 mCi [a-32P]dATP (3,000 Ci/mmol) using a random priming kit (Roche). Membranes were washed in 0.2@ SSC-0.1% SDS for 30 min at 55 8C prior to autoradiography. RNA preparation and Northern blotting. Paramecium te-

375

traurelia RNA was obtained from cultures of either exponentially growing cells at 1,000 cells/ml (vegetative growth) or autogamous cells at 5,000 cells/ml. Total RNA was prepared essentially according to the method of Chomczynski and Sacchi (1987), except that the cells were lysed by vortex mixing in the presence of glass beads. RNA (15–20 mg/lane) was fractionated on 1% agarose-formaldehyde gels and transferred to Hybond N1 membranes using 20@ SSC. Hybridizations and washing were carried out as described for Southern blotting. Primer extension analysis. Five mg of total RNA extracted from P. tetraurelia as described above were hybridized with 100 pmoles of either rt502 or rt1102 oligonucleotides for 15 min at 75 8C followed by slow cooling to room temperature. For primer extension reactions, the hybridization mixture was incubated with 0.5 mM of each dCTP, dGTP, and dTTP, 0.01 mM dATP, 40 mCi [a-32P]dATP (3,000 Ci/mmol), 56 units of RNase inhibitor, 1@ commercial AMV reverse transcriptase buffer, and 50 units of AMV reverse transcriptase (Roche) in 40 ml final vol. for 1 h at 50 8C. The reaction was stopped by addition of EDTA (10 mM final concentration) and the reaction products were purified by classical phenol-chloroform extractions and precipitated twice with 100% ethanol in the presence of 0.3 M sodium acetate, pH 5.5. The reaction products were then separated on 5%-acrylamide-7-M urea sequencing gel. Sequencing reactions were performed with rt502 and rt1102 oligonucleotides and run in parallel. RESULTS Isolation of two genes coding for polypeptide chain release factor eRF1. To isolate the gene coding for eRF1, we screened a l EMBL3 library of P. tetraurelia total DNA using the 39-part of T. thermophila eRF1 gene cDNA as a probe. Four positive phages, named 502, 802, 902 and 1102, were selected after three rounds of isolation and purification. Phage 502 was further analyzed by restriction enzyme digestion and Southern blotting with the probe used for library screening. A strong positive signal was obtained with a single EcoRI-BamHI fragment of approximately 2 kilo-base pairs (kbp) (data not shown). This fragment was cloned in the vector pBluescript II SK1 and sequenced. The top BLASTX (Altschul and Koonin 1998) match against the GenBank nr database was with T. thermophila eRF1 (AB026195), with an expected value (e-value) of 102103. Two primers (pa 31 and pa 51; see Materials and Method section) specific for the 2-kbp fragment were used to perform a PCR amplification on the 4 positive phages and on P. tetraurelia genomic DNA. Analysis of the undigested PCR products revealed the amplification of a 1.4-kbp fragment for all phages and genomic DNA (Fig. 1A). The BamHI digestion of the PCR products showed a single 1.4 kbp-fragment for phages 502 and 902, and for genomic DNA (Lanes 1, 3, 5, Fig. 1A), and two co-migrating fragments of 0.65 kbp for phages 802 and 1102 (Lanes 2, 4, Fig. 1A). These results suggested that all the selected phages contained the gene coding for eRF1. The differences in the BamHI digestion profile of the PCR product suggested that two genes encoding eRF1 were present in P. tetraurelia genome. For the genomic DNA, the BamHI digestion profile of the PCR product corresponded to the digestion profile of phages 502 and 902 (Lanes 1, 3, 5, Fig. 1A). This could be explained by the fact that primers pa51 and pa31 used for the PCR reaction were derived from the sequence of phage 502 insert (also note the lower level of DNA amplification for phages 802 and 1102). The PCR product of phage 1102 was cloned in plasmid pBluescript II SK1 and sequenced. The analysis of the sequence using the BLASTX program against the GenBank nr database confirmed that this fragment contained the 59-part of

376

J. EUKARYOT. MICROBIOL., VOL. 49, NO. 5, SEPTEMBER–OCTOBER 2002

Fig. 1. Isolation of two genes coding for eukaryotic release factor 1 (eRF1) in Paramecium tetraurelia. A. PCR amplification on the DNA of phages isolated during library screening (lanes 1, 2, 3, and 4 corresponded to phages 502, 802, 902, and 1102, respectively) and on genomic DNA (lane 5), using primers specific of phage 502 insert. PCR products (not digested or digested with BamHI) were fractionated on a 1% agarose gel, which was stained with ethidium bromide. Lane a: molecular weight markers. B. Schematic representation of the genomic regions of P. tetraurelia macronuclear DNA containing eRF1 genes. The 39-end of the upstream ORF with similarities to human Zinc/Iron Regulated Transporter gene (annotated ZIRT-like), the intergenic regions, and eRF1-a and eRF1-b genes are presented. Coding sequences are indicated by rectangles, introns by broken lanes, and intergenic regions by straight lanes. The numbers (in base pairs) correspond to the lengths of the different regions.

a gene encoding eRF1. An EcoRI-BamHI fragment containing the 39-part of this gene, which was lacking in the PCR product, was obtained from phage 1102 after Southern blot analysis (data not shown). The complete coding sequence was then reassembled in pBluescript II SK1. We have named eRF1-a the gene present in phages 502 and 902 and eRF1-b the gene in phages 802 and 1102. The two sequences were compared and analyzed using the GCG package (Womble 2000) and the results of BLASTX. A deviant genetic code in which UAG and UAA code for glutamine instead of being translation termination signals was used for translating the DNA sequences. The positions of the ATG start codon, TGA stop codon, and exon/intron boundaries were inferred from the optimal amino acid sequence alignment, the maintenance of open-reading frames, and the known properties of introns in P. tetraurelia (size between 20 and 30 nucleotides, consensus GT/AT intron-end sequences, and high A1T / G1C ratio). Excluding the introns, the coding sequences were 1.314

kbp long and encoded a 437 amino acid putative protein with a theoretical molecular mass of 49,500 Da (Fig. 1B). Moreover, 19 and 20 TAG and TAA stop codons were translated as glutamine in eRF1-a and eRF1-b open reading frames (ORFs), respectively. The two nucleotide sequences were 89.5% identical with or without the introns. Among the 144 positions that were different, only 12 affected the eRF1 coding region. A second ORF was found upstream of both eRF1 genes. In the GenBank nr database, the top BLASTX match for this ORF was with the 39-end of Homo sapiens Zinc/Iron Regulated Transporter(ZIRT)-like gene (AJ 243650) with an e-value of 2 3 1027. The positions of ZIRT-like gene ORFs and introns were defined as described above for eRF1 genes: the two regions are very similar (Fig. 1B). The greatest differences were observed in the positions and lengths of the last introns of ZIRT-like gene ORFs and in the intergenic regions, which were remarkably short and differed in lengths (98 and 89 bp for eRF1-a and eRF1-b, respectively) and nucleotide sequences. Sequence analyses confirmed that the phages isolated during the library screening contained two different regions of DNA corresponding to two genes coding for eRF1 in P. tetraurelia. The nucleotide sequence data reported in this paper have been previously described by Moreira et al. (2002) and appear in the DDBJ/ EMBL/GenBank nucleotide sequence databases with accession numbers AF149035 and AF149036 for eRF1-a and eRF1-b, respectively. The two putative proteins encoded by eRF1-a and eRF1-b are 98.5% identical, differing by six amino acids only (Fig. 2). Most of the different residues (arrowheads, in Fig. 2) were located in non-conserved regions of the proteins and most of them were conservative changes. Using Clustal W program (Thompson, Higgins, and Gibson 1994), P. tetraurelia eRF1 amino acid sequences were aligned with eRF1 from other eukaryotes (Fig. 2 only shows the alignment with T. thermophila and Homo sapiens eRF1 sequences). eRF1s of P. tetraurelia were 60% similar with H. sapiens eRF1, and 64% similar with T. thermophila eRF1. This multiple alignment clearly showed that eRF1s of P. tetraurelia resembles eRF1s of other eukaryotes. The central domain containing the GGQ motif responsible for the activation of the peptidyl transferase center of the ribosome is the most conserved, and the C-terminal region involved in the binding of other proteins is less conserved. In the N-terminal domain, most of the conserved amino acids in known eRF1s are also conserved in Paramecium eRF1s. Recently, alignments of the N-terminus of all available eRF1 proteins including various ciliate species using variant genetic codes were realized (Inagaki et al. 2002; Lozupone, Knight, and Landweber 2001). In these articles, it was pointed out that, in ciliates, some sites that were divergent from other eukaryotic eRF1s, had a convergent evolution in ciliates species using the same genetic code (e.g. reassignment of UAA and UAG to glutamine). However, when the P. tetraurelia eRF1 sequence is added to these alignments, most of the sites of convergent evolution described in these studies are not conserved in P. tetraurelia eRF1: N changed to S at position 46; L changed to R at position 51; and S changed to A at position 57 (numbering according to the human sequence). The exception is the V at position 35, which is convergently conserved. Interestingly, the region of P. tetraurelia eRF1 comprising residues 37 to 80 (underlined in Fig. 2) is significantly divergent from all other eRF1 sequences including eRF1 from Stylonychia, Oxytricha, and Tetrahymena. It has been proposed that this region, which corresponds to the helix-loop-helix region located at the tip of Domain 1 in the three dimensional structure of human eRF1, was involved in stop codon recognition (Muramatsu et al. 2001; Nakamura, Ito, and Ehrenberg 2000; Song et al. 2000).

KERVESTIN ET AL.—PARAMECIUM TETRAURELIA eRF1

377

Fig. 2. Comparison of eukaryotic release factor 1 (eRF1) amino acid sequences. The alignment was performed using CLUSTALW program (Thompson, Higgins, and Gibson 1994). Pt-a, Paramecium tetraurelia eRF1-a (accession number AF149035); Pt-b, P. tetraurelia eRF1-b (AF149036); Tt, Tetrahymena thermophila eRF1 (BAA85336); Hs, Homo sapiens eRF1(P46055). Numbering according to the human sequence. Identical amino acid residues are shaded in black and similar amino acid residues are shaded in gray. Divergent positions between eRF1-a and eRF1-b are indicated in sequence by arrowheads. The portion of the N-terminal domain sequence that is significantly divergent from other eRF1 sequences (including ciliate eRF1s) in P. tetraurelia eRF1 is underlined.

378

J. EUKARYOT. MICROBIOL., VOL. 49, NO. 5, SEPTEMBER–OCTOBER 2002

Fig. 3. Expression of eRF1-a and eRF1-b genes in Paramecium tetraurelia. A. Southern blot analysis of genomic DNA from P. tetraurelia stock 51, mating-type VIII (lanes 2A) and mating-type VII (lanes 2B), digested by BamHI or HindIII. Samples were run on 0.8% agarose gel, transfered on nylon membrane and probed under stringent condition with 32P-labeled probes specific of eRF1-a or eRF1-b, as indicated. Arrows indicate the restriction fragments specific for either eRF1-a or eRF1-b. The molecular weight ladder is in kilo-base pairs. B. Northern blot analysis of total RNA extracted from P. tetraurelia stock d4.2 during vegetative growth (V) or autogamy (A). RNA samples (5 mg) were loaded onto a 1% agarose-formaldehyde gel, transferred, and probed sequentially with the eRF1-a, eRF1-b, and 18S rRNA probes, as indicated.

Expression of eRF1-a and eRF1-b genes. Southern hybridization was performed to ascertain the number of eRF1-related sequences in P. tetraurelia. Because in Paramecium the ploidy ratio of macronuclear DNA to micronuclear DNA is ; 250, only macronuclear DNA can be detected on conventional Southern blots. Genomic DNA extracted from two different mating types of stock 51 was digested with BamHI and HindIII restriction endonucleases and loaded onto a 0.8% agarose gel, transferred to a nylon membrane, and hybridized either with an eRF1-a probe (an internal EcoRV-XmnI fragment of 1,126 nu-

cleotides), or with an eRF1-b probe (an internal EcoRV fragment of 909 nucleotides). The membrane was hybridized first with the eRF1-a probe, washed at low stringency, exposed to X-ray film, washed again at high stringency and re-exposed. Then, the membrane was stripped, exposed to verify the absence of remaining signal, reprobed with the eRF1-b probe, and washed and exposed as described above for the eRF1-a probe. Hybridization of the eRF1-a probe revealed single fragments of 4 kbp and 8kbp for the BamHI and HindIII digestions, respectively (Fig. 3A). In the HindIII digestion, the faintly visible

KERVESTIN ET AL.—PARAMECIUM TETRAURELIA eRF1

band of 6 kbp likely corresponded to the eRF1-b gene (see below). Hybridization with the eRF1-b probe revealed two bands of approximately 12 kbp and a faster migrating species of 4 kbp in the BamHI digestion (Fig. 3A). Considering the high degree of homology of the two sequences used as probes, the latter band, which corresponded in size with the signal observed with eRF1-a probe, was likely due to cross-hybridization with the eRF1-a gene. In the HindIII digestion, hybridization with the eRF1-b probe revealed a major band of 6 kbp, which could be attributed to eRF1-b gene, and a minor band of 8 kbp representing eRF1-a gene (Fig. 3A). A poor transfer of high molecular weight species was probably responsible for the low intensity of the 12 kbp-eRF1-b band in the BamHI digestion in comparison to the intensity of the 4-kbp band. No other signals were observed on the Southern blot experiments, even with washing at lower stringency (data not shown). Thus, the results of the Southern blot analysis suggested that the P. tetraurelia macronuclear genome contains only two eRF1 genes that were amplified. The expression of eRF1 genes was examined by Northern blotting of total RNA extracted from P. tetraurelia cells in different stages of the life cycle (i.e. vegetative growth and autogamy). The membrane was probed sequentially with the eRF1a and eRF1-b probes as described for Southern blot analysis, and with a 18S rRNA probe as a standardization control. With both eRF1-a and eRF1-b probes, a single band of approximately 1,500 nucleotides was observed in each of the samples (Fig. 3B). However, due to the possibility of cross-hybridization, the relative abundance of each of the mRNAs was difficult to estimate. The signal intensity obtained with the eRF1-a probe was 3 times stronger for autogamous cells than for vegetative cells whereas it was slightly lower in vegetative cells than in autogamous growing cells with the eRF1-b probe. These differences indicate that the abundance of the eRF1-a mRNA, relative to that of the eRF1-b mRNA, is higher in autogamous cells than in vegetative cells. For further analysis, the 59-ends of mature mRNAs were determined by the primer extension method. Total RNA extracted from vegetative and autogamous cells was hybridized with oligonucleotides rt502 and rt1102 (see Materials and Methods section) corresponding to the eRF1-a and eRF1-b sequences, respectively. These oligonucleotides were complementary to the same region of eRF1 coding sequences and had their 39-end located 73 nucleotides downstream of the translation initiaton ATG codon. In order to preclude elongation by the reverse transcriptase of a oligonucleotide paired on the non-cognate template mRNA, the oligonucleotides differed in two positions at their 39-end, including the 39-most terminal nucleotide. There were a few differences between vegetative and autogamous cells for eRF1-a (Fig. 4). In both stages, a strong signal, albeit slightly heterogeneous in the case of vegetative cells, was located close to the ATG codon (positions 210 to 24) and a weak signal was found at position 264. An additional weak signal was present at position 217 in autogamous cells. The situation is different in the case of eRF1-b. In vegetative cells, a strong signal was found at position 264, and minor ones were distributed immediately downstream, up to position 241. During autogamy, a single signal was present at position 251. For both genes, all the mRNAs 59-ends mapped in the intergenic region (Fig. 4). DISCUSSION The use of stop codons as sense codons in some ciliate species must result from a complex interplay between tRNAs able to decipher stop codons and modifications of eRF1 protein. Thus, the analysis of ciliate eRF1 protein could help to eluci-

379

date the mechanism of stop codon recognition in eukaryotes. Toward this goal, we have isolated genes encoding eRF1 from a library of P. tetraurelia macronuclear DNA. We have shown here that P. tetraurelia contains two eRF1 genes located in two different regions of the macronuclear genome. The genomic organization of these two regions is very similar, with an upstream ORF encoding a putative zinc/iron regulated transporter (ZIRT) followed by the eRF1 ORF. For each of these two ORFs, the deduced amino acid sequences as well as the intron distribution are conserved between the two macronuclear regions. However, it is interesting to note that contrary to eRF1 ORFs, the boundaries of the last intron of ZIRT-like ORFs are not conserved. Nevertheless, the high degree of similarities between the two regions suggests that they arose from a recent duplication event. In ciliates, two eRF1 genes were also described in E. aediculatus and E. octocarinatus (Inagaki and Doolittle 2001; Liang et al. 2001). However, in these species, the two eRF1 genes are much more divergent, differing by the presence of an intron in the eRF1-A gene of E. aediculatus, by the number of UGA codons used as cysteine codons, and by the length of the deduced proteins. At the level of the deduced amino acid sequences, the two eRF1s of P. tetraurelia share 98.5% of identity whereas the two eRF1s of E. octocarinatus (Eo-eRF1) and E. aediculatus share only 79% and 83.5% identity, respectively. Liang et al. (2001) have suggested that the differences in E. octocarinatus eRF1 amino acid sequences may be linked to differences in stop codon recognition: Eo-eRF1a recognizes only UAA whereas Eo-eRF1b recognizes UAA and UAG. The very high degree of similarity of eRF1 sequences in P. tetraurelia suggests that the two factors have the same activity in stop codon recognition, and in general, in translation termination. One of the major goals in sequencing eRF1 from ciliate species with variant genetic codes was the elucidation of the mechanism of stop codon recognition in eukaryotes. On the basis of the results obtained for E. coli release factors RF1 and RF2 (Ito, Uno, and Nakamura 2000), it was supposed that a few amino acids of eRF1 N-terminal domain bind to the stop codon and govern stop codon specificity. Hence, eRF1 sequence must change at these discrete positions in ciliate species, which use some of the stop codons as sense codons. Moreover, the changes must have had a convergent evolution in ciliates using the same variant genetic code. Indeed, the comparison of eRF1 sequences including ciliate eRF1s identifies 4 to 5 sites of convergent evolution in the N-terminus of eRF1 of ciliates using UAA and UAG as glutamine codons, i.e. Stylonychia, Oxytricha, and Tetrahymena (Inagaki et al. 2002; Lozupone, Knight, and Landweber 2001). However, only 4 or 5 ciliate sequences were used in the alignments. We have recently shown that P. tetraurelia eRF1 has the highest evolutionary rate among ciliate eRF1s so far sequenced and we have suggested that this high evolutionary rate may be related to structural changes in eRF1 (Moreira et al. 2002). Thus, the sequence of P. tetraurelia eRF1 was of great interest to validate the hypothesis of variant codespecific convergent positions. Surprisingly, the number of convergent sites is reduced to one (Val35) when P. tetraurelia eRF1 sequence is added to the alignment. This result emphasizes the idea that additional ciliate eRF1 sequences, those with a high evolutionary rate in particular, are required to determine the residues that could be implicated in stop codon recognition. Besides, in P. tetraurelia eRF1, the region covering the helixloop-helix structure located at the tip of domain 1 (underlined in Fig. 2) contains a high number of changes that are found neither in other ciliate eRF1s nor in eRF1s from other eukaryotes with canonical genetic code. Some of these changes, particularly those of region 55–64, do not fit current models sug-

380

J. EUKARYOT. MICROBIOL., VOL. 49, NO. 5, SEPTEMBER–OCTOBER 2002

KERVESTIN ET AL.—PARAMECIUM TETRAURELIA eRF1

gesting a direct interaction of stop codons with conserved residues of this region (Muramatsu et al. 2001; Nakamura, Ito, and Ehrenberg 2000; Song et al. 2000). Alternatively, cavity models suggest that the stop codon binds to the surface of a cavity created by the a helices and b sheets arising from residues 30 to 140 of Domain 1 of eRF1 (Bertram et al. 2000; Inagaki et al. 2002). With the exception of Glu55, which is changed to Gln, most of the amino acids thought to bind to one of the nucleotides of the stop codon in the cavity models are conserved in P. tetraurelia eRF1. However, examining these models, it is hard to decide on the orientation of the stop codons in the cavity and to predict the mechanism of stop codon discrimination. Direct testing of stop codon recognition by ciliate eRF1 will probably give more details. The biggest differences between the two macronuclear regions containing eRF1 genes were found at the end of ZIRTlike ORFs and in the sequences separating ZIRT-like ORFs from eRF1 ORFs. In these small intergenic regions, with a high A1T content (Fig. 4), the degree of nucleotide identity is lower (85%) than that of the coding region, which is 89.5%. Such small intergenic regions have been previously described (Haynes et al. 2000) and seem to be a general feature of the Paramecium macronuclear genome (Jean Cohen, pers. commun.). Analysis of eRF1 gene mRNA 59-ends by primer extension shows that these short intergenic regions are most likely the sites of transcription initiation, and thus, must also contain a part of the promotor sequences. Unfortunately, transcription signals are unknown in Paramecium and the high A1T content of the intergenic regions prevents the easy identification of putative canonical signals. Comparison of eRF1 mRNA levels during vegetative growth and autogamy revealed that the relative level of eRF1-a mRNA is increased during autogamy whereas eRF1-b mRNA levels remain almost identical (Fig. 3). An explanation is that (i) whatever the stage of the life cycle, the basal level of eRF1 is given by expression of the eRF1-b gene, (ii) the increase of eRF1 level required during autogamy is obtained by an increase of eRF1-a gene transcription, possibly concomitant with the beginning of the transcription of the new macronuclear genome. These differences in eRF1 gene expression are reminiscent of the previous observation that expression of eRF1 in Xenopus laevis is regulated at the mRNA level during development and that the amount of eRF1 increases during oogenesis with the level of protein synthesis (Tassan et al. 1993). The primer extension analysis clearly showed that the differences observed in the intergenic region sequences corresponded to differences in the location of the 59-ends of mRNAs. In the case of eRF1-a gene, the putative weak site located 64 nucleotides upstream of the ATG appears to be conserved for vegetative growth and autogamy, while the position of the strong site located close to the ATG varied by a few nucleotides. Interestingly, in eRF1-b gene, unique transcription start sites located at 264 and 251 were used during vegetative growth and autogamy, respectively (Fig 4). These results suggest that the expression of eRF1 genes is regulated at the transcription level during different stages of the life cycle. A surprising result of the primer extension experiment is that

381

the strongest transcription start sites of the eRF1-a gene are located very close to the first ATG codon (4 to 10 nucleotides only). On the basis of the sequence alignment, this ATG is likely to be the translation initiation codon as translation initiation at the second ATG encountered 151 codon downstream in the ORF will delete the N-terminal domain of eRF1. It has been well established that the length of the 59-leader sequence is one of the structural features of eukaryotic mRNA that contributes to the efficiency of translation initiation. In an in vitro translation system, translation initiation at the first AUG is profoundly decreased when the length of the 59-leader sequence is decreased to less than ten nucleotides (Kozak 1991). In eukaryotes, the only examples described so far of short 59-leader sequences of less than 10 nucleotides were found in P. tetraurelia surface protein mRNA (Martin et al. 1994; Scott, Leeck, and Forney 1993). However the translation initiation efficiency at the first AUG codon of these mRNAs was not documented. An hypothesis is that P. tetraurelia transcripts with short 59leader sequences are inefficiently translated. Thus, in the case of P. tetraurelia eRF1 mRNAs, an efficient translation initiation at the first AUG of eRF1 ORF could be obtained with the transcripts beginning at the weak transcription start sites located 64 nucleotides upstream of this AUG. Following this hypothesis, the function of transcripts with short 59-leader sequence remains obscure. Another hypothesis is that short leader sequences are a common feature of Paramecium mRNAs and that the translation initiation complex is adaptated to allow the efficient translation initiation at the first AUG of these mRNAs. Very little is known of transcriptional control in Paramecium. It has been previously reported that the 59-coding region of surface antigen genes controls mutually exclusive transcription (Leeck and Forney 1996). Here we show that differential regulation of transcription is probably mediated by gene duplication and variation in the sequences located immediately upstream of the ORF. ACKNOWLEDGMENTS We thank Mireille Be´termier and Karine Dubrana for assistance in Paramecium cultures and for helpful discussions during the course of this work. O. J.-J. was supported by grant no 5511 from the Association pour la Recherche sur le Cancer. Work in E. Meyer Lab was supported by the Association pour la Recherche sur le Cancer (grant no 5733), the Centre National de la Recherche Scientifique (Programme Ge´nome), the Ministe`re de l’Education Nationale, de la Recherche et de la Technologie (Programme de Recherche fondamentale en Microbiologie et Maladies infectieuses et parasitaires), and the Comite´ de Paris de la Ligue Nationale contre le Cancer (grant no 75/ 01-RS/73). S. K. held fellowships from the French Ministe`re de la Recherche et de l’Enseignement Supe´rieur. O. G. is a recipient of a fellowship from the Fondation pour la Recherche Me´dicale. LITERATURE CITED Altschul, S. F. & Koonin, E. V. 1998. Iterated profile searches with PSI-BLAST—a tool for discovery in protein databases. Trends Biochem. Sci., 23:444–447. Baroin-Tourancheau, A., Tsao, N., Klobutcher, L. A., Pearlman, R. E.

← Fig. 4. Primer extension of the 59-ends of mRNA of eRF1-a and eRF1-b during vegetative growth or autogamy. The reaction products (indicated by asterisks) and sequencing reactions performed with rt502 oligonucleotide (on the left of each panel) or rt1102 oligonucleotide (on the right of each panel) were resolved on 5%-acrylamide-7-M urea sequencing gel. The location of the mRNA 59-ends are indicated. The sequence of eRF1-a and eRF1-b intergenic region is shown below. The A of the first ATG of the eRF1 genes is numbered 11. mRNA 59 ends are in bold in the sequences and marked above with (A) for autogamy and with (V) for vegetative growth.

382

J. EUKARYOT. MICROBIOL., VOL. 49, NO. 5, SEPTEMBER–OCTOBER 2002

& Adoutte, A. 1995. Genetic code deviations in the ciliates: evidence for multiple and independent events. EMBO J., 14:3262–3267. Bertram, G., Bell, H. A., Ritchie, D. W., Fullerton, G. & Stansfield, I. 2000. Terminating eukaryote translation: domain 1 of release factor eRF1 functions in stop codon recognition. RNA, 6:1236–1247. Caron, F. & Meyer, E. 1985. Does Paramecium primaurelia use a different genetic code in its macronucleus? Nature, 314:185–188. Chomczynski, P. & Sacchi, N. 1987. Single step method of RNA isolation by acid guanidinium thiocyanate-phenol chloroform extraction. Anal. Biochem., 162:156–159. Church, G. M. & Gilbert, W. 1984. Genomic sequencing. Proc. Natl. Acad. Sci. USA, 81:1991–1995. Frolova, L., Le Goff, X., Rasmussen, H. H., Cheperegin, S., Drugeon, G., Kress, M., Arman, I., Haenni, A.-L., Celis, J. E., Philippe, M., Justesen, J. & Kisselev, L. 1994. A highly conserved eukaryotic protein family possessing properties of polypeptide chain release factor. Nature, 372:701–703. Grimm, M., Bru¨nen-Nieweler, C., Junker, V., Heckmann, K. & Beier, H. 1998. The hypotrichous ciliate Euplotes octocarinatus has only one type of tRNACys with GCA anticodon encoded on a single macronuclear DNA molecule. Nucleic Acids Res., 26:4557–4565. Hanyu, N., Kuchino, Y., Susumu, N. & Beier, H. 1986. Dramatic events in ciliate evolution: alteration of UAA and UAG termination codons to glutamine codons due to anticodon mutations in two Tetrahymena tRNAs Gln. EMBO J., 5:1307–1311. Harper, D. S. & Jahn, C. L. 1989. Differential use of termination codons in ciliated protozoa. Proc. Natl. Acad. Sci. USA, 86:3252–3256. Haynes, W. J., Ling, K. Y., Preston, R. R., Saimi, Y. & Kung, C. 2000. The cloning and molecular analysis of pawn-B in Paramecium tetraurelia. Genetics, 155:1105–1117. Helftenbein, E. 1985. Nucleotide sequence of a macronuclear DNA molecule coding for alpha-tubulin from the ciliate Stylonychia lemnae. Special codon usage: TAA is not a translation termination codon. Nucleic Acids Res., 13:415–433. Herrick, G., Hunter, D., Williams, K. & Kotter, K. 1987. Alternative processing during development of a macronuclear chromosome family in Oxytricha fallax. Genes Dev., 1:1047–1058. Horowitz, S. & Gorovsky, M. A. 1985. An unusual genetic code in nuclear genes of Tetrahymena. Proc. Natl. Acad. Sci. USA, 82:2452– 2455. Inagaki, Y. & Doolittle, W. F. 2001. Class I release factors in ciliates with variant genetic codes. Nucleic Acids Res., 29:921–927. Inagaki, Y., Blouin, C., Doolittle, W. F. & Roger, A. J. 2002. Convergence and constraint in eukaryotic release factor 1 (eRF1) domain 1: the evolution of stop codon specificity. Nucleic Acids Res., 30:532– 544. Ito, K., Uno, M. & Nakamura, Y. 2000. A tripeptide ‘‘anticodon’’ deciphers stop codons in messenger RNA. Nature, 403:680–684. Karamyshev, A. L., Ito, K. & Nakamura, Y. 1999. Polypeptide release factor eRF1 from Tetrahymena thermophila: cDNA cloning, purification and complex formation with yeast eRF3. FEBS Lett., 457:483– 488. Kervestin, S., Frolova, L., Kisselev, L. & Jean-Jean, O. 2001. Stop codon recognition in ciliates: Euplotes release factor does not respond to reassigned UGA codon. EMBO Rep., 2:680–684. Kisselev, L. L. & Buckingham, R. H. 2000. Translational termination comes of age. Trends Biochem. Sci., 25:561–566. Kozak, M. 1991. A short leader sequence impairs the fidelity of initiation by eukaryotic ribosomes. Gene Expr., 1:111–115. Kuchino, Y., Hanyu, N., Tashiro, F. & Nishimura, S. 1985. Tetrahy-

mena thermophila glutamine tRNA and its gene that corresponds to UAA termination codon. Proc. Natl. Acad. Sci. USA, 82:4758–4762. Leeck, C. L. & Forney, J. D. 1996. The 59 coding region of Paramecium surface antigen genes controls mutually exclusive transcription. Proc. Natl. Acad. Sci. USA, 93:2838–2843. Liang, A., Bru¨nen-Nieweler, C., Muramatsu, T., Kuchino, Y., Beier, H. & Heckmann, K. 2001. The ciliate Euplotes octocarinatus expresses two polypeptide release factors of the type eRF1. Gene, 262:161– 168. Lozupone, C. A., Knight, R. D. & Landweber, L. F. 2001. The molecular basis of nuclear genetic code change in ciliates. Curr. Biol., 11: 65–74. Martin, L. D., Pollack, S., Preer, J. R., Jr. & Polisky, B. 1994. DNA sequence requirements for the regulation of immobilization antigen A expression in Paramecium tetraurelia. Dev. Genet., 15:443–451. Meyer, F., Schmidt, H. J., Plumper, E., Hasilik, A., Mersmann, G., Meyer, H. E., Engstrom, A. & Heckmann, K. 1991. UGA is translated as cysteine in pheromone 3 of Euplotes octocarinatus. Proc. Natl. Acad. Sci. USA, 88:3758–3761. Moreira, D., Kervestin, S., Jean-Jean, O. & Philippe, H. 2002. Evolution of eukaryotic translation elongation and termination factors: variations of evolutionary rate and genetic code deviations. Mol. Biol. Evol., 19:189–200. Nakamura, Y., Ito, K. & Ehrenberg, M. 2000. Mimicry grasps reality in translation termination. Cell, 101:349–352. Preer, J. R., Jr., Preer, L. B., Rudman, B. M. & Barnett, A. J. 1985. Deviation from the universal code shown by the gene for surface protein 51A in Paramecium. Nature, 314:188–190. Prescott, D. M. 1994. The DNA of ciliated protozoa. Microbiol. Rev., 58:233–267. Sambrook, J., Fritsch, E. F. & Maniatis, T. 1989. Molecular Cloning: A laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. Sanger, F., Nicklen, S. & Coulson, A. R. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA, 74:5463– 5467. Scott, J., Leeck, C. & Forney, J. 1993. Molecular and genetic analyses of the B type surface protein gene from Paramecium tetraurelia. Genetics, 134:189–198. Song, H., Mugnier, P., Das, A. K., Webb, H. M., Evans, D. R., Tuite, M. F., Hemmings, B. A. & Barford, D. 2000. The crystal structure of human eukaryotic release factor eRF1-mechanism of stop codon recognition and peptidyl-tRNA hydrolysis. Cell, 100:311–321. Sonneborn, T. M. 1974. Paramecium aurelia. In: King, R. C. (ed.), Handbook of Genetics: Plants, plant viruses and protists. Plenum Press, New York. 2:469–594. Tassan, J. P., Le Guellec, K., Kress, M., Faure, M., Camonis, J., Jacquet, M. & Philippe, M. 1993. In Xenopus laevis the product of a developmentally-regulated mRNA is structurally and functionally homologous to a Saccharomyces cerevisiae protein involved in translational fidelity. Mol. Cell. Biol., 13:2815–2821. Thompson, J. D., Higgins, D. G. & Gibson, T. J. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22:4673–4680. Womble, D. D. 2000. GCG: The Wisconsin Package of sequence analysis programs. Methods Mol. Biol., 132:3–22. Zhouravleva, G., Frolova, L., Le Goff, X., Le Guellec, R., Inge-Vechtomov, S. G., Kisselev, L. & Philippe, M. 1995. Termination of translation in eukaryotes is governed by two interacting polypeptide chain release factors, eRF1 and eRF3. EMBO J., 14:4065–4072. Received 11/14/01, 04/12/02, 05/12/02; accepted 05/26/02