Evidence for distinct prototype sequences within the Plasmodium

Molecular and Biochemical Parasitology (1997) 87 : 1-11 .... The 21 bp motif is single underlined under the sequence of cDNA 6.1. The sequence ... Multiple alignment of the nucleotide and amino acid sequences was performed using the.
1MB taille 2 téléchargements 274 vues
Etude de Pf60

: une

fa m i ll e mu l t igé n i qu e

de P. falciparu m

ARTICLE 1 Serge BONNEFOY, Emmanuel BISCHOFF, Micheline GUILLOTTE and Odile MERCEREAU-PUIJALON

Evidence for distinct prototype sequences within the Plasmodium falciparum Pf60 multigene family Molecular and Biochemical Parasitology, Vol. 87, p : 1-11, 1997

État des connaissances à la conception du travail: • .Une famille multigénique importante est définie par le clone Pf60.1 (≈140

membre). • Séquences de 2 cDNA, obtenus par hybridation avec la sonde Pf60.1. • Similitudes significatives entre les membres de la famille Pf60 (Pf60.1, cDNA 5.1 et

cDNA 6.1) et l’exon II des gènes var.

Questions posées: 1. Quel est le degrès de diversité de séquence entre les membres des 2 familles ? 2. Quelles sont les relations phylogénétiques entre la famille Pf60 et la famille var ?

- 45 -

Molecular and Biochemical Parasitology (1997) 87 : 1-11

Evidence for distinct prototype sequences within the Plasmodium falciparum Pf60 multigene family

S e r g e B o n n e f o y1, E m m a n u e l B i s c h o f f , M i c h e l i n e G u i l l o t t e a n d O d i l e M e r c e r e a u - P u i j a l o n 2 Unité d'Immunologie Moléculaire des Parasites, Institut Pasteur, 25 rue du Docteur Roux, 75724 Paris Cedex 15, 3 France.

4

Received 23 December 1996; received in revised form 24 February 1997; accepted 28 February 1997

also been reported in malaria parasites. Families possessing a small number of members, which encode proteins located within merozoite apical organelles, have been described in Plasmodium knowlesi {4} and Plasmodium yoelii {5}. We have reported that Plasmodium falciparum parasites possess a large multigene family, comprising at least 140 members, that was called the Pf60 multigene family {6}. This family was discovered during the analysis of clone Pf60.1, which contained a sequence homologous to the consensus motif of the Babesia sp RAP-1 genes, encoding rhoptry-associated 60 kDa antigens {7}. Consistent with this, the Pf60.1 insert hybridised with mRNAs of approximately 3 kb expressed at the end of the cycle; and antisera raised to the Pf60.1 recombinant antigen reacted with P. falciparum 60 kDa merozoite antigens and cross-reacted with 60 kDa B. divergens merozoite antigens. We had therefore concluded that the Pf60 family was the P. falciparum homologue of the RAP-1 Babesia multigene family {6}. However, subsequent cloning and sequencing of the var genes coding for the PfEMP1 variant antigen exposed on the infected red blood cell surface showed that a region of var exon II presented a significant homology with the Pf60.1 sequence. The var genes have a large exon I coding for the extracellular region of the protein, containing several domains involved in receptor recognition, as well as a transmembrane domain. Exon II codes for the portion of the protein facing the erythrocyte cytoplasm and supposedly involved in interaction with the cytoskeleton or in signal transduction {8 10}. The homology of Pf60.1 with a region well conserved in all reported var exon II sequences suggested that Pf60.1 and var exon II were related and part of the same multigene family.

Abstract Using oligonucleotides derived from Pf60.1, a member of the Plasmodium falciparum Pf60 multigene family, numerous fragments were amplified from genomic and cDNA from the 3D7 P. falciparum clone. DNA sequencing showed that the various fragments presented considerable diversity, indicating that the 3D7 repertoire contains at least 20 distinct versions of the region analysed. The various sequences aligned with either of 2 prototype sequences. Characteristic of the A-type was the presence of a 21 bp motif, present in variable copy number, as well as a sequence homologous to the Babesia sp. RAP-1 consensus. The B prototype sequence did not present such features and substantially differed from the A-type, due to accumulation of point mutations and numerous triplet deletions. Consistent with the marked differences between both sub-families, individual members from each sub-family did not crosshybridise, produced distinct multiple band patterns on Southern blots and distinct chromosome profiles. Numerous hybrid sequences were observed. Interestingly, most var genes and var-related unspliced cDNAs described so far are of A/B hybrid type. These data suggest that the family has evolved by successive amplifications from 2 ancestral copies, with accumulation of mutations, as well as recombination and/or gene conversion events.

In order to better understand the relationship between var genes and the Pf60.1 family, we have analysed sequence diversity of the region corresponding to the original Pf60.1 clone by studying PCR products amplified from 3D7 genomic DNA and from a cDNA library constructed from 3D7 late schizonts and sexual stages {11}. We have compared these sequences to those reported for var and var-related genes and this provided new insights into the repertoire of this multigene family.

1. Introduction Numerous pathogens possess multigene families encoding proteins with similar architecture but presenting structural and functional specificity. The best documented examples are the gene families encoding the variant surface antigens in African Trypanosomes {1}, Neisseria gonorrhoeae adhesins {2} or streptococcal M proteins {3}. Recently, multigene families have

1 present address : Stanford University School of Medicine, Dpt of Microbiology & Immunology, Fairchild Science Building, Stanford, CA 94305-5402, USA. 2 Corresponding author. Tel: 33 01 45 68 86 23, Fax : 33 01 40 61 31 85, e-mail, [email protected] 4 Note: Nucleotide sequence data reported in this paper have been submitted to the Genbank™ data base with the accession numbers U82489 to U82509. 3

Key words: Malaria, Multigene family, Diversity, Antigenic variation, Plasmodium falciparum

0166-6851/97/$17.00 © 1997 Elsevier Science B.V. All right reserved.

PII S0166-6851 (97)0033-9

- 47 -

Figure 22 : Triplet alignment of nucleotide sequences of 3D7-derived genomic and cDNA clones (Genbank™ accession numbers U82489 to U82509). Sequences shown exclude the 5' P1 primer, but include the first 2 nucleotides of the 3' P2 primer. Note that the sequence shown for the 3 cDNAs was obtained by sequencing plasmid DNA. Sequences were aligned with the help of the CLUSTAL V program. Gaps have been introduced to maximise homologies. "-" indicates absence of any base at that position. Point mutations affecting one member or a subset of members have been outlined in black. B-type sequences are shaded in grey. The 21 bp motif is single underlined under the sequence of cDNA 6.1. The sequence coding for the RAP-1 consensus homologue is double underlined. The nucleotide position of the Pf60.1 is indicated on the top of each group of sequences. The atypical B region in PCR2, -9, -16 and 10 is boxed.

Bonnefoy et al.

- 48 -

Mol. Biochem. Parasitol. (1997) 8 7 : 1 - 1 1

were major differences. Importantly, for 61 triplets, 2 alternative prototype sequences could clearly be identified, with either specific point mutations shared by a group of sequences or a triplet deletion at a specific position common to numerous members. This indicated the occurrence of 2 distinct sub-families within the 3D7 repertoire. Within each sub-family there was a good conservation of the sequences, but apart from 2 identical genomic clones, (PCR11 and 23), the various genomic or cDNA-derived fragments presented a unique sequence.

2. Materials and methods 2.1 PCR amplification Pf60.1-specific amplifications were carried out using 3D7 genomic DNA. Reactions were done in a Hybaid Thermal Cycler in a total volume of 50 µl using 1µM (each) primers P1: 5'TGG TAC TAG AAC CTA GTG GTA AC and P2: 5'GGA TAA TTA TAT TCT TCT CCA C, in the presence of 200 µ M (each) dNTP, 1.25 mM MgCl2 , 2.5U Taq polymerase (Promega) in the reaction buffer provided by the manufacturer. The following conditions were used : 30 s at 94°C; 120 s at 55°C; 120 s at 72°C for 30 cycles {12}.

Clones presenting an A-type sequence were the original Pf60.1 clone, the genomic copies PCR23, -11, -13, -1, -24, and 15, as well as 2 cDNAs (5.1 and 6.1). All presented the sequence encoding the RAP-1 consensus homologue (double underlined inFigure 22). Size varied from 310 to 358 bp, mainly due to variation in the number of the 5' located 21 bp motif (single underlined in Figure 22), present in 1 - 3 copies. Additional size variation was generated by 3bp deletions. Excluding the extra copies of the 21 bp motif, the various members of this group shared 83 - 100 % homology in the region studied. There were 52 strictly conserved triplets and 39 triplets with limited variability : 27 where one position of the codon was substituted in 1 or more members, 7 where 2 nucleotides of the triplet showed variability, generating 1 or 2 alternative codons at that position and 5 where all 3 nucleotides were variable, resulting in 2-4 alternative codons or in a triplet deletion. It is unlikely that such a variability was due to Taq DNA polymerase errors as identical point mutations were frequently observed in several distinct members. In addition, the observation of numerous triplet deletions argue against random amplification errors.

2.2 Gel analysis of PCR products Analysis of the PCR products was done on 2% agarose gels (Seakem GTG agarose, FMC) in TBE buffer (0.089M Tris, 0.089M boric acid, 0.001M EDTA). Restriction analysis of PCR fragments was done by digesting 5µl of the amplification reaction with 50 U RsaI restriction enzyme as recommended by the manufacturer (Biolabs). 2.3 DNA cloning and sequencing of PCR products 1 µl of the PCR reaction obtained after amplification from 3D7 genomic DNA was cloned into the pCRII vector. Commercially competent INVαF' cells (Invitrogen) were transformed with the ligation products and recombinant colonies were randomly picked for analysis. pCRII recombinant DNA was prepared by the alkaline denaturation method {13}. Double-stranded DNA sequencing was performed using the Sequenase kit (USB) using T7 and SP6 sequencing primers. Sequencing of the 3 cDNAs isolated by hybridisation using the Pf60.1 probe was done in a similar way {6}. Multiple alignment of the nucleotide and amino acid sequences was performed using the Clustal V program {14}. 2.4 Southern blot analysis Parasites were harvested from in vitro cultures and genomic DNA was prepared as described {15}. DNA from P. falciparum Palo Alto FUP/CB was digested with EcoRI, EcoRI/HindIII, HindIII, TaqI, MboI and Sau 3A1 as recommended by the supplier. After digestion, DNA was electrophoresed on 0.8% agarose gels, transferred onto Hybond N membranes (Amersham) as recommended by the manufacturer, and hybridised with specific probes, generated by PCR amplification from bacteriophage λPf60.1 or from the PCR-16 plasmid using primers P1 and P2. PCR products were labelled with 32P by nick-translation (nick translation kit, Boehringer) using α 3 2P dATP (Amersham). For hybridisation, the Nylon membranes were incubated overnight at 65°C with the probes as described {15} and extensively washed at 65°C, three times in 6xSSC, once in 2xSSC and once in 0.1xSSC before autoradiography. {1xSSC is 0.15 M NaCl, 0.015 M Na3citrate, pH7.0}. Dehybridisation was done as recommended by Amersham.

The genomic copies cloned in PCR16, -10, -7, -5, -20, -21 and -22 presented a B-type sequence. This group of clones had a shorter sequence (279 or 285 bp), shared strong homology (92 99 %) with each other, but presented a low homology with subfamily A, due to numerous deletions and point mutations. For instance, clone PCR16 which was used in hybridisation experiments (see below), presented only 44 % homology with the Pf60.1 sequence, including numerous triplet deletions and mismatches, affecting any position within codons. None of these clones encoded the 21 bp repeat nor the RAP-1 consensus homologue. Within this group of B-type sequences, 64 / 81 triplets were strictly conserved. There were 10 triplets where a single substitution was observed, affecting 1 or 2 members; 4 triplets with 2 nucleotides possibly substituted and 3 triplets with all 3 nucleotides showing variability.

2.5 Pulse field gel analysis P. falciparum 3D7 parasites, kindly provided by D. Walliker, were cultivated in A+ human red blood cells in AB+ human serum, as described {6}. Chromosome blocks were prepared as described {15}. Chromosomes were separated on a 0.7% agarose gel (chromosomal grade, Bio-Rad) in 0.5xTBE chamber buffer by pulse field gel electrophoresis (PFGE), performed in a CHEF apparatus as described {16}. The gel was stained with ethidium bromide, photographed, UV-irradiated for 3 min prior to alkaline transfer onto a Hybond N+ membrane (Amersham).

3. Results

Figure 22 also shows evidence for hybrid genes. The closely related clones cDNA 10.3 and PCR-14 possessed an A-type 5' sequence, with the largest number of 21 bp repeats (4.5 copies) and displayed in the second half of the sequence numerous point mutations followed by a mosaic A/B structure downstream the last 3' triplets of the RAP-1 consensus sequence. Clone PCR19 was an A-B hybrid. It presented the typical 21 bp motif but lacked the RAP-1 consensus homologue. A similar organisation was observed in clone PCR12 (a partial sequence was obtained for this clone, due to deletion during cloning in E. coli). Clones PCR2 and -9 were more complex, having A-type 5' and 3' sequences and an atypical B-type central region. They both displayed the A-type 21 bp motif and the RAP-1 consensus homologue. The atypical B-type sequence was similar to that observed for clones PCR16 and -10.

3.1 Analysis of Pf60 P1-P2 region in the 3D7 repertoire

As previously reported, numerous PCR fragments were amplified from 3D7 genomic DNA using primers P1 and P2, derived from the Pf60.1 sequence {6, 12}. The various products were cloned into the pCRII vector. Individual clones were analysed by size, restriction fragment polymorphism and DNA sequencing. The size of the amplified fragments ranged from 280 to 382 bp. In addition, the same region was sequenced from 3 distinct cDNAs isolated from a 3D7 cDNA library by hybridisation with the Pf60.1 probe {6}. Figure 22 shows the alignment of the various nucleotide sequences. Considerable diversity was observed. There were only 14 triplets which were strictly identical in all members. Nine triplets were conserved in all but one member, and 2 additional triplets were identical in all but 2 members. These showed the same point mutation. For all other positions, there

- 49 -

Figure 23 : Alignment of deduced amino acid sequences, indicated in single letter code, encoded by 3D7 genomic and cDNA sequences, var exon II and var -related cDNA sequences. Genbank accession numbers of the various var-related DNA sequences from the Dd2 P. falciparum clone are as follows : L40600 (LT140), L40601 (LT141), L40602 (LT142), L40603 (LT145), L40604 (LT147), L40605 (LT148), L40606 (LT150), L40607 (LT152) {10}. Genbank accession numbers of the various var sequences were as follows : L40608 (Dd2var 1) {10}, L40609 (FCR3var 2-3) {10}, L42636 (Dd2var 7) {10}, L 42244 (A4var) {9}, (U27338) MCvar1 {8}, U53324 (3D7var 1) {17}. Deduced amino acid sequences in the P1-P2 region were aligned with the help of CLUSTAL V program. Gaps have been introduced to maximise homologies. "-" indicates absence of any residue at that position. Amino acids common to both prototype sequences (framework) are indicated in grey, A-specific amino acids are indicated in blue, and A variants common to ≥ 3 members are indicated in yellow. The A-specific 7 amino acid repeats are boxed with light lines, the common 6 amino acid motif is boxed with heavy bars. B-specific amino acids are indicated in green or red. Substitutions observed in a single member are boxed and indicated in white. The sequence in the region corresponding to the P1 and P2 primers is shown for all members; it is indicated in italics for the PCR products amplified using these primers, to notify that it may not reflect the actual genomic sequence.

Bonnefoy et al.

- 50 -

Mol. Biochem. Parasitol. (1997) 8 7 : 1 - 1 1

As illustrated in Figure 22, alignment of the various sequences involved deletion or insertions of single or multiple triplets, resulting in loss of 1 or more amino-acids in the corresponding protein sequence. There was one example (clone PCR2), where a single base pair deletion introduced a frame shift, compensated 11 bases downstream by insertion of one nucleotide (underlined in the sequence of clone PCR2 shown in Figure 22). 3.2. Deduced protein sequence

In Figure 23, the deduced amino acid sequence of the 3D7derived genomic and cDNA fragments was aligned with the protein sequence deduced from published var genes or spliced var cDNAs encoding PfEMP1, derived from several parasite lines or clones, Malayan Camp {8}, A4 {9} , Dd2 {10}, FCR3 {10} and 3D7 {17}. Additional var-related sequences were from Dd2 cDNAs containing unspliced sequences, i. e. containing some intron sequence and a genomic-type intron/exon II junction {10}. Figure 23 outlines that all Pf60-related, as well as var and var-related sequences aligned according to 2 major protein sequence types. The only members presenting a purely A-type or B-type sequence throughout the region analysed were those cloned from 3D7 genomic DNA or cDNA. Numerous A-B type hybrids were observed with variable A vs B ratios. Interestingly, with the exception of 3D7 var1, all var and var-related sequences were of hybrid type. There was a common framework (depicted in grey) of 54 conserved residues (29 strictly conserved in all sequences analysed and 25 with limited variability). A-type sequences were relatively homogeneous. There were 50 A-specific residues, indicated in blue or yellow, with 24 common (blue) strictly conserved amino acids and 26 positions showing limited variability in a subset of members, mainly due to point mutations. The substitutions observed in ≥ 3 members have been indicated in yellow. Clones cDNA 5.1 and 3D7var1 were identical and showed, within the A-type sequences, the greatest divergence from the consensus. B-type sequences differed substantially from the previous group. As shown in Figure 23, 38 - 40 B-specific residues (depicted in green or red) were identified, with 19 strictly identical B-specific residues in all B-type sequences (green). In addition, there were 23 positions where 2 alternative B-type sequences (either green or red) were observed, suggesting existence of 2 B sub-types. Within this frame, there was limited variability. Typical of B-type sequence was the lack of N-terminal 7 amino acid repeats, as well as absence of homology with the Babesia sp RAP-1 consensus.

Figure 2 4 : Sequences hybridising to Pf60.1 and PCR16 do not co-localize. Southern blot analysis of a FUP/CB genomic DNA digested with TaqI (lane 1), HindIII (lane 2), EcoRI/HindIII (lane 3) or Eco RI (lane 4) and probed after transfer onto a Hybond N membrane with the 32P-labelled, nick-translated P1/P2 PCR product generated from Pf60.1 clone (panel A) or PCR16 clone (panel B). Shown is an autoradiograph, obtained after washing the membrane in 0.2xSSC at 65°C. Position and size (in kbp) of the HindIII fragments of the bacteriophage λ DNA are marked by arrows.

corresponding to primer P2. In contrast, LT 150, Dd2var 1, FCR3var 3, A4var and MCvar1 formed a homogeneous group, with 4 identical sequences. They presented a specific series of substitutions and deletions in the region corresponding to the RAP-1 consensus homologue. The A/B boundary of PCR-19 and 12 was located closer to the N-terminus. LT141, -142 and Dd2var 7 had the lowest increment of A-type sequences. They showed an atypical sequence in the region corresponding to the 3’end of primer P1, but possessed down-stream a typical A-type 9 amino acid-long block.

The hybrid sequences represented a complex group. Five subgroups have been outlined in Figure 23. PCR-9 and -2 were A-B-A type hybrids, with a central atypical B-type sequence. The LT140, -145 and -147 cDNAs were difficult to classify. The first 2 were closely related. They presented the largest proportion of A-type sequence, but had a short central B-type sequence and shared with LT147 an atypical N-terminal sequence (related to that observed in LT141 and 142 and D d 2var 7), due to mutations/deletion in the region corresponding to primer P1. The 3 cDNAs lacked the canonical N-terminal A-specific 7 amino acid repeats. In contrast, the C-terminal region of LT147 was similar to that observed in cDNA 10.3, PCR 14, LT152 and FCR3var 2, with the A/B frontier located within the RAP-1 consensus homologue. This group shared a specific block of sequence (indicated in red) and in addition showed unique variability in the sequence

3.3 Hybridisation patterns

Figure 24 shows a comparison of the hybridisation pattern of representative members of each sub-family, namely clone Pf60.1 (sub-family A) and clone PCR16 (sub-family B) on Southern blots of Palo Alto FUP/CB genomic DNA. As previously reported {6}, the

- 51 -

Bonnefoy et al.

Pf60.1 probe generated a multiple band pattern on P. falciparum genomic DNA (panel A). The PCR-16 probe also displayed multiple band hybridisation profiles (panel B), but for each restriction enzyme, the pattern was distinct from the one generated with the Pf60.1 probe. Most fragments hybridising with one probe did not react with the other one. This difference indicates that the 2 probes hybridised to distinct members of this multigene family, located on distinct restriction fragments. In other words, A-type and B-type sequences do not co-localize.

A common, well conserved framework of 54 amino-acids was identified, accounting for 42 - 52 % of the residues in A-type sequences and 59 - 60 % of total residues in B-type sequences. Differences between both prototype sequences were due to i) presence (A type) or absence (B type) of the 21 bp motif encoding the Ser Gly Asn Asn Thr Thr Ala sequence. A related motif (Ser Gly Asn Asn Thr Pro) was present in many but not all sequences and showed substantial variability; ii) deletions of single or multiple triplets ; iii) accumulation of point mutations. Interestingly, a large proportion (78 %) of the point mutations resulted in non-conservative amino acid substitutions. These numerous differences and in particular the large number of amino acid substitutions resulted in different net charge in both types: Atype sequences are predicted to be more acidic than B-type sequences. A-type sequences presented substantial size variation, mainly due to variable copy number of the 21 bp motif (present in single copy in Pf60.1) and deletions of 1-2 triplets. Size variability was limited in B-type sequences. There were 23 positions where 2 alternative B-specific sequences were observed (depicted in red or green in Figure 23). This was less frequent for type A sequences, namely a total of 11 positions with mutations shared by ≥ 3 A and/or hybrid sequences (depicted in yellow).

Probing 3D7 chromosome blots confirmed this conclusion. Figure 25 shows that both probes produced distinct profiles. As previously reported, the Pf60.1 probe hybridised to most chromosomes, with strong signals on chromosomes 12, 10, 9, 86 and 1 (panel A). In contrast, the PCR-16 probe hybridised to fewer chromosomes; in particular, no hybridisation was observed on chromosomes 1 and 10, a reduced signal was detected on chromosomes 8-6, but an intense signal was observed on chromosome 4 which was only weakly labelled by the Pf60.1 probe. 4. Discussion

Characteristic of sub-family A was the presence of a well conserved sequence homologous to the RAP-1 consensus motif present in 60 kDa Babesia sp antigens {7}. We had previously hypothesised that this particular sequence was responsible for the reaction of clone Pf60.1 with the anti-Bd37 antiserum, raised to a B.divergens antigenic fraction {18} and for the reaction of antiPf60.1 antibodies with 60 kDa merozoite-associated antigens in B. divergens {6}. This interpretation is substantiated by the lack of

Little is known to date on the extent of diversity of the repertoire of genes encoding var and Pf60-related sequences in the P. falciparum genome. Results reported here show that the region homologous to the original Pf60.1 clone shows remarkable diversity. Twenty distinct versions of this region have been identified in the 3D7 repertoire. This is a minimum estimate. The sequence of both the P1 and P2 primers showed some variability, indicating that some members will not be amplified using this pair of primers. Both the nucleotide and the deduced amino acid alignments point to the presence of 2 main prototype sequences. The first prototype sequence, that we have called here the A-type sequence was described previously with the Pf60.1 clone {6}. The second one, called here B, was identified during the analysis reported here and has not yet been described. Interestingly, most of the divergences outlined upon alignment of the var sequences with Pf60.1 were indeed due to the fact that most var genes described so far are A-B hybrids, with varying ratios of A vs B sequences. The existence of such hybrids indicates that the Pf60-related sequences and var exon II are part of a single multigene family. The presence of 2 easily outlined prototype sequences suggests that the various genes analysed here derive from two ancestral sequences. This is reminiscent of the situation observed for other P. falciparum genes such as MSP-1 and MSP-2, which present numerous alleles derived from either of 2 quite distinct prototype sequences {19}. But unlike these single copy genes, the Pf60/var family has evolved by multiple amplification steps. During this process, point mutations as well as recombination and/or gene conversion events have been followed by subsequent gene duplication, resulting in groups of members presenting the same set of modifications. Several A/B junctions have been observed, indicating that recombination and/or gene conversion has occurred in several distinct positions. There were a few examples of A/B/A type hybrids. The absence of B/A type hybrids is remarkable in view of the large number of A/B hybrids observed here. This unidirectionality suggests that gene conversion generated hybrid and mosaic genes.

F i g u r e 2 5: P f 6 0 . 1 a n d P C R - 1 6 p r o b e s g e n e r a t e d i s t i n c t chromosome profiles. Chromosomes from 3D7 clone separated by PFGE were probed as for Figure 24 with the 32 P-labelled, nick-translated P1/P2 PCR product generated from Pf60.1 clone (panel A) or PCR16 clone (panel B). Autoradiography has been carried out after washing the membrane in 0.2xSSC at 65°C. The position of the 14 chromosomes of 3D7 clone are marked by arrows.

- 52 -

Mol. Biochem. Parasitol. (1997) 8 7 : 1 - 1 1

genes are also expressed by 3D7 erythrocytic stages (Bonnefoy, unpublished results). We still do not know whether these genes code for merozoite-associated products or for as yet undescribed PfEMP1 molecules. Work is in progress to characterise the corresponding genes.

reaction of the anti-Bd37 antiserum with a recombinant glutathione-S-transferase expressing the PCR16 fragment, derived from sub-family B and not presenting the RAP-1 homologue (data not shown). The presence of the conserved framework, scattered all over this region, provides however sufficient homology for immunologic cross-recognition. Indeed, anti-Pf60.1 specific antibodies did react with the PCR16 recombinant protein (data not shown). In addition, the antiserum raised to GT60.1, a recombinant Pf60.1 glutathione- S -transferase, reacted on immunoblots with a high molecular mass, Triton-insoluble, SDSsoluble antigen of variable size in the O and R variants of the Palo Alto FUP/SP line {20}. This is consistent with a cross-reaction of the anti GT60.1 antibodies with the O and R-specific PfEMP1 variant antigen.

Acknowledgements This work has been supported by a grant from the Direction des Recherches Etudes et Techniques. We thank Artur Scherf and Gordon Langsley for kindly reviewing the manuscript.

References

{1} Pays, E., Vanhamme, L., and Berberof, M. (1994). Genetic controls for the expression of surface antigens in African trypanosomes. Annu. Rev. Microbiol. 48, 25-52. {2} Meyer, T.F., Gibbs, C.P., and Haas, R. (1990). Variation and control of protein expression in Neisseria. Annu. Rev. Microbiol. 44, 451-477. {3} Fischetti, V. (1989). Streptococcal M protein: molecular design and biological behavior. Clin. Microbiol. Rev. 2, 307-313. {4} Adams, J.H., Hudson, D.E., Torii, M., Ward, M., Wellems, T.E., Aikawa, M. and Miller, L.H. (1990). The Duffy receptor family of Plasmodium knowlesi is located within the micronemes of invasive malaria merozoites. Cell. 63, 141-153. {5} Sinha, K., Keen, J.K., Ogun, S.A. and Holder, A.A. (1996). Comparison of two members of a multigene family coding for high molecular mass rhoptry proteins of Plasmodium yoelii. Mol. Biochem. Parasitol. 76, 329-332. {6} Carcy, B., Bonnefoy, S., Guillotte, M., Le Scanf, C., Grellier, P., Schrevel, J., Fandeur, T. and Mercereau-Puijalon, O. (1994). A large multigene family expressed during the erythrocytic schizogony of Plasmodium falciparum. Mol. Biochem. Parasitol. 68, 221-233. {7} Suarez, C.E., McElwain, T.F., Stephens, E.B., Mishra, V.S. and Palmer, G.H. (1991) Sequence conservation among merozoite apical complex proteins of Babesia bovis, Babesia bigemina and other Apicomplexa. Mol. Biochem. Parasitol. 49, 329-332. {8} Baruch, D.I., Pasloske, B.L., Singh, H.B., Bi, X., Ma, X.C., Feldman, M., Tarashi, T. and Howard, R.J. (1995). Cloning of the P. falciparum gene encoding PfEMP1, a malarial variant antigen and adherence receptor on the surface of parasitized human erythrocytes. Cell 82, 77-87. {9} Smith, J.D., Chitnis, C.E., Craig, A.G., Roberts, D.J., Hudson-Taylor, D.E., Pertersen, D.S., Pinches R., Newbold, C.I. and Miller, L.H (1995). Switches in expression of Plasmodium falciparum var genes correlate with changes in antigenic and cytoadherent phenotypes on infected erythrocytes. Cell 82, 101-110. {10} Su, X.Z., Heatwole, V.M., Wertheimer, S.P., Guinet, F., Herrfeldt, J.A., Peterson, D.S., Ravetch, J.A., and Wellems, T.E. (1995). The large diverse gene family var encodes proteins involved in cytoadherence and antigenic variation of Plasmodium falciparum -infected erythrocytes. Cell 82, 89-100. {11} Rawlings, D.J. and Kaslow, D.C. (1992). A novel 40 kDa membrane associated EF-hand calcium binding protein in Plasmodium falciparum. J. Biol. Chem 276, 3976-3982 {12} Carcy, B., Bonnefoy., S., Schrevel, J. and Mercereau-Puijalon, O. (1995). Plasmodium falciparum : Typing of malaria parasites based on polymorphism of a novel multigene family. Exp. Parasitol. 80, 463472. {13} Birnboim, H.C. and Doly, J. (1979). A rapid alkaline extraction extraction procedure for screening recombinant plasmid DNA. Nucleic. Acids. Res. 7, 1513-1523. {14} Higgins, D.G., Bleasby, A.J., and Fuchs, R. (1992) CLUSTAL V: improved software for multiple sequence alignment. Comp. Applic. Biosci. 8, 189-191. {15} Bonnefoy, S., Guillotte, M., Langsley, G. and Mercereau-Puijalon, O. (1992) Plasmodium falciparum: characterization of gene R45 encoding a trophozoite antigen containing a central block of six amino acid repeats. Exp. Parasitol. 74, 441-451.

Under stringent hybridisation conditions, the PCR16 insert (used as a representative B-type probe) did not cross-hybridise with the Pf60.1 clone and vice-versa. Both hybridised on Southern blots of genomic DNA with numerous restriction fragments and on chromosome blots with several chromosomes. The specific hybridisation patterns indicate that distinct members were visualised by both probes. Data shown in Figure 24 and Figure 25 indicate that sequences hybridising with the Pf60.1 probe and sequences hybridising with the PCR16 probe do not co-localize and show distinct chromosome distribution. The figure of approximately 140 members for the Pf60 family {6} was derived from quantification experiments where the Pf60.1 probe has been used under moderately stringent conditions. Our observation that this family contains purely A-type or B-type sequences as well as numerous hybrid genes raises some questions concerning the interpretation of this quantification. Under the hybridisation conditions used (2xSSC, at 65°C), the contribution of individual genes to the global hybridisation signal depends on their nucleotide sequence : A-type genes are predicted to generate strong signals, while B-type will produce fainter signals. The signals obtained for hybrid sequences will depend on the length of A-type sequence present in each gene. This indicates that the actual number of members of the Pf60/var family may be larger than estimated originally for the Pf60 family {6}. The complexity of the structure of the Pf60/var family in this region is puzzling. Two main prototype sequences have been identified, but it looks as if 3 classes of genes exist : genes with a purely A- or B-type sequence and those with a hybrid sequence. We do not know whether the presence of a specific type in the region reflects some functional differences. Available evidence indicates that the 3 main groups of sequences are part of actively expressed genes. A/B hybrid type sequences have been observed in var and var-related cDNAs. The situation for purely A- or B-type sequences is less clear. Purely A-type sequences have been observed in 2 quite distinct classes of cDNAs : 3D7 var1 cDNA {11} and the 3 kb mRNA expressed late in schizogony, identified previously using the Pf60.1 probe {6}. Characterisation of the full sequence of cDNA 5.1 and 6.1 showed unique 5’ sequences with no homology with any of the var genes described so far, including with 3D7 var 1 which is very similar to cDNA 5.1 throughout its exon II sequence. Importantly, antisera raised to 5.1 and 6.1specific regions strongly reacted with merozoites (Bischoff et al, in preparation). This is in line with our previous observations that antisera raised to GT60.1 reacted with 60 kDa merozoiteassociated antigens both in P. falciparum and in B. divergens {6}. RT-PCR using PCR-16 derived primers indicated that purely B-type

- 53 -

Bonnefoy et al.

{16} Hinterberg, K. and Scherf, A. (1994). PFGE: Improved conditions for rapid and high-resolution separation of Plasmodium falciparum chromosomes. Parasitol. Today, 10, 225. {17} Rubio, J. Thompson, J.K. and Cowman, A.F. (1996). The var genes of Plasmodium falciparum are located in the subtelomeric region of most chromosomes. EMBO J. 15 , 4069-4077. {18} Précigout, E., Gorenflot, A., Valentin, A., Bissuel, G., Carcy, B., Brasseur, P., Moreau, Y. and Schrével, J. (1991) Analysis of immune responses of different hosts to Babesia divergens isolates from different geographic areas and capacity of culture-derived exoantigens to induce efficient cross-protection. Infect. Immun. 59, 2799-2805. {19} Kemp, D.J., Cowman, A.F. and Walliker, D. (1990). Genetic diversity in Plasmodium falciparum . Adv. Parasitol. 29, 75-149. {20} Le Scanf, C., Fandeur, T., Morales-Betoulle, M.E. and MercereauPuijalon, O. (1997). Altered expression of several antigens associated with the infected red blood cell membrane during antigenic variation in Plasmodium falciparum. Exp. Parasitol, in press.

- 54 -

Etude de Pf60

: une

fa m i ll e mu l t igé n i qu e

de P . f a lc i pa r um

I.DISCUSSION DE L’ARTICLE 1 : Les résultats rapportés dans cet article montrent que la région homologue au clone génomique Pf60.1 présente une diversité remarquable. Vingt versions différentes de cette région ont été identifié dans le répertoire de la souche parasitaire 3D7. Les alignements de séquences nucléotidiques et protéiques déduites mettent en évidence deux séquences prototypes (Figure 22, page 48 ; Figure 23, page 50) dans la région concernée. La première, dénommé A, fut décrite auparavant avec le clone Pf60.1 (Carcy et alii, 1994) ; la seconde, dénommé B, fut identifié au cours de la présente étude et n’avait, encore, jamais été décrite. Il est intéressant de remarquer que les gènes var sont, en général, des hybride A-B. L’existence de tels hybrides démontre que les gènes de la famille Pf60 et l’exon II des gènes var font partie de la même famille multigénique. L’existence de deux séquences prototypes, pour la région correspondant au clone génomique Pf60.1, suggèrent que les gènes analysés dérivent de deux séquences ancestrales ; situation qui n’est pas sans rappeler celle observée pour d’autres gènes de P. falciparum : comme dans le cas des familles allèliques de MSP1 (Figure 8, page 13). Nonobstant, contrairement à ce gène, en copie unique par génome haploïde, la famille multigénique Pf60/var a évolué par de multiple étapes d’amplification, au cours desquelles mutations ponctuelles, comme recombinaison et/ou conversion génique sont des processus évolutifs dont l’existence de groupe de membres, présentant le même profil de modification, est la résultante. Plusieurs jonctions A/B sont observables, ainsi que des hybrides A/B/A. L’absence d’hybrides B/A est remarquable par rapport aux nombreux hybrides de type A/B observés ; ces transitions unidirectionnelles suggèrent fortement qu’un mécanisme de conversion génique est responsable de la création des hybrides. La sous-famille A se caractérise par la présence du motif consensus, présent dans les protéines RAP-1 des Babesia species (Précigout et alii, 1991). Cette séquence pourrait être responsable de la réaction croisée, d’une part, entre le sérum anti-Bd37 et la protéine recombinante dérivée du clone Pf60.1 (Précigout et alii, 1991) et, d’autre part, entre le sérum anti-Pf60.1 et l’antigène de 60kDa, associé aux rhoptries de B. divergens (Carcy et alii, 1994). Cette interprétation semble confirmée par l’absence de réaction croisée entre le sérum anti-Bd37 et la protéine recombinante dérivée du fragment PCR16, un représentant de la sous-famille B, qui en tant que tel, est dépourvue de ce motif. Par contre, la présence d’un squelette consensus permet aux anticorps anti-Pf60.1 de réagir cette dernière protéine recombiante. Depuis la parution de cet article, un paradigme tombe et une révolution scientifique est en marche : nous somme entré, avec le troisième millénaire, « in the post-genomic era ». Les données issues du séquençage du génome de P. falciparum, cummulées à celles présentées Figure 23, ont permis d’établir un alignement de 125 séquences protéiques déduites des séquences des exons 2 des gènes var, des orf2 des gènes 5.1 et 6.1 et, probablement, d’autres paralogues (confer Annexe C, page

- 55 -

Résultats

119). A partir de cet alignement, 15 régions conservées, semi-conservées ou polymorphes peuvent être définies (Tableau 5). Le polymorphisme des régions semiconservées et polymorphes est de trois ordres ; on observe, tout d’abord, un polymorphisme de séquence, caractérisant, de manière plus marquée, les régions semiconservées (block 1, 4, 8 et 15). Les régions polymorphes sont, quant à elles, caractérisées par des indels, dont, parfois, la taille peut être importante, comme pour les blocks 10, 12 et 14. Remarquablement, pour le génome de P. falciparum, seul le block 3 — précédemment décris dans l’Article 1 — présente un polymorphisme de répétition (variables en nombre et en séquence).

block

diversité/ conservati on du block

nbr de sous familles

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

semi-conservé conservé polymorphe semi-conservé conservé polymorphe conservé semi-conservé conservé polymorphe conservé polymorphe semi-conservé polymorphe semi-conservé

2 — 2 3 — — — — — — — — — — —

type de polymorphisme polymorphisme — polymorphisme polymorphisme — polymorphisme — polymorphisme — indels — indels polymorphisme indels polymorphisme

de séquence de répétition, indels de séquence de séquence, indels de séquence

de séquence, indels de séquence

Tableau 5 : Régions conservées, semi-conservées ou polymorphes identifiables dans l’alignement présenté en Annexe C.

La portion de séquences figurant sur l’alignement originel (Figure 23) correspond à la moitié du block 2, au block 3, 4 et 5 du nouvel alignement (Annexe C). La comparaison des deux alignements appelle une remarque majeur : la dichotomie sous-famille A et sous-famille B ne se poursuit pas sur toute la longueur de l’alignement ; mais, au contraire, est uniquement limitée au block 4. Les membres de la sous-famille B ne semblent être qu’une fraction de la famille A. En revanche, compte tenu de l’échantillonnage plus important et de la nature exhaustive du séquençage d’un génome, une troisième sous-famille (C) émerge de l’alignement, et ce sur toute la longueur de celui-ci (en magenta, block 4). La variabilité de séquence, à l’emplacement où l’amorce oligonucléotidique P2 doit hybrider, est, très probablement, responsable de l’absence d’amplification de fragment de la sous-famille C avec le couple d’amorce P1/P2 utilisé. Les membres des sous-familles B et C sont plus proches les uns des autres que ne le sont les membres de la sous-famille A, comme le montre l’arbre phylogénétique dérivé de l’alignement (Figure 26). Malgré ce polymorphisme étendu, un squelette de positions conservées apparaît sur toute la longueur de l’alignement (matérialisé en gris clair et foncé, Annexe C), ainsi que 5 régions conservées (blocks 2, 5, 7, 9 et 11).

- 56 -

Etude de Pf60

: une

fa m i ll e mu l t igé n i qu e

de P .fal cip arm

mal1001366 mal5002448 mal4000199_120080var5_ blob004266 pcr1 pcr23 pcr11 pcr13 chr2ae001366 mal1610 chr12contig02.000811var3 lt145 blob4422 pf60.1 3d7 chr12contig02.000811var5 cdna5.1 cdna5.1 mal13000654 chr11tigrc11m858_frameshift_ lt140 chr12contig06.000811int blob4401 pcr2 pcr9 mal3p7pfc1120c cdna6.1 cdna6.1 pcr24 pcr15 mal400586internal blob004099_genecomposite_ chr11tigrc11m690 mal9002162 blob44813end mal13000855 blob004133 mal4000526 mal4001762 mal9004424 blob004172 mal13003614 chr14c14m70 blob004467 mal4000199_20080finvar5_ mal4000199_120080var3_ chr2ae001434 mal5002924internal blob004482internal mal1000623 chr12contig22 chr10tigrc10m406 pcr14 cdna10.3 mal9004550 blob004446 chr3mal3p8var_33k43k_ mal4003573 lt152 lt147 mal9004695 var2 chr10tigr10m494 blob004473 blob004027 blob004288 mal40000173var chr12contig13_160k170k_ mal4000565_1400026880_ blob004342 blob004382 mal4000582 chr12contig blob004432 mal9004527 blob004415 blob004375_genecomposite_ chr2ae001433_genecomposite_ mal9002238 mal4000018 mal9004537 blob4158 mal4000441 mal4000199_20080finvar3 blob002959 mal1610internal_verifierorf5 chr11tigrc11m552 blob004301 chr11tigrc11m542intvar mal13000263 chr14c14m48 blob002856 mal4000565_114000_ mal9004490 blob004448n1 mal9004505 blob004332 mal92124 mal9003748_frameshift_ mal9002240 blob002780 ch11tigrc11m5425 chr12contig19 mcvar1 var3 dd2 lt150 a4var blob0048n2 pcr19 chr12contig13var3 pcr12 lt142 lt141 lt148 mal40022223 blob004383 var7 blob004072 mal40000175var pcr10 pcr5 pcr16 pcr7 pcr21 pcr22 pcr20

Figure 26 : Arbre phylogénétique des membres de la famille multigénique Pf60. L’arbre a été construit à partir de l’alignement présenté en Annexe C. Veuillez noter que cet arbre n’est pas enraciné et que la longueur des branches est proportionnelle au degré de divergence des paralogues. En bleu ont figuré les membres de la sous-famille A, en vert et bleu ceux de la sous-famille B et en magenta ceux de la sous-famille C.

C’est donc un tableau complexe, où diversité et conservation sont intimement liées, qui émerge ; complexité renforcer par le fait que le diversité de séquence ne rend pas compte de la diversité fonctionnelle des protéines. Dans le cas de la famille

- 57 -

Résultats

multigénique Pf60/var, ce n’est pas la séquence des éléments homologues qui permet de distinguer fonctionnellement un gène var d’un gène Pf60, mais, au contraire, la séquence des éléments non-homologues (exon1 var versus orf1 Pf60).

- 58 -