Sequence contexts that determine the pathogenicity ... - Nicolas Molinari

Jun 16, 2009 - Received 27 February 2009; accepted revised manuscript 4 June 2009. Published online 16 ... generally represents the first and most critical step of pre-mRNA splicing. ... The disease-associated mutations reported in donor splice-sites clustered more ... Mutations outside the GT are particularly prevalent.
500KB taille 3 téléchargements 240 vues
RESEARCH ARTICLE

Human Mutation OFFICIAL JOURNAL

Sequence Contexts That Determine the Pathogenicity of Base Substitutions at Position 13 of Donor Splice-Sites

www.hgvs.org

Sandie Le Gue´dard-Me´reuze,1 Christel Vache´,2 Nicolas Molinari,3 Julie Vaudaine,1 Mireille Claustres,1,2,4 Anne-Franc- oise Roux,1,2 and Sylvie Tuffery-Giraud1,4! 1

INSERM, U827, Montpellier, F-34000, France; 2CHU Montpellier, Hoˆpital Arnaud de Villeneuve, Laboratoire de Ge´ne´tique Mole´culaire,

Montpellier, F-34000, France; 3CHU Nıˆmes, De´partement d’Information Me´dicale, Nıˆmes, F-30000, France; 4Universite´ Montpellier 1, UFR Me´decine, Montpellier, F-34000, France

Communicated by Garry R. Cutting Received 27 February 2009; accepted revised manuscript 4 June 2009. Published online 16 June 2009 in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/humu.21070

ABSTRACT: Variations at position 13 of 50 splice-sites (50 ss) are reported to induce aberrant splicing in some cases but not in others suggesting that the overall nucleotidic environment can dictate the extent to which 50 ss are correctly selected. Functional studies of three variations identified in donor splice-sites of USH2A and PCDH15 genes sustain this assumption. To gain insights into this question, we compared the nucleotidic context of U2-dependent 50 ss naturally deviated (13G,13C, or 13T) from the13A consensus with 50 ss for which a 13 variation (A4G, A4C, or A4T) was shown to induce aberrant splicing. Statistical differences were found between the two datasets, highlighting the role of one peculiar position in each context (13G/14A; 13C/!1G; and 13T/!1G). We provided experimental support to the biostatistical results through the analysis of a series of artificial mutants in reporter minigenes. Moreover, different 50 end-mutated U1 snRNA expression plasmids were used to investigate the importance of the position 13 and of the two identified compensatory positions !1 and 14 in the recognition of 50 ss by the U1 snRNP. Overall, our findings establish general properties useful to molecular geneticists to identify nucleotide substitutions at position 13 that are more likely to alter splicing. Hum Mutat 30:1329–1339, 2009. & 2009 Wiley-Liss, Inc. KEY WORDS: donor splice sites; position 13; sequence context; USH2A; PCDH15

Introduction In clinical molecular genetics, it is crucial to discriminate splicing mutations of pathological relevance from neutral variants. Splicing is mediated by the spliceosome, a macromolecular machinery composed of five small nuclear RNAs (snRNAs) U1, U2, U4, U5, and U6, assembled into small nuclear ribonucleoprotein particles (snRNPs), and numerous associated Additional Supporting Information may be found in the online version of this article. !Correspondence to: Sylvie Tuffery-Giraud, PhD, Laboratoire de Ge´ne´tique

Mole´culaire et INSERM U827, Institut Universitaire de Recherche Clinique (IURC), 641 Avenue du Doyen Giraud, 34093 Montpellier Cedex 5, France. E-mail: [email protected]

non-snRNP proteins, that recognizes splicing signals and catalyzes intron removal [Jurica and Moore, 2003; Luhrmann et al., 1990; Pettigrew et al., 2005]. Essential splicing signals in the precursor mRNA include the 50 splice site (50 ss) or donor splice-site, the 30 splice-site (30 ss) or acceptor splice-site, and the branch point (BP) sequences. The correct recognition of the donor splice-site generally represents the first and most critical step of pre-mRNA splicing. In higher eukaryotes, the consensus sequence for the U2-type GT–AG 50 ss motif, spanning from positions !3 to 16, corresponds to perfect Watson-Crick basepairing to the U1 snRNA 50 terminus [Horowitz and Krainer, 1994]. Before the first splicing catalytic step, U1 is replaced by U5 and U6 snRNPs, for which the snRNA binds to the exonic (position !1 and !2) and intronic (positions 12, 15, and 16) portion of the 50 ss, respectively [Brow, 2002]. The disease-associated mutations reported in donor splice-sites clustered more closely around the exon/intron junction, with 64% of 50 ss mutations in the Human Gene Mutation Database (HGMD) affecting the obligate GT dinucleotide [Krawczak et al., 2007]. Mutations outside the GT are particularly prevalent at positions !2, !1, and 13 to 16, with 70% of them being located at either exonic position !1 or intronic position 15 [Krawczak et al., 2007]. Predicting the effects of changes in consensus 50 near intron/exon boundaries is thought to be relatively straightforward. However, single nucleotide polymorphisms (SNPs) have also been reported in splice-sites, and in contrast to pathological mutations, they are almost evenly distributed over all sequence positions of 50 ss. Previous bioinformatic studies and experimental data have evidenced that the different positions of the 50 ss might have a mutual relationship; i.e., a mispair between U1 snRNA and one position of the 50 ss can be compensated by a basepair of U1 snRNA to (an)other position(s), thus maintaining the basepairs above a minimal number [Burge and Karlin, 1997; Carmel et al., 2004; Clark and Thanaraj, 2002; Lund and Kjems, 2002; Nelson and Green, 1990; Ohno et al., 1999; Roca et al., 2008]. Determination of the clinical significance of sequence variations identified at position 13 of donor splice-sites raises a specific problem. Indeed, several SNPs having alleles A/G at position 13 have been reported [Roca et al., 2008], most probably because both A and G are consensus nucleotides at that position [Burge and Karlin, 1997; Sahashi et al., 2007]. Hence such substitutions are generally classified as variants of unknown/uncertain significance (UVs) in diagnostic laboratories when splicing outcome can not be established by transcripts analysis, even though computer-generated algorithms can help to predict which variants are deleterious and which are neutral [Hartmann et al., 2008].

& 2009 WILEY-LISS, INC.

In the course of the molecular diagnosis of Usher syndrome, a clinically and genetically heterogeneous recessive disorder that combines hearing loss and retinitis pigmentosa [Saihan et al., 2009], we identified three sequence variations occurring at position13 of 50 ss by DNA sequencing in patients. They consist of the c.381113A4T [Baux et al., 2007] and c.1058513A4G in the USH2A gene (MIM# 608400) and the c.15713A4G in the PCDH15 gene (MIM# 605514), the latter being referenced as a polymorphism in the dbSNP (rs41274636:A4G). We first determined the impact on splicing of the three identified substitutions at position 13 in Usher genes using splicing reporter minigenes. Then, we set up different matrices by compiling motifs from native 50 ss and 50 ss carrying a diseasecausing mutation at position 13 to further extend studies on sequence contexts that dictate correct 50 ss usage. Depending on the presence of a G, C, or T nucleotide at this position, we were able to demonstrate by biostatistical analyses that a single specific position in the 50 ss contributes to splicing efficiency. These results were experimentally validated through expression of a series of minigenes and the role of U1 snRNP in splice-site recognition was investigated in different sequence contexts by the use of appropriate modified U1 snRNA.

Materials and Methods Compiling of Native and Mutant 50 ss Motifs The dataset of native 50 ss is based on a large collection of 189,249 50 ss previously reported [Sahashi et al., 2007]. Only U2dependent 50 ss carrying an invariant GT dinucleotide at positions 11 and 12 were taken into account. Furthermore, we compiled a dataset of 48 mutant motifs by integrating 50 ss sequences with disease-associated mutations at position 13 reported either in the literature (for references see Supp. Table S1) or in various databases: HGMD database (www.hgmd.cf.ac.uk), the Human 50 ss Database (www.uni-duesseldorf.de/rna), the DBASS5 (www.dbass. org.uk/5) [Buratti et al., 2007], the ATM and the FANCA databases (available at http://chromium.liacs.nl/lovd), and the CFTR database (www.genet.sickkids.on.ca/cftr/app). Only substitutions (A4G, A4C, and A4T) at position13 with an experimentally verified effect upon splicing were selected for the study. Small deletions or insertions at this position were excluded. Nucleotide sequences of the 48 mutated 50 ss are detailed in Supp. Table S1. For all cited sequence variations, nucleotide numbering reflects cDNA numbering, with 11 corresponding to the A of the ATG translation initiation codon in the reference sequence, according to the journal guidelines (www.hgvs.org/mutnomem). In particular, the description of the sequence variations selected for functional studies referred to the following mRNA RefSeqs: NM_206933.2 (USH2A), NM_033056.3 (PCDH15), NM_000260.2 (MYO7A), NM_002921.3 (RGR), NM_130837.2 (OPA1), and NM_000070.2 (CAPN3).

Construction of Matrices From the Native and Mutant Datasets of 50 ss Matrices representing the frequencies for each nucleotide from position !3 to position 16 of the 50 ss were constructed for the native (from the 189,249 sequences) and mutant (from the 48 mutated sequences) datasets. Then a series of additional matrices were generated to allow comparison of the sequence context between native splice-sites and mutant splice-sites depending on

1330

HUMAN MUTATION, Vol. 30, No. 9, 1329–1339, 2009

the nucleotide found at position13 (T, C, or G). For populations of 50 ss carrying a G at13 position, we differentiated those carrying a G at15 (13G15G) from those carrying an A, C, or T at15 (13G15H). Pictograms of all the constructed matrices were designed using the Weblogo software (available at http:// weblogo.berkeley.edu/logo.cgi). The strength of each 50 ss included in the native and mutant datasets was calculated using two different algorithms. One is based on the information content (Ri) in bits, which is the dot product of a weighting matrix derived from the nucleotide frequencies at each position of a splice-site sequence database and the vector of a particular sequence [Rogan et al., 1998]. Calculated Ri values were obtained through the Excel program developed by Sahashi et al. [2007]. To accommodate dependencies between adjacent and nonadjacent positions of the 50 ss, the MaxEnt algorithm (http://genes.mit.edu/burgelab/ maxent/Xmaxentscan_scoreseq.html) was also used [Staden, 1984].

Statistical Analysis For each constructed matrix, a proportion test was used to test the null hypothesis that the frequency of each nucleotide from position !3 to 12 and 14 to 16 was identical in the native and in the mutant 50 ss datasets. Because a multiple testing approach was used, a Bonferroni correction was applied to avoid a lot of spurious positives. A Shapiro-Wilk test was performed to test the normality. In addition, we specifically compared the mean of 50 ss scores calculated for native 50 ss having an A at position 13 (n 5 112,635) with that obtained for the wild-type version of the 48 mutant 50 ss. Difference in the mean of 50 ss scores between the two sets was assessed using the nonparametric Mann-Whitney rank test.

Splicing Reporter Constructs A splicing reporter minigene (pGint vector) was used to study the impact on splicing of 50 ss variations. This system is based on a Green Fluorescent Protein (GFP) reporter plasmid derived from pEGFPN1 kindly provided by M.A. Garcia-Blanco [Wagner et al., 2004]. A test exon and its flanking intronic sequences (about 200 bp) are inserted into a heterologous intron (pI-12) that disrupts an Enhanced Green Fluorescent Protein (EGFP) coding sequence. Inclusion of the exon is designed to interrupt the EGFP reading frame (low level of fluorescence) whereas skipping restores the EGFP open reading frame (high level of fluorescence). A series of constructs was generated by introducing exons derived from human genes into the pGint plasmid. Fluorescence was used as a reliable and fast prescreening of splicing impact of the tested variations (data not shown). For the three sequence variations identified in the laboratory, c.381113A4T (USH2A intron 17), c.1058513A4G (USH2A intron 53), and c.15713A4G (PCDH15 intron 3), the exons and their flanking intron sequences were amplified from the patients’ genomic DNA using a proofreading polymerase. Amplicons were inserted into the pGint vector between the BamH1 and Sal1 restriction sites using the In-Fusion Dry Down PCR Cloning Kit (Clontech, Saint-Germain-en-Laye, France). For the other constructs tested, we first amplified the exons of interest from genomic DNA of normal controls. Naturally-occurring and artificial tested substitutions were introduced in the wild-type constructs using an oligonucleotide-directed mutagenesis system (QuickChange II Site-Directed Mutagenesis Kit; Stratagene, La Jolla, CA) according to the manufacturer’s instructions. The primer sequences and PCR conditions used for each fragment are available upon request. The sequence fidelity of each minigene construct was verified by direct sequencing.

Cell Culture and Transfection HeLa cells were cultured in Dulbecco’s modified Eagle’s medium (DMEM)-F12, supplemented with 5% fetal calf serum, 1% Ultrozer (Pall Corporation, Saint-Germain-en-Laye, France), 2 mmol/L L-glutamine, 1% nonessential amino acid, and 100 U/ mL penicillin !0.1 mg/mL streptomycin !0.25 mg/mL amphotericin B, at 371C in a humidified atmosphere of 95% air and 5% CO2 in 100 " 20 mm2 tissue culture dishes. About 80% confluent cells were transiently transfected in six-well plates (transcripts analysis) or 24-well plates (fluorescence analysis) with the FuGENE6 Transfection Reagent (Roche Diagnostics, Indianapolis, IN) and with 1 mg or 200 ng of minigene constructs, respectively, according to the manufacturer’s instructions. Cells of 24-well plates and six-well plates were harvested after 24 hr and 48 hr, respectively. All transfection experiments were performed in triplicate and in at least two independent transfection assays.

Transcripts Analysis The impact on splicing of the tested sequence variations was evaluated first by fluorescence microscopy, and subsequently verified by transcripts analysis that permits the accurate determination of the splicing outcome of the studied sequences (exon skipping event, utilization of de novo 50 ss or cryptic 50 ss). To this aim, RT-PCRs were performed on RNA extracted from cells transfected in six-well plates by using the Nucleospin RNAII Kit (Macherey-Nagel, Hoerdt, France) according to the manufacturer’s instructions. cDNA synthesis was carried out using 1 mg of total RNA, random primers (Invitrogen, Cergy-Pontoise, France) and the Superscript II Reverse Transcriptase (Invitrogen). One-tenth of the synthesized cDNA was used as a template for RTPCR amplification with vector-specific primers surrounding the cloning site: 50 -CATCCTGGTCGAGCTGGACG-30 as forward primer and 50 -GTAGGTCAGGGTGGTCACGA-30 as reverse primer. Amplification was performed for 30 cycles, consisting of 30 sec at 941C, 30 sec at 581C, and 30 sec at 721C. The products were resolved on 1.5% agarose gel and confirmed by sequencing. The proportion of exon-skipping or misspliced transcripts was measured using the Quantity one (v. 4.6.5) software (Bio-Rad, Marnes-La-Coquette, France).

others consist in A4G substitutions: c.1058513A4G in USH2A intron 53 and c.15713A4G in PCDH15 intron 3. Both the c.3811 13A4T and the c.1058513A4G have been found in trans of a deleterious mutation, and were highly suspected to be pathogenic. In contrast, the c.15713A4G in PCDH15 has been identified in cis of a nonsense mutation in an Usher patient, suggesting that it might be a neutral variant. Moreover, this sequence variation is referenced in the dbSNP (rs41274636:A4G), but no data on frequency, segregation analysis, heterozygosity, or validation status are available. Because specific transcript analysis is not achievable in Usher patients, we determined the impact on splicing of these sequence variations by functional splicing assays.

Consequences on Splicing of the Three Identified Sequence Variations in Usher Genes We inserted the wild-type and mutant version of each exon in a reporter splicing minigene, and analyzed the splicing pattern by RTPCR after transient transfections in HeLa cells. The rs41274636:A4G in PCDH15 was found to preserve normal splicing (Fig. 1A). Similar to the wild-type construct (13A), only the fulllength transcript corresponding to the inclusion of exon 3 was detected in presence of the 13G as confirmed by direct sequencing of the RT-PCR products indicating that this substitution could definitively be considered as a nonpathogenic variant. In contrast, the two different base substitutions identified in the USH2A gene were found to alter splicing. The c.381113A4T in intron 17 led to a complete exon skipping compared to the wild-type construct (Fig. 1B). The c.1058513A4G in intron 53 (Ri (mutant) 5 2.5 bits, MaxEnt (mutant) 5 !3.99) induced the activation of a stronger exonic cryptic donor splice-site (Ri (cryptic) 5 5.6 bits, MaxEnt (cryptic) 5 6.05) leading to the out-of-frame deletion of the last 82 nucleotides of exon 53 in all the transcripts (Fig. 1C and D). No trace of normal or skipped transcript was detected for the mutant construct. For both the exon 53 and the exon 17 wild-type constructs, small amounts of cryptic splice-site utilization and exon skipping, respectively, were noticed, which may be attributable to the heterologous expression system used. The results of the minigene studies illustrate the variability of the effect on splicing of base substitutions at position 13 in 50 ss of human disease genes, in particular of A4G substitutions, and sustain a role of the sequence context in splicing outcome of such sequence variations.

U1 snRNA Constructs The parental U1 snRNA clone pG3U1 (U1-WT), a derivative of pHU1 [Lund and Dahlberg, 1984], was kindly provided by Prof. F. Baralle and Dr. E. Buratti (Molecular Pathology Laboratory, International Centre for Genetic Engineering and Biotechnology [ICGEB], Trieste, Italy). The sequence variations (U114A, U1!1A, and U1!1T) of the 50 end of the parental U1 snRNA were introduced by PCR-based mutagenesis (QuickChange II SiteDirected Mutagenesis Kit) and all constructs were verified by sequencing. HeLa cells were transfected in six-well plates with 1 mg of each minigene plasmid and 2 mg of the control empty vector or U1 snRNA construct. RNA was extracted after 48 hr and transcripts analysis was performed as described above.

Results We identified three sequence variations occurring at position 13 of 50 ss in the USH2A and PCDH15 genes of patients with Usher syndrome, an autosomal recessive disorder. One substitutes an A4T in USH2A intron 17 (c.381113A4T), while the two

Compiling a Comprehensive Native and Mutant 50 ss Motifs Collection In order to determine why a discordant G at position 13 of the 50 ss consensus that is not matched to the 50 end of U1 snRNA has a consequence on splicing in some contexts (mutations) and not in others (neutral variant or native 50 ss), we further explored the role of the nucleotidic environment in 50 ss selection. To this aim, we set up two datasets for native and mutant 50 ss. The dataset of native 50 ss was based on the 189,249 human native 50 ss previously reported [Sahashi et al., 2007]. Analysis of the nucleotide frequencies at each position revealed a consensus sequence, CAG|GTAAGT, fully complementary to the U1 snRNA sequence (Fig. 2). Position 13 is the sole position of the 9-bp motif to exhibit a high level of conservation of two nucleotides, A (59.5%) and G (34.6%). In our dataset of mutant motifs, we focused our attention into mutations that affect the A nucleotide at position 13 and with an experimentally verified effect upon splicing. We selected 48 mutations consisting of 27 A4G substitutions (56.3%), 11 A4T substitutions (22.9%), and 10 HUMAN MUTATION, Vol. 30, No. 9, 1329–1339, 2009

1331

Figure 1.

Ex vivo analysis of the three identified variants in Usher genes affecting position 13 of 50 ss. Gel electrophoresis of RT-PCR products obtained from the splicing reporter minigene assay for (A) c.15713A4G in PCDH15 intron 3, (B) c.381113A4T, and (C) c.105851 3A4G in intron 17 and intron 53 of USH2A, respectively. Splicing patterns obtained are specified on the right-hand of each agarose gel. Gray boxes represent EGFP exons while white boxes represent tested exons. WT, wild-type version of the tested exon. D: Schematic diagram representing the resulting aberrant splicing of USH2A intron 53 carrying the c.1058513A4G. Partial sequence of USH2A exon 53 and scores estimated by the information content (Ri) and the MaxEnt software are displayed. The13A4G mutation decreases values of the authentic 50 ss (Ri: 3.342.5 bits and MaxEnt: 2.774–3.99), and activates a stronger exonic cryptic 50 ss (underlined GT; Ri: 5.6 and MaxEnt: 6.05).

(Ri,mean 5 7.0 bits) with that of the 112,635 native human 50 ss carrying an A at position13 (Ri,mean 5 8.1 bits) (Mann-Whitney rank test; P 5 6.48 " 10!4). Consistent results were obtained computing the MaxEnt algorithm (7.85 vs. 8.61 for the 50 ss mutant and native human 50 ss, respectively, P 5 0.01125). Hence differences should exist in the sequence context of the 48 mutant 50 ss that make them prone to sequence variations at position 13 even though they still could be considered as good splice-sites. Indeed, the calculated mean Ri value of 7.0 bits is above the average information content for donor splice-sites required for splicing (Rsequence 5 6.73 bits) [Rogan et al., 2003]. In an attempt to address this question, we sought to set up specifically designed matrices.

Figure 2.

Pictorial representation of nucleotide frequencies at U2-type GT-AG 50 ss; 189,249 native 50 ss sequence motifs [Sahashi et al., 2007] were used to build this matrix. The region of the 50 end of U1 snRNA that basepairs to the different positions of 50 ss is indicated between matrices, pseudouridines (C) corresponding to the two consecutive posttranscriptionally modified uridines (Us) of U1 snRNA. In this and the following pictorial representation, P-values obtained by proportion tests are indicated under pictograms (threshold: Po0.00122); an arrow indicates positions that significantly differ between the native and mutant sets of 50 ss. [Color figure can be viewed in the online issue, which is available at www.interscience. wiley.com.]

A4C substitutions (20.8%) (Supp. Table S1). We also extracted from the literature two13A4G variations reported as unlikely pathogenic variants occurring in intron 6 of RGR (MIM# 600342) [Morimura et al., 1999] and intron 7 of CAPN3 (MIM# 114240) [Krahn et al., 2006], respectively, in addition to the one we had identified in intron 3 of PCDH15. It is assumed that sequence variations at position 13 that cause aberrant splicing occurred in the context of weak 50 ss compared to 13 substitutions that preserve normal splicing [Madsen et al., 2006]. In accordance with this statement, a statistical difference was observed when we compared the mean of the distribution of Ri values for the 48 mutant 50 ss in their wild-type version (13A)

1332

HUMAN MUTATION, Vol. 30, No. 9, 1329–1339, 2009

Construction and Comparison of Matrices for Native and Mutant 50 ss Global matrices representing the nucleotide frequencies at each position of the 50 ss (!3 to16) were constructed for the set of native human 50 ss (n 5 189,249) and for the set of 50 ss having a 13 mutation (n 5 48) (Fig. 2). Comparison of these matrices revealed that the distribution of nucleotides was significantly different at the 14 and 15 positions (Po2.2 " 10!16 and P 5 9.9 " 10!10, respectively). Indeed, the 14A and 15G nucleotides, which perfectly match with the U1 snRNA at these positions, were found to be more represented in the native context than in the mutant one (69.4% vs. 41.7% and 77.6% vs. 52.1%, respectively). Even though these global matrices evidenced a general tendency for differences in nucleotide distribution between native and mutant 50 ss, they may not be relevant to display more subtle differences that would depend on the nucleotide found at position 13 (13G, 13C, or 13T). We then performed additional comparative studies between the two native and mutant datasets taking into account this parameter. We first compared nucleotidic environment of 50 ss naturally carrying a 13G and 50 ss for which a 13G mismatch induces aberrant splicing (Nat13G vs. Mut13G) (Fig. 3A). A total of

65,527 50 ss naturally carrying a G at 13 and 2713A4G mutations were extracted from our data. Proportion tests revealed that the nucleotidic distribution at positions 14 and 15 were still significantly different in that context (Po2.2 " 10!16 in both cases). Actually, when a native 50 ss carries a 13G, this discordance from the consensus sequence seems to be compensated by an accordant G at 15 in 92% of the cases while an accordant G is observed in only 33.3% in mutated sites (Fig. 3A). Because of the interdependence of positions13 and15 previously described [Burge and Karlin, 1997] and confirmed in this study, 50 ss carrying a 13G and a 15G and those carrying a 13G and a different nucleotide at 15 (15H) were studied separately.

Compensation of 13A4G Sequence Variation in a 15G Context and a 15H Context Comparison of nucleotidic environments of the set of native 50 ss carrying 13G and 15G (Nat13G15G) and the set of 50 ss for

which a 13G and a 15G is deleterious (Mut13G15G) disclosed the importance of position14 in that context (Po2.2 " 10!16) (Fig. 3B). We found that a discordant nucleotide at position 13 is compensated by an accordant A at position 14 in 82.6% of cases in a native context. In return, we noticed that none of the 50 ss in our mutant dataset having a G at position 15 (n 5 9) has an A at position 14. This hypothesis was further confirmed by examining the sequences of 50 ss with an associated 13A4G polymorphism in the PCDH15, RGR, and CAPN3 genes. All three carry definitely an accordant A at position 14. To provide clues to this hypothesis, we constructed reporter minigenes and first confirmed the absence of effect on splicing of the two reported 13A4G polymorphisms in intron 7 of CAPN3 (c.102913A4G) and intron 6 of RGR (c.75613A4G). As illustrated by the RT-PCR, a normal pattern of splicing was obtained in presence of a 13G as was observed for the13A construct for the two tested exons (Fig. 3C). We next introduced a discordant T at 50 ss position 14 for the three exons. As expected from the bioinformatic analysis,

Figure 3. Compensation of 13A4G sequence variation in a 15G and 15H context. A: Comparison of the nucleotidic environment of naturally 13G 50 ss (Nat13G) and 50 ss for which 13A4G (Mut 13G) induces aberrant splicing. B: Comparison of the nucleotidic environment of naturally 13G 50 ss having a 15G (Nat 13G15G) and 50 ss for which 13A4G in a 15G context (Mut 13G15G) induces aberrant splicing. C: Splicing patterns of minigene constructs in a wild-type (WT) context or carrying different deleterious and/or compensatory mutations as indicated on the right-hand of agarose gels. For example, 13G14T indicates that the position 13 of the 50 ss was substituted in G (mutant context) while the position 14 was modified in T. Numbers at the left-hand of agarose gels indicate the proportion (%) of exon-skipping (white arrow) compared to the expected full-length transcript (dark arrow). D: Comparison of nucleotidic environment of naturally 13G 50 ss carrying an A, C, or T at position15 (Nat 13G15H) and 50 ss for which 13A4G in a15H context (Mut 13G15H) induces aberrant splicing. E: Splicing patterns of the WT or mutated (13G and 13G14A) USH2A exon 53 minigene. The introduction of a 13A4G mutation shifts the use of the donor splice-site from the authentic one (dark arrow) toward an exonic cryptic splice-site (gray arrow) in all the transcripts. The introduction of a compensatory 14A (13G14A) reduced the proportion of misspliced transcripts (% given at the left-hand of agarose gels). [Color figure can be viewed in the online issue, which is available at www.interscience.wiley.com.] HUMAN MUTATION, Vol. 30, No. 9, 1329–1339, 2009

1333

this sequence modification induced a partial (PCDH15 exon 3, 34% skipping) to total (RGR exon 6 and CAPN3 exon 7) abolition of splicing (Fig. 3C). A previous study suggested that an accordant T at position16 is essential to correct splicing in a 13G context [Ohno et al., 1999]. Even though we did not find any significant difference of nucleotide distribution at position 16 between the sets of native and mutant 50 ss no matter what the nucleotide at position 15 is, we introduced a discordant A at position 16 in the minigene PCDH15 exon 3, which exhibits an accordant T at 16. This base substitution led to only a limited extent of missplicing, whereas splicing was completely abolished when both a discordant 14T and 16A were introduced (Fig. 3C). In the absence of 15G, a mismatch with U1 snRNA at position 13 becomes critical to donor site selection [Burge and Karlin, 1997]. However, 2.7% of native human 50 ss carry a C, T, or A at that position. Comparison of this set of 50 ss to the set of 15H (C, T, or A) mutant 50 ss showed that native 50 ss tend to compensate for their mismatch at 13 and 15 by an accordant 14A in 58% of cases vs. only 17% in the mutant set (Po2.2 " 10!16) (Fig. 3D). No significant difference in nucleotide distribution was seen in other positions. To verify this hypothesis, an A was introduced at position 14 of the 50 ss of the USH2A exon 53, which is naturally 14T and for which we have demonstrated the deleterious effect of the 13A4G mutation (activation of the exonic cryptic splicesite). As expected from the statistical analysis, the introduction of an A at position 14 allowed partial restoration of normal splicing,

Figure 4.

confirming the importance of the position 14 in a 13G15H context (Fig. 3E).

Compensation of 13A4C and 13A4T Sequence Variations Finally, this study was extended to the analysis of the subsets of 13A4T and 13A4C deviations at 50 ss from the consensus sequence. Native and mutant sets of 50 ss sequences having a C at position 13 revealed that the distribution of nucleotides was significantly different for position !1 (P 5 2.24 " 10!10) (Fig. 4A), while the importance of this position had not been evidenced in the global matrices used earlier (Fig. 2). In fact, when native 50 ss carry a 13C, they also carry an accordant !1G in 92.4% of the cases vs. only 41.7% for the 50 ss with a reported 13C mutation. Functional validation was performed by introducing a discordant T nucleotide at position !1 of a 50 ss naturally carrying a 13C (exon 27 of MYO7A; MIM# 276903). This experiment revealed a strong inhibition of splicing, leading to the skipping of exon 27 in the vast majority (76%) of the transcripts (Fig. 4B). In return, a compensatory mutation at position !1 of the 50 ss restored normal splicing of exon 10 of the OPA1 gene (MIM# 605290) impaired by an A4C mutation in position 13 (Fig. 4B). Similar studies were performed for13T deviations, which revealed that accordant G at !1 was also crucial for correct selection of 1 3T 50 ss (P 5 1.13 " 10!7) (Fig. 4C). In accordance with this

Compensation of 13A4C and 13A4T sequence variations. A: Comparison of the nucleotidic environment of naturally 13C 50 ss and 50 ss for which 13A4C substitutions induce aberrant splicing. B: RT-PCR patterns of minigenes carrying a C at position13. The importance of position !1 upon splicing of these minigenes was assessed by introducing a discordant nucleotide at position !1 (MYO7A exon 27) or a !1G compensatory change (OPA1 exon 10). The proportion (%) of exon skipping (white arrow) compared to normal (dark arrow) is given at the lefthand of agarose gels. C: Comparison of nucleotidic environment of naturally 13T 50 ss and 50 ss for which 13A4T substitution induces aberrant splicing. D: RT-PCR patterns of the USH2A exon 46 minigene carrying a T at position 13. The introduction of a discordant A at position !1 leads to aberrant splicing (% given at the left-hand of the agarose gel): exon skipping (white arrow) and activation of an exonic cryptic splice-site (gray arrow). [Color figure can be viewed in the online issue, which is available at www.interscience.wiley.com.]

1334

HUMAN MUTATION, Vol. 30, No. 9, 1329–1339, 2009

prediction, the introduction of a discordant !1A in the native !1G13T 50 ss of USH2A exon 46 induced a complete abolition of normal splicing of this exon, leading to exon skipping and use of different exonic cryptic splice sites (Fig. 4D).

Do Compensatory Mutations at Positions !1 and 14 of 50 ss Reinforce U1 snRNA Binding? Substitutions at position 13 of donor splice-sites are expected to specifically impair their recognition by the U1 snRNA and, indeed, we could demonstrate that aberrant splicing due to an A4C substitution at position 13 of OPA1 exon 10 could be totally corrected by coexpression of a modified U1 snRNA (13C/ U113G) (Fig. 5A). The exon inclusion level (91%) obtained was nearly the same compared to the one achieved from the wild-type construct tested in the previous experiment (87%; Fig. 4B). However, we did not succeed in restoring normal splicing in the mutated 13G USH2A exon 53 minigene using a modified U113C but instead reinforced the use of the exonic cryptic splice-site previously identified (data not shown). We hypothesized that binding of U113C at the mutated authentic donor splice-site of exon 53 directly competes with binding at the stronger cryptic splice-site activated by the mutation, which also carries a G at position 13 (Fig. 1D), thus explaining the exclusive use of the cryptic 50 ss. Next, we sought to establish whether the suppression of aberrant splicing that we obtained by compensatory changes introduced in the minigenes to positions 14 and !1 was attributable to enhanced U1 snRNA binding at suboptimal donor splice-sites. To this aim, we performed complementation experiments with specifically engineered mutant U1 snRNAs. We first investigated the ability of U1 snRNAs capable of recognizing a !1A or !1T position to recover normal splicing of the OPA1 exon 10, MYO7A exon 27, and USH2A exon 46 minigenes having a C or a T at position 13. While the minigenes carrying a compensatory

change at position !1 sustained normal splicing (Fig. 4B), the cotransfection of a modified U1–1A with the mutated (!1T13C) minigenes was not able to reverse exon skipping (OPA1 exon 10) (Fig. 5A), or only to a limited extent (9%, MYO7A exon 27) (Fig. 5B). In the case of the USH2A exon 46, the modified U1–1T construct slightly reinforced the use of cryptic splice sites (by 17%) over exon skipping; no normal processed transcript was detected (Fig. 5C). In contrast, compensatory changes between U1 snRNA and 50 ss at position 14 (U114A) ameliorated aberrant splicing of the RGR exon 6, PCDH15 exon 3, and CAPN3 exon 7 minigenes carrying 13G14T. We observed that 11% to 23% of the misspliced transcripts were rescued (Fig. 5D–F). In all U1 snRNA complementation experiments, overexpression of the wild-type U1 (U1-WT) or empty vector did not alter the splicing pattern.

Discussion The consensus sequence of the 50 ss (position !3 to 16) reflects the Watson-Crick basepairing between the 50 ss and the 50 terminus of the U1 snRNA that is involved in the early steps of splicing [Alvarez and Wise, 2001; Wassarman and Steitz, 1992; Zhuang and Weiner, 1986]. However, not all the positions would contribute equally to this basepairing. Among the nine positions of 50 ss motif, positions !1 and 15 appear to be the most prominent in the basepairing with U1 snRNA, excluding the invariant 11 and 12 positions [Carmel et al., 2004]. The higher frequency of disease-associated splicing mutations at these noncanonical positions in the Human Genome Mutation Database [Stenson et al., 2003] sustains this assumption and most probably reflects a stronger selective pressure upon their maintenance [Krawczak et al., 2007]. This may be related to the fact that these positions also pair with other splicing factors (position !1 with U5 snRNA

Figure 5.

Correction of aberrant splicing by modified U1 snRNAs. Minigene variants inducing splicing defects were cotransfected alone or with 2 mg of the indicated mutant U1 snRNAs. For example, U1–1A indicates that the nucleotide corresponding to position –1 is substituted in A to match with the T of the studied 50 ss. WT, wild-type. Numbers at the left-hand side of the agarose gels indicate the proportion (%) of exonskipping (white arrow) compared to the full-length transcript (black arrow), except in the case of USH2A exon 46 where numbers indicate the proportion of transcripts involving the use of cryptic splice sites (gray arrow) compared to exon-skipping (white arrow). (A) Note that the U113G rescued aberrant splicing in the OPA1 exon 10 minigene while U1–1A had no effect or (B) only a limited effect in MYO7A exon 27 minigene. C: The use of U1–1T slightly reinforced the use of cryptic splice-sites. D–F: Cotransfection of U114A ameliorated aberrant splicing in PCDH15 exon 3, CAPN3 exon 7, and RGR exon 6. HUMAN MUTATION, Vol. 30, No. 9, 1329–1339, 2009

1335

and position 15 with U6 snRNA) [Sawa and Shimura, 1992; Sontheimer and Steitz, 1993]. The detrimental effect on splicing of base substitutions at other less conserved positions in 50 ss, i.e., positions !3, !2, 13, 14, and 16, is more questionable. In particular, interpretation of sequence variations at position 13 is problematic in clinical genetics, since both nucleotides A and G show a high level of conservation, and a subset of them is not deleterious. Through the functional splicing analysis of variations identified in Usher patients, we further illustrate that such substitutions can definitely be pathogenic in some cases (c.10585 13A4G in USH2A) and not in others (c.15713A4G in PCDH15). In this study, we have sought to establish how splicing mutations of pathological relevance differ from splice-site neutral variants at position 13 of 50 ss by analyzing sequence context. Biostatistical analyses and experimental data based on a mutational approach of reporter minigenes have highlighted that a single specific position could compensate for a deviation at 13 (G, C, or T) in native 50 ss. Interestingly, the compensatory position differs depending on the nucleotide found at position 13 establishing the following pairs, 13C/!1G, 13T/!1G, and 13G/14A. This may indicate that compensation of impaired binding of U1 snRNA with position 13 relies on different mechanisms involving position 14 or position !1 and potentially distinct splicing factors depending on the mismatch at position 13. Interdependence of positions 13 and 15 has been previously reported [Burge and Karlin, 1997], and indeed we observed a preference for G at position 15 (92%; Fig. 3A) in the subset of native 50 ss having a 13G compared to 77.6% in the global matrix (Fig. 2), while only 33% of the 13G mutant 50 ss were found to have a 15G (Fig. 3A). However, the comparison of the 13G/15G and 13G/15H matrices revealed that whatever the nucleotidic context at 15 (15G or 15H) is, an A at position 14 is sufficient to compensate for a 13G deviation, pinpointing the importance of this peculiar position in that context. Contrary to what was previously reported [Ohno et al., 1999], we did not evidence any essential importance of the position 16 in the compensation for a 13G either by biostatistical analyses or functional assays. In addition, we could establish that a G nucleotide at position !1 is a major determinant of the normal splicing process of native 13C/T 50 ss. This is consistent with the results of the recently published association matrix for Homo sapiens that suggests a !1G13C/T pairwise association in human native 50 ss and a depletion of human native 50 ss, which carry an A/T or C at !1 when they are 13C or T [Roca et al., 2008]. However, we did not observe a marked enhanced frequency of the 13C14T pair for the native 50 ss (17.4% vs. 11.1% in the global matrix) (Fig. 4A) as reported in the previous association matrix and thought to reflect a stronger basepairing to the U6 ACAGAG box [Roca et al., 2008]. Complementarity to U1 snRNA plays a pivotal role in 50 ss selection [Seraphin et al., 1988; Siliciano and Guthrie, 1988; Zhuang and Weiner, 1986] even though cases of U1-independent splicing through an U6-dependent mechanism have been documented [Crispino et al., 1994; Raponi and Baralle, 2008]. Then it is reasonable to hypothesize that the compensatory changes observed in native 50 ss are intended to maintain the number of basepairs between U1 snRNA and 50 ss above the minimum. Up to 97.5% of native 50 ss have five or more matches between 50 ss and the 50 end arm of the U1 snRNA, with a higher frequency for six and seven consecutive basepairing matches (cumulative frequency of 70%) (Fig. 6). In contrast, most 50 ss mutants at position13 have five or fewer matches (66.7%), the majority of them (50%) exhibiting five matches, which greatly

1336

HUMAN MUTATION, Vol. 30, No. 9, 1329–1339, 2009

differs from native 50 ss (16%). Because the number and also the position of accordant nucleotides are expected to determine normal splicing [Ohno et al., 1999], we next considered the number of consecutive matches for both native and mutant 50 ss. We observed that 62.6% of the human native 50 ss have five or more consecutive matches with U1 snRNA with a higher frequency of five and six consecutive basepairing matches (46%). Conversely, none of the13 mutants exhibited more than five matches, and the majority of them (70.8%) have only three or four consecutive matches. These observations reinforce the assumption that the number of consecutive accordant nucleotides is a crucial determinant for the 50 ss selection [Ketterling et al., 1999]. An extended U1 snRNA/50 ss interaction increases 50 ss recognition and then exon inclusion [Freund et al., 2005] even though less than seven potential Watson-Crick basepairs to U1 snRNA is thought to be preferred [Carmel et al., 2004]. Indeed, hyperstabilization of the spliceosome may inhibit the splicing process [Konarska and Query, 2005; Lund and Kjems, 2002]. The U1 snRNA complementary experiments provided some clues about this question. We could clearly demonstrate that U11 3G was able to suppress a mutation at donor site position 13 and to redirect the selection of the correct 50 ss in the mutant 13C

Figure 6.

Comparison of basepairing of native and mutant 50 ss with U1 snRNA. A: Total number of basepairs across the 50 ss (from position –3 to16) for native and 50 ss 13 mutants. B: Number of consecutive basepairs for native and 50 ss 13 mutants.

Table 1.

General Rules for the 50 ss Selection in the Presence of a G, C, or T Base Substitution at Position 13 50 ss Positiona Probability of splicing defect

–1

13

14

15

– 1 11 111 1 111 1 111

NC NC NC NC G 6¼G G 6¼G

A4G A4G A4G A4G A4T A4T A4C A4C

A A 6¼A 6¼A NC NC NC NC

G 6¼G G 6¼G NC NC NC NC

a Nucleotidic context at position 13 and according to sequence context at positions !1, 14, and 15. The (11), (1), and (!) symbols reflect a decreasing scale in the probability of aberrant splicing. The (111) symbol indicates a high probability of a splicing defect. NC, not contributive.

OPA1 exon 10 minigene. Moreover, experiments with U114A indicate that the correct binding of U1 snRNA at position 14 in a 13G context plays a role in 50 ss selection. The enhanced frequency of the 13G14A pair in native 50 ss can be structurally explained by the fact that an A consensus at14 might stabilize the weak basepair between the 13G and the pseudouridine (C) nucleotide in the 50 arm of the U1 snRNA, stabilization which is not essential in the case of an A-C basepair. The role of position !1 in the interaction of 50 ss with U1 snRNA is less clear. Recent data have established the importance of U1 snRNA basepairing to the exonic portion of the 50 ss in the context of 15 and 16 mutations [Carmel et al., 2004]. This position is involved in a G:C basepairing that forms three hydrogen bonds and contributes to the stability of U1:50 ss duplex. Also, it is inferred that G at positions !1 and 11 generates a strong stacking effect between two adjacent purines [Fini and Slaugenhaupt, 2002; Sontheimer and Steitz, 1993]. In our study, we demonstrate for the first time that this position is sufficient to compensate for 13C and 13T mismatches in minigenes (Fig. 4). However the cotransfection of a modified U1 snRNA had poor effects on exon skipping level (OPA1 exon 10 and MYO7A exon 27) or instead reinforced the use of cryptic splice-sites (USH2A exon 46). In the difference of position 14, which is predicted to only interact with the U1 snRNA, position !1 is also engaged in binding with U5 snRNP [Cortes et al., 1993; Newman and Norman, 1992] and other splicing factors, such as U1(C) [Rossi et al., 1996] or U1(Sm) [Zhang et al., 2001]. Hence we cannot rule out that splicing factors other than the sole U1 snRNP contribute to splicing process in the three tested minigenes, and may interfere with the rescue of splicing by the modified U1 snRNA. In conclusion, our study confirms that interdependencies between bases need to be considered in the interpretation of sequence variations in splice-sites. We provide clues crucial to the prediction of the impact on splicing of base substitutions affecting position 13 of donor splice-sites. Based on our data, we propose general rules that may be helpful for diagnostic purposes (Table 1), but also may contribute to refine algorithms of prediction that accommodate nucleotides dependencies. First, 13A4G substitutions are highly expected to be deleterious when position 14 differs from an A, and 24 out of the 27 (88.8%) mutations in Supp. Table S1 actually respect this rule. This is true whatever the nucleotide found at position 15, but an A, C, or T at position 15 should probably be considered as a worsening parameter. Second, the deleterious effect on splicing of a 13A4C or 13A4T mutation is very likely since only 6% of the human native 50 ss

carry a C or T at position 13, unless a G is present at position !1 that may reduce the probability of aberrant splicing. Some 50 ss sequences may escape the proposed rules because splicing is a highly regulated mechanism involving a plethora of splicing factors. Moreover cis-regulatory elements consisting of exonic and intronic splicing enhancers or silencers are known to play a critical role in the 50 ss recognition. However, the general trends exposed in this study are expected to prevail for the majority of 50 ss sequences. They can be of great help to molecular geneticists in a first attempt to detect potential candidates for splicing defects and prompt them to set up additional transcripts analysis.

Acknowledgments We thank Nicole Lautre´dou (Montpellier Rio Imaging platform) for her technical support in fluorescent microscopy. We also thank Mariano Garcia Blanco (Department of Molecular Genetics, Durham, NC) for the gift of the pGint vector, and F. Baralle and E. Buratti (Molecular Pathology Laboratory, International Centre for Genetic Engineering and Biotechnology [ICGEB], Trieste, Italy) for the U1 plasmid. S.L-G. is supported by a fellowship from the French SOS Retinite Association. This work was partly supported by the Association Franc- aise contre les Myopathies (AFM).

References Alvarez CJ, Wise JA. 2001. Activation of a cryptic 50 splice site by U1 snRNA. RNA 7:342–350. Annunen S, Korkko J, Czarny M, Warman ML, Brunner HG, Kaariainen H, Mulliken JB, Tranebjaerg L, Brooks DG, Cox GF, Cruysberg JR, Curtis MA, Davenport SLH, Friedrich CA, Kaitila I, Krawczynski MR, Latos-Bielenska A, Mukai S, Olsen BR, Shinho N, Somer M, Vikkula M, Zlotogora J, Prockop DJ, Ala-Kokko L. 1999. Splicing mutations of 54-bp exons in the COL11A1 gene cause Marshall syndrome, but other mutations cause overlapping Marshall/Stickler phenotypes. Am J Hum Genet 65:974–983. Aretz S, Uhlhaas S, Sun Y, Pagenstecher C, Mangold E, Caspari R, Moslein G, Schulmann K, Propping P, Friedl W. 2004. Familial adenomatous polyposis: aberrant splicing due to missense or silent mutations in the APC gene. Hum Mutat 24:370–380. Ars E, Kruyer H, Morell M, Pros E, Serra E, Ravella A, Estivill X, Lazaro C. 2003. Recurrent mutations in the NF1 gene are common among neurofibromatosis type 1 patients. J Med Genet 40:e82. Asselta R, Montefusco MC, Duga S, Malcovati M, Peyvandi F, Mannucci PM, Tenchini ML. 2003. Severe factor V deficiency: exon skipping in the factor V gene causing a partial deletion of the C1 domain. J Thromb Haemost 1:1237–1244. Baux D, Larrieu L, Blanchet C, Hamel C, Ben Salah S, Vielle A, Gilbert-Dussardier B, Holder M, Calvas P, Philip N, Edery P, Bonneau D, Claustres M, Malcolm S, Roux AF. 2007. Molecular and in silico analyses of the full-length isoform of usherlin identify new pathogenic alleles in usher type II patients. Hum Mutat 28:781–789. Beaufrere L, Rieu S, Hache JC, Dumur V, Claustres M, Tuffery S. 1998. Altered rep-1 expression due to substitution at position 13 of the IVS13 splice-donor site of the choroideremia (CHM) gene. Curr Eye Res 17:726–729. Berger I, Shaag A, Anikster Y, Baumgartner ER, Bar-Meir M, Joseph A, Elpeleg ON. 2001. Mutation analysis of the MCM gene in Israeli patients with mut(0) disease. Mol Genet Metab 73:107–110. Bidichandani SI, Shiach CR, Lanyon WG, Connor JM. 1994. Novel splice donor mutation affecting position 13 in intron 6 of the factor-VIII gene. Hum Mol Genet 3:651–653. Brackett JC, Sims HF, Rinaldo P, Shapiro S, Powell CK, Bennett MJ, Strauss AW. 1995. Two alpha-subunit donor splice-site mutations cause human trifunctional protein-deficiency. J Clin Invest 95:2076–2082. Brow DA. 2002. Allosteric cascade of spliceosome activation. Ann Rev Genet 36:333–360. Brunkow ME, Gardner JC, Van Ness J, Paeper BW, Kovacevich BR, Proll S, Skonier JE, Zhao L, Sabo PJ, Fu YH, Alisch RS, Gillett L, Colbert T, Tacconi P, Galas D, Hamersma H, Beighton P, Mulligan JT. 2001. Bone dysplasia sclerosteosis results from loss of the SOST gene product, a novel cystine knot-containing protein. Am J Hum Genet 68:577–589. Buraczynska M, Wu WP, Fujita R, Buraczynska K, Phelps E, Andreasson S, Bennett J, Birch DG, Fishman GA, Hoffman DR, Inana G, Jacobson SG, Musarella MA, HUMAN MUTATION, Vol. 30, No. 9, 1329–1339, 2009

1337

Sieving PA, Swaroop A. 1997. Spectrum of mutations in the RPGR gene that are identified in 20% of families with X-linked retinitis pigmentosa. Am J Hum Genet 61:1287–1292. Buratti E, Chivers M, Kralovicova J, Romano M, Baralle M, Krainer AR, Vorechovsky I. 2007. Aberrant 50 splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization. Nucleic Acids Res 35:4250–4263. Burge C, Karlin S. 1997. Prediction of complete gene structures in human genomic DNA. J Mol Biol 268:78–94. Carmel I, Tal S, Vig I, Ast G. 2004. Comparative analysis detects dependencies among the 50 splice-site positions. RNA 10:828–840. Carstens RP, Fenton WA, Rosenberg LR. 1991. Identification of RNA splicing errors resulting in human ornithine transcarbamylase deficiency. Am J Hum Genet 48:1105–1114. Claes K, Machackova E, De Vos M, Mortier G, De Paepe A, Messiaen L. 1999. Mutation analysis of the BRCA1 and BRCA2 genes results in the identification of novel and recurrent mutations in 6/16 Flemish families with breast and/or ovarian cancer but not in 12 sporadic patients with early-onset disease. Mutation in Brief Online. Hum Mutat 13:256. Clark F, Thanaraj TA. 2002. Categorization and characterization of transcriptconfirmed constitutively and alternatively spliced introns and exons from human. Hum Mol Genet 11:451–464. Cortes JJ, Sontheimer EJ, Seiwert SD, Steitz JA. 1993. Mutations in the conserved loop of human U5 snRNA generate use of novel cryptic 50 splice sites in-vivo. EMBO J 12:5181–5189. Crispino JD, Blencowe BJ, Sharp PA. 1994. Complementation by SR proteins of premessenger-RNA splicing reactions depleted of U1 snRNP. Science 265: 1866–1869. Deboer M, Bolscher B, Dinauer MC, Orkin SH, Smith CIE, Ahlin A, Weening RS, Roos D. 1992. Splice site mutations are a common cause of X-linked chronic granulomatous-disease. Blood 80:1553–1558. Fini ME, Slaugenhaupt SA. 2002. Enzymatic mechanisms in corneal ulceration with specific reference to familial dysautonomia: potential for genetic approaches. In: Sullivan DA, Stern ME, Tsubota K, Dartt DA, Sullivan RM, Bromberg BB, editors. Lacrimal Gland, Tear Film and Dry Eye Syndromes 3. Advances in Experimental Medicine and Biology, vol. 506. New York: Springer. p 629–639. Freund M, Hicks MJ, Otte M, Hertel KJ, Schaal H. 2005. Extended base pair complementarity between U1 snRNA and the 50 splice site does not inhibit splicing in higher eukaryotes, but rather increases 50 splice site recognition. Nucleic Acids Res 33:5112–5119. Fujita R, Buraczynska M, Gieser L, Wu WP, Forsythe P, Abrahamson M, Jacobson SG, Sieving PA, Andreasson S, Swaroop A. 1997. Analysis of the RPGR gene in 11 pedigrees with the retinitis pigmentosa type 3 genotype: paucity of mutations in the coding region but splice defects in two families. Am J Hum Genet 61:571–580. Harland M, Taylor CF, Chambers PA, Kukalizch K, Randerson-Moor JA, Gruis NA, de Snoo FA, ter Huurne JAC, Goldstein AM, Tucker MA, Bishop DT, Bishop JAN. 2005. A mutation hotspot at the p14ARF splice site. Oncogene 24:4604–4608. Hartmann L, Theiss S, Niederacher D, Schaal H. 2008. Diagnostics of pathogenic splicing mutations: does bioinformatics cover all bases? Front Biosci 13:3252–3272. Horowitz DS, Krainer AR. 1994. Mechanisms for selecting 50 splice sites in mammalian pre-messenger-RNA splicing. Trends Genet 10:100–106. Indo Y, Tsuruta M, Hayashida Y, Karim MA, Ohta K, Kawano T, Mitsubuchi H, Tonoki H, Awaya Y, Matsuda I. 1996. Mutations in the TRKA/NGF receptor gene in patients with congenital insensitivity to pain with anhidrosis. Nat Genet 13:485–488. Jurica MS, Moore MJ. 2003. Pre-mRNA splicing: awash in a sea of proteins. Mol Cell 12:5–14. Kaler SG, Gallo LK, Proud VK, Percy AK, Mark Y, Segal NA, Goldstein DS, Holmes CS, Gahl WA. 1994. Occipital horn syndrome and a mild Menkes phenotype associated with splice-site mutations at the MNK locus. Nat Genet 8:195–202. Kan R, Twigg SRF, Berg J, Wang L, Jin F, Wilkie AOM. 2004. Expression analysis of an FGFR2 IIIc 50 splice site mutation (108413A-G). J Med Genet 41:e108. Ketterling RP, Drost JB, Scaringe WA, Liao DZ, Liu JZ, Kasper CK, Sommer SS. 1999. Reported in vivo splice-site mutations in the factor IX gene: severity of splicing defects and a hypothesis for predicting deleterious splice donor mutations. Hum Mutat 13:221–231. Kirschner LS, Carney JA, Pack SD, Taymans SE, Giatzakis C, Cho YS, Cho-Chung YS, Stratakis CA. 2000. Mutations of the gene encoding the protein kinase A type I-alpha regulatory subunit in patients with the Carney complex. Nat Genet 26:89–92. Kohonen Corish M, Ross VL, Doe WF, Kool DA, Edkins E, Faragher I, Wijnen J, Khan PM, Macrae F, St John DJB. 1996. RNA-based mutation screening in hereditary nonpolyposis colorectal cancer. Am J Hum Genet 59:818–824. Konarska MM, Query CC. 2005. Insights into the mechanisms of splicing: more lessons from the ribosome. Genes Dev 19:2255–2260.

1338

HUMAN MUTATION, Vol. 30, No. 9, 1329–1339, 2009

Krahn M, Bernard R, Pecheux C, Hammouda EH, Eymard B, De Munain AL, Cobo AM, Romero N, Urtizberea A, Leturcq F, Leevy N. 2006. Screening of the CAPN3 gene in patients with possible LGMD2A. Clin Genet 69:444–449. Krawczak M, Thomas NST, Hundrieserl B, Mort M, Wittig M, Hampe J, Cooper DN. 2007. Single base-pair substitutions in exon-intron junctions of human genes: nature, distribution, and consequences for mRNA splicing. Hum Mutat 28:150–158. Laake K, Jansen L, Hahnemann JM, Brondum-Nielsen K, Lonnqvist T, Kaariainen H, Sankila R, Lahdesmaki A, Hammarstrom L, Yuen J, Leevy N. 2000. Characterization of ATM mutations in 41 Nordic families with ataxia telangiectasia. Hum Mutat 16:232–246. Liu B, Parsons RE, Hamilton SR, Petersen GM, Lynch HT, Watson P, Markowitz S, Willson JKV, Green J, Delachapelle A, Kinzler KW, Vogelstein B. 1994. hMSH2 mutations in hereditary nonpolyposis colorectal-cancer kindreds. Cancer Res 54:4590–4594. Lucchiari S, Donati MA, Parini R, Melis D, Gatti R, Bresolin N, Scarlato G, Comi GP. 2002. Molecular characterization of GSD III subjects and identification of six novel mutations in AGL. Hum Mutat 20:480. Luhrmann R, Kastner B, Bach M. 1990. Structure of spliceosomal snRNPs and their role in pre-messenger-RNA splicing. Biochim Biophys Acta 1087:265–292. Lund E, Dahlberg JE. 1984. True genes for human U1 small nuclear-RNA. Copy number, polymorphism, and methylation. J Biol Chem 259:2013–2021. Lund M, Kjems J. 2002. Defining a 50 splice site by functional selection in the presence and absence of U1 snRNA 50 end. RNA 8:166–179. Madsen PP, Kibaek M, Roca X, Sachidanandam R, Krainer AR, Christensen E, Steiner RD, Gibson KM, Corydon TJ, Knudsen I, Wanders RJA, Ruiter JPN, Gregersen N, Andresen BS. 2006. Short/branched-chain acyl-CoA dehydrogenase deficiency due to an IVS313A4G mutation that causes exon skipping. Hum Genet 118:680–690. Moller-Morlang K, Tavassoli K, Eigel A, Pollmann H, Horst J. 1999. Mutationalscreening in the factor VIII gene resulting in the identification of three novel mutations, one of which is a donor splice mutation. Mutation in Brief Online. Hum Mutat 13:504. Morimura H, Saindelle-Ribeaudeau F, Berson EL, Dryja TP. 1999. Mutations in RGR, encoding a light-sensitive opsin homologue, in patients with retinitis pigmentosa. Nat Genet 23:393–394. Nelson KK, Green MR. 1990. Mechanism for cryptic splice site activation during premessenger-RNA splicing. Proc Natl Acad Sci USA 87:6253–6257. Newman AJ, Norman C. 1992. U5 snRNA interacts with exon sequences at 50 and 30 splice sites. Cell 68:743–754. Ohno K, Brengman JM, Felice KJ, Cornblath DR, Engel AG. 1999. Congenital endplate acetylcholinesterase deficiency caused by a nonsense mutation and an A-G splice-donor-site mutation at position 13 of the collagen-like-tailsubunit gene (COLQ): how does G at position 13 result in aberrant splicing? Am J Hum Genet 65:635–644. Park KY, Dalakas MC, Goebel HH, Ferrans VJ, Semino-Mora C, Litvak S, Takeda K, Goldfarb LG. 2000. Desmin splice variants causing cardiac and skeletal myopathy. J Med Genet 37:851–857. Pensotti V, Radice P, Presciuttini S, Calistri D, Gazzoli I, Perez APG, Mondini P, Buonsanti G, Sala P, Rossetti C, Ranzani GN, Bertario L, Pierotti MA. 1997. Mean age of tumor onset in hereditary nonpolyposis colorectal cancer (HNPCC) families correlates with the presence of mutations in DNA mismatch repair genes. Genes Chromosomes Cancer 19:135–142. Pesch UEA, Leo-Kottler B, Mayor S, Jurklies B, Kellner U, Apfelstedt-Sylla E, Zrenner E, Alexander C, Wissinger B. 2001. OPA1 mutations in patients with autosomal dominant optic atrophy and evidence for semi-dominant inheritance. Hum Mol Genet 10:1359–1368. Pettigrew C, Wayte N, Lovelock PK, Tavtigian SV, Chenevix-Trench G, Spurdle AB, Brown MA. 2005. Evolutionary conservation analysis increases the colocalization of predicted exonic splicing enhancers in the BRCA1 gene with missense sequence changes and in-frame deletions, but not polymorphisms. Breast Cancer Res 7:R929–R939. Purandare SM, Lanyon WG, Connor JM. 1994. Characterization of inherited and sporadic mutations in neurofibromatosis type-1. Hum Mol Genet 3:1109–1115. Qi M, Byers PH. 1998. Constitutive skipping of alternatively spliced exon 10 in the ATP7A gene abolishes Golgi localization of the Menkes protein and produces the occipital horn syndrome. Hum Mol Genet 7:465–469. Raponi M, Baralle D. 2008. Can donor splice site recognition occur without the involvement of U1 snRNP? Biochem Soc Trans 36:548–550. Richard MM, Erenberg G, Triggsraine BL. 1995. An A-to-G mutation at the 13-position of intron-8 of the HEXA gene is associated with exon-8 skipping and Tay-Sachs-disease. Biochem Mol Med 55:74–76. Roca X, Olson AJ, Rao AR, Enerly E, Kristensen VN, Borresen-Dale AL, Andresen BS, Krainer AR, Sachidanandam R. 2008. Features of 50 -splice-site efficiency derived from disease-causing mutations and comparative genomics. Genome Res 18:77–87.

Rogan PK, Faux BM, Schneider TD. 1998. Information analysis of human splice site mutations. Hum Mutat 12:153–171. Rogan PK, Svojanovsky S, Leeder JS. 2003. Information theory-based analysis of CYP2C19, CYP2D6 and CYP3A5 splicing mutations. Pharmacogenetics 13:207–218. Rossi F, Forne T, Antoine E, Tazi J, Brunel C, Cathala G. 1996. Involvement of U1 small nuclear ribonucleoproteins (snRNP) in 50 splice site-U1 snRNP interaction. J Biol Chem 271:23985–23991. Sahashi K, Masuda A, Matsuura T, Shinmi J, Zhang Z, Takeshima Y, Matsuo M, Sobue G, Ohno K. 2007. In vitro and in silico analysis reveals an efficient algorithm to predict the splicing consequences of mutations at the 50 splice sites. Nucleic Acids Res 35:5995–6003. Saihan Z, Webster AR, Luxon L, Bitner-Glindzicz M. 2009. Update on Usher syndrome. Curr Opin Neurol 22:19–27. Sandoval N, Platzer M, Rosenthal A, Dork T, Bendix R, Skawran B, Stuhrmann M, Wegner RD, Sperling K, Banin S, Shiloh Y, Baumer A, Bernthaler U, Sennefelder H, Brohm M, Weber BHF, Schindler D. 1999. Characterization of ATM gene mutations in 66 ataxia telangiectasia families. Hum Mol Genet 8:69–79. Savino M, Borriello A, D’Apolito M, Criscuolo M, Del Vecchio M, Bianco AM, Di Perna M, Calzone R, Nobili B, Zatterale A, Zelante L, Joenje H, Della Ragione F, Savoia A. 2003. Spectrum of FANCA mutations in Italian Fanconi anemia patients: identification of six novel alleles and phenotypic characterization of the S858R variant. Hum Mutat 22:338–339. Sawa H, Shimura Y. 1992. Association of U6 snRNA with the 50 -splice site region of pre-messenger-RNA in the spliceosome. Genes Dev 6:244–254. Seraphin B, Kretzner L, Rosbash M. 1988. A U1 snRNA–pre-mRNA base-pairing interaction is required early in yeast spliceosome assembly but does not uniquely define the 50 cleavage site. EMBO J 7:2533–2538. Siliciano PG, Guthrie C. 1988. 50 splice site selection in yeast—genetic alterations in base-pairing with U1 reveal additional requirements. Genes Dev 2:1258–1267. Sontheimer EJ, Steitz JA. 1993. The U5 and U6 small nuclear RNAs as active-site components of the spliceosome. Science 262:1989–1996. Staden R. 1984. Computer methods to locate signals in nucleic-acid sequences. Nucleic Acids Res 12:505–519. Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, Thomas NST, Abeysinghe S, Krawczak M, Cooper DN. 2003. Human Gene Mutation Database (HGMD (R)): 2003 update. Hum Mutat 21:577–581. Tamary H, Dgany O, Toledano H, Shalev Z, Krasnov T, Shalmon L, Schechter T, Bercovich D, Attias D, Laor R, Koren A, Yaniv I. 2004. Molecular characterization of three novel Fanconi anemia mutations in Israeli Arabs. Eur J Haematol 72:330–335.

Tuddenham EG, Schwaab R, Seehafer J, Millar DS, Gitschier J, Higuchi M, Bidichandani S, Connor JM, Hoyer LW, Yoshioka A. 1994. Haemophilia A: database of nucleotide substitutions, deletions, insertions and rearrangements of the factor VIII gene, second edition. Nucleic Acids Res 22:4851–4868. Tzetis M, Efthymiadou A, Doudounakis S, Kanavakis E. 2001. Qualitative and quantitative analysis of mRNA associated with four putative splicing mutations (62113A-G, 275112T-A, 29611G-C, 1717-9T-C-D565G) and one nonsense mutation (ES22X) in the CFTR gene. Hum Genet 109:592–601. Wagner EJ, Baines A, Albrecht T, Brazas RM, Garcia-Blanco MA. 2004. Imaging alternative splicing in living cells. Methods Mol Biol 257:29–46. Wagner TMU, Hirtenlehner K, Shen PD, Moeslinger R, Muhr D, Fleischmann E, Concin H, Doeller W, Haid A, Lang AH, Mayer P, Petru E, Ropp E, Langbauer G, Kubista E, Scheiner O, Underhill P, Mountain J, Stierer M, Zielinski C, Oefner P. 1999. Global sequence diversity of BRCA2: analysis of 71 breast cancer families and 95 control individuals of worldwide populations. Hum Mol Genet 8:413–423. Wang XH, Pohfitzpatrick M, Chen T, Malavade K, Carriero D, Piomelli S. 1995. Systematic screening for RNA with skipped exons—splicing mutations of the ferrochelatase gene. Biochim Biophys Acta 1271:358–362. Wassarman DA, Steitz JA. 1992. Interactions of small nuclear RNAs with precursor messenger-RNA during in vitro splicing. Science 257:1918–1925. Wilton SD, Johnsen RD, Pedretti JR, Laing NG. 1993. Two distinct mutations in a single dystrophin gene—identification of an altered splice-site as the primary Becker muscular-dystrophy mutation. Am J Med Genet 46:563–569. Wimmer K, Roca X, Beiglbock H, Callens T, Etzler J, Rao AR, Krainer AR, Fonatsch C, Messiaen L. 2007. Extensive in silico analysis of NF1 splicing defects uncovers determinants for splicing outcome upon 50 splice-site disruption. Hum Mutat 28:599–612. Zeniou M, Pannetier S, Fryns JP, Hanauer A. 2002. Unusual splice-site mutations in the RSK2 gene and suggestion of genetic heterogeneity in Coffin-Lowry syndrome. Am J Hum Genet 70:1421–1433. Zhang D, Abovich N, Rosbash M. 2001. A biochemical function for the Sm complex. Mol Cell 7:319–329. Zhuang Y, Weiner AM. 1986. A compensatory base change in U1 snRNA suppresses a 50 splice site mutation. Cell 46:827–835. Zolezzi F, Valli M, Clementi M, Mammi I, Cetta G, Pignatti PF, Mottes M. 1997. Mutation producing alternative splicing of exon 26 in the COL1A2 gene causes type IV osteogenesis imperfecta with intrafamilial clinical variability. Am J Med Genet 71:366–370.

HUMAN MUTATION, Vol. 30, No. 9, 1329–1339, 2009

1339