The amelogenin story: origin and evolution - Wiley Online Library

Duplication and diversification of ancestral genes enco- ding mineralized tissues – bone, dentin and enamel – were crucial events in vertebrate evolution, ...
888KB taille 1 téléchargements 251 vues
Copyright  Eur J Oral Sci 2006

Eur J Oral Sci 2006; 114 (Suppl. 1): 64–77 Printed in Singapore. All rights reserved

European Journal of Oral Sciences

The amelogenin story: origin and evolution Sire J-Y, Delgado S, Girondot M. The amelogenin story: origin and evolution. Eur J Oral Sci 2006; 114 (Suppl. 1): 64–77  Eur J Oral Sci, 2006 Genome sequencing and gene mapping have permitted the identification of HEVIN (SPARC-Like1) as the probable ancestor of the enamel matrix proteins (EMPs), amelogenin (AMEL), ameloblastin (AMBN) and enamelin (ENAM). We have undertaken a phylogenetic analysis to elucidate their relationships. AMEL genes available in databases, and new sequences obtained in blast searching genomes or expressed sequence tags, were compiled (22 full-length sequences), aligned, and the ancestral sequence calculated and used to search for similarities using psi-blast. Hits were obtained with the N-terminal region of AMBN, ENAM, and HEVIN. We retrieved all available AMBN (n ¼ 8), ENAM (n ¼ 3), and HEVIN (n ¼ 4) sequences. The sequences of the four proteins were aligned and analyzed phylogenetically. AMEL and AMBN are sister genes, which diverged after duplication of a common ancestor issued from ENAM. The latter derived from a copy of HEVIN. Comparisons of gene organization, amino acid sequences and location of ENAM and AMBN, adjacent on the same chromosome, suggest that AMBN is closer to ENAM than AMEL. This supports AMEL as being derived from AMBN duplication. This duplication occurred long before tetrapod differentiation, probably in an ancestral osteichthyan. The story of AMEL origin is completed as follows: SPARC fi HEVIN fi ENAM fi AMBN fi AMEL.

Duplication and diversification of ancestral genes encoding mineralized tissues – bone, dentin and enamel – were crucial events in vertebrate evolution, in particular for protection and locomotion (skeleton), and for feeding adaptations (teeth). These tissues emerged early in vertebrate history, as they were already present in taxa without jaws, some 500 million yr ago (Ma) (1). Teeth were probably recruited from odontodes, which were located on the dermal armor of these early vertebrates, at the time the jaws formed in early gnathostomes (2–6). To date, teeth are present in most vertebrate lineages and their loss in, for example, turtles and birds has occurred secondarily, 200 and 80 Ma, respectively (7). In all vertebrate species, teeth share a typical structure comprising a protective tissue (enamel or enameloid) covering a sensitive tissue (dentin) surrounding a pulp cavity containing nerve endings and blood vessels. Dentin, which is deposited by mesenchymal cells (i.e. the odontoblasts), probably shares its origin with bone. Both tissues form on a collagenous latice, which is secondarily impregnated with numerous calcium-binding phosphoproteins [e.g. dentin sialophosphoprotein (DSPP), dentin matrix acidic phosphoprotein 1 (DMP1), integrin-binding sialoprotein (IBSP), and secreted phosphoprotein 1 (SPP1)]. In contrast to dentin, enamel, which is deposited by epithelial cells (i.e. the ameloblasts), is a unique tissue. In mammals, its organic matrix is composed of three, non-pleiotropic, enamel

Jean-Yves Sire1, Sidney Delgado1, Marc Girondot2 1

Equipe `Evolution & Dveloppement du Squelette', UMR7138 `Systmatique, Adaptation, Evolution' – CNRS, Universit Pierre & Marie Curie, MNHN, IRD, ENS – Paris, France; 2UMR 8079, Ecologie, Systmatique et Evolution, Universit Paris-Sud, Orsay, France

Dr Jean-Yves Sire, Universit Paris 6, UMR 7138, Case 5, 7 quai St-Bernard, 75005 Paris, France Telefax: +33–1)44273572 E-mail: [email protected] Key words: ameloblastin; amelogenin; bioinformatic; enamelin; evolution; hevin Accepted for publication November 2005

matrix proteins (EMPs): amelogenin (AMEL), which accounts for 90% of the forming enamel matrix; ameloblastin (AMBN); and enamelin (ENAM). Enameloid, although considered a tissue homologous to enamel, is particular in that both odontoblasts and ameloblasts contribute to its formation. Odontoblasts first deposit a loose collagenous matrix, which is secondarily impregnated by epithelial proteins, probably including proteinases, leading to the maturation of a hypermineralized tissue, which is similar to enamel. Enameloid covers the teeth of chondrichthyans (e.g. sharks and rays), actinopterygians (e.g. trout, zebrafish, cichlids), and basal sarcopterygians (lungfish, coelacanth, and salamander and newt larvae). Enamel covers the teeth of tetrapods [adult amphibians, sauropsids (¼ reptiles) and mammals]. The fossil record allows the evolution of these tissues to be traced in vertebrates, and it is currently well established that tetrapod enamel derives from an ancestral enameloid through a heterochronic process: reduction of the odontoblast contribution and increase of the ameloblast production of enamel-specific proteins (1,8,9). During tetrapod evolution, the enamel structure was subjected to evolutionary changes in the various lineages, leading to its current structural diversity: non-prismatic in amphibians and sauropsids, and prismatic in mammals, the latter condition deriving from the former (10). These changes in enamel structure could be the result of both a different

The amelogenin story: origin and evolution

organization of the ameloblasts (the presence of Tomes’ processes in mammals only) and variations in the structure and function of EMP-coding genes. Although immunohistochemical studies have indicated the possible presence of EMPs in enameloid of sharks and fish (11–13), none of the genes coding for these proteins has been detected, to date, in taxa possessing enameloid. In contrast, EMP genes are known in tetrapods and their evolutionary analysis has the potential to provide insight on their origin, their relationships, and their specific structural features, which could be linked to their function. Previous studies dealing with the evolution of mineralized tissues have indicated that the three EMP genes belong to a single family, which probably derives from a common ancestor, HEVIN (also known as SPARC-like 1, MAST9, SC1, QR1), a member of the SPARC family (14–16). For several years, we have focused our attention on the best known EMP, AMEL, with the objective of gaining a better understanding of its evolution. Recently, we have described AMEL evolution in mammals and calculated an ancestral putative sequence for mammalian AMEL (i.e. dating back to  200 Ma) (17). We have also shown that the two first exons of AMEL have a stronger relationship with those of AMBN and ENAM than with those of HEVIN (18), confirming a probable common origin for the EMPs. The present study was undertaken to answer the question of the relationships of AMEL. We compiled all available sequences for AMEL, AMBN, ENAM and HEVIN, added new sequences, and performed an evolutionary analysis. We demonstrate that AMEL and AMBN are sister genes, which have a common origin, long before tetrapod diversification, and that AMEL was created after the duplication of an ancestral AMBN. In addition, our study reveals that the AMBN/AMEL ancestor originated from a duplication of ENAM, which itself was derived from HEVIN.

Material and methods Sequences retrieval Amelogenin: We retained 15 complete coding sequences (CDS) available in databases, to which the following sequences were added: opossum and hamster sequences (although they lack the signal peptide); the computer-predicted dog sequence (after correction using blast search and comparison with other sequences); the chimpanzee sequence (completed after blast search and comparison with human); the sequence of Chalcides viridanus, a scincid lizard (19); the sequence of Rana catesbiana, a frog (20); and the sequence of Xenopus tropicalis, a frog (blast search against the currently well-advanced sequenced genome in the JGI sequencing project database). The references concerning these 22 tetrapod sequences are listed in Table 1. Ameloblastin (amelin): We retained eight complete CDS, to which the following sequences were added: the computerpredicted dog sequence after correction; the cow sequence after being completed; and the Xenopus tropicalis sequence (blast search against the genome). The reference for these 11 tetrapod sequences are listed in Table 1. The

65

computer-predicted chimpanzee sequence found in GenBank was not included in the analysis because of the presence of a reading frameshift in exon 13, probably the result of a sequencing error. Enamelin: Only three full-length CDS were available in databases, to which we added the computer-predicted sequences of rat, chimpanzee and dog, after correction. The reference data concerning these 6 mammalian sequences are listed in Table 1. HEVIN (SPARC-like1, MAST9, SC1, QR1): Six fulllength CDS were found in the databases, including the quail and fugu sequences. We added four more sequences: computer-predicted sequences from chimpanzee and dog (after correction); and pig and cow sequences [retrieved after blast searching expressed sequence tags (ESTs)]. Two additional sequences were recovered after blast searching the chicken and tetraodon genomes. The N-terminal region of bird and fish sequences could not be aligned to that of mammals because of a very high number of substitutions. Therefore, they were not used for the present study. Only the 8 mammalian sequences were used. References to these sequences are listed in Table 1. Molecular analyses Putative amino acid sequences were deduced from nucleotide sequences using DNA strider 1.2 software (for Mac: Ch. Marck and C.E.A., BGM, GiF/Yvette, France, 1991). The inferred protein sequences were aligned using clustalx 1.83 (ftp-igbmc.u-strasbg.fr) and hand-checked afterwards using the sequence alignment editor Se-Al 2.0 (1996–2002; Andrew Rambaut, University of Oxford, Oxford, UK). Putative ancestral sequences: Such sequences are more useful to go back to the origin of a gene than consensus sequences. However, because representative sequences were not always available in all lineages, the ancestral sequences have to be considered as being only indicative of our current knowledge of the evolution of these genes. For each gene, the sequences were aligned and gaps were removed. Then, each alignment was transferred into paup 4.0 (Sinauer Associates, Sunderland, MA, USA), taking into account the current tetrapod phylogeny using mcclade 3.06 (Sinauer Associates). The putative ancestral sequences were calculated at each node of the tree. For instance, AMEL sequences were obtained for putative mammal, amniote, and tetrapod ancestors. Psi-blast search: The psi-blast program [National Center for Biotechnology Information (NCBI) website] performs an iterative search, in which sequences found in one round of searching (iteration) are used to build a score model for the next round of searching. Different regions (exon 2 alone, exons 2+3, exons 2+3+5, exons 2–6) of the three ancestral AMEL sequences were used for blast searching all nonredundant GenBank peptide sequences with a statistical significance threshold including a sequence set at 0.5. For the query using only exon 2 (18 residues) and exons 2+3 (34 residues) sequences, we used pam-30 as blast substitution matrix. For the large sequences, blosum-62 was preferred as the most sensitive matrix. Phylogenetic analysis: The 47 sequences (22 AMEL, 11 AMBN, 6 ENAM and 8 HEVIN) were aligned using

Xenopus tropicalis Rana catesbeiana Takifugu rubripes Tetraodon nigroviridis

Tetraodon

Gallus gallus Coturnix japonica Paleosuchus palpebrosus Caiman crocodilus Elaphe quadrivirgata Chalcides viridanus Xenopus laevis

Lemur catta Otolemur garnettii Mus musculus Rattus norvegicus Mesocricetus auratus Cavia porcellus Sus scrofa Bos taurus Capra hircus Equus caballus Canis familiaris Monodelphis domestica

Homo sapiens Pan troglodytes Pongo pygmaeus Saimiri sciureus

Species

Human Chimpanzee Orangutan Common squirrel monkey Ring-tailed lemur Small-eared galago House mouse Norway rat Golden hamster Domestic guinea pig Pig Cow Goat Horse Dog Gray short-tailed opossum Chicken Japanese quail Cuvier’s dwarf caiman Spectacled caiman Colubrid snake Canarian skinc African clawed frog (two genes) Western clawed frog Bull frog Torafugu

Name

AMBN, ameloblastin; AMEL, amelogenin; ENAM, enamelin. +++, nucleotide sequences that were completed using blast search.

Amphibians, Anurans, Pipids Amphibians, Anurans, Ranids Actinopterygians, Teleosts, Tetraodontids Actinopterygians, Teleosts, Tetraodontids

Sauropsids, Aves, Phasianids Sauropsids, Aves, Phasianids Sauropsids, Crocodylids Sauropsids, Crocodylids Sauropsids, Squamates, Serpentes Sauropsids, Squamates, Scincids Amphibians, Anurans, Pipids

Primates, Lemurids Primates, Galagonids Rodents, Murids Rodents, Murids Rodents, Murids Rodents, Caviids Cetartiodactyls, Suids Cetartiodactyls, Bovids Cetartiodactyls, Bovids Perissodactyls, Equids Carnivores, Canids Metatherians, Didelphids

Mammals, Mammals, Mammals, Mammals, Mammals, Mammals, Mammals, Mammals, Mammals, Mammals, Mammals, Mammals,

Hominids Hominids Hominids Cebids

Primates, Primates, Primates, Primates,

Mammals, Mammals, Mammals, Mammals,

Lineage



Present study (20) –

– – AF095568 – AF118568 (19) AF095569; AF095570

AB091785 AB091787 D31768 U67130 AF005245 AJ012200 U43405 M63631 AF215889 AB032193 AB080686(+++) U43407

AF436849 AB091781(+++) – AB091783

AMEL (n ¼ 22)



Present study – –

– – – AY043290 – — AY181985; AY181986

ENAM (n ¼ 6)



– – –

– – – – – – –

– – U82698 XM_223338(+++) – – U52196 – – – XM_539305(+++) –

AF125373 XM_526591(+++) – –

References

– – BC087927 U35097 – AJ537436 U43404 AF157019(+++) – – XM_539304(+++) –

– –

AF219994

AMBN (n ¼ 11)

Species used in the study and references to nucleotide sequences

Table 1

Present study

– – AY575075

Present study M61908 – – – – –

– – BC003759 BC061755 – – Present study Present study – – XM_535648(+++) –

AF321976 CB294583(+++) CR860739 –

HEVIN (n ¼ 12)

66 Sire et al.

The amelogenin story: origin and evolution clustalx. Only the N-terminal region, aligning to the 62 residues of the tyrosine-rich amelogenin peptide (TRAP) region of AMEL was used in the analysis. Phylogeny has been constructed using maximum likelihood in paup 4.0 (transition and transversion at different rate, empiric base frequencies, gamma distribution for among-site variation with estimated shape parameter). Backbone constraint was used for the phylogeny within each gene, based on well-established tetrapod phylogeny. The tree was rooted on the HEVIN gene as this gene is only present in actinopterygian genomes.

Results Amelogenin sequences

AMEL, the major protein found in the forming enamel matrix (up to 90% in cattle) (21), plays a crucial role in enamel structure and mineralization, as exemplified by the severe disorders resulting from various mutations (X-linked amelogenesis imperfecta; reviewed previously [22]). However, the exact functional contribution of its various domains and conserved residues is still poorly understood. In most eutherian mammals, two copies exist on each of the sex chromosomes, X and Y, while in other taxa the gene is autosomal. AMEL-Y sequences were not retained in our study because they show a particular evolution rate (7). Prior to this study, fulllength AMEL sequences were known in tetrapods (i.e. mammals, sauropsids, and amphibians). New sequences, alignment, and ancestral sequence – The chimpanzee sequence is identical to the human sequence (no variation, not even at the nucleotide level), in contrast to the other primate sequences, which show a few substitutions (Fig. 1). Interestingly, the two lemur sequences show the characteristic insertions of triplet repeats in the region of a hotspot of mutation identified in exon 6 of mammalian AMEL (17). The dog sequence lacks a coding exon 4, and no AMEL copy was found in the Y chromosome. The X. tropicalis sequence resembles that of X. laevis, and particularly that of clone 2, but 26 residues (12.9%) differ between both sequences. The alignment of the 22 AMEL sequences indicates that the tetrapod ancestral gene comprised six exons coding for a protein composed of 193 amino acids (Fig. 2). Indeed, exon 4 is lacking in amphibians, in sauropsids and in some mammals, indicating that it has been recruited recently in mammals, after the differentiation of eutherian lineages. Exons 8 and 9, which were found in rodent sequences (23), are also lacking. The N-terminal region (the so-called TRAP region: the first 62 amino acids, excluding those of exon 4) contains 21 residues that were unchanged during 350 million yr (Myr) of tetrapod evolution. Sauropsid and amphibian sequences show much greater variation in exons 2 and 5 than mammal sequences. The central region of the protein (most of exon 6) is rich in proline (P) and glutamine (Q) residues, highly variable (in particular in sauropsids and amphibians), and not easy to align. In the region considered to be a hotspot of mutation, numerous substitutions (and also large insertions and/or deletions)

67

have occurred independently in various species of the mammalian, sauropsid and amphibian lineages. The C-terminal region (the last 32 amino acids) possesses five well-conserved amino acids. The amino acid similarity was found to be high in mammals (e.g. 94.3% between human and dog), medium between human and crocodile (51.8%), and low between human and frog (45.6%), the latter being close to the ancestral sequence. PSI-BLAST search of sequence similarities – The Nterminal region of the putative ancestral AMEL sequence for mammal, amniote (¼ mammals + sauropsids), and tetrapod (¼ amniotes + amphibians) lineages was used to search for sequence similarities. Exons 2+3 (34 amino acids), exons 2+3+5 (49 amino acids) and exons 2+3+5+6 (64 amino acids) detected similarities with AMBN at the first or second iteration. Search with large sequences (exons 2–5 and 2–6) of the three ancestral AMEL identified ENAM at the third or fourth iteration. The best hits were obtained with exons 2–5 of the mammal and amniote ancestral sequences: in addition to AMBN and ENAM, they also detected similarities with HEVIN at the sixth iteration. Ameloblastin, enamelin and hevin sequences

In contrast to our rather well representative data set for AMEL (22 sequences), our knowledge is more limited for AMBN (n ¼ 11), ENAM (n ¼ 6) and HEVIN (n ¼ 8) (Table 1). Ameloblastin sequences – Briefly, AMBN represents less than 5% of the EMPs in the developing enamel, but it is assumed to play a key role in the control of enamel crystal growth and in the subsequent determination of the prismatic structure (24). However, the function of its various domains is far from understood. Given its possible role in structuring enamel, AMBN mutations should result in genetic disorders. However, although suspected to be responsible for autosomal-dominant amelogenesis imperfecta, AMBN mutations have not been reported to date. Prior to this study, full-length AMBN sequences were known in eutherian mammals, a sauropsid and an amphibian (Fig. 3). The pig, cow (for which we completed exons 6 and 12) and dog sequences lack exons 8 and 9, which are present in the primate sequences. The X. tropicalis sequence resembles that of X. laevis, and particularly that of clone 1, but 58 residues (12.8%) differ between both sequences. The alignment of the 11 AMBN sequences indicates that the tetrapod ancestral gene was composed of 11 exons only, coding for a 432-amino acid protein (Fig. 3). Indeed, exons 8 and 9 are lacking in most sequences, indicating that these exons have been recruited recently in mammals, and probably during primate differentiation, by two repeats of exon 7. The N-terminal region ( 110 amino acids, from exon 2 to the start of exon 6) contains a number of well-conserved residues, although the sauropsid and amphibian sequences show more variations in exons 2, 4 and 5 than in the mammalian

68

Sire et al.

Fig. 1. Amelogenin. Alignment of the amino acid sequences. The putative ancestral tetrapod sequence is shown at the top. Vertical bars indicate the limits between exons. The signal peptide is in a box. The names of the new sequences are shown in bold. Unchanged residues are shown on a gray background. The two arrows indicate the TRAP proteolytic sites. (.), Identical residue; (–), indel; (*), unknown residue. See Table 1 for sequence references and Latin names of the species.

exons. The central region of the protein (most of exon 6, exons 7, 10 and 13) is variable (numerous substitutions), while exon 12 is more conservative. Exons 5 and 6 are rich in P and Q. Only the last four to seven residues of the C-terminal region are well conserved. The amino acid similarity is 82.9% between human and dog sequences, 42.8% between human and crocodile and only 24.3% between human and frog (which is close to the ancestral sequence).

Enamelin sequences – ENAM represents less than 5% of the matrix in the developing mammalian enamel, but it is assumed to play a crucial role in the enamel structure, as illustrated by a severe form of autosomal-dominant amelogenesis imperfecta resulting from ENAM mutations (25). For further details on the current knowledge on ENAM we refer to the recent review by Hu & Yamakoshi (26). In contrast to the other EMPs, ENAM was known in only four eutherian mammals. This is the

The amelogenin story: origin and evolution

Fig. 2. Schematic representation of the gene structure of the putative ancestral coding sequences calculated for AMEL, AMBN, ENAM and HEVIN (N-terminal region only). The reference to exon number on top of the boxes is that known for human sequences. The amino acid number coded by each exon is indicated within the boxes.

largest EMP, with more than 1,000 residues (1,142 amino acids in human). Prior to this study, full-length ENAM sequences were known in three eutherian mammals only (Fig. 4). The ENAM sequence retrieved in the chimpanzee genome shows 19 amino acid differences (1.7%) compared with human. This substitution level validates the three previously reported cases of polymorphism in human (26) and suggests the possible existence of a larger polymorphism. The dog sequence similar to the other sequences, while the rat sequence shows two particularities: absence of exon 4 and presence of an insertion of 176 residues (11 residues repeated 16 times) similar to that in the mouse (14 repeats). The alignment of the six ENAM sequences indicates that the ancestral gene of eutherian mammals was composed of nine exons coding for a 1129-amino acid protein (Fig. 4). Exon 9 is the largest exon with more than 900 residues (+143 in mouse and +176 in rat sequences). The number of species investigated is too small to reveal the regions that were particularly well conserved during evolution. However, exons 2, 3, 4, 5, 7, 8, the start of exon 9, and the last six residues of exon 9, possess wellconserved residues. Exons 6 and 7 are rich in P and Q. Exon 3 is homologous to exon 2 in AMEL and AMBN, although ENAM exon 2 possesses an ATG and wellconserved residues, which suggest that it could be translated (15). The sequence similarity between human and dog is 77.1%. Hevin sequences – HEVIN codes for a matricellular protein, which modulates cell–matrix interactions (antiadhesive activities) and cell function. In contrast to the EMPs, HEVIN does not act as a structural protein. While the function of SPARC (also known as osteonectin) is rather well known, the roles played by HEVIN are unclear. The protein is composed of a highly acidic N-terminal domain I, a follistatin-like domain II, and an extracellular calcium-binding C-terminal domain III. Prior to this study, full-length HEVIN sequences were known in four mammals, the quail (a sauropsid), and the fugu (a teleost fish). Domains II and III, located in the Cterminal region, were well conserved through evolution (and also among the SPARC family members, HEVIN, SPARC, TESTICANs and SMOCs). In contrast, the

69

large acidic domain I (more than 400 residues) was subjected to a large number of substitutions during evolution in each tetrapod lineage (M. Girondot et al. unpublished). We were interested in the N-terminal, variable region of HEVIN only (exon 2, which houses the signal peptide; exon 3; the large exon 4; and the short exon 5, for a total of 430 residues in human), because this is the only region in which amino acid similarities were detected by psi-blast search. The C-terminal region, with the follistatin and calcium-binding domains, is highly conserved, but no similarity was found with the three EMPs. The N-terminal region of the sequence retrieved in the chimpanzee genome shows 14 amino acid differences (3.3%) when compared with human (Fig. 5). The dog, pig and cow sequences possess an insertion of nine residues in exon 3. The alignment of the eight HEVIN sequences from eutherian mammals indicates that the ancestral gene was composed of five exons encoding a 444-amino acid peptide (Fig. 5). The N-terminal region of the eight sequences shows a high substitution rate and a low level of amino acid similarity (65.2% between human and dog sequences). Relationships of amelogenin

The alignments obtained for the four proteins (AMEL, AMBN, ENAM and HEVIN) were, in turn, tentatively aligned together (47 sequences). Only the N-terminal regions were used, because psi-blast search revealed that the relationships were located in this region only. The aligned region covered the first 62 amino acids of AMEL (from exon 2 to the TRAP proteolytic site at the start of exon 6). AMBN sequences were rather easy to align with AMEL sequences. ENAM alignment was slightly more complicated, while HEVIN was difficult to align, because only a few residues, most being located in exon 2, were common to the other sequences. Next, a phylogenetic tree was obtained, indicating the relationships between all sequences analyzed, using maximum likelihood (Fig. 6). This analysis demonstrates that (i) the AMEL cluster is the sister group of the AMBN cluster; (ii) AMEL and AMBN clusters form a group, which is the sister group of ENAM; and (iii) HEVIN appears as the sister group of the three EMPs. This result means that AMEL/AMBN have a common ancestor, which was issued from a duplication of an ENAM ancestor, itself derived from a copy of the HEVIN ancestor. This phylogenetic analysis, however, cannot help in answering the question of which came first, AMBN or AMEL. The comparison of the structural organization of the four genes reveals that they share the same first coding exon (18 amino acids), which contains the signal peptide. The three first, small exons of the ancestral EMPs have a similar size (14–18 amino acids), in contrast to HEVIN, which possesses a large exon 3 (59 amino acids) and an extended exon 4 (343 amino acids) (Fig. 2). The gene organization indicates that AMBN could be closer to ENAM than AMEL. The comparison of protein sequences also indicates the same

70

Sire et al.

Ancestral X.tropicalis X.laevis1 X.laevis2 Caiman Dog Cow Pig Guinea-pig Rat Mouse Human

exon2 MELLVLVLCL LKMSFAVPAF ..P....... .--.T.MA-Y ....A..F.. I--.T.IA-Y ...FA..M.. I--.T.IA-. .NVWM.T... .GTGF.L.MY .KD...I... ........V. .KDMI.I... .......... .KDM..I... ....S..... .KD.M.F.S. ..I....... .KG.L.F.S. V...L..... .KG.I.F.S. V...L..... .KD.I.I... .E......F.

exon3 P--QQTPGTQ -----GMASM .MHP.A.... -----.L..I .MHQH..... -----.L..I .IYP...... -----.LP.I .--.H.-..R -----..... .--..-...P -----....L .--..-..IP -----....L .--R.-...P -----.V..L .--..-..AP NMAPP....L .--..-..A. GMAPP....L .--..-..A. GMAPP....L .--..-S..P -----....L

SLETMRQQGA ........Q. ........-. ....L...Q. ...-...Y.R .......L.S .......L.S .......L.S .......L.S .......L.S .......L.S .......L.S

exon4 AQTLNT--LS .N..-.APF. .D..-.AP.. .NKQ-.AP.. -.NM.M--.P L.G..M--.. L.G..L--.. L.G..M--.. L.G..A--.. L.G..A--.. L.G..A--.. L.R...--..

QYSRFGYGDP .I.....N.. .I.....N.. .I.....N.. ..G.YD..E. ......F.KS ......F.KS ......F.KS ......F.KS ....L.F.KA ....L.F.KA ....Y.F.KS

exon5 YNS-LWLHGL .-.V...... .-.V...... .-.V...... F..-V..... F..-..M... F..-..MN.. F..-..M... L..-...Q.. L..-...... L..-...... F..-..M...

90 LPPHSSYPWL .......... .......... .....F...Q ......F... ......F..M ......F..M ......FQ.M ...P..F... ....N.F..I ....N.F..I ......L..M

Ancestral X.tropicalis X.laevis1 X.laevis2 Caiman Dog Cow Pig Guinea-pig Rat Mouse Human

HQRPQLSDNQ .......... .......... ......P... Q....EHET. --..REHET. --..REHET. --..REHET. --..REHET. --G.REHET. --G.REHET. --..REHET.

QFEYALPIHP .......... .......... ---------.Y...M.V.. .Y..S..V.. .---S..V.. .Y..S..V.. .Y..SV.V.. .Y..S..V.. .Y..S..V.. .Y..S..V..

PPLPGAQSPA .......... .......... ---------....SL.T.L ....S-.PSL ....S-.PSL ....S-.PSL ....S-.PSQ ....S-.PSL ....S-.PSL ....S-.PSL

QTEKPGQHAQ .......... ....A...T. ---------.PQQ.RLQ.. .PQQ...KPF .PQQ...KPF .PQQ...KPF .PPQ..MKHF .PHQ..LKPF .PHQ..LKPF KPQQ..LKPF

exon6 NMPQQPQAAA STDQVSHNGE .......... .......--.I........ ....L..------------ ---------.PSLRSTLPT KQG.IQL.-. L---.SAIVT DIQDTAQKRG L---..TVVT .MQNAVQK.V L---..TVVT .IQNPVQK.V Q---..T.TP AIQDTPQKAG L---..T..T GVQVTPQKPG L---..T..T GVQVTPQKPG L---.SA..T TNQATALKEA

AQPSLPLGFP -......... -......... ---------.LLPVQV.Q. T..PVYQ.Q. P..PIYQ.H. P..PIYQ.H. P..PMHPRQL PH.PMHP.QL P..PMHP.QL L..PIH..HL

ILQQADLPAM ......-... ......-..I --....-... P...GE..GI P...TE-GP. P....E-GP. P...VE-GP. P.KD.E.-PE P..EGE.I.P P..EGE.I.P P..EGE.-PL

180 IPPKGGPADK M........T......V..........QQ-QLI.... LEQQVA.S.. VEQQVA.SE. VQQQVA.SE. AHEQVA.TEM DE.QVA.SEN DE.QVA.SEN VQQQVA.S..

Ancestral X.tropicalis X.laevis1 X.laevis2 Caiman Dog Cow Pig Guinea-pig Rat Mouse Human

exon7 GQIAQTVALE YSGHLGQY-...-.....- -------.-...-.....- -------.-...-.....- -------.--.L-.LP... .......V-PPK.ELPGMD FAEPQ.PS-PPTTELPGMD FADLQDPP-PPE.ELPG.D FADPQDPS-PLNPELPV.D FADPQIPS-PPTPEVPIMD FADPQFPT-PPTPEVPIMD FADPQFPT-PPKPELPGVD FADPQ.PSLP

exon8 ---------------------------------------------------------------------------------------------------GMDFPDPQGP

exon9 ---------- ----MYQ-IM ---------- ----...T.. ---------- ----...T.. ---------- ----...T.. ---------- ----..P-.V ---------- ----VF.-.A ---------- ----.FP-.A ---------- ----.FP-.A ---------- ----VF.-.A ---------- ----VF.-.A ---------- ----VF.-.A SLPGLDFADP QGSTIF.-.A

-KLLAHQ--N...--.QGA N...--.QGA N...--.QGV -HQ.V..---R.ISR----H.ISR----R.IS-.---RFVSR----HS.SR----RSISR----R.IS.----

exon10 --------GT GETIPDPA.. GEMMPDPAS. GETMPDPA.. --------.P --------.P --------.P --------.P --------VP --------.P --------.P --------.P

L-P---QQHP ...VN..... P..V.H.... P..V.H.... MQ.---..Q. M..---.NK. M..---.NK. V..---.DK. V..---.NK. M.A---HNKV M.A---HNKA M..---.NKQ

270 APLYPGLFFM ---....... ---....... ---.....S. .-.H.A..Y. S.....I.Y. SQ....I.YV S.....M.Y. .M....MVY. PTF...M.Y. SAF...M.Y. S.....MLYV

Ancestral X.tropicalis X.laevis1 X.laevis2 Caiman Dog Cow Pig Guinea-pig Rat Mouse Human

exon11 QYGGGPGGPP .......... .......... ......D... S.AANQ..A. S..ANQLNA. T..ANQLNA. S..ANQLNS. S..TNQLSA. S..ANQLNA. S..ANQLNA. PF.ANQLNA.

exon12 -ARLGAMSSE -......... -......... -......... -....IV... -G...I.... -....I.... -....IL... A....F.... -..I.F.... -..I.F.... -....I....

EITGGRAGPM ......T--.......--....S..--.ML..-V-..MA...GS.. .MG...GD.L .MA...G..L .MP...VA.V .MP.E.GS.. .MP.E.GS.. .VA...ED..

AYGAMAHAFS --..-..... --..-..... --..-..T.. .....I---.....FPG.G ....IFPG.G .....FPG.G T..GLLPGLT ...TLFPGYG ...TLFPR.G .....FPG.G

exon13 SLYPGLLGM- ------PQNP .........G PRLESH.... .........G PGPGSQ.... .........G PGMGRQ.... ---..FR..- ------..D. GMR.N.G..- ------.H.. GMR.R.G..- ------.H.. GMR.N.G..- ------.P.S GVRHTIRRV- ------..D. GFRQT.R.L- ------N..S GFRQT.RRL- ------N..S GMR..FE..- ------.H..

ALQGDFTIED .......... .......... .S........ .......T.. GMG....L.F DMG....L.F .KG....L.F TMA....L.F PKG....V.V PKG....V.V .MG....L.F

DSPVAGQKPT .......... ...I..G... ......G.SA .N.ATAHN.A ......T.GP .....AT.GP ...A..T.GP .T.-.AT.GP ....SVT.GP ....SVT.GP .....AT.GP

360 IKGQEVSNQG --.....-.. -----.F-K. --...I.-K. .--.GGA... E..EGGAQGS E..EGGAQDS E..EGGAEGS E..EGGAQGS E..EGP-EGS E..EGP-EGS ENEEGGAQGS

Ancestral X.tropicalis X.laevis1 X.laevis2 Caiman Dog Cow Pig Guinea-pig Rat Mouse Human

PIRESRPQGV ..-..-.... ..-.T-..A. ..-..-.... FS.G..FPA. .MPDVN.ANP .VP.AHLADP .VA.ANTADP .LH.TKGEDP .LQ.AS.DKG .LQ.AN.GKR .MP.AN.DNL

NRAGDGSSII -----....M -----.P.L. ----.L.Q-. ....H..A.L ENP----ALL ESP----ALL ESP----ALF EN.----ALL ENP----ALL ENP----ALL ENP----AFL

IPGLEGNPTG ......S... ...S....N. M......... L.--D.T.A. --TELAPGAL --SELAPGAL --SEVASGVL --SQIAPGA--SQMAPGA--SQMAPGA--TELEPAP-

QGETVAFPKG .....P..-........-........-HEGFL--.--.GLL.H..-EGLL.N.E. -.GLL.N... HTGLFG..-N HAGLL...-N HAGLL...-N HAGLL.L..D

exon13 DNINNLPNMG FNPQSQSKIP -..-.S.... .....--... -..-....FA ........L. -..-....LA ........F. -...DM.GQ. V..-VGQRGT ..DPS.-AR. PAG..GGPPR -..P..-AR. PAGR.RGFLR -K.P..-AR. PAGR.RG--. ..VPSM-AR. PSGHRNRPL.H.P.M-AR. PAG.--RLL.H.PSM-AR. PAG.--RLL.-.PG.-PRS PSGKMKGL--

PGVTPANAAP ........-. ........-. .N...V..-. .E...TAT.. --....D.D. -.....A.D. ......D.D. -E...VT.D. -.....A.D. -.....A.D. .S....A.D.

RLTHDTGAGS ............Y..............E..-QGIPD. LM.PE-L.DI LM.P-GL.EV LM.P-GL.DA LI.PE-L.EI LI.PE-L.EV LI.PE-L.EV LM.PE-L.DV

450 FVPFGA-DDT Y...-.L... ....-.L... .-..-.M... .MT...-EG. YETY..-.V. YETY..-.E. YETY..-.E. YETY.G-.V. YETY..-.V. YETY..-.V. YRTYD.-.M.

Ancestral X.tropicalis X.laevis1 X.laevis2 Caiman Dog Cow Pig Guinea-pig Rat Mouse Human

IPFGIQRENV M......... ...SV..... ...SV....G V.L...K.VT T.L---E.TP TTL.L.E.TT TTL.L.E.MT T.L.---.AT T.L.DG-.AT T.L.DG-.AT TSVDF.E.AT

exon13 MSGDMTQAN- AKAIESPNMM I...I.R..N ..G....-I. IT.E.....N ..T.D..-I. .TME.....N .N..D..-.I ADPT.FPEA- QHTLMAG.GA TDTTVIPDT- QQTLMPE.KA VDSTA.PDT- QHTLMPR.KA .DSTA.PYS- EHTSMPG.KA .ETT..PDT- PQTPMPG.KV .DIT.SPDT- QQPPMPG.KV .DIT.SPDT- QQPLLPG.KV .DTT.APNS- LQTSMPG.KA

QHPEVVHLQN ..-.-..... -R-.-..... ---Q-..P.. EQ.Q.MQDVW .Q.QIM.DGW .Q.QIK.DAW .Q.QIKRDAW RQ.QM..EPW HQ.Q-..NAW HQ.Q-..NAW .E..MM.DAW

498 H---FQEP .NNY.... .NNY.... .NNY.... .---.... .---.... .---.... R---.... .---.... R---.... R---.... .---....

Fig. 3. Ameloblastin. Alignment of amino acid sequences. The putative ancestral tetrapod sequence is shown at the top. Vertical bars indicate the limits between exons. The signal peptide is in a box. The names of the new sequences are shown in bold. Unchanged residues are shown on a gray background. (.), Identical residue; (–), indel; (*), unknown residue. See Table 1 for sequence references and Latin names of the species.

The amelogenin story: origin and evolution

Ancestral Dog Pig Rat Mouse Chimpanzee Human

exon2 MLLLQCRLGA SLPKLDNLVP ...F.Y.H.. .......... .-..S..H.. .S........ .......NT. .P..PCDM........NPT .P..PCG... ..V.R....T .F........ ..V.R....T .F........

exon3 TGKMKILLVF LGLLGNSFAM .......... .......... S......... ....CY.A.. --..S-.... .....V.A.. NV..S-.... .....V.A.. K......... .......V.. K......... .......V..

Ancestral Dog Pig Rat Mouse Chimpanzee Human

---PQYQ-----....-----....--HMP...PPYQ HMP...PPYQ ---....-----....---

MPMWPQPPPN .......... .......... .....P.V.. .....P.V.. ..V....... ..........

TGHPQKSP-A ........S. KK....PS-. GW--.QP.-M GW--.QP.-M .W..R..S-. .W..R..S-.

Ancestral Dog Pig Rat Mouse Chimpanzee Human

exon7 LYQQPPWQVP .......... P....L.H.. P......PI. P.P....PI. P.......I. P.......I.

Ancestral Dog Pig Rat Mouse Chimpanzee Human

exon4 PMQ--MPRMP GFSSKSEEMM ...--..... .......... ...--..... .......... .--------- --------.. .F.MP..... .......... ..H--..... .......... ..H--..... ..........

exon5 RYGQFNFMNS .......... ...H.....A ..N......A ..N......A ..N..T...G ..N......G

exon6 90 PHMAQLGPLY GNS--MPQLF .......... ...M--.... ....H..T.. ..GMQL..F. .P.MHM..W- Q.G--L.MPP .P.MPM..-. ..G--..MPP ....H...FF ..G--L..Q. ....H...FF ..G--L..Q.

exon6 LKRQ-SKTDQ APETQKPNQP ....K..... .......... S.Q.-....P ...S...... PNFP-...E. TQ..A....T PNFP-..... TQ..A....T P..H-N.... TQ.......T P..H-N.... TQ.......T

QPKKPPPQKR .......-.. ...T.T.-.Q D.QE.Q...Q N.QE.Q...Q .S...-.... .S...-....

PLKQPPPTPA .......... ..NE.S...T ...E..NEA. ...E..NEA. .....SHNQP .....SHNQP

QPEEEGQPPQ .......... .....T.T.. RAKD.A.... RAKDDA.... .....A.... .....A....

exon7 180 AFPPFGNGLF .......... .......... P........Y P........Y .......... ..........

exon8 HRVPPPGYGR PPASNEEGGN .....-.... .......... ..I..-.... ..T....... QTG..T.F.. .KF.....-. Q.G..TAF.. .KF.....-. Q.L....... ..M....... Q.L....... ..I.......

PYFGYFGYQG .......... ....F...H. ..YA....H. ..YAF...H. ........H. ........H.

FGGRPPYYSE .......... .......... .....-.... .....-.... .......... ..........

exon9 EMFEQDFEKP KEKDPPKAES .......... .......... .......... .......T.T ....D-Y... .......P.D ....D-Y... .......P.D .......... ..E....... .......... ..E.......

PAA-----EP ...-----.. ..T-----.. .PPDDPPP.A .PPDDPPP.A .GT-----.. .GT-----..

270 SGNSTGPETN .......... .V.T.V.... .T...V.DA. .T...V.DA. TA...VT... TA...VT...

STQSNPGGSQ .......... ...P.-APNP A...I...G. A...I.---E ...P..K... ...P..K...

SGNDTSPTGN .......... R........T G...S..V.. G......I.. G......... G.........

SGPGPNTVSN .......... ..Q...PR.. T.....AGN. T.....AGN. .T..L..GN. .T..L..GN.

PTAQNGVISP .......... ..G...---..VL...FPL ..V....FP. .P....IGPL .S.R..IGPL

exon9 ATVNISGQGV .......... PA..V..... PK..V..... PK..V..... PA..A....G PA..A....G

PRTQISWGPN .......... ..S.SP...R .KN..P.R.S .KS..P.R.S .GS..P.R.S .GS..P.R.S

QPNIHENYPN .......... .TI....... ....Y....Y ....Y....Y ....R..H.Y ....R..H.Y

PNIRNFPAGR .......... ....G...R. ..Y---.SE. ..Y---.SE. ..V....S.. .......S..

360 QWRPTGTFMG .......... ....P.PA.. ..QT.D.-Q. ..QT...-Q. ..YF...V.. ..YF...V..

Ancestral Dog Pig Rat Mouse Chimpanzee Human

HRQNGPFYRN .......... ..R....... PK....G.Q. P.....G... ....R..... ....R.....

QQVQRGPRWN .......... ..I....... P.I....Q.. P..E...Q.. .......... ..........

SFALERKQAM .......... ..T..G...V ...W.G...T ...W.G......W.G..-V F..W....-V

ARPGNPIYRK -......... -...Y.T..R TH....T.H. T.....T.G. ......V.H. ......V.H.

exon9 AYASTARGNS .......... V.G....S.P PSPP.S.V.Y PPSP.SGV.Y ..PP.S...Y ..PP.S...Y

PNHAGNLGNV .......... ..Y...SA.L ..Y...PV.F ---...PVHF ..Y...PA.L ..Y...PA.L

RRKPQAP--.......--....EG.--G..LPG.KKP G.NLPG.NKP .....G.--.....G.---

---------------------------FMGANPASNK FVGANPASNK -------------------

450 ---------------------------PFVGANPASN PFVGANPASN -------------------

Ancestral Dog Pig Rat Mouse Chimpanzee Human

---------------------------KPFVGTNPAS KPFVGANPAS -------------------

---------------------------NKPFVGANPA NKPFVGANPA -------------------

---------------------------SNKPFVGANP SNKPFVGANP -------------------

---------------------------ASNKPFVGAN ASNKPYVGAN -------------------

exon9 ---------------------------PASNKPFVGA PASNKPFIGA -------------------

---------------------------NPASNKPFMG NPAANKPSIG -------------------

---------------------------TNPAANKPSI TNPAANKPSI -------------------

---------------------------GTNAASNKPF GTNPAANKPF -------------------

540 ---------------------------VGTNAGTNKP VRNNVG----------------------

Ancestral Dog Pig Rat Mouse Chimpanzee Human

---------------------------FMGINVASNK ----------------------------

---------------------------PFMGTNPASN --------AN -------------------

---------------------------KPFVGTNPAS KPFVGTNPSS -------------------

---------------------------NKPFVRSNHA NQPFLRSNQA -------------------

exon9 ---------------------------SNKPFVGANA SNKPFMRSNQ -------------------

--NKHPMVGT --.....-.. --..N...-. AS..-.F... AS..-.F... --....-... --....-...

NVAPLGPKHG .......... ....P..... ..PSV...QV ...SV...QV T.......P. T.......P.

TVVHNEKIQN .......... ..DQ..N... ..S..M.T.. ..S..M.T.. P..R...... P..R......

630 PGEKPVGPKE .......... .R..Q.SQ.. .K..SL.Q.. .K..SL.Q.. .R...L.... .K...L....

Ancestral Dog Pig Rat Mouse Chimpanzee Human

RIVIPTRDPS .......... .T.V...... .T.T..K.TG .T.T..K.A. Q.IV..KN.T Q.IV..KN.T

GPWRNSQDYG .......... .......... N...S.KQ.. N...SAKQ.. S......Q.E S......Q.E

VNKSNYKLSP .......... I.......PQ I.NP..N.PH I.NP..N.PR ........PH ........PH

PESNVLVPNF .......... ..D.M..... ..GSMVG... S.GSMVG... S.GYMP.... S.GYMP....

exon9 NSVDQHENSY .......... ..I..R.... ..F..Q..T. ..F..Q.... .......... ..........

YSRGDSRRAP .......... .P..E.K... F......KVA F.K.A.K.V. .P.....KV. .P.....KV.

NSDGQTQSQN .......... .......T.I SPNR.I.... SPNI.I.... .......... ..........

LPKGIILEPR .......... I....V.... .....A.... .....A.... .....V.GS. .....V.GS.

720 RNPYESETNR .......... .I.......Q .T.F....K. .T.FQ...KK .M.......Q .M.......Q

Ancestral Dog Pig Rat Mouse Chimpanzee Human

PELKHSTYQP .......... ......A... .......NM. .....G.H.. ......S... S.....S...

AVFPEEIPPP -......... -.YT.G..S. -.Y.KKN.S. .-Y.KK..S. ..Y.....S. ..Y.....S.

AREHFPAGRN .......... .K........ T.G....... T.K....E.. .K........ .K........

TWHHQEISPS .......... ..NQ.....P N.N..KTL.P ..NR.K.L.P ..D......P ..D......P

exon9 FKEDPGRQEE .......... .......... L..GH..K.. L...Y...D. .......... ..........

LLPPPSQGSR .......... H..HL.H... N.RH.PY... N.RH..Y... H..H..H... H..H..H...

GGVYYPDY-N ........-. VH......-. .N.F.HE.N. .NIF.HE.T. .S.F..E.-. .S.F..E.-.

SYDPRENSPY .......... P......... P.PN-.K... P.HN-.K.Q. P......... P.........

810 LRSNRWDER ......... ....T.Y.. ..G.T..K. IK..P..K..G.T.... ..G.T....

Ancestral Dog Pig Rat Mouse Chimpanzee Human

DDSPNTIGQP .......... ......M... IST.G.MM.. -S..S.MMR. ......M..K ......M..K

RNSLYPINTP .......... E.PH..M... E.PQ.TMSSL E.PQ.TMTSL ESP....... ESP.......

ELKETVPYNE .......... DP...I.... DQ...EQ... DQ...EQ... DQ........ DQ..I.....

EDPIDPTGDE .......... .......... ......NE.. ......NE.. ...V...... ...V......

exon9 TFPGQNRWGM .......... H....S..D. S....S...D S....S...D V........V........-

EEPSFKEGPT .......... ..L....D.. D.L...GN.. ..MN..GN.. ..L...G... ..L...G...

VRHYEGDQYT .......... ......E... L.Q...E... ..Q...EH.A ......E... ......E...

VNQPKEYLPY .......... S......... STLA...... STLA...... S......... S.........

900 SLDNPSKTRE .......... .......P.. ..S..P.PS. ..S..P.PS. .......P.. .......P..

Fig. 4.

71

72

Sire et al.

Ancestral Dog Pig Rat Mouse Chimpanzee Human

DFPYGEFYPW .......... ..L....... ....S..... ....S..... ..Y.S..... ..Y.S.....

NPDENFPSYN .......... ..E....... ..H.T..I.. ..Q.T..I.. S......... S.........

TAPTVPPPVE .......... .....SS... PG..IA...D PG..IA...D ....M...I. ..S.M...I.

SRGYYANNAV .......... .......... P.S..V.... P.S..V...I .....V...A .....V...A

exon9 RQEESPLFPS .......... G....TM... G....T.... G....T.... GP...T.S.. GP...T....

WNSWDHRVQA .......... .S...P.I.. .T.....N.. .T.....N.. R......I.. R......I..

QGQKERQPYF .......... .....GR..L ER...SE... ER...SE... ...R..R... ...R..R...

NRNYWDQPTT .......... ...F...S.N ...V...SMS ...V...SIN ...I...A.H ...I...A.H

990 LHKAPPSPPH .......... .Y.T.T.S.. ...SNIQN....SNI--.IQ...AR..D .Q...AR..D

Ancestral Dog Pig Rat Mouse Chimpanzee Human

QKENQPYPSN .......... .......SN. ---.H..STT ---.H..STT ..G....Y.. ..G....Y..

SPAGLQKNPT .......... .......... ....FP.D.. ...RFP.D.. T........I T........I

WREGENLNYG .......... .H........ .L.......D .F.......D .H........ .H........

MQITRLNSPE .......... .......... L...S.SQ.. L...S.SP.. .....M.... .....M....

exon9 GEHLAFPDLI .......... RD........ R.Q......R R.Q.....FL R..SS..NF. R..SS..NF.

PPSYPAGQKE .......... ..D..G.... .Q...T..N. .Q...T..N. .....S.... .....S....

AHVFHLSQRG .......... S......... D.L..Q.... ..L..Q.... ..L....... ..L.......

SCCAGGSPGH .......... P......MWP ...I...T.. ...I...T.. .....S.T.P .....S.T.P

1080 KDNPLALQDY .......... .N........ .E.V...H.. ...V...... .......... ..........

Ancestral Dog Pig Rat Mouse Chimpanzee Human

TPPFDLAPGE .......... .QS.G..... .SSYG.P.RK .SSYG.P.RK ..SYG..... ..SYG.....

NEDTSPLYTE .......... .P...IG.A. TQEI..VH.. .QE...VH.. .Q.......D .Q.......D

DSHANHARDT .......... ...IKY..Q. S.YTKY.KPN S.YIKY..PN G..TKQT..I G..TKQT..I

ISPASNLPGQ .......... V..T.IV... V....I..N. V....I..S. V..T.I.... ...T.I....

exon9 RNSSEKRMPA .......... ......IL.G ..I..NKLT. ..I..NKLT. .......--.......---

ESQNLSPFRD .......... ....P...K. ..P.P...G. ..P.P...G. ....--.... ....--....

DVSTLRRNTP .......... .......S.. ..P.V.K... G.P.V.K... .......... G.........

CSMKNQLSQR .......... ..V.S..... N.G....ETY.G....ET.AI....G.K ..I....G.K

1170 GIMPFPEAGS .......... ....L...N. ..IALS..S. ..VA.S..S. E.......S. E.......S.

Ancestral Dog Pig Rat Mouse Chimpanzee Human

LQSKNTPCLT .......... .......... S.P...A..K S.P......K .........K .........K

SDLGGDGNNV .......... .......... ......RRD. ......RRD. N........I N........I

LEEIFEDNQL .......... ..Q...G... .KQF..GT.. .KQF..GS.. ..QV...... ..QV......

SERTVDLTPE .......... N......... G...AG.... ....AG.... N...I..... N.........

exon9 QLVIGTPDEG .......... ...F....KE .....S..K. .....I..K. .......... ..........

PEPEGIQSEV .......... .R....PN.M SG.D.TH... SG.DS..... SN......Q. SN......Q.

QGNEGDRQQQ .......... ....SE.... ..S..EM... ..K..EM... .E..SE.... .E..SE....

RPSSIIQLPC .......... .Q...L.... ..PT.RK... ..PT.MK... ...N.LH... ...N.LH...

1260 FGSKITNYHS .......... ....LA...T ....LAKL.. ...NS-KF.. ....LA.H.. ....LAKH..

Ancestral Dog Pig Rat Mouse Chimpanzee Human

SSTGTPSSIG .......... ..I.....L. ....P.TNN. .T..P.INNR .T......D. .T......D.

RQGPFDEEPI .......... ..DS..GD.. GPSLPNGALS .PTLLNGALS ..S...GDS. ..S...GDS.

MPTENPNSLS .......... ....T....A T...S.DT.V T...S..T.V T......T.V T......T.V

RLATGAQFQS .......... G........N G...RE..K. G...RE.LK. E...EE..K. E...EE..K.

exon9 INVDPLSADE .......... ......NE.. ....Q.N..G ....K.N... ......D... ......D...

HTPFDSFLQL ......-... ......-..I ..TLE..-.D ..TLE..-..S..E-...R .S..E-...R

GTNPQDHVQD .......... ......Q... -.SQ..QAHG ..S...Q--G ...V..Q... ...V..Q...

CLLLQA ...... ...... ...... ...... ...... ......

b

1336

Fig. 4. Enamelin. Alignment of amino acid sequences. The putative ancestral tetrapod sequence is shown at the top. Vertical bars indicate the limits between exons. The signal peptide is in a box. The names of the new sequences are shown in bold. Unchanged residues are shown on a gray background. (.), Identical residue; (–), indel; (*), unknown residue. See Table 1 for sequence references and Latin names of the species.

relationship, with the C-terminal region of AMBN and ENAM showing a similar amino acid pattern, while this region is lacking in AMEL. Given the relationships found between AMEL and AMBN, we compared both sequences in detail. Similarities indicate the regions that are phylogenetically related, and that could have a similar function. Differences point to the functional specificities of each protein. Both genes having representative sequences in mammals, sauropsids and amphibians, the two ancestral sequences, obtained from the previous alignments at the tetrapod level (i.e. 350 Ma), were compared (Fig. 7). At the time the tetrapods differentiated, both coding sequences had already diverged, as shown by their gene structure: AMBN is larger than AMEL (432 residues vs. 196) and it possesses a larger number of exons (10 vs. 5). Exons 2, 3, 5, and the beginning of AMEL exon 6 aligned rather well with exons 2, 3, 4, and the beginning of AMBN exon 5. The rest of the AMEL sequence shows a number of amino acid similarities with the rest of AMBN exons 5 and 6. This means that one of the largest differences between both genes consists of the absence, in AMEL, of a counterpart of AMBN exons 7–13 (Fig. 7). In addition, although similarities are detected between AMEL exon 6 and AMBN exons 5–6,

they are mainly the result of a large number of P and Q in both sequences. AMEL, however, distinguished from AMBN by the clear organization of the P and Q residues into triplets – PXQ, or PXX, where X is a variable residue (Fig. 7).

Discussion Amelogenin is derived from an ancestral ameloblastin

Our phylogenetic analysis clearly demonstrates that AMEL and AMBN are sister genes, which differentiated from an ancestral gene having diverged from a copy of ENAM. Was this ancestral gene AMEL or AMBN? The similarities found in ENAM and AMBN gene organization, and in the amino acid pattern, both support AMBN as being the probable ancestor. In addition, AMBN and ENAM are located adjacent to each other on the same autosomal chromosome in human, mouse and rat, while AMEL is located on the sex chromosomes. This strongly supports AMBN as being the first gene having diverged after a duplication of ENAM. AMEL was secondarily created after a duplication of the ancestral AMBN, and translocated to another chromosome. This finding on the origin of

The amelogenin story: origin and evolution

73

Ancestral Dog Pig Cow Mouse Rat Orangutan Chimpanzee Human

exon2 MKTVLFFLYI LGTAAAIPTN .......... .......... ...L..L... ......V..Q ....V...C. ....G....Q ..A..LL.CA ....V....S ..A..LL..A ..I...V.....G....CL .......... ...G....CL .......... ...G....CL ..........

ARFLSDHSKP .......... ...P.G..N. ........N. T.......N. -T......N. ..L.P..... G.L....... ..L.......

TADSLSSIQQ .......... ..A....F.. ......FF.E .TAT.VTP-.SAT.PTL-..ET-VAP-..ET-VAP-..ET-VAP--

exon3 AEILVTPDNT .......N.. ..T...AEQ. ..---.SEH. -------EDA -------EDA -------... -------.D. -------...

AIPVLGVEDA .......... .T.TGR.... ...LVE.... TV.IA...-TV.TVPA.-...G.RA..E ...S.RA..E ...S.WA.AE

ENEKETAVSI .....I.... .....P...L .......T.. ----A..D-. ----AA.D-. .........T .........T .........T

EDHPNHKAEK -.....E... .E..H..... ...SH..P.. .N...D.... .K........ ...SH..... ..DSQR.... ..DSH.....

exon4 90 SSVLKSKEEN .......... .......Q.D ..L...E... P.A.N.E..T P.A.N.E..A .........S .........S .........S

Ancestral Dog Pig Cow Mouse Rat Orangutan Chimpanzee Human

HEESADQGQS .D........ .......D.. .DQ....EE. ..Q.TE.DKT ..Q.TE.DKT D.Q..E..K. ..Q..E..K. ..Q..E..K.

YSQELGLQDE .......... ...N..S.EH .N.N.....Q ..F.VD.K.. ..F.VD.K.. S......K.Q S......K.Q S......K.Q

EESESDLSEN .......... .K..R..I.. .KT....T.. .DGDG...VD .DGDG...VD .D.DG...V. .D.DG...V. .D.DGH..V.

LEYMPTEGTL .....S.R.. S......... ...S...... ----...... ----...R.. ...A...... ...A...... ...A......

exon4 --DLKEDM-S --......-G --......-T --E.....-. TL..-QEGT. --..-QEGT. --.I....-. --.I....-. --.I....-I

EPQKKKLSEN ...N...... Q...E.FP.S ....E..P.S ...Q.S.P.. ...Q.R.P.. ...E...... ...E...... ...E......

IDFLAPNVS..L....I...S.GHDD.--...HD..G..-PAT..T A..-PAT..T S..F..G..T.....G..T.....G..-

SIVDSNQQES ....P.H... .......... .......... .Y..P..RAN PF...D.PAN .FT....R.. .FT....... .FT.......

180 ITKTEKKQEQ ......E... ...A.EN.D.....EN.K. ...GKES... ...G.ES... ...R.EN... ...R.EN... ...R.EN...

Ancestral Dog Pig Cow Mouse Rat Orangutan Chimpanzee Human

PINDSHPQLN ....L..... ----...... .VD....... .VS...Q.P. .VS...Q.QD .R.Y..H... .R.Y..H... .R.Y..H...

KSNKHSQDLS ........V. G.S...P... -----....R E.S.QT...K E.G.QT..SM R.S....G.R R.S....G.R R.S....G.R

DQGNQEQDSN ....E..... .R......PD .....D..TAEES.T..PD TEESHK..PG ........P. ........P. ........P.

IPNGEGEGE.........S.T.KE...--...E......E.E.E.E ...E.K.E..S...E.E..S...E.E..S...E.E.-

exon4 -------EDP -------... -------KE. -------KE. DEEEEEE.E. -------... -------KE. -------KG. -------KE.

GEV--GTHSD ...--....N H..DDA.LN. ...--...N. EDI--.AP.. ED.--.AP.. ...--.I.N. ...--...N. ...--...N.

NQEREREFPK ......M... ...E...VLQ EE...T.L.. ...EGK.PLE ...E.K.P.E ....KT.L.R ....KT.L.R ....KT.L.R

EHSNSKEEED ....I..... .Q...EQ... .P..N.Q... .QPT..W.GN .QPT..W.GN ..A...H... ..A...Q... ..A...Q...

270 NTQSDDVLEE H......... S.R..GI... S....A.... RE....T... GD..E.I.Q. .....GI... ......I... ......I...

Ancestral Dog Pig Cow Mouse Rat Orangutan Chimpanzee Human

SNQPTQVSKM .......... .Y....A... .Y....E... .S....I..T .S....I..T .D........ .D........ .D........

QKDEFEQGNQ ....S..... .......... ..E.L.H.SR E.HQS..... -.ND....S. .E...Q.... .E...D.... .E...D....

EQEEDSSNAE .......... GR..ENA... ....ENT... G..S.-.E.. G..G.-.... ...DN-.... ...DN-.... ...DN-....

MEDETASKIN .......... T.E......T ..E......D G..KA.GSKE G..KA.GSKE ..E.N...V. ..E.N..NV. ..E.N..NV.

exon4 KHNQDPEWQS .......... .P..ER.... .PS..T.... -.IPHT.Q.D -.LPHT...G ..I.ET.... ..I.ET.... ..I.ET....

QEEKPGLEVI .....--... P.G......A ..G...P... ..G.A...A. ..GRA..DA. ..G.T...A. ..G.T...A. ..G.T...A.

SDHEEIDEKK .......-.. AAR-DLELV.N...M.-.. GNQKDT..-. GNRKDT..E. .N.K....-. .N.K.T..-. .N.K.TE.-.

TVS-EALLVE ...-.....K .LA-KP.... .L.-.P.... A..T.----A..T.----...-....M. ...-....M. ...-....M.

360 PTDDGNIMPR ..E....... ....D.V... ....V.EVAI ...AAVV-.. ...AAVV-.. ...G..T... .......T.. ......TT..

Ancestral Dog Pig Cow Mouse Rat Orangutan Chimpanzee Human

NHGANDDGDD .......... ....D..S.. ....K..S-. S..G---AG....---S-....D...G. ...VD...N. ...VD.....

EGDDGGNDGP .--------. D-.E-----D.-------. --.N..G.DS --.N..G.DS D......... D.....T... D.....T...

RHDASDDY-F ........E. -.N..E.A-. GYNE.E.D-. K.G.G...-. K.G.....-. ..S.....-. ..S.....-. ..S.....-.

IPSEAFLEAE ......I... N....Y..R. T...E..SN. ...QE..... ...QE..... ...Q...... ...Q...... ...Q......

exon4 RAQSISYYHL .S.....-.. ..P.N-...T ....M-.... .MH.L...-. .MH.L...-. .....A.-.. .....A.-.. .....A.-..

KYEEEREKAR .......RE. ..K.Q....H E...QI.N.L ..GGGE.TTT ..G..---TP .I..Q....Q .I..Q...VH .I..Q...VH

ENGEN--VDA AR-..GN... -E...--..T K.-..--..S GES..----R DES..----R ..-..--I.T ..-..--IGT ..-..--IGT

SEPGEYQGAK .......... ..AA.N.... ..TV.N.Q.. R.AADN.E.. ..A.DN.... .....HEE.. .....H.E.. T....H.E..

exon5 450 KAESSPNEDE .......... ...G.....D .....LK... .......-A. .......-A. ...N.S..E. ...N.S..E. ...N.S..E.

Ancestral Dog Pig Cow Mouse Rat Orangutan Chimpanzee Human

exon5 SSNENNTR-V ..Y...-.M. N.T.GT..-G H.T.G...-. P.D.G.S.-E P.D.G.S.-G T.S.G.M.-. T.S.G.M.-. T.S.G.M.-.

463 HGV .DI .S. .S. .SA .SA ... ... .A.

Fig. 5. Hevin. Alignment of amino acid sequences of the N-terminal region (domain I). The putative ancestral tetrapod sequence is shown at the top. Vertical bars indicate the limits between exons. The signal peptide is in a box. The names of the new sequences are shown in bold. Unchanged residues are shown on a gray background. (.), Identical residue; (–), indel; (*), unknown residue. See Table 1 for sequence references and Latin names of the species.

AMEL closes a quest, which started in 2001, when Delgado et al. (14) showed a relationship between AMEL signal peptide (most of the exon 2) and the signal peptide of SPARC (osteonectin) and SPARClike1 (HEVIN). In addition, these authors calculated that AMEL originated some 600 Ma, before SPARCHEVIN duplication. Taking into account the recent mapping of dental (AMBN, ENAM, DSPP) and bone (DMP1, IBSP, MEPE, SPP1) protein genes on the same chromosome in human and rodents, Kawasaki

& Weiss (15) proposed that EMPs, on the one hand, and the dentin and bone proteins (the secretory calcium-binding phosphoproteins, SCPPs), on the other hand, constitute two families. Moreover, they suggested that HEVIN, which was located on the same chromosome and between the EMP and SCPP clusters, was the founder of all these proteins, which are involved in the mineralization of dental and bone tissues. Given the similar gene structure of AMBN, ENAM and AMEL, Kawasaki &

74

Sire et al.

Fig. 6. Relationships of the enamel matrix protein genes, amelogenin (AMEL), ameloblastin (AMBN) and enamelin (ENAM), with HEVIN, a member of the SPARC family. The only N-terminal region of the 47 sequences were aligned to construct the tree using paup 4.0 (maximum likelihood). The tree has been rooted on HEVIN because this gene is present in actinopterygian genomes.

AMEL AMBN

AMEL AMBN

AMEL AMBN

Weiss concluded that AMEL also belongs to the same family, while having been secondarily translocated to another chromosome after the duplication from HEVIN. Here, we confirm that, indeed, AMEL is related to HEVIN (and also, to a lesser degree, to SPARC, which is the closest relative of HEVIN), but the AMEL/ HEVIN relationship is far from a direct descendency, because several duplication events (and therefore a long evolutionary period) occurred before AMEL was created. Indeed, after having been created from SPARC, HEVIN duplicated several times, giving rise to ENAM and to a SCPP founder, which remains to be identified. A good candidate could be SPP1 (osteopontin), which is the only SCPP gene found so far in a teleost fish genome (16). Next, a second round of duplication led from ENAM to an ancestral AMBN, which itself duplicated, giving rise to AMEL that translocated to another chromosome either immediately or later. Such successive rounds of duplication, separated by long periods of differentiation, explain in part why HEVIN is now so different from AMEL, AMBN, and ENAM, although the latter shares a similar amino acid pattern with the N-terminal region of HEVIN. A second reason for this high sequence divergence is HEVIN itself, which shows a high rate of evolution, especially in the N-terminal region (M. Girondot et al. unpublished). The strength of our findings lies in the comparison of a total of 47 sequences of AMEL, AMBN, ENAM and HEVIN, in contrast to the previous study by Kawasaki & Weiss (15), in which the few sequences used did not allow such a phylogenetic analysis to be performed. However, our current data set is still too small for understanding the relationships between HEVIN and ENAM, and more sequences need to be investigated in various basal species of mammals, in sauropsids and in amphibians, to conclude firmly the relationships among HEVIN, ENAM and AMBN.

exon2 exon3 exon 5 exon6 90 MGTWILLTCLLGAAFAIPLP PHPQHPGY--V-NFSYE VLTPLKWYQSMMK-HQ YPS-YGY-EPMSG-WLHNPMIPAPMMPQQQHPPQHA .ELLV.VL...KMS..V.AF .Q-.T..TQGMASM.L. TMRQQGAA.TLNTLS. .-.RF..GD.YNSL...-GLL.-.H--SSY-.WL.Q exon3 exon4 exon5 exon2 exon6 180 ILPPH-HPLLIPQQ---PL-V---PL---Q-PVMQVPGHHPMIPL--TPQHQHQLNP-TYLF--NPEPQQPANH-Q-PM-QPQKPDHLNKP R--.QLSD---N..FEYA.P.HPP..PSA.S.A-.TE--K.GQHAQNM..-.P.AAAS.DQVSH.G.A.-.SLPLGF.IL.-.-A.-.--. exon6 exon5 exon6 exon7 (end) VQPLPPQPPLPPMFPMQPLPPLVEERPQEPWQAAGKTKQEEL DZ A-MI..-----------------------KGGP.D.-G.IAQ TV exon6 exon7, 10, 11, 12, 13

Fig. 7. Alignment of the ancestral tetrapod amino acid sequences of AMEL and AMBN. Vertical bars indicate the limits between exons. The signal peptide is in a box. (.), Identical residue; (–), indel.

The amelogenin story: origin and evolution

75

Amelogenin was created long before tetrapod differentiation

Amelogenin diversification after its recruitment from AMBN

More than 500 Ma, early vertebrates without jaws possessed a dermal skeleton ornamented with odontodes, the tissues of which are considered to be the ancestors of the current tooth tissues, dentin and enamel, as previously reviewed in detail [5,6]. Therefore, it is difficult to imagine that EMPs, or at least a member of the family (i.e. ENAM) were not already present, and that another protein could play a similar role, as suggested by Kawasaki et al. (16). It is currently accepted that the enamel-like hard mineralized tissue protecting the early ÔdentalÕ structures was an enameloid. Given our present findings, HEVIN, the suspected founder gene of the EMPs, was therefore not only duplicated from SPARC before the vertebrate diversification, but at least one subsequent duplication also occurred, giving rise to the first member of the EMP family (i.e. ENAM). These duplication events probably occurred after the differentiation of Ciona intestinalis (an urochordate) because HEVIN was not found in its genome (or in protostomian genomes). HEVIN is present in teleost fish genomes (at least in the fugu and tetraodon). There is no indication of its presence in sharks, but, to date, no chondrichthyan genome has been sequenced to verify this. Shark teeth are covered with a well-mineralized enameloid. This type of tissue was found to be morphologically different from the other enameloid tissues described in fish (27). However, there is no reason to think that at least the first EMP, ENAM, is not present. No gene of this family has yet been identified in sharks, but enamelins were detected in teeth using immunohistochemical techniques (12). Similarly, positive results were obtained with amelogenin antibodies in teleost teeth (13) and in the enamel tissue covering the scales in a polypterid, a basal actinopterygian (28). All these immunohistological data, although they have to be carefully considered because they were obtained using mammalian antibodies, support the presence of EMPs in early gnathostomes, 450 Ma, and long before tetrapod differentiation, which occurred  350 Ma. A possible scenario would be that HEVIN duplicated from SPARC after urochordate differentiation (i.e. before the Cambrian period). Later, ENAM was created from HEVIN, but before the Cambrian explosion (545 Ma), which revealed a rapid vertebrate diversification. Enamel-like tissue developed in early vertebrates and was well differentiated when teeth formed in the first gnathostome. This could explain the positive results obtained with enamelin antibodies in shark teeth. Subsequently, ENAM duplication gave rise to AMBN, and AMBN to AMEL, somewhere during the evolution of the osteichthyans, but probably before the sarcopterygian/actinopterygian divergence because enamel was identified in polypterids (29,30). Some EMPs could have been lost secondarily during teleost fish evolution, or have diverged from the current structure known in tetrapods, because no EMP gene has yet been identified in the zebrafish, fugu, and tetraodon genomes.

Duplication events leading from an ancestral HEVIN to AMEL were followed by a diversification of the amino acid composition and function of the newly recruited genes; redundant genes would have been eliminated during evolution. This explains why each gene shows its own particularities in relation to the new function acquired after the duplication, despite the gene structure being similar in the N-terminal region for the two or three first exons, thus allowing their relationships to establish. Our evolutionary analysis, demonstrating the origin of AMEL from an ancestral AMBN, and the subsequent comparison of both protein sequences, allowed identification of which regions could have a similar function in both genes, and which domains, in contrast, are specific for each protein, indicating their proper function. Our previous studies of AMEL evolution, although limited to mammals, have pointed clearly to well-conserved residues in the N-terminal region (i.e. the 62 first amino acids encoded by exons 2, 3, 5, and the beginning of exon 6) (17,18). We know, from various studies on AMEL, that the main functions of these domains are to be found in cell–matrix and protein– mineral crystal interactions (22,31–33). These regions have to be considered as homologous, and we conclude that AMEL has inherited these functions from AMBN. In contrast, the rest of the two molecules [i.e. most of exon 6 in AMEL ( 130 residues) and most of exon 5 to exon 13 in AMBN (more than 350 residues)] have diverged more during the separate evolution of both genes, although they present some similarities, such as their richness in P and Q residues. These diverging regions therefore are related to the particular function of each molecule. In AMEL, this region is hydrophobic, and is subjected to numerous variations, but it is characterized by a high number of repeats of amino acid triplets (PXQ or PXX), while such repeats are virtually absent from AMBN exons 5–13. The comparisons of AMEL sequences indicate that these PXX repeats have differentiated long before tetrapod differentiation. Moreover, new insertions of triplet repeats have occurred independently in several mammalian taxa, in a particular region recognized as a hotspot of mutation (17,18). To date, however, human mutations have not been reported to occur in this particular region (22). This does not mean that such mutations do not exist, but it may indicate that they have no important consequences for the enamel aspect. We propose the following scenario for AMEL evolution. After its recruitment, probably in an osteichthyan ancestor (around 450 Ma), AMEL differentiated from the AMBN ancestor: half of the C-terminal region was lost and the remaining part accumulated substitutions and indels, which resulted in the formation of triplet repeats. The structure of this new protein was probably favorable for the differentiation of a new tissue – enamel – from the pre-existing enameloid. Because enamel may represent an advantage compared with enameloid, AMEL was not only conserved, but

76

Sire et al.

also new substitutions and indels were retained in various lineages, in particular in sauropsids and mammals. AMEL differentiation was probably rapid in the ancestral osteichthyans (therefore limiting redundancy with the sister gene), but then slowed down. Indeed, AMEL shows a higher amino acid sequence similarity among tetrapods (45.6%) than AMBN (24.3%). This also suggests that AMBN has diverged much during tetrapod evolution. Concomitant to AMEL differentiation, AMBN has been conserved but probably modified a little, which means that its function is still important for enamel. Finally, given the structural similarities of AMEL and AMBN amino acid sequences, we suggest a possible compensation by AMBN when AMEL is mutated or lacking. This could explain why enamel is always present (although hypoplastic and/or hypomineralized) after inhibition of AMEL translation (34), in AMEL null mice (35) and in human mutations (22,36). Conversely, given the importance of AMEL (90%) vs. AMBN (5%), the compensation could mask the effects of an AMBN mutation.

References 1. Donoghue PCJ, Sansom IJ. Origin and early evolution of vertebrate skeletonization. Microsc Res Tech 2002; 59: 352– 372. 2. Ørvig T. Phylogeny of tooth tissues: evolution of some calcified tissues in early vertebrates. In: Miles AEW, ed. Structural and chemical organization of teeth, Vol. 1. New York: Academic Press, 1967; 45–105. 3. Ørvig T. A survey of odontodes (Ôdermal teethÕ) from developmental, structural, functional, and phyletic points of view. In: Andrews SM, Miles RS, Walker AD, eds. Problems in vertebrate evolution. London, New York: Academic Press, 1977; 53– 75. 4. Reif WE. Evolution of dermal skeleton and dentition in vertebrates. Evol Biol 1982; 15: 287–368. 5. Huysseune A, Sire J-Y. Evolution of patterns and processes in teeth and tooth-related tissues in non-mammalian vertebrates. Eur J Oral Sci 1998; 106: 437–481. 6. Sire J-Y, Huysseune A. Formation of skeletal and dental tissues in fish: a comparative and evolutionary approach. Biol Rev 2003; 78: 219–249. 7. Girondot M, Sire J-Y. Evolution of the amelogenin gene in toothed and tooth-less vertebrates. Eur J Oral Sci 1998; 106: 501–508. 8. Smith MM. Heterochrony in the evolution of enamel in vertebrates. In: Evolutionary change and heterochrony (McNamara KJ, ed.). New York: John Wiley and Sons, 1995; 125–150. 9. Donoghue PCJ. Evolution and development of the vertebrate dermal and oral skeletons. unraveling concepts, regulatory theories, and homologies. Paleobiology 2002; 28: 474–507. 10. Sander PM. Prismless enamel in amniotes: Terminology, function and evolution. In: Teaford M, Ferguson MWJ, Smith MM, eds. Development, function and evolution of teeth. New York: Cambridge University Press, 2001; 92–106. 11. Slavkin HC, Samuel N, Bringas P Jr, Nanci A. Selacian tooth development. II. Immunolocalization of amelogenin polypeptides in epithelium during secretory amelogenesis in Squalus acanthias. J Craniofac Gen Dev Biol 1983; 3: 43– 52. 12. Herold RC, Rosenbloom J, Granovsky M. Phylogenetic distribution of enamel proteins: immunolocalization with monoclonal antibodies indicates the evolutionary appearance

13.

14.

15. 16. 17. 18. 19. 20. 21.

22. 23.

24. 25.

26. 27. 28.

29.

30. 31.

32. 33. 34.

of enamaelins prior to amelogenins. Calcif Tissue Int 1989; 45: 88–94. Lyngstadaas SP, Risnes S, Nordbo H, Flones AG. Amelogenin gene similarity in vertebrates: DNA sequences encoding amelogenin seem to be conserved during evolution. J Comp Physiol 1990; 160: 469–472. Delgado S, Casane D, Bonnaud L, Laurin M, Sire JY, Girondot M. Molecular evidence for Precambrian origin of amelogenin, the major protein of vertebrate enamel. Mol Biol Evol 2001; 18: 2146–2153. Kawasaki K, Weiss KM. Mineralized tissue and vertebrate evolution: The secretory calcium-binding phosphoprotein gene cluster. Proc Natl Acad Sci USA 2003; 100: 4060–4065. Kawasaki K, Suzuki T, Weiss KM. Genetic basis for thet evolution of vertebrate mineralized tissue. Proc Natl Acad Sci USA 2004; 101: 11 356–11 361. Delgado S, Girondot M, Sire J-Y. Molecular evolution of amelogenin in mammals. J Mol Evol 2005; 60: 12–30. Sire J-Y, Delgado S, Fromentin D, Girondot M. Amelogenin: Lessons from evolution. Arch Oral Biol 2005; 50: 205–212. Delgado S, Couble ML, Magloire H, Sire J-Y. Cloning, sequencing and expression of amelogenin in the scincid lizard, Chalcides (Squamata). J Dent Res 2006; 85: 136–143. Wang X, Ito Y, Luan X, Yamane A, Diekwisch T. Amelogenin sequence and enamel biomineralization in Rana pipiens. J Exp Zool Mol Dev Evol 2005; 304B: 1–10. Termine JD, Belcourt AB, Christner PJ, Conn KM, Nylen MU. Properties of dissociatively extracted fetal tooth matrix proteins. I. Principal molecular species in developing bovine enamel. J Biol Chem 1980; 255: 9760–9768. Hart PS, Hart TC, Simmer JP, Wright JT. A nomenclature for X-linked amelogenesis imperfecta. Arch Oral Biol 2002; 47: 255–260. Baba O, Takahashi N, Terashima T, Li W, Denbesten PK, Takano Y. Expression of alternatively spliced RNA transcripts of amelogenin gene exons 8 and 9 and its end products in the rat incisor. J Histochem Cytochem 2002; 50: 1229–1236. Paine ML, Luo W, Zhu DH, Bringas P Jr, Snead ML. Functional domais for amelogenin revealed by compound genetic defects. J Bone Miner Res 2003; 18: 466–472. Hart PS, Michalec MD, Seow WK, Hart TC, Wright JT. Identification of the enamelin (g.8344delG) mutation in a new kindred and presentation of a standardized ENAM nomenclature. Arch Oral Biol 2003; 48: 589–596. Hu JCC, Yamakoshi Y. Enamelin and autosomal-dominant amelogenesis imperfecta. Crit Rev Oral Biol Med 2003; 14: 387– 398. Sasagawa I. Mineralization patterns in elasmobranch fish. Microsc Res Tech 2002; 59: 396–407. Zylberberg L, Sire JY, Nanci A. Immunodetection of amelogenin-like proteins in the ganoine of experimentally regenerating scales of Calamoichthys calabaricus, a primitive actinopterygian fish. Anat Rec 1997; 249: 86–95. Sire J-Y, Ge`raudie J, Meunier FJ, Zylberberg L. On the origin of ganoine. Histological and ultrastructural data on the experimental regeneration of the scales of Calamoichthys calabaricus (Osteichthyes, Brachiopterygii, Polypteridae). Am J Anat 1987; 180: 391–402. Sire J-Y. Ganoine formation in the scales of primitive actinopterygian fishes, lepisosteids and polypterids. Connect Tissue Res 1995; 33: 213–222. Ravindranath RMH, Basilrose RM, Ravindranath NH, Vaitheesvaran B. Amelogenin interacts with cytokeratin-5 in ameloblasts during enamel growth. J Biol Chem 2003; 278: 20 293–20 302. Paine ML, Wang HJ, Snead ML. Amelogenin self-assembly and the role of the proline located within the carboxyl-teleopeptide. Connect Tissue Res 2003; 44: 52–57. Snead M. Amelogenin protein exhibits a modlar design: Implications for form and function. Connect Tissue Res 2003; 44: 47–51. Diekwisch T, David S, Bringas P Jr, Santos V, Slavkin HC. Antisense inhibition of AMEL translation demonstrates supramolecular controls for enamel HAP crystal growth during

The amelogenin story: origin and evolution embryonic mouse molar development. Development 1993; 117: 471–482. 35. Gibson CW, Yuan ZA, Hall B, Longenecker G, Chen E, Thyagarajan T, Sreenath T, Wright JT, Decker S, Piddington R, Harrison G, Kulkarni AB. Amelogenin-deficient mice display an amelogenesis imperfecta phenotype. J Biol Chem 2001; 276: 31871–31875.

77

36. Wright JT, Hart PS, Aldred MJ, Seow K, Crawford PJM, Hong SP, Gibson CW, Hart TC. Relationship of phenotype and genotype in X-linked amelogenesis imperfecta. Connect Tissue Res 2003; 44: 72–78.