Phylogenetic Reconstruction References ... - Yves Desdevises .fr

Phylogenetic trees made easy. Third Edition. ..... easier than others; e.g. for AA: BLOSUM 62 matrix). 37 .... MP sometimes considered as a special case of ML.
8MB taille 2 téléchargements 286 vues
Phylogenetic Reconstruction Yves Desdevises Université Pierre et Marie Curie (Paris 6) Observatoire Océanologique de Banyuls France [email protected] http://desdevises.free.fr http://desdevises.free.fr/Phylogenetic_reconstruction

1

References • Felsenstein J. 2004. Inferring phylogenies. Sinauer. • Lemey P., Salemi M. et Vandamme A.-M. 2009. The

phylogenetic handbook. Second Edition. Cambridge University Press.

• Hall B. 2007. Phylogenetic trees made easy. Third Edition. Sinauer.

• Page R. & Holmes E. 1998. Molecular evolution: a phylogenetic approach. Blackwell.

• Nei M. & Kumar S. 2000. Molecular Evolution and Phylogenetics. Oxford University Press.

2

• Goal: propose a hypothesis of relationships between several taxa

• Phylogeny = tree (≠ ladder) • Speciation: binary • Based on homology: similarity from a common ancestor

• Indicates the existence of a common ancestor • Identified from a phylogenetic tree, and basis to build it!

3

Labrus viridis Cheilinus trilobatus Cheilinus chlorourus

Stetojulis albovittata Stetojulis bandanensis

Halichoeres margaritace us albovittata Stetojulis bandanensis Stetojulis rus lorou nus ch Cheili Ch eil in us tril ob a Labrus merula viridis tus

Labropsis australis Halichoeres marginatus

Labroides dimidiatus Labrichthys unilineatus Coris julis Hemigymnus melapterus Hemigymnus fasciatus Thalassoma bifasciatum Thalassoma lunare

Notolabrus tetricus Bodianus rufus Clepticus parrae Pagrus major Symphodus roissali

Symphodus roissali Symphodus cinereus

Symphodus tinca

Symphodus tinca

Symphodus ocellatus

Symphodus ocellatus

Symphodus mediterraneus

Symphodus mediterraneus

Ctenolabrus rupestris

Ctenolabrus rupestris Labrus merula

Labrus viridis

Labrus viridis

Cheilinus chlorourus Epibulus incidiator

Cheilinus trilobatus Cheilinus chlorourus Epibulus incidiator

Stetojulis albovittata

Stetojulis albovittata

Stetojulis bandanensis

Stetojulis bandanensis

Halichoeres hortulanus

Halichoeres hortulanus

Halichoeres margaritaceus

Halichoeres margaritaceus

Labropsis australis

Labropsis australis

Halichoeres marginatus

Halichoeres marginatus

Anampses geographicus

Anampses geographicus

Anampses caeruleopunctatus

Anampses caeruleopunctatus Labroides dimidiatus

Labroides dimidiatus

Labrichthys unilineatus

Labrichthys unilineatus Coris julis

Coris julis Hemigymnus melapterus

Hemigymnus melapterus Hemigymnus fasciatus

Hemigymnus fasciatus

Thalassoma bifasciatum

Thalassoma bifasciatum

Thalassoma lunare

Thalassoma lunare

Thalassoma lutescens Pictilabrus laticlavius Notolabrus tetricus Bodianus rufus

Sympho dus cin ereus Sym phod Sy us tin mp ca Sy ho m du ph so ce od ll us atu s m ed ite rra ne us

Symphodus melanocercus

Labrus merula

Cheilinus trilobatus

Labrus viridis

Thalassoma lutescens Pictilabrus laticlavius

lis Labropsis austra ceus rgarita us es ma lan hoer Halic is ortu ns sh re ne oe da lich an Ha sb juli to Ste

Symphodus cinereus

Symphodus melanocercus

Symphodus roissali

Anampses geographicus Anampses caeruleopunctatus

us tric te s bru fus la to s ru No ianu d rrae Bo us pa ptic Cle major Pagrus

s rcu ce no ela sm ris du ho pest s ru mp bru Sy nola Cte a s merul Labru

stris rupe

Sym ph od us oce lla tus

r to ia cid in

us tinca Symphod

SSyy mmp phh oodd uuss cro inis ere sa ulis

Halichoeres hortulanus Halichoeres margaritaceus

La bro ide sd im cae idia rule opu tus Anam nct atu pses s geog raph icus Halichoeres margin atus

An am pse s

Thalassoma bifasciatum

Ctenolabrus rupestris Labrus merula

Epibulus incidiator

brus nola Cte

Pa gru sm ajo r

Symphodus melanocercus

Ste to juli sa Ep lbo ibu vit lus ta inc Chei idia ta linus tor chlo rour us Cheilinus trilobatus

La

Symphodus ocellatus Symphodus mediterraneus

Th

br An am oide s di pse HLab mid alic ropsis aus s ca iatu tralis ho eru s ere leo sm pu nct arg atu ina s tus

Symphodus cinereus Symphodus tinca

s ulu ib Ep

s rcueus ocean ditnerr s meela hodu s m Symp odu ph Sym

nus fasciatus Hemigym rus apte mel julis ris s Co tu ea ilin un ys th ch bri La

Symphodus roissali

s s nu icu ula ph ort gra eo sh sg ere pse ho am lic An Ha

fus s ru ianu Bod

nus igym Hem

unilineatus Labrichthys

Th TH Cor ala haem is ju ss lasig lis om soym nu a b ma s fa ifa lute sciatu s Hemigymnusscmelapterus iatuscen m s Pic tilabr us are la maticlun lavi sso us Thala Cle ptic tetricus Notolabrus us pa rra e

alasso ma lun Tha are lass Pic om a lu tila tesc bru ens s la tic lav ius

Phylogenetic trees

Thalassoma lutescens Pictilabrus laticlavius Notolabrus tetricus Bodianus rufus

Clepticus parrae

Clepticus parrae

Pagrus major

Pagrus major

4

• Cladogram • No branch lengths • Clades • Phylogram • Branch lengths Ultrametric tree

Additive tree

5

Leafs = terminal taxa

Clade

Terminal branches A

B

C D

E

F

G

H

I

J

Polytomy Internal branches

Node Root

6

• Speciation

7

Hypothesis

A

B

C

8

Rooting

• Gives the branching order • Use of an outgroup • Rest = ingroup Rooted tree outgroup

Non rooted tree

Add an outgroup

9

• Outgroup: sister taxa from ingroup • Shared characters between outgroup and ingroup = ancestral characters

• Sometimes no outgroup: rooting at equal distance from tree tips (need branch lengths) = midpoint rooting B

A

C

D

F

E

B C

E

A D

F

10

• Groups • Monophyletic (clade): natural group

• Mammals • Paraphyletic • Reptiles • Polyphyletic • Algae, protozoans

11

Characters • Organisms are composed of different features • These features are different among taxa: Character states

• All character states form a character • These states are produced by heritable changes • Phylogenetic inference is performed from differences between character states

12

• We want to establish the ancestor-descendant link from the presence/absence of character states

• We look for the appearance of new character states in descendants

• The different character states are homologies • Taxa sharing this new character state (derived) form clades

• Example: hair in mammals • Characters can be differentially weighted 13

• Homology

14

15

• Homoplasy

16

• Ancestral characters: plesiomorphies • Shared ancestral characters: symplesiomorphies • Derived characters: apomorphies • Shared derived characters: synapomorphies • Ideally, identify clades • Non shared derived characters = particular to a given taxon: autapomorphies

17

18

Homology

• Homologies are supposed to show similarities in: • position • structure • development • A recognized criterion to support homology is the congruence with other characters

19

Dog

Lizard

Frog

Human Change HAIR Absents Presents

20

Homoplasy

• Non homologous similarities • Results from independent evolution • Convergence • Parallelism • Reversion • Blurs phylogenetic signal: may lead to false evolutionary relationships

21

Parallelism Convergence

Reversion

22

Lizard

Human TAIL

Frog

Dog

Human

Dog

Absent Present

TAIL

Frog

Lizard

Absent Present

23

• Without homoplasy, phylogenetic inference would be easy

• Main problem of phylogenetic recontruction: discriminate homoplasy (noise) from homology (signal)

• Data quality (“good” phylogenetic signal) is more important than method used

24

• If there is only one correct tree, when characters

support different trees, at least one contains homoplasies Dog Lizard HAIR Absent Present Frog Human

Human

Dog TAIL

Frog

Lizard

Absent Present

25

Congruence

• The chosen tree is the tree maximising the number of congruent characters

MAMMALS Dog HAIR MILK ... Human

Lizard

Frog Changes

26

Case of molecular data

• Homoplasy is more common with molecular than morphological data

• Few states (4 for DNA: A G C T) • Chemically close • Evolutionary rates can be high • No identification of homoplasy via structure or development

27

Data • Fossils: rare • Morphological characters • Molecular character: DNA, proteins, ... • By far the most used now: models, numerous characters, less subjective, ...

• But... phylogeny of the DNA fragment (≠ taxa) • Future: genomes ➙ phylogenomics • Others (behaviour, hosts, habitat, ...)

28

Morphological data

• Homology uneasy to identify • Characters often not numerous: problem when

studying many taxa, especially if they are closely related

• Some subjective decisions • Evolutionary processes poorly known: limit method choice

• Require coding • Sometimes difficult • Hypotheses on character evolution

29

Coding

• Binary: Presence/absence = 0/1 • Multiple states (ordered or not): definition of step numbers between states

• Additive binary coding: e.g. 00, 01, 10, 11 • Linear coding: e.g. 0, 1, 2 • Both can be combined 30

31

Molecular data

• Nucleotides ou amino acids (for ancient divergences) • Characters = base (or AA) positions • Character states = bases (ou AA) identity • Important step: alignment • Sometimes manual • Automated methods: manual editing required • No test: no null hypothesis • Can use information on secondary structure or coding nature

32

• Nucleotides: only 4 states (in 2 types) • Evolution can be modelled • Homoplasy “easy”

33

• Amino acids • 20 states • 5 categories • Evolution much

more difficult to model

• Codons • 61 states! 34

• Gene tree ≠ species tree • Genes: orthologous or paralogous Paralogs Orthologs

b* c

a

Orthologs

C* B

A*

b* C*

A*

Duplication

Tree Ancestral gene

35

Alignment