Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Module de Master 2 Biostatistique: mod` eles de g´ en´ etique des populations
Gene genealogies and the coalescent Rapha¨el Leblois & Fran¸cois Rousset Centre de Biologie pour la Gestion des populations (CBGP, UMR INRA)
December 2013
TD
Introduction
Coalescent theory
Trees and mutations
Introduction Coalescent theory 2 lineages k lineages TMRCA
Coalescent advantages
Simulation algorithms
conclusions
This is the introduction to the coalescent theory, the coalescent will be used extensively in the next courses : - TD fluctuating population size
Trees and mutations
- structured populations
Coalescent advantages
- ML-based inferences under coalescent models (MCMC & IS)
Simulation algorithms Tree simulation Gen by gen continuous time
Simulating mutations conclusions TD
- Coalescent with recombination and Inferences from genomic data (HMM) - Inference using ABC
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
The Wright-Fisher Model (reminder...) size, with non-overlapping generations, and in which all individuals have equal reproductive success, each gene does not necessary leave one descendant in the next generation, but the number of descendants of each gene is a random variable following a probability distribution with expectation equal to one.
Time
• In a population of constant and finite
→ Drift, fixation of alleles, loss of genetic variation,... ?
conclusions
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
backward in time
backward in time
forward in time
The coalescent theory
In the coalescent framework, we look backward in time at the genealogy of a sample up to its most recent common ancestor (MRCA)
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
The coalescent theory Genealogy of the sample
backward in time
Genealogy of the population
forward in time
Introduction
Coalescent tree
6
?
• Classical approach • Population • Gene frequencies • Forward in time
• Coalescent approach • Sample • Gene genealogies • Backward in time
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
The coalescent theory
• The main idea behind the coalescent theory is the following : • By definition, in neutral models, the number of descendant of
a gene does not depend on its genetic type. • Thus mutations does not affect the genealogy.
→ mutational processes are independant of demographic processes, i.e. mutations and genealogy can be analyzed separately.
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
Coalescence of 2 lineages
In one generation :
t=1 t=0
P(G2 = 1) =?
conclusions
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Coalescence of 2 lineages
In one generation : probability that the two genes have a common parental gene in the previous generation t=1 t=0
P(G2 = 1) =
1 N
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
Coalescence of 2 lineages In two generations :
P(G2 = 2) =?
t=2 t=1 t=0 Haploid population of size N
conclusions
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Coalescence of 2 lineages In two generations : probability that the two genes did not coalesce in the first generation, multiplied by the probability that the two genes have a common parental gene in the second generation
P(G2 = 2) = (1 −
t=2 t=1 t=0 Haploid population of size N
1 1 N )N
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
Coalescence of 2 lineages in i generations t=i
P(G2 = i) =?
t =i −1
In i generations :
t=1 t=0
conclusions
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
Coalescence of 2 lineages in i generations t=i
t =i −1
t=1 t=0
P(G2 = i) = (1 −
1 i−1 1 N) N
In i generations : probability that the two genes did not coalesce in the first (i − 1) generations, multiplied by the probability that the two genes have a common parental gene in the i th generation
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Coalescence of 2 lineages in i generations • Coalescence probability of two lineages in i generations :
1 i−1 1 ) N N • It is a geometric distribution with parameter 1/N
probability
P(G2 = i) = (1 −
time
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Coalescence of 2 lineages in i generations • Coalescence probability of two lineages in i generations :
1 i−1 1 ) N N • It is a geometric distribution with parameter 1/N P(G2 = i) = (1 −
→ The expectation of t is ∞
E(G2 ) = ∑ iP(G2 = i) = N i=0
[intuitively, if there is one chance over 6 to get a 4 with a dice, we need 6 rolls of dice, in average, to get a 4...]
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Coalescence of 2 lineages in i generations • Coalescence probability of two lineages in i generations :
1 i−1 1 ) N N • It is a geometric distribution with parameter 1/N P(G2 = i) = (1 −
→ The expectation of t is ∞
E(G2 ) = ∑ iP(G2 = i) = N i=0
• Thus, in average, a common ancestor for the two genes is found
N generations backward in time, but there is a large variance : V(G2 ) = N(N − 1) ≈ N 2
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Coalescence of 2 lineages in i generations For x ≪ 1, we have (1 − x)t ≈ e −tx , → For large N, the discrete geometric distribution can be approximated by a continuous exponential distribution of rate N (also its expectation) : P(G2 = i) ≈
1 −i eN N
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Coalescence of 2 lineages in i generations For x ≪ 1, we have (1 − x)t ≈ e −tx , → For large N, the discrete geometric distribution can be approximated by a continuous exponential distribution of rate N (also its expectation) :
probability
P(G2 = i) ≈
1 −i eN N
Illustration with N = 20
time
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Coalescence of k lineages in i generations k(k−1) Considering a sample of k genes, there is (k2) = 2 pairs that can coalesce with probability 1/N, the probability that a single pair of gene coalesce in the previous generation is thus
k 1 k(k − 1) P(Gk = 1) = ( ) = 2N 2 N Then the probability that a coalescence took place i generations backward in time in a sample of k genes is follows a geometric k(k−1) distribution with parameter 2N : P(Gk = i) = (1 −
k(k − 1) i−1 k(k − 1) ) 2N 2N
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Coalescence of k lineages in i generations The probability that a coalescence took place i generations backward in time in a sample of k genes is follows a geometric k(k−1) distribution with parameter 2N :
probability
P(Gk = i) = (1 −
k(k − 1) i−1 k(k − 1) ) 2N 2N
Illustration with N = 20
time
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Coalescence of k lineages in i generations For large N, the distribution of coalescence times in a sample of k genes can thus be approximated by a continuous exponential k(k−1) distribution with parameter 2N : P(Gk = i) ≈
k(k − 1) −i k(k−1) e 2N 2N
Then scaling time by the population size (i.e. T = G /N, change of variable), we get P(Tk = t) ≈
k k(k − 1) −t k(k−1) k 2 e = ( )e −t(2) 2 2
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
Coalescence of k lineages in i generations
probability
E(Gk ) =
2N k(k − 1)
The larger the sample size or lineage number is, the shorter expected coalescence times are.
Illustration with N = 20
time
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
Coalescence of k lineages in i generations E(Gk ) =
V(Gk ) =
2N k(k − 1)
The larger the sample size or lineage number is, the shorter expected coalescence times are.
Coalescence times have high variance : two 4N 2 independent loci could show very different k 2 (k − 1)2 coalescence times, and thus very different coalescent trees (genealogies)
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TMRCA, length of a coalescent tree TMRCA = Time to the Most Recent Common Ancestor = length (or height) of the coalescent tree k
k
2N j=2 j(j − 1)
E(GMRCA ) = ∑ E(Gj ) = ∑ j=2
k
= 2N ∑ ( j=2
1 1 1 − ) = 2N(1 − ) j −1 j k
For time scaled by population size :
1 E(TMRCA ) = 2(1 − ) k
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
TMRCA, length of a coalescent tree For time in generations :
1 E(GMRCA ) = 2N(1 − ) k For time scaled by population size :
1 E(TMRCA ) = 2(1 − ) k
→ TMRCA tends to 2N (or 2) for large sample sizes → TMRCA for a relatively small random sample is almost the same as the one for the whole population. → E (T2 ) = 1 means that half of the tree height is due to the last coalescent event (i.e. of the last pair of genes).
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Addition of mutations to a coalescent tree • Under neutrality assumption, mutations are independent of
the genealogy, because genealogical process strictly depends on demographic parameters → First, genealogies are build given the demographic parameters considered (e.g. N), then mutation are added a posteriori on each branch of the genealogy, from MRCA to the leaves, given a mutation rate and a mutation model. This allows to obtain polymorphism data under the demographic and mutation models considered.
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
Addition of mutations to a coalescent tree • The number of mutations on each branch of the tree is a function
of the mutation rate of the genetic marker and the branch length. mutation rate µ = mean number of mutation per locus per generation. e.g. 5.10-4 for microsatellites, 10-8 per nucleotide for DNA sequences
→ For a branch of length t, the number of mutation m thus follows a binomial distribution with parameters (µ, t), often approximated by a Poisson distribution with parameter (µt).
P(m∣t) =
(µt)m e −µt m!
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
Addition of mutations to a coalescent tree • The number of mutations on each branch of the tree is a function
of the mutation rate of the genetic marker and the branch length. (µt)m e −µt m! • Once the number of mutation on each branch is fixed, a genetic type ( allele or haplotype) is chosen for the MRCA, and then the effect of each mutation is added step by step, from the MRCA to the leaves, given the mutation model considered. P(m∣t) =
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Advantages of the coalescent • It offers a probabilistic model for gene genealogies • The coalescent often simplifies the analyses of stochastic
population genetic models and their interpretation. • The coalescent allows extremely efficient simulations of
genetic polymorphism under various demo-genetic models (sample vs. entire population) • The coalescent allows the development of powerful methods
for the inference of populational evolutionary parameters (genetic, demographic, reproductive,. . . ), some of those methods uses all the information contained in the genetic data (likelihood-based methods) .
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Simulation of trees and polymorphism data under the coalescent theory
(Reminder) • For neutral markers, the number of offspring is independent of the genetic types of the parents → Demographic and mutational processes are thus independent. • Simulation of polymorphism data is thus be done in two
steps : (1) Tree simulation : topology and branch length (2) Addition of mutations on the tree
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Simulation of trees under the coalescent
There are two main methods for coalescent tree simulation : • Using continuous time algorithm (Hudson, 1991) - very fast but approximations only valid for large population sizes, weak mutation and migration rates. • Using generation by generation algorithm - can consider any mutational and demographic model, but can be very slow.
RAPIDITY : Continuous time approximations > Generation by generation FLEXIBILITY : Generation by generation > Continuous time approximations
TD
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Representation of a tree and usual terminology Past
Time
Introduction
?Present
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
generation by generation algorithm Very simple and without any approximations : • Go backward in time generation by generation • At each generation, stochastically draw all events that can
affect the genealogy e.g. coalescence, migration, recombinaison • Stop when the most recent common ancestor of all sampled
genes (MRCA) is reached
Toy example : • 4 genes • A single neutral locus • An haploid WF population with N = 10
→ there is only coalescence events
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 0 :
Node / lineage random parental gene Generation
1
2
3
4
0
0
0
0
1
2
3
4
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 0 :
Node / lineage random parental gene Generation
1 5 0
2 2 0
3 6 0
→ Coalescence : randomly choose the parents by assigning a uniform random number between 1 and N for each lineage
4 2 0
1
2
3
4
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 1 :
Node / lineage random parental gene Generation
1 5 0
3 6 0
5 2 1
→ Make the coalescence event...
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 1 :
Node / lineage random parental gene Generation
1 1 0
3 4 0
5 1 1
→ Coalescence : randomly choose the parents by assigning a uniform random number between 1 and N for each lineage
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 2 :
Node / lineage random parental gene Generation
3 4 0
6 1 2 6
→ Make the coalescence event...
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 2 :
Node / lineage random parental gene Generation
3 3 0
6 9 2 6
→ Coalescence : randomly choose the parents by assigning a uniform random number between 1 and N for each lineage
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 3 :
Node / lineage random parental gene Generation
3 3 0
6 9 2 6
→ Nothing happened at generation 3...
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 3 :
Node / lineage random parental gene Generation
3 5 0
6 7 2 6
→ Coalescence : randomly choose the parents by assigning a uniform random number between 1 and N for each lineage
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 4 :
Node / lineage random parental gene Generation
3 5 0
6 7 2 6
→ Nothing happened at generation 4...
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 4 :
Node / lineage random parental gene Generation
3 10 0
6 2 2 6
→ Coalescence : randomly choose the parents by assigning a uniform random number between 1 and N for each lineage
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 5 :
Node / lineage random parental gene Generation
3 10 0
6 2 2 6
→ Nothing happened at generation 5...
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm • Generation G = 5 :
Node / lineage random parental gene Generation
3 3 0
6 3 2 6
→ Coalescence : randomly choose the parents by assigning a uniform random number between 1 and N for each lineage
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
generation by generation algorithm 7
• Generation G = 6 :
Node / lineage random parental gene Generation
7 3 6 6
→ Make the coalescence event...
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
generation by generation algorithm
conclusions
TD
MRCA 7
• Generation G = 6 :
• The coalescent tree is finished !
we have the topology and the branch lengths. • This is a stochastic process with a
high variance,
6
so if we build many trees, they will all be different but share common properties
5
• To get polymorphism data, we need to
add mutations on the tree. . .
1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
continuous time algorithm Two steps in case of a single WF population with large N : • First simulate the topology of the tree by randomly coalescing
lineages (all leaves are equivalent). • Then draw coalescence times to add branch length. e.g. using continuous time exponential approximations
Same toy example : • 4 genes • A single neutral locus • An haploid WF population with N = 10
→ there is only coalescence events
A bit more complex for structured models because the topology is constrained by migration events...
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
continuous time algorithm (1) Build the topology by randomly coalescing ancestral lineages
→ Lineages 2 and 4 were randomly chosen. That’s the 1st coalescent event. 5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
continuous time algorithm (1) Build the topology by randomly coalescing ancestral lineages
→ Lineages 2 and 4 were randomly chosen. That’s the 1st coalescent event.
6
→ Lineages 1 and 5 were randomly chosen. That’s the 2d coalescent event.
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
continuous time algorithm (1) Build the topology by randomly coalescing ancestral lineages
MRCA → Lineages 2 and 4 were randomly chosen. That’s the 1st coalescent event.
7 6
→ Lineages 1 and 5 were randomly chosen. That’s the 2d coalescent event. → Lineages 3 and 6 were randomly chosen. That’s the 3d and last coalescent event.
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
continuous time algorithm (1) Build the topology by randomly coalescing ancestral lineages (2) draw the 3 branch lengths T4 , T3 and T2 . MRCA • branch lengths are drawn from
exponential distributions P(Tk = t) ≈
k(k − 1) −t k(k−1) 2 e 2
7
T2 6
T3
5
T4 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
TD
MRCA
continuous time algorithm (1) Build the topology by randomly coalescing ancestral lineages (2) draw the 3 branch lengths T4 , T3 and T2 .
7
T2
• branch lengths are drawn from
exponential distributions k(k − 1) −t k(k−1) 2 e P(Tk = t) ≈ 2
T3
random deviates → T4 = 0.8,T4 = 1.4 and T4 = 4.3.
T4
6
5 1
2
4
3
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
continuous time algorithm (1) Build the topology by randomly coalescing ancestral lineages (2) draw the 3 branch lengths T4 , T3 and T2 . • The coalescent tree is finished !
we have the topology and the branch lengths. • Note : Coalescence times distributions
must be known under the demographic model considered ! • To get polymorphism data, we need to
add mutations on the tree. . .
conclusions
MRCA
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
Addition of mutations • General principle (reminder) : - Mutations are distributed on the different branches from the MRCA to the leaves as a function of the mutation rate µ - Each mutation induce a change in the allelic/nucleotidic state of the descending node - This genetic state change is made according to the mutational model considered, which may reflect real mutational processes of some genetic markers
conclusions
MRCA
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
Addition of mutations
conclusions
MRCA
• For a branch of length t, the number of
mutation m follows a binomial distribution with parameters (µ, t), that is often approximated by a Poisson distribution with parameter (µt) : P(m∣t) =
☇
(µt)m e −µt m!
☇
☇ ☇ ☇
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
Addition of mutations
conclusions
TD
MRCA 20
• For a branch of length t, the number of
mutation m is often approximated by a Poisson distribution with parameter (µt) : P(m∣t) =
(µt)m e −µt m!
☇
• Example for microsatellites under a SMM : the
☇ 19 21 ☇ 18 ☇ 19
effect of each mutation is a gain or a loss of a motif (repeat) for each mutation - Random genetic type for the MRCA (drawn from stationary distribution) - Then add effect of each mutation
21
☇
20
20
20
19
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
Addition of mutations
conclusions
TD
MRCA 20
• For a branch of length t, the number of
mutation m is often approximated by a Poisson distribution with parameter (µt) : P(m∣t) =
(µt)m e −µt m!
☇
• Example for microsatellites under a SMM : the
☇ 19 21 ☇ 18 ☇ 19
effect of each mutation is a gain or a loss of a motif (repeat) for each mutation → A polymorphic sample of 4 genes is obtained with allelic states 20, 20, 21, 19.
21
☇
20
20
20
19
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
Addition of mutations
conclusions
TD
MRCA ATTGC
• For a branch of length t, the number of
mutation m is often approximated by a Poisson distribution with parameter (µt) : P(m∣t) =
(µt)m e −µt m!
☇
☇ ATTCC TTTGC ☇ AATCC ☇
AAACC
• Example on DNA sequence markers ( 5 bp). - Choice of the ancestral sequence (ATTGC) - independent mutation on each site
☇
TTTGC
TTAGC
TTAGC
TTAGC
AAACC
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
Addition of mutations
conclusions
TD
MRCA ATTGC
• For a branch of length t, the number of
mutation m is often approximated by a Poisson distribution with parameter (µt) : P(m∣t) =
(µt)m e −µt m!
☇
☇ ATTCC TTTGC ☇ AATCC ☇
AAACC
• Example on DNA sequence markers ( 5 bp).
→ A polymorphic sample of 4 genes is obtained with haplotypes TTTGC, TTAGC, TTAGC, AAACC.
TTTGC
☇
TTAGC
TTAGC
TTAGC
AAACC
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
what to do with these simulated coalescent trees and polymorphism data ? (subject of the next courses) • Exploratory approaches - study the effects of various parameters on the shape of coalescent trees, on the distribution of polymorphism in a sample, and on various summary statistics (e.g. He, FST,. . . ) e.g. Effect of past changes in population size • Simulation tests - create simulated data sets to test the precision and robustness of genetic data analysis methods • Inferential approach - estimate populational evolutionary parameters (pop sizes, dispersal, demographic history) from polymorphism data
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Books
Simulation algorithms
conclusions
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
Some examples in R : with the package ’coalesceR’ by Renaud Vitalis
https ://r-forge.r-project.org/
conclusions
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Gene genealogies are affected by demography
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Gene genealogies are affected by demography With population growth, recent coalescent events are less frequent (large N) as compared to ancient coalescent events (small N). Hence external branches are longer, and internal branches shorter...
→ we expect a excess of low frequency mutations.
TD
Introduction
Coalescent theory
Trees and mutations
Coalescent advantages
Simulation algorithms
conclusions
Gene genealogies are affected by demography With population decline, recent coalescent events are more frequent (small N) as compared to ancient coalescent events (large N). Hence external branches are shorter, and internal branches longer...
→ we expect a deficit of low frequency mutations.
TD