O céanolo
This file contains one logo per layer. These logos use RGB (RVB) colors.
g
i
ue
rv
a
ire
q
to
Obse
Laboratoire
•
er
C
1882 C
1
In the calque menu, click on the "eye" symbol to toggle the logo displayed. The logo that is visible is the version that will be exported, or printed. Turn off this layer (READ ME FIRST) before exporting or printing.
/M
M • UP
de Banyuls
ARAGO
Using Genomics to investigate Cospeciation and Lateral Gene Transfer in symbiotic systems NR
S • INSU •
http://desdevises.free.fr/MEEG_YD
Yves Desdevises Observatoire Océanologique de Banyuls Université Pierre et Marie Curie France
2
Who am I? Roscoff
• Professor • Marine Station of Banyuls/Mer (Deputy Director) • UMR 7232 Integrative Biology of Marine Banyuls
Organisms
• Head of the team Marine Interactions - Evolution and Adaptation
What kind of research am I doing? Evolution and diversity in (marine) symbiotic systems
Development of analytical methods
Microalgae-Viruses
Fish-Platyhelminthes, …
3
Cophylogenetics
Comparative method
4
The tools I use Morphology and Phylogeny in Lamellodiscus
• Phylogenetics, mainly molecular data • Comparative genomics • Environmental genomics • Statistics and numerical ecology
Figure 1. Phylogeny of several Lamellodiscus species obtained by maximum likelihood and Bayesian inference. As topologies obtained with both reconstruction methods gave congruent topologies and similar branch lengths, the most resolved tree, obtained by maximum likelihood, was retained and is presented here. Bootstrap values (1000 replicates) and posterior probabilities (.0.5. Dashes correspond to values ,0.5) are indicated at each node. The clusters of individuals for which the alignment of ITS1 was possible are outlined in grey boxes. Thick black lines indicate ergensi and ignoratus groups. doi:10.1371/journal.pone.0026252.g001
50000
PLoS ONE | www.plosone.org
4
40000 30000 20000 10000
Mamiella
Micromonas_C
Micromonas_B
Monomastix_unknown Monomastix_unknown
Bathycoccaceae_unknown
Crustomastix
Dolichomastix
Micromonas_D
Ostreococcus_D
Mantoniella_unknown
Micromonas_unknown
Monomastigales_unknown
Ostreococcus_unknown
Mamiellaceae_unknown
Dolichomastigales_unknown
Bathycoccus
Micromonas_E
Micromonas_A
Micromonas_BC
Ostreococcus_ABC
0
CA2 (31.46%)
Total amount of sequences
60000
CA1 (32.62%)
October 2011 | Volume 6 | Issue 10 | e26252
5
Symbiotic systems • Symbiotic associations in a broad sense (involving eukaryotes, prokaryotes or viruses): parasitism, mutualism, commensalism, ...
• Closely interacting partners ➡ Closely interacting genomes: coevolution
• Common phylogenetic history? • Lateral gene transfers between partners? 6
Outline 1.
2.
Methods 1.1.
Assessing cophylogenetic history
1.2.
Finding lateral gene transfer
Case study: a microalgae-virus system 2.1.
Cophylogeny
2.2.
Lateral gene transfer
7
Finding cophylogenetic patterns in symbiotic associations
Host-parasite associations Parasites Hosts
Parasites
8 Hosts
9
10
11
• Cospeciation; coevolution; cophylogeny; parallel cladogenesis; cocladogenesis; cophylogenetic descent; cophylogenetic maps ...
• Here: macroevolutionary context • How to reconstruct the common evolutionary history of two clades, for example hosts and parasites?
• Some key dates • 1981: Brooks (see Klassen 1992) • 1994: Page; Hafner et al. • Books • Page (ed.). 2003. Tangled trees. University of Chicago Press.
• Garamszegi L. Z. (ed.) 2014.
Modern comparative methods and their application in evolutionary biology. Chap. 20. Springer.
12
13
Four cophylogenetic events
Cospeciation
Transfer
Duplication
Sorting
14
revue Virologie 2015, xx (x) : 1-10
Quand virus et hôtes évoluent ensemble : la fidélité est-elle la règle ? Laure Bellec1 Yves Desdevises2 1 Sorbonne Universités, UPMC Univ Paris 06, CNRS, Biologie des organismes et écosystèmes aquatiques (BOREA, UMR 7208), Muséum national d’Histoire naturelle, Université Pierre et Marie Curie, Université de Caen Basse-Normandie, CNRS, IRD, CP26 75231, 43 rue Cuvier, Paris cedex 5, France 2 Sorbonne Universités, UPMC Univ Paris 06, CNRS, Biologie intégrative des organismes marins (BIOM, UMR 7232), Observatoire océanologique, 66650, Banyuls/Mer, France
Résumé. Les virus ont avec leurs hôtes des interactions très fortes, que ce soit au niveau physiologique ou écologique, se traduisant le plus souvent par une très forte spécificité. Il est donc tentant de penser qu’ils évoluent de conserve et que les virus peuvent pratiquement être considérés comme des caractères de leurs hôtes. Cependant, la cospéciation entre les virus et leurs hôtes, c’est-à-dire le degré auquel leurs arbres phylogénétiques sont similaires, a encore fait l’objet de relativement peu d’études malgré un domaine de recherche très dynamique. Les concepts et méthodes principales pour étudier ces patrons de cospéciation, ou plus généralement la cophylogénie, sont exposés ici. Leur utilisation dans des systèmes hôte-virus montre que contrairement à ce qui est souvent présupposé, l’histoire évolutive conjointe des virus et de leurs hôtes est souvent complexe. Sans une étude cophylogénétique rigoureuse, il est ainsi extrêmement hasardeux de calquer l’histoire évolutive des virus sur celle de leurs hôtes. Mots clés : cophylogénie, cospéciation, transfert d’hôte, évolution, spécificité
Abstract. Viruses display strong interactions with their hosts, from physiological and ecological point of views, often leading to strict patterns of host specificity. It is then tempting to consider that viruses evolve in the same way as their hosts, behaving more or less like hosts’ characters. However, the cospeciation between viruses and their hosts, that is the degree to which their evolutionary trees are similar, has been the subject or relatively few studies, in a field otherwise very dynamic. The main concepts and methods to study the patterns of cospeciation, and more generally cophylogeny, are reviewed here. Their uses with host-virus systems suggest that, contrarily to a common belief, the joint evolutionary history of viruses and their hosts is often complex. Without a rigorous cophylogeny study, it is then very risky to consider that the evolutionary history of viruses mirrors that of their hosts.
Methods
15
Key words: cophylogeny, cospeciation, host switch, evolution, specificity
Introduction L’étude des relations évolutives entre des symbiontes, ou des parasites comme des virus (ce sera le cas dans cet article), et leurs hôtes est désignée dans la littérature scientifique par de nombreux termes plus ou moins complexes : coévolution, cophylogénie, cospéciation, codivergence, cocladogénèse, descente cophylogénétique. . . Cela reflète le dynamisme de ce champ de recherche, qui a véritablement pris son essor dans les années 1980-90, avec le développement de méthodes analytiques dédiées.
Généralement, on parle de cospéciation pour faire référence au processus de spéciation concomitante de deux groupes taxonomiques d’organismes étroitement associés, résultant en deux phylogénies congruentes (présentant un fort degré de similarité) sinon identiques [1]. Cependant, il est important de faire la distinction entre les termes cospéciation et coévolution. Depuis son introduction par Ehrlich et Raven en 1964 [2], le concept de coévolution s’est restreint aux phénomènes micro-évolutifs [3], autrement dit à une sorte de course à l’armement entre 2 espèces en interaction au cours de leur évolution, comme un parasite et son hôte. Le terme coévolution peut ainsi désigner l’évolution réciproque d’adaptations entre des hôtes et leurs virus, mais ce système peut évoluer sans cospéciation alors
doi:10.1684/vir.2015.0612
• Event-based methods • Fit symbiont tree onto host tree by adequately Tirés à part : Y. Desdevises
1
Virologie, Vol xx, n◦ x, xxx 2015
Pour citer cet article : Bellec L, Desdevises Y. Quand virus et hôtes évoluent ensemble : la fidélité est-elle la règle ? Virologie 2015; xx(x) : 1-10 doi:10.1684/vir.2015.0612
mixing the four types of events = reconciled trees
• Optimality criterion: maximise number of
cospeciation events or minimise global cost (each kind of event is attributed a cost)
• Computationally intensive: for simple problems
(quite small trees and/or specific parasites and/or extensive cospeciation)
16
• With complex cases, solutions cannot be computed in polynomial time, unless
• host-switching is precluded, and/or • time constraint are considered, to limit the
number of possible switches (that requires fully dated trees)
17
• Global fit methods • Global congruence between trees • Influence of individual events • Host-parasite links • Importance of the null hypothesis • Cospeciation (e.g. Johnson et al. 2001) • Random associations (Legendre et al. 2002)
Theoretical prerequisites
18
• Well known and fully resolved trees needed for event methods
• Branch lengths: molecular phylogenies • Exhaustive sampling • Monophyletic groups = evolutionary entities • ... quite difficult • Topological congruence does not necessarily mean
19
• Different causes for congruence/incongruence
20
cospeciations!
Some methods
21
• Event-based methods • Brooks parsimony analysis (BPA; Brooks 1981) • Reconciled trees (Component, TreeMap 1, 2 and
3; Page 1993, 1994; Charleston 1998, Tarzan; Merkle and Middendorf 2005, CoRe-PA; Merkle et al. 2010, Jane; Conow et al. 2010, TreeCollapse; Drinkwater and Charleston, 2014)
• Generalised parsimony (TreeFitter; Ronquist 1995) • Probabilistic methods: ML, Bayesian inference (Huelsenbeck et al. 1997, 2000)
22
• Global fit methods • Homogeneity test (Johnson et al. 2001) • Congruence tests (ParaFit; Legendre et al. 2002, Hommola et al. 2009)
23
• Most methods work well if • Widespread cospeciation • ≈ 1 host / 1 parasite • Small phylogenies • Else: event-based methods are computationally very intensive, optimal solution not guaranteed
• Event-based methods all require fully-resolved trees
• Different methods: different results Reconciled trees TreeMap (Page 1994)
• Goal: fitting parasites tree onto host tree by adequately mixing the 4 types of events
• Criterion: maximise number of cospeciations (TM 1)
• Test against a random distribution • Can take branch lengths into account
24
25
• TreeMap 1: problems • Transfers added a posteriori • Limited optimality criterion (can generate many optimal solutions)
• Difficulty with widespread parasites (several host species)
• TreeMap 2 (Charleston & Page, 2002) • Jungles algorithm: introduces event costs • Optimisation of global cost • Find optimal solutions but very computationally
26
intensive
• Many modifiable parameters • Tests: • Global cost • Cherry Picking Test: influence of individual associations
27
• Each host must have at least a parasite(!!) • Needs fully resolved trees • TreeMap 3 is in development: in Java, for all platforms, with a few new functions
• TreeCollapse is a fast heuristic (greedy) algorithm to compute the solution
Example talpoides
28
wardi 17
13 bottae
minor thomomyus
bursarius
15
actuosi 10
hispidus
ewingi
cavator
chapini
14
18
12 11
underwoodi
16 panamensis 12 setzeri
10 9
cherriei 8
14 13
cherriei 11
heterodus
costaricensis
Phylogenies: COXI Pocket Gophers
Chewing Lice
29
TreeMap 1 talpoides wardi thomomyus bottae minor actuosi bursaris ewingi
hispidus chapini cavator panamensis
chapini cavator panamensis
underwoodi setzeri
underwoodi setzeri
underwoodi setzeri
cherriei setzeri cherriei heterodus
(b)
costaricensis
talpoides wardi thomomyus bottae minor actuosi bursaris ewingi
talpoides wardi thomomyus bottae minor actuosi bursaris ewingi
hispidus chapini
hispidus
underwoodi setzeri
panamensis underwoodi setzeri
cherriei setzeri cherriei heterodus
cherriei setzeri cherriei heterodus
costaricensis
hispidus
cherriei setzeri cherriei heterodus
(c)
costaricensis
14
(e)
costaricensis
talpoides wardi thomomyus bottae minor actuosi bursaris ewingi hispidus chapini
chapini cavator
cavator panamensis
(d)
talpoides wardi thomomyus bottae minor actuosi bursaris 5 ewingi
talpoides wardi thomomyus bottae minor actuosi bursaris ewingi
cavator panamensis
costaricensis
14
4
hispidus chapini
cherriei setzeri cherriei heterodus
(a)
3
cavator panamensis underwoodi setzeri cherriei setzeri cherriei heterodus
(f)
costaricensis
TreeMap 2
30
• 6 optimal solutions (out of 9) 1 of 9 | Co = 8, Sw = 4 (total distance: 3.565), Du = 10, Lo = 20;oftotal 9 | Co cost = 8, = 14 Sw = 4 (total distance: 3.67), Du = 10, Lo = 0; total 3 of cost 9 | Co == 1412, Sw = 3 (total distance: 3.009), Du = 6, Lo = 1; total cost = 10 talpoides talpoides talpoides wardi wardi wardi thomomyus thomomyus thomomyus bottae bottae bottae minor minor minor actuosi actuosi actuosi bursarius bursarius bursarius ewingi ewingi ewingi hispidus chapini
hispidus chapini
hispidus chapini
cavator panamensis
cavator panamensis
cavator panamensis
underwoodi setzeri
underwoodi setzeri
underwoodi setzeri
cherriei cherriei
cherriei cherriei
cherriei cherriei
heterodus costaricensis
heterodus costaricensis
heterodus costaricensis
4 of 9 | Co = 12, Sw = 3 (total distance: 3.009), Du = 6, Lo = 61;oftotal 9 | Co cost = 12, = 10 Sw = 2 (total distance: 1.9345), Du = 6, Lo = 3; 6 total of 9 |cost Co ==12, 11 Sw = 2 (total distance: 1.9345), Du = 6, Lo = 3; total cost = 11 talpoides talpoides talpoides wardi wardi wardi thomomyus thomomyus thomomyus bottae bottae bottae minor minor minor actuosi actuosi actuosi bursarius bursarius bursarius ewingi ewingi ewingi hispidus chapini
hispidus chapini
hispidus chapini
cavator panamensis
cavator panamensis
cavator panamensis
underwoodi setzeri
underwoodi setzeri
underwoodi setzeri
cherriei cherriei
cherriei cherriei
cherriei cherriei
heterodus costaricensis
heterodus costaricensis
heterodus costaricensis
• Test against a null distribution (from randomised
31
trees) of inferred number of cospeciations (or global cost with TreeMap 2)
• Confrontation with observed value Observed value
250
Fréquence Frequency
200
P < 0.05 The observed number of cospeciations is higher than 95 % of random iterations
150
100
*
50
0 1
2
3
4
5
6
7
8
9
10
11
12
Nombre de Number ofcospéciations cospeciations
Temporal congruence
• TreeMap can be used to compare divergences in cospeciating pairs
• Evolutionary rates can be compared (e.g.
parasites usually evolve faster than their hosts)
• Temporal congruence of speciation events can be assessed, this is a condition for true cospeciation
• Useful to discriminate evolutionary scenarios
32
• Simultaneity of speciation events?
33
No clock: additive trees, independent branch lengths Parasites
1,i
ivParasites
Hosts
Less changes in ii - evolves slowlier? - diverged later?
4,iv
iii
ii
i
3,iii 2,ii Hosts
Copaths
Molecular clock: ultrametric trees, dependent branch lengths Parasites
iv 3
3,iii
If intercept = 0: cospeciation
1-4,i-iv
ii
i 1
4,iv
iii
4
2,ii
2-3,ii-iii 3-4,iii-iv
2
1,i Hosts
Intervals between Coalescence times speciation events
Page 1996
Slope: compare evolutionary rates for hosts and parasites
• Tests in TreeMap • Branch lengths must be correctly estimated on the
34
tree (e.g. with an evolutionary model)
• Additive trees • Copaths based on reconstruction • Correlation coefficient r between copaths tested via branch lengths randomisation, because copaths are not independent (via phylogeny)
• Ultrametric trees • Coalescence times can be used • Same test 35
• Example with additive trees 19 18 20 17 22
hispidus
chapini
cavator
panamensis
underwoodi cherriei
16 heterodus trichopus
26
bulleri castanops 21 merriami
27
personatus breviceps 29 25 24 23
bmajusculu
28
talpoides
Parasites
cherriei
20
costaricen 18 trichopi 25
29
expansus
(r = 0.5663) talpoides-thomomyus
0.91
merriami-perotensis
bottae-actuosi
nadleri
hispidus-chapini [16]-[18] bulleri-nadleri bottae-minor
27
texanus 21 ewingi 26 23 actuosi 24 geomydis
trichopus-trichopi
underwoodi-setzeri [18]-[19] [25]-[21] cherriei-cherriei talpoides-barbarae
32
cavator-panamensis
oklahomens 22 perotensis
bhalli
bottae
28
setzeri 19
33 30
00
heterodus-costaricen personatus-texanus breviceps-ewingi bhalli-oklahomens bmajusculu-geomydis Hosts
0.96
thomomyus barbarae minor 31
• Example with ultrametric trees
36
• Slope = comparated rates (if same gene) • If the intercept is not different from 0:
37
simultaneous speciation events = cospeciation
Hafner et al. 2003
• Example: aphids and bacteria • Significant cospeciation • TreeMap (tolologies) • ParaFit (distances) • ML (sequences)
38
P < 0.01
Tarzan and Jane
39
• Connected to Jungles (TreeMap 2 and 3), but
faster due to heuristic algorithms, with statistical tests
• Tarzan can consider time ranges for nodes in the parasite tree, to preclude switches that are impossible in time (e.g. to an ancestor)
• Jane can consider time ranges for nodes in the
host and parasite trees, and can modify switch cost according to phylogenetic distances between hosts
ParaFit
• Assess congruence between distance matrices
(potentially) computed from phylogenies of hosts and parasites, via host-parasite associations
• Statistical tests (via permutations) • Global congruence between two trees/matrices (H0: random associations)
• Contribution of each individual association to this congruence (structuring effect)
40
41
Host-parasite associations
Hosts
Parasites
A B Hostparasite associations
Parasites tree
C Hosts tree
• Phylogenetic distances are transformed in principal
42
coordinates
ACGTTCGGA ACTGTCGGA AGTGTCCGA
010010100 010110110 001110110
( )
Raw or patristic distances
1
n
Principal coordinates analysis
n-1 (max)!
Production of a maximum number of n-1 independent variables (principal coordinates) fully equivalent to phylogenetic distances
43 Princ. coordinates
Matrix A
Matrix B
Absence/presence of host-parasite associations (0/1 data)
Coordinates (col.) describing the parasite phylogenetic tree
Parasites
Parasites
Hosts
Pocket gophers
Matrix C Coordinates (rows) describing the host phylogenetic tree
T. talpoides T. bottae Z. trichopus P. bulleri O. hispidus O. underwoodi
Princ. Coordinates Host tree
Princ. coordinates
Hosts
Parasite tree princ. coordinates
Matrix D SSCP parameters to be estimated
T. barbarae T. minor G. trichopi G. nadleri G. chapini G. setzeri G. panamensis
O. cavator
G. cherriei
O. cherriei
G. costaricensis
O. heterodus
G. thomomyus
C. merriami
G. perotensis
C. castanops G. bursarius majus. G. bursarius halli
G. actuosi G. expansus G. geomydis G. oklahomensis
G. breviceps
G. ewingi
G. personatus
G. texanus
Chewing lice
44
• Drawbacks • Events not considered • No scenarios • Advantages • Statistical tests, and tested via simulations • Adapted to complex problems • Various numbers of hosts/parasite and parasites/
45
host
• Use distance matrices: no problem with polytomies, or multiple trees
• ParaFit implemented in CopyCat
46
47
• Use different methods in cophylogenetic studies
48
Looking for Lateral Gene Transfers between symbionts
• LGT in the tree of life
49
50
• Lateral gene transfer is more and more recognized
as an important factor shaping the evolution of life
• Current debate is no more on the existence of LGT but on its importance: can we still consider that the evolution of life is mainly tree-like?
• No (?) in Prokaryotes • Yes (?) in Eukaryotes
51
• LGT needs to be identified and removed before
concatenating genes to build trees from genomes
52
53
• Books • Gogarten et al. (eds) 2009. Horizontal gene transfer. Humana Press
• Pagel and Pomiankowski (eds).
2008. Evolutionary genomics and proteomics. Sinauer
54
Possible mechanisms • "Prokaryotes"
•
55 Eukaryotes
REPORTS 28. W. D. Koenig, D. Van Vuren, P. N. Hooge, Trends Ecol. Evol. 11, 514 (1996). 29. C. M. Arnaud, F. S. Dobson, J. O. Murie, Mol. Ecol. 21, 493 (2012). 30. K. B. Armitage, D. H. Van Vuren, A. Ozgul, M. K. Oli, Ecology 92, 218 (2011). Acknowledgments: I thank J. Hoogland for encouraging me to reexamine my information on dispersal; my 150+ research assistants over the 31 years of research (especially my four offspring); and D. Boesch, K. Fuller, R. Gardner, R. Morgan, and L. Pitelka of the University of the Maryland Center for Environmental Science (UMCES) for the opportunity for long-term comparative research. Financial support was provided by Colorado Parks and Wildlife, the Denver Zoological Foundation, Environmental Defense, the Eppley Foundation, the Harry Frank Guggenheim Foundation, the National Fish and Wildlife Foundation, the National Geographic Society,
Gene Transfer from Bacteria and Archaea Facilitated Evolution of an Extremophilic Eukaryote Gerald Schönknecht,1,2*† Wei-Hua Chen,3,4† Chad M. Ternes,1† Guillaume G. Barbier,5†‡ Roshan P. Shrestha,5†§ Mario Stanke,6 Andrea Bräutigam,2 Brett J. Baker,7 Jillian F. Banfield,8 R. Michael Garavito,9 Kevin Carr,10 Curtis Wilkerson,5,10 Stefan A. Rensing,11|| David Gagneul,12 Nicholas E. Dickenson,13 Christine Oesterhelt,14 Martin J. Lercher,3,15 Andreas P. M. Weber2,5,15*
NSF, Princeton University, the Ted Turner Foundation, UMCES, and the Universities of Michigan and Minnesota. For help with the manuscript, I thank R. Alexander, D. Blumstein, D. Bowler, C. Brown, J. Clobert, T. H. Clutton-Brock, A. Davis-Robosky, F. S. Dobson, L. Handley, K. Holekamp, A. Hoogland, S. Keller, X. Lambin, M. Oli, P. Sherman, N. Solomon, and D. Van Vuren. Data for this report are archived as supplementary materials on Science Online.
Supplementary Materials
18 October 2012; accepted 23 January 2013 10.1126/science.1231689
lead to expansion of existing gene families (8). In contrast, archaea and bacteria commonly adapt through horizontal gene transfer (HGT) from other lineages (9). HGT has also been observed in some unicellular eukaryotes (10); however, to our knowledge, horizontally acquired genes have not been linked to fitness-relevant traits in freeliving eukaryotes (11). Phylogenetic analyses of G. sulphuraria genes using highly stringent criteria indicate at least 75 separate gene acquisitions from archaea and bacteria (supplementary materials). The origin of these G. sulphuraria genes from HGT is supported by the finding that compared to the genomic average, they have
• LGT is viewed as the main adaptive mechanism in
Some microbial eukaryotes, such as the extremophilic red alga Galdieria sulphuraria, live in hot, toxic metal-rich, acidic environments. To elucidate the underlying molecular mechanisms of 1 adaptation, we sequenced the 13.7-megabase genome of G. sulphuraria. This alga shows an Department of Botany, Oklahoma State University, Stillwater, REPORTS enormous metabolic flexibility, growing either photoautotrophically or heterotrophically on more OK 74078, USA. 2Institute of Plant Biochemistry, HeinrichHeine-Universität Düsseldorf, 40225 Düsseldorf, Germany. than 50 carbon sources. Environmental adaptation seems to have been facilitated by horizontal 3 Ecol. Evol. 17. J. L. Hoogland, Science 215, 1639 (1982). 28. W. D. Koenig, D. Van Vuren, P. N. Hooge, Trends NSF, Science, Princeton University, the Ted Turner Foundation, Institute for Computer Heinrich-Heine-Universität gene transfer from various bacteria and archaea, often followed by gene family expansion. At least 4 18. J. L. Hoogland, The Black-tailed Prairie Dog: Social Life of a 11, 514 (1996). UMCES, and the Universities of Michigan and Minnesota. Düsseldorf, 40225 Düsseldorf, Germany. European Molecular 5%Burrowing of protein-coding genes of G. sulphuraria were probably acquired horizontally. These proteins Biology (EMBL) Heidelberg, Meyerhofstrasse Mammal (Univ. of Chicago Press, Chicago, 1995). 29. C. M. Arnaud, F. S. Dobson, J. O. Murie, Mol. Ecol.Laboratory 21, For help with theEMBL, manuscript, I thank R. Alexander, 5 are J.involved in ecologically processes ranging from 493 heavy-metal 1, 69117 Heidelberg,D.Germany. Department of C. Plant Biology, 19. L. Hoogland, J. Mammal. important 80, 243 (1999). (2012). detoxification to Blumstein, D. Bowler, Brown, J. Clobert, T. H. Clutton-Brock, Wilson State University, Lansing, MI glycerol uptake and metabolism. Thus, findings show that a pan-domain poolVuren, has A. Ozgul, 612 20. J. L. Hoogland, in Rodent Societies, J. O.our Wolff, 30. K. B. Armitage, D.gene H. Van M. K. Oli, Road, Michigan A. Davis-Robosky, F. S.East Dobson, L. Handley, K. Holekamp, 48824, USA. 6Institut Mathematik und Informatik, Ernst facilitated environmental adaptation this Chicago, unicellular eukaryote. P. W. Sherman, Eds. (Univ. of ChicagoinPress, Ecology 92, 218 (2011). A. für Hoogland, S. Keller, X. Lambin, M. Oli, P. Sherman,
• This remains to be studiedAin Eukaryotes, where
Moritz Arndt Universität Greifswald, Walther-Rathenau-Straße 2007), pp. 438–450. N. Solomon, and D. Van Vuren. Data for this report are 47, 17487 Greifswald, Germany. 7Department of Earth and Envi21. Details about bacteria my materials and methods aredomavailable(6). as TheAcknowledgments: J. Hoogland for encouraging archived as supplementary only member ofI thank the Cyanidiophyceae lthough and archaea usually ronmental me Sciences, 4011 CC Little Building, 1100 materials North Uni-on Science Online. supplementary materials on Science hot Online. to reexamine my sequence informationwas on dispersal; my 150+ versityresearch Avenue, University of Michigan, Ann Arbor, MI 48109, a genome previously inate extreme environments, and ex- for which 22. J. B. Silk, Philos. Trans. R. Soc. London Ser. B 362, 539 assistants over the 31 years of research (especially my8 four Earth and Planetary Science, Department tremely acidic habitats are typically devoid available, Cyanidioschyzon merolae (7), diverged USA. Department ofSupplementary Materials (2007). offspring); and D. Boesch, K. Fuller, R. Gardner, of R.Environmental Morgan, Science, Policy, and Management, University 9 www.sciencemag.org/cgi/content/full/339/6124/1205/DC1 sulphuraria about 1 billion years ago, of California, of photosynthetic bacteria. Instead, eukaryotic from G.and 23. J. C. Mitani, J. Call, P. M. Kappeler, R. Palombit, J. B. Silk, L. Pitelka of the University of the Maryland Center for Berkeley CA 94720–4767, USA. Department of Materials and Methods which approximates the evolutionary distance beunicellular red algae of the Cyanidiophyceae are Biochemistry and Molecular Biology, 603 Wilson Road, Michigan The Evolution of Primate Societies (Univ. of Chicago Environmental Science (UMCES) for the opportunity for S1 MI 48824, USA. 10Research TechLansing, fliescomparative and humans (see fig. S1 and the Press, principal photosynthetic organisms in these tween fruit Chicago, 2012). long-term research. Financial supportState wasUniversity, East Fig. Tables and Laboratories, S2 nology Support Facility, PlantS1 Biology 612 Wilson merolae maintains ecological niches (1). Cyanidiophyceae 24. J. L. Hoogland, Behaviour 69, 1 (1979).can grow supplementary provided materials). by Colorado C. Parks and Wildlife, the Denver Zoological Road, Michigan StateReferences University, East Lansing, MI 48824, USA. (31–38) 25. J. L.0 Hoogland, Behav. Ecol. Sociobiol. 63, 1621 Environmental Defense, the Eppley a strictlyFoundation, photoautotrophic lifestyle and does not Foundation, at pH to 4 and temperatures up to 56°C, close(2009). 11 Faculty of Biology and BIOSS Centre for Biological Signalling 18 October 2012; accepted January 2013 26. P. J.upper Greenwood, Anim. limit Behav.for28,eukaryotic 1140 (1980). Harry Guggenheim Foundation, the National Fish high saltFrank or metal concentrations; it difto the temperature life toleratethe Studies, University of Freiburg, Schänzlestrasse 1, 79104 23 Freiburg, 10.1126/science.1231689 27. M. Waser,sulphuraria W. T. Jones,isQ.a Rev. Biol.member 58, 355of(1983). and Wildlife the in National Geographic Society,12UMR USTL-INRA 1281 “Stress Abiotiques et Differs markedly from Foundation, G. sulphuraria ecology, cell Germany. (2). P.Galdieria unique the Cyanidiophyceae, displaying high salt and biology, and physiology. Accordingly, we find férenciation des Végétaux cultivés,”13 Université de Lille 1, 59650 Villeneuve d'Ascq Cédex, France. Department of Microbiology metal tolerance and exhibiting extensive meta- orthologs for only 42% of the 6623 G. sulphuraria and Molecular Genetics, Oklahoma State University, Stillwater, bolic versatility (3, 4). G. sulphuraria naturally proteins in C. merolae, and only 25% of both ge- OK 74078, USA. 14CyanoBiofuels GmbH, Magnusstrasse 11, lead to expansion of existing gene families (8). In inhabits volcanic hot sulfur springs, solfatara soils, nomes constitute syntenic blocks (fig. S2). Coding 12489 Berlin, Germany. 15Cluster of Excellence on Plant Scicontrast, archaea and bacteria commonly adapt Düsseldorf, 40225 and anthropogenic hostile environments. In habi- sequences make up 77.5% of the G. sulphuraria ences (CEPLAS), Heinrich-Heine-Universität tats with high concentrations of arsenic, alumi- genome, resulting in a median intergenic distance Düsseldorf, Germany.through horizontal gene transfer (HGT) from other should addressed. lineages (9).be HGT hasE-mail: also been observed in num, cadmium, mercury, and other toxic metals, of 20 base pairs (bp) (fig. S3). Protein-coding *To whom correspondence
[email protected] (G.S.); andreas.weber@ some unicellular eukaryotes (10); however, to G. sulphuraria frequently represents up to 90% genes contain on average two introns (fig. S4), uni-duesseldorf.de (A.P.M.W.) of total biomass and almost all the eukaryotic with median lengths of 55 bp (fig. S5). Thus, the †These authors contributed our knowledge, acquired genes have equally to thishorizontally work. Novozymes, Inc, 1445toDrew Avenue, G. sulphuraria genome is highly condensed by ‡Permanent address: biomass (1, 5). not been linked fitness-relevant traits in freeTo understand the molecular mechanisms comparison with that of C. merolae and most Davis, CA 95618, USA. living (11). Phylogenetic analyses of §Permanent address: Scrippseukaryotes Institution of Oceanography, 1,2 extremophilic and 3,4other eukaryotes. 5 underlying G. sulphuraria’s Gerald Schönknecht, *† Wei-Hua Chen, † Chad M. Ternes,1† Guillaume G. Barbier, †‡ of California, University San Diego, CA 92037, G. sulphuraria genesUSA. using highly stringent crite5 6 2 7 through 8 innovations arise metabolically flexible lifestyle (Fig. 1), we deter||Permanent address: of Biology, Philipps-University Roshan P. Shrestha, †§ Mario Stanke, AndreaEukaryotic Bräutigam, Brett usually J. Baker, Jillian F. Banfield, riaFaculty indicate at least 75 separate gene acquisimined its genome sequence (13.7 Mb;10 table S1) gene duplications Marburg, 35032 9 5,10 and neofunctionalizations, 11 which 12 Marburg, Germany.
gene duplication and evolution is an important process (but see Schonknecht et al. 2013, Science) Gene Transfer from Bacteria and Archaea Facilitated Evolution of an Extremophilic Eukaryote
R. Michael Garavito, Kevin Carr, Curtis Wilkerson, Stefan A. Rensing, || David Gagneul, Nicholas E. Dickenson,13 Christine Oesterhelt,14 Martin J. Lercher,3,15 Andreas P. M. Weber2,5,15* www.sciencemag.org
SCIENCE
VOL 339
8 MARCH 2013
Some microbial eukaryotes, such as the extremophilic red alga Galdieria sulphuraria, live in hot, toxic metal-rich, acidic environments. To elucidate the underlying molecular mechanisms of adaptation, we sequenced the 13.7-megabase genome of G. sulphuraria. This alga shows an enormous metabolic flexibility, growing either photoautotrophically or heterotrophically on more than 50 carbon sources. Environmental adaptation seems to have been facilitated by horizontal gene transfer from various bacteria and archaea, often followed by gene family expansion. At least 5% of protein-coding genes of G. sulphuraria were probably acquired horizontally. These proteins are involved in ecologically important processes ranging from heavy-metal detoxification to glycerol uptake and metabolism. Thus, our findings show that a pan-domain gene pool has facilitated environmental adaptation in this unicellular eukaryote.
tions from archaea and bacteria (supplementary materials). The origin of these G. sulphuraria 1207by the finding genes from HGT is supported that compared to the genomic average, they have 1
Department of Botany, Oklahoma State University, Stillwater, OK 74078, USA. 2Institute of Plant Biochemistry, HeinrichHeine-Universität Düsseldorf, 40225 Düsseldorf, Germany. 3 Institute for Computer Science, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany. 4European Molecular Biology Laboratory (EMBL) Heidelberg, EMBL, Meyerhofstrasse 1, 69117 Heidelberg, Germany. 5Department of Plant Biology, 612 Wilson Road, Michigan State University, East Lansing, MI 48824, USA. 6Institut für Mathematik und Informatik, Ernst Moritz Arndt Universität Greifswald, Walther-Rathenau-Straße
aded from www.sciencemag.org on March 13, 2013
Bacteria and Archea
56
www.sciencemag.org/cgi/content/full/339/6124/1205/DC1 Materials and Methods Fig. S1 Tables S1 and S2 References (31–38)
Downloaded from www.sciencemag.org on March 13, 2013
17. J. L. Hoogland, Science 215, 1639 (1982). 18. J. L. Hoogland, The Black-tailed Prairie Dog: Social Life of a Burrowing Mammal (Univ. of Chicago Press, Chicago, 1995). 19. J. L. Hoogland, J. Mammal. 80, 243 (1999). 20. J. L. Hoogland, in Rodent Societies, J. O. Wolff, P. W. Sherman, Eds. (Univ. of Chicago Press, Chicago, 2007), pp. 438–450. 21. Details about my materials and methods are available as supplementary materials on Science Online. 22. J. B. Silk, Philos. Trans. R. Soc. London Ser. B 362, 539 (2007). 23. J. C. Mitani, J. Call, P. M. Kappeler, R. Palombit, J. B. Silk, The Evolution of Primate Societies (Univ. of Chicago Press, Chicago, 2012). 24. J. L. Hoogland, Behaviour 69, 1 (1979). 25. J. L. Hoogland, Behav. Ecol. Sociobiol. 63, 1621 (2009). 26. P. J. Greenwood, Anim. Behav. 28, 1140 (1980). 27. P. M. Waser, W. T. Jones, Q. Rev. Biol. 58, 355 (1983).
57
Methods to unveil LGT • Compositional methods • Comparison of evolutionary rates • Look for similar sequences in databases through BLAST
• Phylogenetic approach
Compositional methods
58
• Look via bioinformatics in complete genomes for • atypical nucleotide composition in putatively transferred genes
• atypical codon usage patterns
• Only for recent transfer events (before homogenization)
Evolutionary rates
59
• Compare pairwise distances between gene
orthologs within families vs distances between genomes (from a reference tree): if no LGT, these distances should be roughly equal
• Problem: many other factors than LGT can cause substitution rates (then distances) to be different
• Requires orthologs to be present in genomes
under study: better within than between phyla
60
• Another possibility is to compare instantaneous substitution matrices in genes vs genomes
• In case of LGT, these rates should differ • Problem: difficult to accurately estimate
substitution matrix for short DNA stretches (e.g. single genes)
• Supposes LGT is rare within genomes, because genome rate computation includes transferred genes
BLAST and similarity
61
• Find homologs of a query sequence in databases, genomes, ... via a similarity search (e.g. BLAST)
• Pattern of gene presence/absence in organisms = phyletic pattern
• Identification of LGT for genes with unusual affiliation
• Drawback: similarity does not necessarily mean evolutionary proximity
• Sensitive to taxa/gene representation in databases 62
• BLAST can be used to estimate the amount of transferred genes in closely related organisms using their genomes
• Can find if genes specific to each genome are putatively transferred
Phylogenetic approach
63
• Look for individual gene trees incongruent with a reference phylogeny
• Reference tree: rDNA, genomes, gene
concatenation, consensus tree, supertree, ...
• Need well supported trees • Test for incongruence between topologies (KH, SH, AU, SOWH,…)
• Cannot detect LGT between neighbours • Mix phylogenetic and phyletic approaches (BLAST): find putatively close sequences using BLAST and include them in a phylogenetic analysis
• Between symbionts with complete genomes available: blast symbiont ORFs against host genome to identify putatively transferred genes
• Cophylogenetic event-based methods (gene tree
within species tree) can be used to infer a scenario for the LGT (= host-switch)
64
65
Examples
• Antifreeze protein (AFP) gene in fish
66
Comparison of Herring AFP gene with homologous gene in
AFP tree
Rainbow smelt
16S tree Zebrafish
• Carbohydrate-active enzymes transferred from
67
marine bacteria to microbiota of japanese people
• Transfer from seaweed bacteria (Porphyra in sushi)
• TOP6B transferred from Archaea to photosynthetic eukaryotes and ∂-proteobacteria
68
• frp gene acquired in red algae and green plants
69
• Multiple transfers virus-to-host and host-to-virus
70
• Multiple transfers virus-to-host and host-to-virus
71
from ∂-proteobacteria
between Emiliania huxleyi and EhV86
between Emiliania huxleyi and EhV86
72
Case study: Prasinophyte microalgae and their viruses
Hosts: Prasinophyceae
73
green algae (Order • Chlorophyta: Mamiellales, ubiquitous picoplankton)
• 3 main genera, 6 complete genomes to date • Ostreococcus (3 genomes) free-living eukaryote and • Smallest photosynthetic genome • Bathycoccus (1 genome) • Scales • Micromonas (2 genomes) • Flagellum 74
Ostreococcus
Bathycoccus
Micromonas
75
Viruses • Phycodnavirus • Prasinovirus • Important role in the regulation of phytoplanktonic populations
ML Escande, OOB
76
77
• Hosts and viruses mainly sampled in the gulf of Lion
to
ire
O céanolo
This file contains one logo per layer. These logos use RGB (RVB) colors.
gi
q
ue
rv
a
• Evolution • Diversity • Specificity (viruses) and richness (hosts) • Coevolution • Lateral gene transfers Obse
Laboratoire
•
• INSU •
er
C
RS
In the calque menu, click on the "eye" symbol to toggle the logo displayed. The logo that is visible is the version that will be exported, or printed. Turn off this layer (READ ME FIRST ) before exporting or printing.
/M
M • UP
de Banyuls
ARAGO 1882 CN
• Large viral genomes: about 200 Kb • 14 complete genomes to date • Ostreococcus virus: 2 OtV, 7 OlV, 1
78
OmV, 1 OxV
• Bathycoccus virus: 2 BpV • Micromonas virus: 1 MpV
79
• OtV5 genome
Genome Flexibility of Ostreococcus Viruses
TABLE 4 Percent identity of the core genes between the seven Ostreococcus lucimarinus virusesa % identity to: Virus
OlV1
OlV2
OlV3
OlV4
OlV5
OlV6
OlV7
OlV1 OlV2 OlV3 OlV4 OlV5 OlV6 OlV7
100
66 100
67 95 100
92 68 68 100
67 94 95 69 100
67 94 95 68 97 100
92 69 68 92 69 69 100
a
FIG 4 Number of core and specific genes among the seven Ostreococcus luci-
• Genome comparisons
types present in the English Channel (not investigated in either study) and successfully cross-infects O. tauri (30). The morphology and size of the six new OlVs sequenced here is typical of other characterized prasinoviruses that infect Micromonas or Ostreococcus (17, 18). The particles also are morphologically similar to the much larger (165 to 190 nm) Chlorella viruses except for the spike structure at the vertex, which is not observed in prasinoviruses (45, 48). Globally, they show icosahedral symmetry, like the great majority of the other dsDNA aquatic eukaryote viruses currently described (49), without any tail, in contrast to many archaeal and bacterial bacteriophages (50, 51). The size of their genomes (around 200 kb) and the number of potential ORFs also are similar to those of the other sequenced prasinoviruses (17–20). Several lines of evidence support the delineation of two distinct OlV groups that we term type I and type II. In phylogenetic analysis of 22 proteins shared across the analyzed viruses, the OlVs formed two bootstrap-supported groups (Fig. 8). Concordance of the two reconstructions performed here (Fig. 8) indicates
!
FIG 5 Synteny among the seven Ostreococcus lucimarinus virus genomes. Synteny analysis is based on the alignment between annotated open reading frames translated into amino acids for each of the seven OlVs. Each red line represents one orthologous gene. Window, 20 amino acids. Each blue line represents an inverted orthologous gene sequence.
that PolB is a reliable phylogenetic marker for investigating natural diversity of chlorella- and prasinoviruses (9, 13). One O. tauri virus (OtV2) was grouped with OlV type II viruses. However, this virus was isolated using Ostreococcus sp. strain RCC393, a clade B strain that is present primarily in oligotrophic waters (25), rather than a strain from clade C (represented by O. tauri). Hence, while it is unknown whether the so-called OtV2 also infects clade A or C strains, its name is misleading, since it was not isolated against O. tauri. In addition to the observed phylogenetic relationships, nucleotide identities were higher for gene orthologs from OlVs in the same phylogenetic group than between groups, and gene presence/absence patterns were more similar within each group than between groups. Types I and II do not appear to correlate with geographical origins of the viruses. This indicates that the inversion and sequence divergence arose before the dispersion of the two groups. Moreover, the fact that the bona fide O. tauri viruses grouped with type I OlVs suggests these viruses cospeciated with their hosts from a common ancestor with OlV2, OlV3, OlV5, and OlV6. Among type II OlVs, the percent nucleotide identity between orthologous genes of viruses isolated from the same location (OlV5 and OlV6) (Table 1) is higher (97%) than that with the other type II viruses (94 to 95%), suggesting that inside each subgroup the sequence distance reflects the geographical distance, or that the viruses infecting Mediterranean Sea host strains have become more specialized for these hosts. However, additional sampling of viruses will be required to test this hypothesis. The presence of the sequence inversion in the two virus subgroups suggests that this inversion is an ancient rare event that occurred before the separation of the two groups. This hypothesis also is supported by the sequence divergence observed inside the inverted fragment that is similar to the divergence found in the rest of the OlV genomes. Furthermore, the phylogeny suggests that the most parsimonious explanation is that O. tauri viruses have arisen from a type I O. lucimarinus viral ancestor (with the inversion) by host switching. Among the genes which are shared by at least two but fewer than the seven OlVs, 2-oxoglutarate-Fe(II) oxygenase genes (52) are present in multiple copies in several viruses. These highly similar copies are located toward the 3= ends of the genomes, suggesting gene duplications from a single or a limited number of initial acquisitions. This gene family also has been described in multiple copies in viruses infecting cyanobacteria (53). The authors proposed that these genes were involved in the regulation of the cellular nitrogen metabolism or in DNA repair for the benefit of the virus. However, they also could be involved in controlling host translation during the infection (54, 55). Interestingly, the pho4
80
Downloaded from http://jvi.asm.org/ on May 6, 2015 by guest
marinus viruses. Pale blue circle, OlV core genes; gray circle, genes present in more than one OlV but not in all seven of the genomes; dark blue triangles, genes specific to only one OlV; yellow external circles, number of these specific genes shared between this OlV and at least one other prasinovirus. See Materials and Methods for the determination of the orthologous genes.
Boldface numbers indicate compaisons between viruses of the same type.
81
• Genome stats
!
• Phylogeny
82
• Specificity
83
Some of our questions • What is the genetic diversity of viruses and their hosts? Are they correlated in some way? Is it linked to environmental variables?
• What are the features of host and virus genomes? • What are the resistance mechanisms in viruses? • Virus evolution: are prasinovirus monophyletic?
Are they coevolving with their hosts? Are there any gene transfers between hosts and viruses? Are evolutionary rates different between hosts and their viruses? What is the origin of viral genes?
84
85
Cophylogeny
86
• Trees are based on the analysis of partial DNA
polymerase gene (about 600 bp) for viruses, and (generally) 18S rDNA for hosts
• Host specificity is assessed experimentally
87
Cospeciation • Significant
cospeciation with reduced dataset
• Need more data • Longer sequences • Specificity • New strains • More complete dataset (51 virus on 22 hosts): too long to compute exact tests host
Bathycoccus
associations
1/100
Bp_RCC1105
parasite
1/100
BpV1 BatV3
Bp_RCC464
OmV63 OmV64
0.96/79
OmV67
1/100
1/99
Om_RCC1107
OtV343 OtV344
D
Om_RCC789
1/61
OtV304 OtV4
1/91
O_RCC344
1/99
Ostreococcus
OtV564
1/92
OtV565 OtV573
A 1/87
O_RCC356
OtV9 OtV22
Ol_CCE9901
OtV3
1/91
OlV158 OlV462
1/100
ALGAE
C
O_RCC1108
OlV360
Mp_RCC658
VIRUSES
0.99/69
MicBV39
1/100
0.92/-
0.98/73
OlV349 OlV536
Ot_RCC745
0.82/50
MicBV16
0.83/-
MicBV13 MicCV9
A
Mp_RCC2485
0.98/63
MicBV40 MicBV25
Mp_RCC834
MicAV31
1/92
MicAV28 MicAV27 Mp_RCC465
Micromonas
MicAV34
1/100
Mp_RCC629
MicAV32
1/99
MicAV38
C
Mp_RCC114
MicAV30 MicAV39
1/99
0.92/55
MicC497V2 Mp_RCC373
MicBV26 MicAV29
Mp_RCC2484
MicCV3 MicCV2
Mp_RCC1109
1/91
B
0.78/-
MicCV10
MicCV1 MicC497V1 MicCV36
Mp_RCC418
1/100
MicCV32 MicCV28
1/100
Mp_RCC2482
0.99/60
MicCV23 MicCV22
0.76/65
MicCV21 Mp_RCC461
1/92
MicB1109V4 MicB1109V6
Mp_RCC2483
MicB1109V14
1/100
88
• Rationale approach: compute congruence test with
89
ParaFit and propose reconstruction with topologybased algorithms
• A reconstruction from Jane
90
• Statistical testing
91
Original instance: P ≤ 0.000
• Difficult to
define evolutionary or geographical entities for viruses, because global distribution ("everything is everywhere"): distance approaches less biased than topology-based approaches
92
93
• No physical barriers between hosts and viruses • If significant cophylogenetic pattern: adaptation ≠ lack of opportunity for transfer
• "Real" cospeciation ➡ ... Need to study this dataset in a more thorough way (e.g. by partitioning the data)... your job!
94
Lateral Gene Transfer
95
• Special case of LGT here: between hosts and their viruses
• Viruses are known as "bag of genes", or "gene robbers", steeling genes from their host
• Suspected to be vectors of gene transfers between eukaryotes
General methodolody for identifying LGT • Define candidate gene for transfer via BLAST: present in host and viruses
• Find same genes in different taxa (using BLAST, GenBank, ...): make a dataset with most closely related hits (BLAST), reference taxa, candidate gene in host and virus
• Align sequences and make tree • Look at the tree to identify LGT
96
Host-virus LGT in OtV5?
97
• Blast each viral ORF against host genome and keep ORF meeting specific criteria (AA ID > 45 % on > 50 AA)
• Blastp against GenBank nr, keep all viral ORFs with host in the 50 best blast hits, and get these BBHs
• Keep these sequences if similar known gene function in Phycodnaviruses
• 6 candidates for LGT
• Make phylogenetic tree for each candidate, adding
98
host and virus sequences in the alignment + other BBHs (and reference sequences)
NO NO Pyrophosphatase
GDP-mannose Unknown
?
99
NO Topoisomerase
NO Ribonucleosidediphosphate reductase
100
? Maybe...
101 reo co ccu s lu cim Ost 1/100
cus ococ
i
taur
1/81
Bath
ycoc
oc
cu
s lu cim ococ arin cus us taur i
us
Chlamydomonas Ph Chlorella om O ysc ry Sorg itrella za h um
oc
Ostre
as
tre
cus
0.1
n mo
Os
Ce V Mim ivir
ro Mic
Micromonas CCMP1545 C299us as RC cocc omon thy Ba
lium opsis ondy Arabid lysph Po Homo
EhV8
6 Thermococcus
AtC V PbCV MT32 1 5 83 PbCV FR4 158 AR CV Pb
NY 2A PbCV1
Methanosarcina Me Met thanoco ccoid hano es saet a
V
1 MpV1 OlV
1
OtV
V1
BpV2
Bp
Bp
BpV2 OtV5
Zoom based on the concatenation of 5 genes (same for hosts and viruses)
Ostre
00
1/1
Micr
•
Phylogenetic tree of an "evolutionary marker gene", the DNA polymerase
Pb C
•
ari nu s
Prasinovirus genomes
V1
Sy
ccu s
us
c oc
1
roco
ia
OtV
chlo
c ho
as on
n lsto Ra
m no
1/94
Pro
c ne
ari M
0.88/76
00
V1
1/1
Mp
OlV1 1/100 OtV5 1/100
0.1 0.1
102
• Different AA metabolism in related viral genomes !
Only in MpV and OtV: LGT?
Only in MpV and OtV: LGT?
Only in OtV: LGT?
103
• A HSP70 gene exists only in the BpV genome: LGT from its host?
• This gene is thought to have been frequently and independently recruited in taxonomically distant viruses: what happened here?
➡ ... Find by yourself!
104
LGT of inteins in prasinoviruses
• Virus-to-virus transfer • Intein are selfish genetic
!
elements that can insert in genes and disrupt it without affecting its activity (self excises after translation)
• Scattered phylogenetic distribution
• Comparison with reference
tree (from DNA polymerase) ! FIG. 3. Phylogenetic tree of polB sequences belonging to Prasinoviruses. The phylogenetic tree was built using only the PolB sequences of intein-containing viruses and from reference sequences lacking inteins, BI (codon model) and ML (codon model; 297 nucleotides; 100 bootstrap replicates). Numbers show posterior probabilities (BI) and bootstrap proportions
Base composition
105
Tree comparison
(A)$
!
$
! FIG. 5. Tanglegram of PolB and associated inteins belonging to Prasinovirus. Phylogenetic trees were built using codon models in BI and ML (426 positions for the PolB tree; 438 positions for the inteins tree; 100 bootstrap replicates). Numbers show posterior probabilities (BI) and bootstrap proportions (ML) reflecting clade support. Trees were rooted according to the Fig. 3. PolB and intein trees are shown respectively on the left and on the right, respectively.
O R I G I NA L A RT I C L E doi:10.1111/j.1558-5646.2012.01738.x
Codon usage
$(B)$
GENETIC EXCHANGES OF INTEINS BETWEEN Tree reconciliation ! PRASINOVIRUSES (PHYCODNAVIRIDAE) FIG. 4. GC content (A) and NC codon usage statistic (B) of the PolB gene and associated $
!
FIG. 6. Cophylogenetic scenario. Black and grey trees represent PolB and intein sequences,
Camille Clerissi,1,2,3 Nigel Grimsley1,3 , and Yves Desdevises1,3
inteins. ■: BpV-like (Bathycoccus virus-like); : OtV-like (Ostreococcus virus-like); : Avenue du Fontaule,´ 66650, Banyuls-sur-Mer, France ´ UPMC Univ Paris 06, UMR 7232, Observatoire Oceanologique, 1
2
E-mail:
[email protected]
respectively. ●: Switch; MpV-like (Micromonas virus-like). The straight lines have a slope of 1 and correspond to 3
PolB-intein couples without recent transfer signal.
•
Array of methods
○: Codivergence; - - -: Loss. This scenario was produced with Jane 3
´ ´ 66650, Banyuls-sur-Mer, France CNRS, UMR 7232, Observatoire Oceanologique, Avenue du Fontaule,
106
using the following costs: Codivergence: 0; Duplication: 1; Switch: 1; Loss/Sorting: 2; Failure
Received April 3, 2012
Accepted June 29, 2012
Phylogenetic diversity in the Phycodnaviridae (double-stranded DNA viruses infecting photosynthetic eukaryotes) is most often
to diverge: 1.
• GC content and codon usage point out (recent) studied using their DNA polymerase gene (PolB). This gene and its translated protein product can harbor a selfish genetic element
called an “intein” that disrupts the sequence of the host gene without affecting its activity. After translation, the intein peptide
sequence self-excises precisely, producing a functional ligated host protein. In addition, inteins can encode homing endonuclease (HEN) domains that permit the possibility of lateral transfers to intein-free alleles. However, no clear evidence for their transfer
between viruses has previously been shown. The objective of this paper was to determine whether recent transfers of inteins
transfers
have occurred between prasinoviruses (Phycodnaviridae) that infect the Mamiellophyceae, an abundant and widespread class of
unicellular green algae, by using DNA sequence analyses and cophylogenetic methods. Our results suggest that transfer among prasinoviruses is a dynamic ongoing process and, for the first time in the Phycodnaviridae family, we showed a recombination event within an intein.
• Cophylogenetic analyses identifies the players KEY WORDS:
•
Gene conversion, lateral gene transfer, Mamiellophyceae, recombination, virus.
Microbes are at the base of food networks in the oceans and thus they shape the structure and function of ecosystems (Azam et al. 1983). Marine viruses are one order of magnitude more abundant than the microbial hosts they predate (Suttle 2005). Hence, they have a strong influence on biogeochemical cycles and on the community structure of microorganisms (Proctor and Fuhrman 1990; Thingstad and Lignell 1997). The majority of them are prokaryotic viruses (bacteriophages) (Suttle and Chan 1993; Sullivan et al. 2003; Weinbauer 2004), but there is increasing interest in eukaryotic viruses, especially because (1) picoeukaryotes make a large contribution to planktonic primary production (Moon-van der Staay et al. 2001), (2) some eukaryotic algae are toxic and kill shellfish (Tarutani et al. 2001), and (3) aquatic environments are likely to harbor an uncharacterized diversity of large algal viruses that have recently been detected by virtue of their DNA sequence similarities to Mimivirus, the largest virus ever discovered (La Scola etO al.R2003; al.RT 2008). I G IMonier NA Let A I CThus, L Ethese algal
Identification of probable LGTs
• between OtV • between putative BpV
viruses were included in the Mimiviridae family (Monier et al. 2008; Fischer et al. 2010), recently proposed to be reclassified as the Megaviridae family (Arslan et al. 2011). To date, the most extensively studied viruses infecting eukaryotic plankton belong to the Phycodnaviridae and Megaviridae families (Van Etten et al. 2002; Dunigan et al. 2006; Iyer et al. 2006; Monier et al. 2008; Fischer et al. 2010; Arslan et al. 2011). Both families contain large, double-stranded DNA viruses (sometimes called “Giruses,” short for Giant viruses [Claverie et al. 2006; Claverie and Ogata 2009; Ogata et al. 2009; Forterre 2010]), and form monophyletic groups within the nucleocytoplasmic large DNA viruses (NCLDV; Iyer et al. 2006; Fischer et al. 2010). Although five marine plankton-infecting viruses have been described so far for the Megaviridae (Phaeocystis phouchetii virus [PpV-01], Jacobsen et al. 1996; Chrysochromulina ericina virus [CeV-01], Pyramimonas orientalis virus [PoV-01], Sandaa et al. 2001; Cafeteria roenbergensis virus [CroV], Fischer et al. 2010;
C 2012 The Society for the Study of Evolution. 2012 The Author(s). Evolution ⃝ Evolution 67-1: 18–33
doi:10.1111/j.1558-5646.2012.01738.x
⃝ C
18
GENETIC EXCHANGES OF INTEINS BETWEEN PRASINOVIRUSES (PHYCODNAVIRIDAE) Camille Clerissi,1,2,3 Nigel Grimsley1,3 , and Yves Desdevises1,3 1
´ ´ 66650, Banyuls-sur-Mer, France UPMC Univ Paris 06, UMR 7232, Observatoire Oceanologique, Avenue du Fontaule,
3
´ ´ 66650, Banyuls-sur-Mer, France CNRS, UMR 7232, Observatoire Oceanologique, Avenue du Fontaule,
2
E-mail:
[email protected]
Received April 3, 2012 Accepted June 29, 2012 Phylogenetic diversity in the Phycodnaviridae (double-stranded DNA viruses infecting photosynthetic eukaryotes) is most often studied using their DNA polymerase gene (PolB). This gene and its translated protein product can harbor a selfish genetic element called an “intein” that disrupts the sequence of the host gene without affecting its activity. After translation, the intein peptide sequence self-excises precisely, producing a functional ligated host protein. In addition, inteins can encode homing endonuclease (HEN) domains that permit the possibility of lateral transfers to intein-free alleles. However, no clear evidence for their transfer between viruses has previously been shown. The objective of this paper was to determine whether recent transfers of inteins have occurred between prasinoviruses (Phycodnaviridae) that infect the Mamiellophyceae, an abundant and widespread class of unicellular green algae, by using DNA sequence analyses and cophylogenetic methods. Our results suggest that transfer among prasinoviruses is a dynamic ongoing process and, for the first time in the Phycodnaviridae family, we showed a recombination event within an intein. KEY WORDS:
Gene conversion, lateral gene transfer, Mamiellophyceae, recombination, virus.
Microbes are at the base of food networks in the oceans and thus they shape the structure and function of ecosystems (Azam et al. 1983). Marine viruses are one order of magnitude more abundant than the microbial hosts they predate (Suttle 2005). Hence, they have a strong influence on biogeochemical cycles and on the community structure of microorganisms (Proctor and Fuhrman 1990; Thingstad and Lignell 1997). The majority of them are prokaryotic viruses (bacteriophages) (Suttle and Chan 1993; Sullivan et al. 2003; Weinbauer 2004), but there is increasing interest in eukaryotic viruses, especially because (1) picoeukaryotes make a large contribution to planktonic primary production (Moon-van der Staay et al. 2001), (2) some eukaryotic algae are toxic and kill shellfish (Tarutani et al. 2001), and (3) aquatic environments are likely to harbor an uncharacterized diversity of large algal viruses that have recently been detected by virtue of their DNA sequence similarities to Mimivirus, the largest virus ever discovered (La Scola et al. 2003; Monier et al. 2008). Thus, these algal
18
viruses were included in the Mimiviridae family (Monier et al. 2008; Fischer et al. 2010), recently proposed to be reclassified as the Megaviridae family (Arslan et al. 2011). To date, the most extensively studied viruses infecting eukaryotic plankton belong to the Phycodnaviridae and Megaviridae families (Van Etten et al. 2002; Dunigan et al. 2006; Iyer et al. 2006; Monier et al. 2008; Fischer et al. 2010; Arslan et al. 2011). Both families contain large, double-stranded DNA viruses (sometimes called “Giruses,” short for Giant viruses [Claverie et al. 2006; Claverie and Ogata 2009; Ogata et al. 2009; Forterre 2010]), and form monophyletic groups within the nucleocytoplasmic large DNA viruses (NCLDV; Iyer et al. 2006; Fischer et al. 2010). Although five marine plankton-infecting viruses have been described so far for the Megaviridae (Phaeocystis phouchetii virus [PpV-01], Jacobsen et al. 1996; Chrysochromulina ericina virus [CeV-01], Pyramimonas orientalis virus [PoV-01], Sandaa et al. 2001; Cafeteria roenbergensis virus [CroV], Fischer et al. 2010;
C 2012 The Society for the Study of Evolution. 2012 The Author(s). Evolution ⃝ Evolution 67-1: 18–33
⃝ C