• unlimited access with print and download
    $ 37 00
  • read full document, no print or download, expires after 72 hours
    $ 4 99
More info
Unlimited access including download and printing, plus availability for reading and annotating in your in your Udini library.
  • Access to this article in your Udini library for 72 hours from purchase.
  • The article will not be available for download or print.
  • Upgrade to the full version of this document at a reduced price.
  • Your trial access payment is credited when purchasing the full version.
Buy
Continue searching

Evolutionary genetics of the self-incompatibility in Solanaceae and Papaveraceae

ProQuest Dissertations and Theses, 2009
Dissertation
Author: Timothy Paape
Abstract:
Flowering plants are able to avoid inbreeding by several genetically based mechanisms. Gametophytic self-incompatibility (GSI) occurs when pollen is rejected in the style or on the stigma if it possesses a matching allele with either of the ovule parent's S-alleles. This mechanism typically involves a single genetic locus that is highly polymorphic within populations and species. S-alleles are maintained by strong negative frequency dependent selection that essentially favors alleles when they become rare in a population. This type of balancing selection preserves variation at the S-locus for millions of years enabling us to infer ancient demographic patterns through phylogenetic analyses of genealogies of S-alleles. GSI has been described in several taxa of Solanaceae but only one genus of Papaveraceae, the genus Papaver . Although the molecular mechanisms of self-recognition in these respective families differ remarkably, the underlying theoretical predictions regarding their genetics and evolution are expected to be similar. In Chapter I of this dissertation, I first explore the evolutionary history of a genetic bottleneck in the Solanaceae. Self-incompatible species in the sister genera Physalis and Witheringia share restricted variation at the S-locus indicative of an ancient bottleneck that occurred in a common ancestor. Using phylogenic approaches to look at S-allele variation in species of the subtribe Iochrominae, the clade containing Physalis and Witheringia , we are able to determine when this bottleneck event occurred. We then use two chloroplast markers, fossil calibrations and a Bayesian relaxed molecular clock approach to determine the approximate date of the bottleneck. In Chapter II , I examine the molecular evolution of individual codons from S-alleles from the bottlenecked lineages of Physalis compared to those of the non-bottlenecked lineages of Solanum . Because Physalis S-RNases appear to have diversified more recently than those of Solanum , we find significantly different patterns among amino acids undergoing positive selection using maximum likelihood phylogenetic and Bayesian coalescent methods. (increase in subst. at 4 fold degenerate sites or 3rd position (general synonymous relative to 1-2nd pos.) HyPhy.docs, overall dS increase in Physalis relative to Solanum; overall increase in dN in Physalis?) In Chapter III , I explore the genetics of a putative S-locus polymorphism in three previously uncharacterized species of Papaveraceae native to California. Analyses of putative S-allele sequences from A. munita , P. californicus and Romneya coulteri sampled from natural populations have shown that each harbors substantial genetic polymorphism homologous to stylar S-alleles from Papaver rhoeas . These genes appear to be expressed only in female reproductive tissues as expected for stylar S-locus products. In A. munita and P. californicus , greenhouse crosses among full sibs with matching putative S-genotypes usually don't result in seed set while crosses among individuals with non-matching genotypes almost always do. In addition, potential duplications at or near this locus have been detected in diploid P. californicus . Contrary to other well known SI systems, allelic genealogies from Papaveraceae show a general pattern of monophyletic clustering according to species. Genealogies of these species' S-alleles and those from newly sequenced Papaver alleles show general patterns of monophyletic clustering. The reduced levels of trans-specific polymorphism may be explained by founder events or population bottlenecks in each of the species, though other possibilities must also be considered. We also employ maximum likelihood models to estimate positive selection among putative alleles from these taxa.

Table of Contents

Signature Page.............................................................................................................iii Table of Contents........................................................................................................iv List of Tables..............................................................................................................vi List of Figures.............................................................................................................viii Acknowledgements.....................................................................................................x Vita and Publications...................................................................................................xi Abstract of the Dissertation.........................................................................................xiii Chapter I.....................................................................................................................1 A 15 Million Year Old Genetic Bottleneck..................................................................1 Abstract.......................................................................................................................2 Introduction.................................................................................................................3 Materials and Methods................................................................................................6 Genealogy of S-alleles from Solanaceae.............................................................7 Species Phylogeny and Divergence Time............................................................9 Results........................................................................................................................13 Discussion...................................................................................................................15 Acknowledgements.....................................................................................................22 References...................................................................................................................25 Reprint Acknowledgement..........................................................................................30

v Chapter II....................................................................................................................31 Differential Selection in S-RNases in the genera Physalis and Solanum.......................31 Abstract.......................................................................................................................32 Introduction.................................................................................................................33 Materials and Methods................................................................................................37 Sequences and Phylogeny Construction..............................................................37 Selection Estimates.............................................................................................38 Results........................................................................................................................44 Discussion...................................................................................................................48 Acknowledgements.....................................................................................................55 References...................................................................................................................62

Chapter III...................................................................................................................68 Evolutionary Genetics of S-locus in Papaveraceae.......................................................68 Abstract.......................................................................................................................69 Introduction.................................................................................................................70 Materials and Methods................................................................................................73 Results........................................................................................................................80 Discussion...................................................................................................................89 Acknowledgements.....................................................................................................98 References...................................................................................................................110

vi List of Tables

Chapter I

Table 1. Prior probability and posterior probability distribution estimates of the Solanaceae phylogeny................................................................... ...................22

Chapter II Table 1.

Average pairwise nucleotide divergence among S-alleles...............................55 Table 2. Rate distributions of non-synonymous and synonomous substitutions in each dataset.....................................................................................................56 Table 3.

Sites predicted to be under differential positive selection using the Bayesian ratio of omegas test and the fixed effects likelihood (FEL) test... ....................57 Table 4.

Estimates of average non-synonymous (dN) and synonymous (dS) substitutions at terminal branches..................................................................57

Chapter III Table 1. Results of hand pollinations among Argemone munita full sibs with matching genotype.......................................................................................99

Table 2. Results of hand pollinations of Platystemon californicus full sibs with matching genotypes.....................................................................................100 Table 3. Statistics of combined full sib crosses from each species.............................100 Table 4. Average pairwise nucleotide divergence among sequences within genera....101

vii

Table 5. Results of PAML analysis of positive selection on 5’ sequences from Papaver, Argemone and Platystemon ……………………………………………..... 102 Table 6 . Results of non-parametric permutation analyses of recombination and coalescent likelihood estimates of  = 4N e r ……………………………………..... 103 Table 7. Simulation results testing the power to detect recombination under variable levels of genetic diversity…..........................................................103

viii List of Figures

Chapter I

Figure 1. Maximum likelihood phylogeny of 72 S-alleles from Solanaceae.................23 Figure 2. Bayesian consensus species phylogeny and divergence time estimates of the Solanaceae based on sequence data from 2 chloroplast genes . ...............24

Chapter II Figure 1. Phylogeny of Physalis and Solanum S-RNases...........................................58 Figure 2. Posterior probability scores of sites predicted to be under positive selection in Physalis and Solanum..............................................................59 Figure 3.

Bayesian estimate of the ratio of omega values...........................................60 Figure 4. Contrast of point estimates of dN/dS for Physalis and Solanum for sites that were found to have omega ratios significantly above 1........................61 Figure 5. Fixed effects likelihood comparisons of non-synonymous (dN) substitutions at sites predicted to be under significantly different selection pressures....................................................................................................61 Figure 6. Power test for Fixed Effects Likelihood test of positive selection on Physalis and Solanum datasets...................................................................62

i x Chapter III Figure 1. Amino acid alignment of all sequences containing 5’ ends..........................104

Figure 2. Maximum likelihood phylogeny of known Papaver S-alleles putative S- alleles from Alaskan and Californian taxa..................................................105

Figure 3. Sliding window analysis of average pairwise nucleotide diversity of putative S alleles with 5’ ends....................................................................106

Figure 4. Sliding window analysis of average pairwise nucleotide diversity of putative S-alleles with 3’ ends....................................................................107 Figure 5. RT-PCR of putative S-locus genes from stigmatic, stem and leaf tissues using allele-specific forward primers and a universal amplification reverse primer............................................................................................108 Figure 6. Posterior probabilities for codons estimated to be under positive selection using OmegaMap.......................................................................................109

x Acknowledgments Thanks to Michael Lawrence, VE Franklin-Tong for paving the way for SI studies in poppies and to Gary Hannan for work with Platystemon and assistance with population locations. All of the UC Reserve staff that helped early on to find plant populations and assistance in seed collection, especially Mark Stromberg at Hastings and Daniel Dawson and Kim Rose at SNARL. To all of our collaborators (outside of the Kohn lab) on the bottleneck paper, Stacey Smith, Richard Olmstead, Lynn Bohs for sharing your data, samples and knowledge. Boris Igic and Matt Streisfeld were instrumental in orienting me in the lab as well as teaching me lots of great molecular skills. Special thanks to Sergei Kosakovsky Pond for sharing so much of your expertise, time and beer during our discussions of the comparative selection paper. Also Danny Wilson for responding to so many emails and sharing code for Omegamap and R. Thanks to Jill Miller for a great collecting trip down the Baja Pennisula to collect Lycium. Thanks to Ed Newbigin for all of the insight and into pollen S in Solanaceae. I had the good fortune to be well funded throughout my research from many sources, they are: Jeanne Messier Memorial Fund, California Desert Research Fund, Mildred Mathias Fund, the National Science Foundation and to James Crow for his generous fellowship. Thanks to my advisor Josh Kohn for all of the advice (well most of it anyway), discussion, the lab space, grueling hours of editing and writing, letters of support, securing resources and encouragement over the past five years. Thanks to my parents for all of the amazing support over the years. Thanks to my wife Amy for riding the roller coaster of events, both real and imaginary throughout my PhD. and for all of your comforting, care and love…and most of all, patience. A special thanks

xi to Yogananda for your garden and making Encinitas a wonderful sanctuary during my stay in San Diego. And of course all of the other blessings as well… The text of Chapter I, in full, is a reprint of the material as it appeared in Molecular Biology and Evolution. I was the primary researcher and author. The co- authors listed in this publication contributed species level sequence data and plant material, and Dr. Kohn directed and supervised the research included in this chapter.

xii Curriculum vitae Timothy Paape, Ph.D.

A) Professional Preparation:

2002 B.S. Biology- Fort Lewis College, Durango, CO 2003-2004 Lab Technician, Plant Sciences, University of Minnesota, St. Paul, MN 2009 Ph.D. Biological Sciences, University of California San Diego, CA

B) Appointments

2004-2009 Teaching Assistantship UC San Diego

C) Dissertation Subject

a) Genetics of S-locus polymorphism in three native California species (Papaveraceae)

b) Evolutionary genetic history and differential selection of S-RNases across the genera

D) Publications

Newbigin, E., T. Paape, J.R. Kohn. 2008. RNase based self-incompatibility: Puzzled by Pollen S. Plant Cell. 20:2286-2292

Paape, T., B. Igic, S. Smith, R. Olmstead, L. Bohs, J.R. Kohn. 2008. A 15-Million- Year-Old Genetic Bottleneck at the S-locus of the Solanaceae. Mol. Biol. Evol. 25: 655- 663

Silverstein, K., M. Graham, T. Paape, and K.A. VandenBosch. 2005. Genome Organization of More Than 300 Defensin-Like Genes in Arabidopsis. Plant Physiology. 138: 600–610

xiii

Abstract of the Dissertation

Evolutionary Genetics of Gametophytic Self-Incompatibility in Solanaceae and Papaveraceae

by

Timothy Paape

Doctor of Philosophy in Biological Sciences University of California, San Diego, 2009 Professor Joshua R. Kohn, Chair

Flowering plants are able to avoid inbreeding by several genetically based mechanisms. Gametophytic self-incompatibility (GSI) occurs when pollen is rejected in the style or on the stigma if it possesses a matching allele with either of the ovule parent’s S-alleles. This mechanism typically involves a single genetic locus that is highly polymorphic within populations and species. S-alleles are maintained by strong negative frequency dependent selection that essentially favors alleles when they become rare in a population. This type of balancing selection preserves variation at the S-locus for millions of years enabling us to infer ancient demographic patterns through phylogenetic analyses of genealogies of S-alleles. GSI has been described in several taxa of Solanaceae but only one genus of Papaveraceae, the genus Papaver. Although the molecular

xiv mechanisms of self-recognition in these respective families differ remarkably, the underlying theoretical predictions regarding their genetics and evolution are expected to be similar. In Chapter I of this dissertation, I first explore the evolutionary history of a genetic bottleneck in the Solanaceae. Self-incompatible species in the sister genera Physalis and Witheringia share restricted variation at the S-locus indicative of an ancient bottleneck that occurred in a common ancestor. Using phylogenic approaches to look at S-allele variation in species of the subtribe Iochrominae, the clade containing Physalis and Witheringia, we are able to determine when this bottleneck event occurred. We then use two chloroplast markers, fossil calibrations and a Bayesian relaxed molecular clock approach to determine the approximate date of the bottleneck. In Chapter II, I examine the molecular evolution of individual codons from S-alleles from the bottlenecked lineages of Physalis compared to those of the non-bottlenecked lineages of Solanum. Because Physalis S-RNases appear to have diversified more recently than those of Solanum, we find significantly different patterns among amino acids undergoing positive selection using maximum likelihood phylogenetic and Bayesian coalescent methods. (increase in subst. at 4 fold degenerate sites or 3 rd position (general synonymous relative to 1-2 nd pos.) HyPhy.docs, overall dS increase in Physalis relative to Solanum; overall increase in dN in Physalis?) In Chapter III, I explore the genetics of a putative S-locus polymorphism in three previously uncharacterized species of Papaveraceae native to California. Analyses of putative S-allele sequences from A. munita , P. californicus and Romneya coulteri sampled from natural populations have shown that each harbors substantial genetic

xv polymorphism homologous to stylar S-alleles from Papaver rhoeas. These genes appear to be expressed only in female reproductive tissues as expected for stylar S-locus products. In A. munita and P. californicus, greenhouse crosses among full sibs with matching putative S-genotypes usually don’t result in seed set while crosses among individuals with non-matching genotypes almost always do. In addition, potential duplications at or near this locus have been detected in diploid P. californicus. Contrary to other well known SI systems, allelic genealogies from Papaveraceae show a general pattern of monophyletic clustering according to species. Genealogies of these species’ S- alleles and those from newly sequenced Papaver alleles show general patterns of monophyletic clustering. The reduced levels of trans-specific polymorphism may be explained by founder events or population bottlenecks in each of the species, though other possibilities must also be considered. We also employ maximum likelihood models to estimate positive selection among putative alleles from these taxa.

1 Chapter I A 15-Myr-Old Genetic Bottleneck

2 Abstract Balancing selection preserves variation at the self-incompatibility locus (S- locus) of flowering plants for tens of millions of years, making it possible to detect demographic events that occurred prior to the origin of extant species. In contrast to other Solanaceae examined, self-incompatible species in the sister genera Physalis and Witheringia share restricted variation at the S-locus indicative of an ancient bottleneck that occurred in a common ancestor. We sequenced 14 S-alleles from the subtribe Iochrominae, a group that is sister to the clade containing Physalis and Witheringia. At least six ancient S-allele lineages are represented among these alleles demonstrating that the Iochrominae taxa do not share the restriction in S-locus diversity. Therefore the bottleneck occurred after the divergence of the Iochrominae from the lineage leading to the most recent common ancestor of Physalis and Witheringia. Using cpDNA sequences, three fossil dates, and a Bayesian relaxed molecular clock approach, the crown group of Solanaceae was estimated to be 51 MY old and the restriction of variation at the S-locus occurred 14.0 - 18.4 MY before present. These results confirm the great age of polymorphism at the S-locus and the utility of loci under balancing selection for deep historical inference.

3 Introduction Theoretical population genetic studies of balancing selection predict that it will greatly increase the coalescence time of allelic polymorphism relative to neutral variation (Takahata 1990; Vekemans and Slatkin 1994). This prediction has been confirmed by studies of self-recognition loci such as the MHC loci of jawed vertebrates (Klein et al. 1993), and the mating compatibility loci of both fungi (Muirhead et al. 2002) and plants (Ioerger et al. 1990; Richman and Kohn 2000; Castric and Vekemans 2004). In all of these systems, the time to coalescence of allelic variation is far older than extant species. Loci under balancing selection can therefore provide evidence of historical genetic and demographic events that far predate current species, a utility that has been termed “molecular paleo-population biology” (Takahata and Clark 1993). In many flowering plants, self-incompatibility (SI) systems allow hermaphroditic individuals to recognize and reject their own pollen in favor of pollen from other individuals, thus avoiding the deleterious effects of self-fertilization (de Nettancourt 1977). In single-locus gametophytic SI, as found in the Solanaceae (nightshade family) studied here, a match between the S-allele carried by the haploid pollen grain and either of the S-alleles in the diploid style triggers pollen tube rejection, preventing self-fertilization and also cross-fertilization if the cross-pollen grain carries either allele found in the female parent. In such systems, rare alleles have a selective advantage because they are compatible with more mates (Wright 1939). Selection favoring rare alleles is quite strong, even with large numbers of alleles segregating in populations. For instance, a new pollen S-allele entering a

4 population that already contains 20 alleles has an 11.1% male mating advantage (Clark 1993). Strong negative frequency-dependent selection is responsible for the two outstanding features of S-locus polymorphism. First, dozens of alleles occur in natural populations with alleles accumulating until a balance is reached between selection favoring rarity and drift causing allele loss (Wright 1939; Lawrence 2000). Second, alleles are often very old because, if any allele drifts towards rarity, selection acts to increase its frequency (Ioerger et al. 1990; Clark 1993). In the Solanaceae, the S-locus gene responsible for self-pollen recognition and rejection in the female tissue is an RNase (S-RNase hereafter; McClure et al. 1989). The great age of polymorphism at the S-locus is exemplified by the fact that S-RNase alleles from the same diploid individual of Solanaceae often differ at more than 50% of their amino acid sites. In addition, S-RNase alleles from species in different genera often cluster together in phylogenetic analyses, evidence of broadly shared ancestral polymorphism (Ioerger et al. 1990; Richman and Kohn 2000; Igic et al. 2004, 2006; Savage and Miller 2006). Much of the S-locus polymorphism found in SI Solanaceae was present in their common ancestor, which must also have been SI (Igic et al. 2004; 2006). A striking contrast exists between the sequence diversity of S-alleles from species of the closely allied genera Physalis and Witheringia, and nearly all other Solanaceae whose S-alleles have been sampled (species of Brugmansia, Lycium, Nicotiana, Petunia and Solanum). While the numbers of S-alleles present in Physalis and Witheringia species are similar to those found in other Solanaceae (Lawrence

5 2000; Savage and Miller 2006; Stone and Pierce 2005; Igic et al. 2007), all 93 S- RNases sequenced from three Physalis (Richman et al. 1996a; Richman and Kohn 1999; Lu 2001) and two Witheringia (Richman and Kohn 2000; Stone and Pierce 2005) species cluster within only three S-allele lineages that pre-date the divergence of Physalis and Witheringia. For other Solanaceae, even small samples of alleles usually represent many more ancient lineages (reviewed in Richman and Kohn 2000; Castric and Vekemans 2004; see also Savage and Miller 2006; Igic et al. 2007). This finding has been interpreted as evidence of an ancient bottleneck that restricted variation at the S-locus in some common ancestor of the genera Physalis and Witheringia. No such restriction is evident at the S-locus of any other sampled SI Solanaceae (Richman et al. 1996b; Richman and Kohn 2000; Richman 2000; Igic et al. 2004; Stone and Pierce 2005; Igic et al. 2006) except for African species of the genus Lycium (Miller et al. in press), whose S-locus shows evidence of a bottleneck associated with colonization of the Old World from America. After the restriction of S-allele diversity in some common ancestor of Physalis and Witheringia, the remaining S-allele lineages diversified leaving the observed pattern of large numbers of S-alleles representing only a restricted number of ancient S-allele lineages. In this paper we date the historical restriction of S-locus variation common to Physalis and Witheringia. First, we examine S-locus diversity in the South American monophyletic subtribe Iochrominae, which is found to comprise the sister group of the lineage containing Physalis and Witheringia by Olmstead et al. (in review). We ask whether the Iochrominae share the reduced set of S-allele lineages found in Physalis and Witheringia. If so, then the restriction of diversity at the S-locus

6 predates the most recent common ancestor (MRCA) of the group containing the subtribe Iochrominae as well as Physalis and Witheringia. On the other hand, if the Iochrominae harbor a wide diversity of ancient S-allele lineages, then the restriction at the S-locus must have occurred after the divergence of the Iochrominae from the group containing Physalis and Witheringia but before the MRCA of Physalis and Witheringia. We then generated a cpDNA phylogeny of Solanaceae using a fossil- anchored Bayesian relaxed molecular clock approach, to date the branch along which the bottleneck is shown to have occurred.

Materials and Methods

Plant material and molecular procedures Stylar tissue from one to four individuals from seven self-incompatible (Smith and Baum 2006a) species from the subtribe Iochrominae (Dunalia brachyacantha Miers, Eriolarynx lorentzii (Dammer) Hunz., Iochroma australe Griseb., Iochroma cyaneum (Lindl.) M. L. Green, Iochroma gesnerioides Miers, Iochroma loxense Miers, and Vassobia breviflora (Sendt.) Hunz.) was collected from plants growing at the University of Wisconsin greenhouse facility. Smith and Baum (2006b) determined these species SI status through manual self- and cross-pollinations. Seeds for these taxa were acquired largely from the Solanaceae Germplasm collection at Radboud University, Nijmegen, The Netherlands and a few from offspring of wild- collected individuals. Voucher numbers and accession information are given in Smith and Baum (2006a,b) and in the supplementary online material ( http://mbe.oxfordjournals.org/). No large population samples were available for any single species within the Iochrominae. Sampling across species should provide an

7 estimate of S-locus diversity within a group with the caveat that occasionally the same functional S-allele (same specificity) may be sampled from more than one species, making the estimate of the amount of S-allele diversity in the group conservative. Total RNA was extracted and reverse transcription performed to amplify S-alleles according to methods described by Richman et al. (1995), except for the application of 3’-RACE as in Igic et al. (2007). The forward degenerate primer PR1 (5’-GAATTCAYGGNYTNTGGCCNGA-3’) amplifies from the 5’ end of the conserved region C2 (Ioerger et al. 1991) to the 3’end of the coding region of the S- RNase cDNA. Products obtained via PCR were cloned using the TOPO TA Cloning Kit (Invitrogen Corp.) to separate alleles at the obligately heterozygous S-locus. Amplified cloned PCR products were screened by restriction digests (ten clones per individual, on average) and sent for automated sequencing by Eton Bioscience Inc., San Diego, CA.

Genealogy of S-alleles from Solanaceae For phylogenetic analysis of S-RNase sequences from the Iochrominae, additional S-alleles were obtained from GenBank for the following species (number of alleles): Lycium andersonii (10), Nicotiana alata (6), Petunia integrifolia (6), Physalis cinerascens (12), Solanum carolinense (9), and Witheringia solanacea (15; see supplementary online material for Genbank accession numbers). We chose the taxa and allele sequences used for the phylogenetic analysis based on three criteria. First, we aimed for broad taxonomic representation across the Solanaceae. Second, S- RNase sequences had to cover at least the entire region between conserved regions

8 two and five as described by Ioerger et al. (1991). Many sequences in GenBank are shorter and these were discarded. Third, in order to apply maximum-likelihood and Bayesian methods without prohibitively long computation times, we reduced the number of sequences used in the final dataset by first constructing a neighbor-joining tree in PAUP* v4.0b10 (Swofford 2002) using 71 S-allele sequences from Genbank along with our Iochrominae sequences. We then removed one of any intra-specific sister pair of non-Iochrominae alleles with fewer than ten amino acid differences. Twelve alleles were removed in this manner. This should not affect our goal of determining the number of ancient S-allele lineages represented among alleles recovered from the Iochrominae. Alleles that arose prior to the divergence of the Iochrominae are unlikely to fall between any very closely related pair of alleles from within a given species. DNA sequences were manually aligned using BioEdit 7.0.1.4 (Hall 1999) and Se-Al 2.0 (Rambaut 1996) for phylogenetic analysis. Because many S-allele sequences in GenBank do not include the 3’ end of the gene, this region was removed from all sequences leaving 354 bp in the final alignment used for phylogenetic analysis. This represents approximately 62% of the coding region of the S-RNase gene including the hypervariable regions most frequently implicated as involved in specificity determination (Ioerger et al. 1991; Savage and Miller 2006; Igic et al. 2007). We generated a maximum likelihood (ML) tree of S-alleles using PAUP* v4.0b10 (Swofford 2002). Maximum likelihood model parameters were determined using ModelTest 3.0 (Posada and Crandall 1998). The Akaike Information Criterion

9 (Akaike 1974) best fit model (TVM+I+ ) was used to heuristically search for the ML phylogeny. One S-RNase from Antirrhinum hispanicum (Plantaginaceae; Xue et al. 1996) was used as the outgroup. Bootstrap values were generated using a maximum-likelihood heuristic search of 1000 replicates using the same base frequencies found above to produce a 50% majority rule consensus tree. We also used Mr. Bayes v3.1.1 (Ronquist and Huelsenbeck 2003) to generate a 50% majority consensus tree for comparison with the ML tree. Bayesian analysis was run using four simultaneous Markov chain Monte Carlo chains (3 heated and 1 cold) with a GTR+ substitution model across sites. The analysis was run for 1,000,000 generations, sampling every 100 th tree for a total of 10,000 trees. After determining stationarity, the initial 2501 trees were discarded from the burn-in phase. The remaining trees represent generations 250,001 to 1,000,000 (7500 trees) on which posterior probabilities were calculated. Species phylogeny and divergence time estimation The chloroplast sequences used for species divergence time estimation represent a subset of a much larger sample (200 species) of Solanaceae (Olmstead et al. in review). To reduce computational time needed for divergence time estimation, we limited our taxonomic sample to 29 Solanaceae representing only genera from which S-alleles have been sampled, or genera that represent basal nodes in the diversification of the Solanaceae ((e.g. Schizanthus and Cestrum, Olmstead and Sweere 1994, Olmstead et al. in review), but from which no S-locus information is currently available. An alignment of sequences from two chloroplast regions, ndhF coding (2116 bases) and trnL-trnF coding and intergenic spacer sequences (1377

10 bases) was used for a combined total of 3488 bases. For outgroup comparison and root placement we included ndhF and trnL-trnF sequence information from two species of Convolvulaceae (Ipomoea batatas and Convolvulus arvensis) which is considered to be the sister family to the Solanaceae (Olmstead and Sweere 1994). Because our small taxonomic sample could lead to erroneous estimation of relationships, we assumed the topological constraints (ordering of generic divergences) found in the larger phylogenetic analysis of Olmstead et al. (in review) which all receive  90% bootstrap support. Likelihood ratio tests (Felsenstein 1988) were used to determine whether sequence data conformed to the expectation of a molecular clock. Maximum- likelihood models with and without the enforcement of a clock were performed using PAUP* on the constrained topology for each gene separately and on the combined dataset (both genes). The 2-parameter HKY85 model (Hasegawa et al. 1985) was selected with a four-category gamma distribution of rates across sites estimated from the data. Base frequencies, the transition/transversion ratio, and the gamma distribution shape parameter were estimated while running the ML analyses. The test statistic null model settings for each partition correspond to HKY85 + I + + c with the alternative model being HKY85 + I + assuming N-2 degrees of freedom where N is the number of terminal sequences. The distribution of likelihood ratio test under the hypothesis  = (-2[ln clock / ln without clock ]) was assumed to be as a 2 . Because the data do not conform to a strict molecular clock (see results), a Bayesian method (Thorne et al. 1998; Thorne and Kishino 2002; Drummond et al.

11 2006) of relaxing this assumption was used to estimate divergence times among species. The program BEAST v1.4 (Drummond and Rambaut 2003) performs both exponential and lognormal uncorrelated rate estimates of nucleotide substitution along lineages of a phylogeny using a Markov chain Monte Carlo simulation process. The Bayesian method of Drummond and Rambaut (2003) also allows the user to specify uncertainty in fossil dates using soft bound priors which is not possible using likelihood methods of divergence times (Yang 2006). Our prior probability parameters were as follows: we assumed the HKY85 + Г model of nucleotide substitution with a proportion of invariant sites estimated from the sequence data. We fixed the mean substitution rate at the root node to be 0.0007 substitutions per million years, consistent with estimated coding and non-coding rates of cpDNA evolution (Palmer 1991; Schnabel and Wendel 1998). We assumed an uncorrelated lognormal relaxed model of rate heterogeneity among branches and a Yule prior model of speciation. The software also allows the user to calibrate specific nodes on the phylogeny to estimated fossil dates along with confidence intervals as priors. We used two fossil dates within the Solanaceae (Solanum-like and Physalis-like seeds from mid-Miocene and a Lower Eocene Convolvulaceae fossil, Benton 1993) as prior constraints of particular nodes (Table 1). Based on these fossils, we assumed normally distributed priors of 10 MY (SD = 4.0 MY) for the age of both Solanum and Physalis and a mean of 52 MY (SD = 5.2 MY) for the outgroup (Convolvulaceae) divergence (Magallón et al. 1999). The standard deviations on the priors represent the upper and lower bounds of the geological epochs from which the fossils were obtained.

12 We constrained the starting tree and all subsequent trees in the MCMC analysis to conform to the topology estimated by Olmstead et al. (in review). This preserves species relationships but allows for variation in node heights that translate to ages in millions of years. The MCMC was run twice each for 5,000,000 generations, sampling every 500 th tree with a burn-in phase of 500,000 generations for each run. The two runs were checked for convergence and the posterior age distributions of the nodes of interest were analyzed using Tracer v1.3 (Rambaut and Drummond 2004). The estimated node ages for both runs were combined and re- sampled at a frequency of every 1000 th tree, providing a sample of 10,000 trees. The time between the MRCA of Physalis and Witheringia and the MRCA of those genera plus the Iochrominae was estimated by subtracting the relevant node ages for each of the 10,000 samples. The results of the MCMC procedure are given as the mean and the 95% highest posterior density (HPD) intervals in millions of years. The mean and standard deviation of the duration of this branch were calculated from these values. Trees from both runs were combined to produce an ultrametric consensus tree using FigTree1.0 (Rambaut 2006). It should be noted that although the Bayesian program MULTIDIVTIME (http://statgen.ncsu.edu/thorne/multidivtime.html ) does not allow soft-bound prior distributions on fossil dates, similar estimates for ingroup divergences were achieved using the above priors. We present the analysis using BEAST because it facilitates the estimation of the duration and associated error of the branch during which the restriction of variation at the S-locus occurred (see below).

13

Results S-allele genealogy A total of 14 different alleles from 15 individuals from the seven Iochrominae species were successfully amplified and sequenced. The low number of alleles relative to the number of individuals sampled resulted from two causes. First, several individuals shared common alleles. For example, four individuals of Iochroma australis and four of Eriolarynx lorentzii possessed only 3 different alleles per species. Our sample of plants was derived from small germplasm collections that likely contain lower S-locus diversity than would be found in nature. Second, only one allele was successfully isolated from two individuals. As found in previous studies, the genealogy of Solanaceae S-alleles shows extensive shared ancestral polymorphism among most species (Figure 1B). The S- alleles of each species of Petunia, Nicotiana, Lycium and Solanum represent five to seven lineages that arose before the divergence of these genera. This is true even though only a subset of available alleles and species were included to simplify the analysis. In contrast, all alleles from Physalis cinerascens and Witheringia solanacea fall within only three lineages that pre-date the MRCA of these two genera. Previous studies that incorporated additional S-alleles and species have consistently found the same result (Richman et al. 1996; Richman and Kohn 2000; Lu 2001; Stone and Pierce 2005). Despite the limited sampling of Iochrominae alleles, several observations can be made. First, in two cases, very similar alleles were recovered from different

14 species of Iochrominae. These close pairs (E.lor1 and I.lox2, I.cya1 and I.ges2) differ by 2 and 3 amino acid residues, respectively, over the region compared and may represent sequence divergence within a specificity that arose after species divergence. Therefore, the 14 Iochrominae alleles sampled may represent fewer than 14 specificities. Among this set, we recovered at least six ancient Iocrhominae S-allele lineages five of which diverged from one another prior to the origin of the genus Solanum (Fig. 1). Alleles from group 1 (Fig. 1) are more closely related to alleles from Solanum than to other Iochrominae alleles. Given uncertainty in the topology in Fig. 1, this group of alleles could represent either one or two S-allele lineages that diverged prior to the origin of Solanum. Iochrominae S-allele lineages 2, 3 and 5 are each found to be sister to different S-alleles from Nicotiana, and group 6 is sister to a pair of alleles from Petunia and Lycium. Iochrominae S-alleles from group 4 are more closely related to alleles from Physalis and Witheringia than to alleles from other sampled genera. Only one Iochrominae S-allele (I.aus 2) falls within any of the three clades of alleles found in Physalis and Witheringia. The placement of that allele is uncertain; it may be sister to all other members of clade I (Fig. 1). A basal position for this allele would be consistent with diversification of this clade of alleles in Physalis and Witheringia after divergence of the Iochrominae. Most S-alleles recovered from Iochrominae fall neither within, nor sister to, the three clades of alleles represented in species of Physalis and Witheringia. Species phylogeny and divergence estimates Likelihood ratio tests strongly rejected the molecular clock for each chloroplast gene individually ( 2 distributions: ndhF:  = 2[6867.69-6830.92]= 73.53,

Full document contains 133 pages
Abstract: Flowering plants are able to avoid inbreeding by several genetically based mechanisms. Gametophytic self-incompatibility (GSI) occurs when pollen is rejected in the style or on the stigma if it possesses a matching allele with either of the ovule parent's S-alleles. This mechanism typically involves a single genetic locus that is highly polymorphic within populations and species. S-alleles are maintained by strong negative frequency dependent selection that essentially favors alleles when they become rare in a population. This type of balancing selection preserves variation at the S-locus for millions of years enabling us to infer ancient demographic patterns through phylogenetic analyses of genealogies of S-alleles. GSI has been described in several taxa of Solanaceae but only one genus of Papaveraceae, the genus Papaver . Although the molecular mechanisms of self-recognition in these respective families differ remarkably, the underlying theoretical predictions regarding their genetics and evolution are expected to be similar. In Chapter I of this dissertation, I first explore the evolutionary history of a genetic bottleneck in the Solanaceae. Self-incompatible species in the sister genera Physalis and Witheringia share restricted variation at the S-locus indicative of an ancient bottleneck that occurred in a common ancestor. Using phylogenic approaches to look at S-allele variation in species of the subtribe Iochrominae, the clade containing Physalis and Witheringia , we are able to determine when this bottleneck event occurred. We then use two chloroplast markers, fossil calibrations and a Bayesian relaxed molecular clock approach to determine the approximate date of the bottleneck. In Chapter II , I examine the molecular evolution of individual codons from S-alleles from the bottlenecked lineages of Physalis compared to those of the non-bottlenecked lineages of Solanum . Because Physalis S-RNases appear to have diversified more recently than those of Solanum , we find significantly different patterns among amino acids undergoing positive selection using maximum likelihood phylogenetic and Bayesian coalescent methods. (increase in subst. at 4 fold degenerate sites or 3rd position (general synonymous relative to 1-2nd pos.) HyPhy.docs, overall dS increase in Physalis relative to Solanum; overall increase in dN in Physalis?) In Chapter III , I explore the genetics of a putative S-locus polymorphism in three previously uncharacterized species of Papaveraceae native to California. Analyses of putative S-allele sequences from A. munita , P. californicus and Romneya coulteri sampled from natural populations have shown that each harbors substantial genetic polymorphism homologous to stylar S-alleles from Papaver rhoeas . These genes appear to be expressed only in female reproductive tissues as expected for stylar S-locus products. In A. munita and P. californicus , greenhouse crosses among full sibs with matching putative S-genotypes usually don't result in seed set while crosses among individuals with non-matching genotypes almost always do. In addition, potential duplications at or near this locus have been detected in diploid P. californicus . Contrary to other well known SI systems, allelic genealogies from Papaveraceae show a general pattern of monophyletic clustering according to species. Genealogies of these species' S-alleles and those from newly sequenced Papaver alleles show general patterns of monophyletic clustering. The reduced levels of trans-specific polymorphism may be explained by founder events or population bottlenecks in each of the species, though other possibilities must also be considered. We also employ maximum likelihood models to estimate positive selection among putative alleles from these taxa.