What evolutionary process or processes tend to decrease the variation between separate populations?

Microbial Experimental Evolution

D. Dykhuizen, in Encyclopedia of Evolutionary Biology, 2016

Gene Frequency Change

Mutation, genetic drift, migration, and natural selection are the forces that change gene frequency. In microbial experimental evolution, population size is usually large enough that genetic drift is unimportant. However, in very small populations with a high mutation rate, detrimental mutations can become fixed, decreasing the fitness of the population. If this continues to happen, generation after generation, it is theoretically possible that multiple detrimental mutations are fixed until the population becomes unviable. This idea is called Muller’s ratchet (Muller, 1964). When an RNA virus is passaged through a single plaque, the fitness decreases (Chao, 1990), showing that Muller’s ratchet can effect populations. But as fitness decreases, compensatory mutations that increase fitness increase (Poon and Chao, 2005), reversing the effects of Muller’s ratchet. Thus, it is extremely unlikely that Muller’s ratchet will continue until the population becomes unviable.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128000496002328

Evolution, Theory of

Gregory C. Mayer, Catherine L. Craig, in Encyclopedia of Biodiversity (Second Edition), 2013

Speciation

Mutation supplies the raw material of variation. Migration, drift, and selection are the forces that effect the fate of that variation within lineages; with respect to nonneutral variants, it can be said mutation proposes, selection disposes. The four evolutionary forces cause evolutionary changes, and these changes within a single ancestor–descendant lineage are called anagenesis. But if evolution consisted only of anagenesis, there would be only one kind of organism – changing over time, but just one lineage. To produce the great contemporaneous diversity of organic beings that inhabit Earth, there must be a multiplication of lineages. The generation of new lineages is called cladogenesis or speciation (Coyne and Orr, 2004; Price, 2008).

For the study of evolutionary mechanisms, including speciation, the most fruitful concept of species, at least for sexual forms, is the biological species concept (Coyne and Orr, 2004). Under this concept, species are groups of actually or potentially interbreeding populations in nature, which are isolated from other such groups by reproductive isolating barriers. Reproductive isolating barriers include phenomena such as courtship behavior, pollinator specificity, and hybrid sterility, all of which act as barriers to the exchange of genes between species. Under the biological species concept, the question of how species originate is the question of how reproductive isolating barriers arise.

One way to achieve this is to divide a population into two or more spatially separated populations, so that migration – a homogenizing force – is no longer possible between the spatially separated descendant populations. Each population initiates an independent lineage, within which mutation, migration, drift, and selection will convert individual variation into variation in time and, because the independent lineages coexist in time, into variation in space. If this independent anagenesis leads to the populations changing so much that they are reproductively isolated (e.g., their mating calls change to the point where they no longer recognize one another as potential mates), then the descendant populations will be able to coexist in the same place without losing their identity through interbreeding: they will have become separate species.

This scenario of geographic isolation is one mode of speciation, called allopatric speciation. It is the conversion of an initially extrinsic isolation (geographic separation) into intrinsic reproductive isolating barriers. There are other possible scenarios, which involve such possibilities as selection acting to reinforce reproductive isolating barriers, or little or no initial geographic separation (parapatric and sympatric speciation). Allopatric speciation is widely accepted (Mayr, 1942), whereas there remains considerable debate about at least some aspects of other modes of speciation (Coyne and Orr, 2004; Price, 2008).

As speciation is repeated over time, there is a potentially exponential increase in the number of species, and the ramifying, branching history of life over time is the phylogenetic tree. Different branches of the tree will speciate and go extinct at different rates. Some branches will undergo adaptive radiations, in which the varied species will come to occupy a diversity of stations in the economy of nature on scales from the archipelagic (Grant and Grant, 2008; Losos, 2009) to the continental (Lull, 1917). The lineages of the various branches will interact with one another and with their environments, and will disperse over, and move about with, the wandering continents and oceans. This constitutes biotic evolution, the evolutionary history of the set of all organic beings, and this is the history of biodiversity on Earth.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123847195000484

GENETICS AND GENETIC RESOURCES | Population, Conservation and Ecological Genetics

C. Mátyás, in Encyclopedia of Forest Sciences, 2004

Directed and Random Genetic Changes in Large Populations

Owing to various, partly random genetic effects, such as natural selection (i.e., genetic adaptation), mutation, genetic drift, gene flow, or unequal sexual contribution of individuals, allele frequencies change over time, unlike in idealized populations in Hardy–Weinberg equilibrium. In large populations, gene flow and adaptation are the prime forces influencing genetic diversity and intraspecific variability.

Hardy–Weinberg Law

In infinite, random-mating populations, allele and genotype frequencies remain constant from generation to generation in absence of selection, migration, drift, and mutation effects. For two alleles of p, q frequency at a locus, the equilibrium ratio of homozygotic (p2 and q2) and heterozygotic (2pq) genotypes will be p2+2pq+q2=1.

Gene Flow

Gene flow describes the spatial movement of genes, typically through seed and pollen dispersal, either within a population or between separated stands. (An alternative term, ‘migration,’ is reserved here for shifts in time of the geographic range of species or populations.) Wind-pollinated species, producing abundant pollen, such as most temperate forest trees, show major gene flow. In favorable weather, pollen clouds may travel hundreds of kilometers and contribute significantly to local pollination (Figure 2).

What evolutionary process or processes tend to decrease the variation between separate populations?

Figure 2. Density (grains m−2) of pollen in a Scots pine seed orchard. Fifty percent of the female strobili was receptive before the appearance of local pollen. Reproduced with permission from Lindgren D, Paule L, Xihuan S, et al. (1995) Can viable pollen carry Scots pine genes over long distances? Grana 34: 64–69.

Animal-pollinated (mostly tropical) tree species depend on the movement of their pollen vectors. Investigations have shown pollen transport of several kilometers and medium-level gene flow between trees and stands. The very rare apomictic and self-pollinating tree species show the lowest gene exchange rate.

The evolutionary and practical significance of gene flow is high. Its function is to counter genetic drift (random fluctuations in allele frequencies) within the range, to disperse fitness-improving mutant alleles, to maintain high levels of genetic variation and adaptability, and to avert inbreeding in fragmented populations. Gene flow has therefore a decisive role in shaping within-species genetic variation patterns (Table 2), and consequently influences appropriate strategies for forest reproductive material use and conservation.

Natural Selection, Adaptation

The prerequisite of the selective force described by Darwin and Wallace is easily visible in forests, namely conspicuous differences between individual trees in competitive and reproductive ability, the two key components of fitness. Natural selection acts through higher mortality and fewer offspring of less-fit individuals. Consequently, the gene pool of the next generation will be selected for greater fitness.

The shift in genetic variability tends to change the population profiles and over several generations may lead to evolution. Darwinian natural selection is an important, although not exclusive, driving force of evolution, as random effects play an important role too.

Genetic adaptation and fitness

While natural selection explains the ‘statistical’ aspect of selection, adaptation describes the sum of biological processes that safeguard the survival of the population under constantly changing conditions. Some of these mechanisms are nonheritable.

Genetic adaptation operates on populations through fitness selection. Fitness encapsulates the differential effect of many traits expressed during the life cycle. The selective value of individual traits depends on the actual environment, which can change constantly. In environments influenced by humans, e.g., in managed forests or plantations, fitness will be modified (cultivation- or domestic fitness).

Fisher's fundamental theorem of natural selection postulates that the increase of average fitness (ΔW¯) is a function of the average relative fitness of the parent population () and the within-population additive genetic variance (VW):

ΔW¯≈VWW¯

i.e., progress of natural selection depends on adaptedness (), and adaptability of a population depends on the available genetic variance – if genetic variability is depleted, adjustment to a changed environment becomes difficult or impossible (Figure 3).

What evolutionary process or processes tend to decrease the variation between separate populations?

Figure 3. Improvement of the fitness average of populations over time in a theoretical niche. The progress of population 1 is slower toward the fitness maximum (Wmax), because its genetic variability is smaller, and its average fitness is closer to the maximum. Population 2 has a larger genetic load (L), but also the selection pressure is stronger. In practice, the progress is not very effective owing to environmental fluctuation and heterogeneity, resulting in ever-changing fitness optima. The precondition for the improvement of fitness is sufficient genetic variability!

The consequences of reduced genetic diversity on adaptability on species level can be illustrated by the examples of two contrasting boreal pine species, jack pine (Pinus banksiana) and red pine (P. resinosa), which have similar life histories and ecological niches in boreal forests of eastern North America. While the former displays very broad genetic variability, the latter seems practically devoid of diversity. Regarding distributions, red pine has only restricted, fragmented occurrences and is becoming rare in certain areas, while jack pine is the dominant species in many forest associations. The difference in distribution pattern may be attributable to the loss of diversity in red pine, probably through ‘genetic bottlenecks’ of glacial periods.

Variation in Reproductive Fitness: Unequal Sexual Contribution of Individuals

Differences between genotypes in flowering and seeding vary over 10-fold, even among dominant trees in a forest stand. Owing to unbalanced flowering and seeding, neither natural regeneration nor the seed crop collected in a stand or in a seed orchard is genetically identical to the gene pool of the parents. Many genotypes contribute insignificantly to the next generation. The top quartile of genotypes may be represented in over two-thirds, and the bottom quartile in less than 3% of the progeny. Therefore the effective population size (Ne) is usually far smaller (by roughly an order of magnitude) than the total number of individuals, and can be calculated for a monoecious species from the reproductive contribution of each individual (Wi):

Ne=1∑ (Wi2)

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0121451607000855

Optimization in Evolution, Limitations of

H.G. Spencer, in International Encyclopedia of the Social & Behavioral Sciences, 2001

2.7 Selection Need not Optimize

Perhaps the most counterintuitive reason for optimization being limited is that numerous selective regimes do not maximize fitness. While most verbal descriptions of natural selection (including Darwin's) emphasize the optimizing aspect of fitness, mathematical models show that the conditions under which fitness reaches a maximum (even a local maximum) are rather restrictive. Ignoring the effects of genetic drift, migration, mutation and nonrandom mating as described above, constant selection, arising from viability differences among different phenotypes reflecting genetic variation at a single locus, will maximize the mean fitness of the population. If selection pressures vary in any way, if selection arises because of fertility or fecundity differences or if more than one locus is involved, the population's mean fitness need not be maximized. It is even possible to construct simple mathematical models under which mean fitness is minimized. A straightforward verbal scenario (quoted in Gould and Lewontin 1979, p. 592) in which selection acts but does not improve mean fitness concerns an allele for increased fecundity, which selection rapidly fixes in the population. If the population is otherwise unchanged, no more adults are likely to survive than previously, and the size of the population is unchanged (and may even decrease if predators key into the more abundant juveniles). Neither individuals nor the population is optimized.

The non-optimizing property of fertility selection is a reflection of its formal parallels with selective mating: the number of offspring (‘fitness’) is a property of the mated pair, ‘fertility’ for fertility selection and ‘the chance of mating’ for selective mating. Mean fitness can decrease in both sorts of systems. Nevertheless, it can be argued that the sorts of fitnesses that lead to such non-intuitive behavior are not often found in nature. This rarity may arise either because genetic variants bearing such fitnesses arise infrequently (a form of mutational bias) or because they are weeded out by selection in the longer term.

Systems with two or more loci do not optimize mean fitness because of genetic linkage and the non-independence of the effects of genes at the different loci (a property known as epistasis). The recombinatory effects of sexual reproduction prevent the population's mean fitness being maximized. But, again, a case can be made that alleles with strong epistatic interactions are unusual and, as a result, genes (as well as traits) are selectively quasi-independent.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B008043076703134X

Detection and Classification of Extracellular Action Potential Recordings

Karim Oweiss, Mehdi Aghagolzadeh, in Statistical Signal Processing for Neuroscience and Neurotechnology, 2010

2.2.2 Unknown Spike

Prior knowledge of the deterministic spike shape, s, or its parameters if stochastic, greatly simplifies the analysis. In practice, however, this information may not be fully available. Spikes may also be nonstationary over short time intervals, for example during bursting periods (Harris et al., 2000; Csicsvari et al., 2003). The shapes can also change over prolonged periods, for example as a result of electrode drift or cell migration during chronic recordings with implanted electrodes (Buzsaki et al., 2004, Gold et al., 2006). In this case, the models H0 and H1 may involve some nuisance (unknown) parameters for which performance depends on how close a chosen signal model is to the real spike waveform. The optimal detector here selects the maximum value of the distribution pY|H1,sy|H1,s with respect to the unknown parameters. This happens when y = s, or

(2.9)maxsundefinedpY|H1 ,sy|H1,s=12πΣ

The other hypothesis does not depend on the signal, and the LRT becomes a generalized LRT (GLRT), often termed the square law or incoherent energy detector; it takes the form

(2.10) ytΣ−1yH1><H0γ

Similar interpretation of the noise kernel applies, but now the product involves a whitened observation vector, y′=D−12U−1y. The test statistic in this case becomes a blind energy detector, where the observations are whitened, squared, and then compared to a threshold.

Performance of this detector can be easily derived by noting that the test statistic becomes the sum of the squares of statistically independent Gaussian variables and is chi-square–distributed (Papoulis et al., 1965). Similar to the analysis in Eq. (2.7), it can be shown that this distribution has the following form under the null hypothesis:

(2.11)pH0(T,0)=TN2−1e−T2Γ(N2)2N2

Where Γ(.) is the gamma function. Under H1, the test statistic has the noncentrality parameter α=sTΣ−1s (Urkowitz, 1967).

(2.12)PH1(T, sTΣ−1s)=TN2−1e −T+α22N2∑∞k=0(αT)k22kk!Γ( k+N2)

If we denote by FTN,α the cumulative distribution function of a noncentral chi-square–distributed random variable T with N degrees of freedom and noncentrality parameter α, these results can help us calculate the performance of the detector in terms of PD and PF as follows:

(2.13)PD=PH1 T>γ=1−FH1γ=1−FH1γ|N,sTΣ−1sPF=PH0T>γ=1−FH0γ=1−FH0γ|N,0

Figure 2.4 illustrates the performance of this detector. As in the known signal case, the performance improves when the number of samples N increases and/or the ratio of the average spike power to the average noise power increases.

What evolutionary process or processes tend to decrease the variation between separate populations?

Figure 2.4. Performance of the GLRT energy detector versus the number of samples for different values of the noncentrality parameter α.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123750273000028

The Nuclear Genome

Stefano Mariani, Dorte Bekkevold, in Stock Identification Methods (Second Edition), 2014

14.4 Conclusions

The vast amount of information contained in the nuclear genome is becoming progressively more available owing to recent developments in DNA sequencing techniques. However, the wealth of information that can be “read” across genomes may have rather different meanings. The most substantial difference is between neutral and adaptive genetic variation: the former is primarily shaped by the interaction between random genetic drift and gene flow, whereas the latter is constrained by the action of natural selection. This fundamental issue is often “lost in translation” when geneticists communicate their findings to fisheries managers, much like the way nuclear and mitochondrial DNA evidence is sometimes lumped into the general term genetics.

There is still no general consensus or golden rule as to when it is more appropriate to use adaptive or neutral markers. Initially, an emphasis on migrant exchange had put neutral markers at the forefront of stock identification applications, with the view that markers under selection would essentially “obscure” migration/drift dynamics. Recently, with the aim to resolve structure in cases where neutral tools failed to reject panmixia, adaptive markers have become an attractive alternative. Since population divergence over evolutionary timescales is the result of processes that affect the whole organism and its genome collectively, Funk et al. (2012) have recently proposed that for the identification of evolutionarily significant units (ESUs), all sets of markers should be employed together to assess divergence. On the other hand, for the identification of management units (MUs), which emphasize demographic independence, they suggest that neutral markers would be best suited, while markers under selection should be applied to detect local adaptation in population components requiring special conservation (CUs). Here we argue that, in marine populations, neutral markers may be at times ineffective at describing demographic boundaries between stock units, in which case the employment of selected adaptive markers will be necessary to address management issues—while obviously ensuring that the patterns observed are temporally stable over the timescales of interest. This approach will be strengthened by the rapidly increasing knowledge of genomes, which will allow understanding of the functional relevance of the chosen markers on a case-by-case basis.

Every branch of conservation biology operates at the delicate interface between applied and fundamental research, and in order to deliver effective management advice, it is often unnecessary to explain the subtleties of the biological processes underlying the patterns observed. In fact, while ecological and evolutionary research is eventually bound to investigate biological processes, stock identification is primarily interested in detecting patterns that are informative for the purpose of management. Advances in the field of stock identification and many other applications are attained through the inherently deep-delving and self-correcting process of scientific inquiry, which seeks to resolve the ultimate causes for the patterns observed in nature. However, in order to effectively translate basic knowledge into management, it is invariably necessary to simplify and streamline the information.

In this sense, it is important to sustain continued efforts in fundamental research, because it is upon the solid platform of good scientific knowledge that improved methods can be designed, tested, and implemented.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B978012397003900014X

Epidemiology and Evolution of Fungal Pathogens in Plants and Animals

P. Gladieux, ... T. Giraud, in Genetics and Evolution of Infectious Diseases (Second Edition), 2017

6.2.1 Rate and Direction of Gene Flow

Pathogenic fungal species are often organized into discrete populations. Population genetics usually assumes a simple model of n populations, each of which is equally likely to receive and give migrants to and from each of the other populations. Under this model, providing additional simplifying assumptions, a relationship between Neme (Ne being the effective size of each population; me being the effective migration rate between populations) and FST (an F-statistic that measures of genetic differentiation among populations by quantifying the differences in allele frequencies between populations) can be derived: FST≈1/(1 + 4 Neme). This approach has been severely criticized by some authors95,96 who raised concerns about the unrealistic assumptions under the n-island model (constant population sizes, symmetrical migration at constant rates, no selection, and persistence for periods of time long enough to achieve migration–drift equilibrium). Even though they do not provide reliable estimates of rates of gene flow, measures of population differentiation can nonetheless be used to gain information on the history of dispersal. Several studies reported very low differentiation among samples of fungal pathogens of agricultural crops or forestry trees from different localities across a continent (e.g., Venturia inaequalis,97,98 Melampsora larici-populina99).

The coalescent theory100 relates patterns of common ancestry within a set of genes to the structure of the populations from which they were sampled. In coalescent models, patterns of relationships among genes are represented by a genealogy, and the structure of the population is represented by parameters such as population size, rates of population growth, or—what is relevant to the present discussion—rates and directions of gene flow. Both the genealogy and the parameters are generally unknown, and the one usually wants to estimate the parameters of the model. It is generally impossible to jointly consider all possible ancestral relationships and parameter values and to search for the combinations that maximize the probability of the model. Instead, approaches have been developed that simultaneously explore many relatively probable genealogies (loosely speaking, irrelevant genealogies are disregarded) and parameter values (see Refs. 101,102 for reviews). These approaches are collectively referred to as “coalescent genealogy samplers.” Several methods relying on coalescent genealogy samplers were designed to estimate, among other parameters, rates of gene flow between species or populations.103,104 These methods offer the advantage of allowing less restrictive models than the more traditional methods presented earlier. These methods have been successfully applied to infer the ancestral routes of colonization for several fungal globally distributed plant pathogens such as the barley scald pathogen Rhynchosporium secalis,105 and the apple scab pathogen V. inaequalis.97

Methods based on coalescent genealogy samplers remain computationally demanding. For many datasets and models of population structure, they even remain computationally intractable. As a result, there is an increasing interest in developing alternative approaches that are faster and easier to implement. One of the most promising approaches is approximate Bayesian computation106; it has been shown to be particularly powerful to determine the origin and routes of introduction of invading pest species,107–109 and it is very likely that it will also provide important insights into the history of fungal pathogens.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780127999425000044

Biogeography and Larval Dispersal Inferred from Population Genetic Analysis

Serge Planes, in Coral Reef Fishes, 2002

VI. The Future Role of Genetics for Coral Reef Fish Studies

The genetic approach has been adopted by reef fish biologists as a way of assessing variation among geographically remote populations of coral reef fishes, and therefore for testing effectiveness of larval dispersal. Most results to date suggest that spatial genetic differentiation in coral reef fishes is very often more pronounced than what would be expected from the duration of the pelagic phase, and the dispersal capacities of larvae through oceanic currents. Selective processes could also contribute to such a pattern by generating genetic variations resulting from adaptation and not from the equilibrium genetic drift migration.

Nevertheless, the number of genetic studies completed remains very small [12 publications since Bell et al. (1982)], making it impossible to draw any general conclusions. Available data give some trends, but genetic surveys have to contend with interactions between biological parameters, historical features, hydrodynamics, and geomorphological characteristics of coral reefs that vary among places and species. These make any generalization impossible without many more studies. Dealing with such numbers of variables requires multispecies and multiscale intensive studies on numerous individuals if a synthetic general pattern is desired. Only a few studies have been done from that perspective, and the only point of satisfaction is that they all give significant results that provide key information for understanding persistence and maintenance of coral reef fish populations. Clearly, more data are needed, and not just data for more species, but data obtained from well-designed studies.

A second research perspective, in which genetics will certainly have a significant input, concerns questions relevant to biogeography and phylogeography. A comparison of reef fish assemblages over large scales requires knowledge of the establishment of each assemblage. Genetic markers can surely provide good data on the time of origin of species and of colonizations of particular regions if some suitable calibration from the fossil record is available. Again such approaches require large-scale surveys. A few are now available for some coral reef invertebrates, but this field remains virtually unknown for coral reef fishes. For the few species that have been studied, several models seem to be appropriate, depending on the species analyzed. No general patterns are yet evident, but general trends may emerge as more data become available.

Finally, the genetic/demographic link is starting to be investigated. Species with high fecundity, such as most coral reef fishes, often display two paradoxes: (1) much lower genetic variation than expected under neutrality theory based on their abundance (Avise, 1994; Nei and Graur, 1984), and (2) “chaotic patchiness” involving seemingly stochastic genetic heterogeneity over small spatial and temporal scales. Variance in the reproductive success among spawning groups may account in part for both observations (Hedgecock et al., 1992). This hypothesis was tested in coral reef fishes (Avise and Shapiro, 1986; Planes and Lecaillon, 2002) and did not reveal a significant pattern. Further work needs to be developed in this direction in order to couple genetics and demography during the larval stage and the recruitment transition.

The techniques used for genetic analysis are changing rapidly. Allozyme techniques have been the most frequently used approach up to now, and have provided significant results. In general, if it is feasible to use variation in allozymes rather than nucleic acids for a particular question, one should do so (Parker et al., 1998). This is especially the case when looking at multispecies surveys, because setting up an allozyme protocol can be rapid compared to DNA approaches. However, we should keep in mind that the choice of marker also has implications on evolutionary processes that will be investigated. Choosing markers under genetic selection (such as allozymes or the cytochrome b gene of mtDNA) may provide a different result than choosing neutral markers (such as microsatellites). Differences in results may not only be a consequence of the higher variability investigated but more likely are related to the fact we are looking at different evolutionary processes (selection vs. genetic drift).

In addition, the use of hypervariable codominant microsatellite systems can be viewed as the future tool for population genetics. Because they express many alleles for each locus (up to 50), these systems allow more precise genetic discrimination of populations and even of individuals. Five microsatellite loci with an average of 20 alleles per locus provide 95 independent variables, yet it requires at least 25 polymorphic allozyme loci with an average of 5 alleles per locus to get almost the same number of independent variables. No allozyme study done on fish has provided such a level of polymorphism. The disadvantage of the use of microsatellites for population genetics is that it requires significant effort and time to set up the protocol. This has heretofore limited any multispecies approach. In the present context, microsatellite systems seem an ideal tool when effort is centered on one or two species with numerous individuals and samples to process and compare. Finally, microsatellite analysis offers the possibility of genetic surveys of small larvae (Ruzzante et al., 1996). This surely represents a new area of interest that can link genetics and demography during the larval stage and recruitment.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780126151855500128

Genome Sequence Databases: Types of Data and Bioinformatic Tools

A. G-Preciado, ... E. Merino, in Encyclopedia of Microbiology (Third Edition), 2009

Phylogeny Prediction

Phylogeny and molecular evolution have provided a huge toolbox to study the evolutionary history of organisms, allowing inferences of phylogenetic relations (ancestor–descendent) of protein domains, genes, and organisms. The resulting phylogenetic hypotheses are crucial to make phylogeny predictions or inferences. It also allows estimation of evolutionary forces (such as selection, genetic drift, migration, and recombination) in protein domains, genes, genomes, and populations. Phylogeny can make important hypotheses such as the universal tree of life and distinction between orthologues and paralogues as well as the important inferences of their function, for example, the likely change of function between paralogue proteins.

Phylogeny’s objective is to trace an ancestor–descendant relation of organisms through different taxonomic levels. The markers used to build a phylogeny are contained in the (DNA or protein) sequences, and these include restriction fragment length polymorphisms (RFLPs), genomic fingerprints, among others. The sequences that contain this information must be aligned with the previously described bioinformatic programs like ClustalW, T-Coffee, and MUSCLE.

In general, DNA sequences give a finer resolution of the evolutionary history of an organism since a great variability exists in the substitution rate within DNA sequences, for example, comparing coding regions and intergenic regions, catalytic residues versus noncatalytic residues, structural domains versus nonstructural domains, third positions versus first and second positions of codons in coding sequences, and stems versus loops of rRNAs and tRNAs. Moreover, different genes evolve at different rates; viral genes evolve very fast in contrast to the slow evolutionary rate of 16S rRNAs.

Horizontal gene transfer (HGT) and homoplasy represent a problem and a limitation of phylogenies. Homoplasy occurs when characters are similar, but are not derived from a common ancestor. There are several types of homoplasy: parallel evolution, which is the independent evolution to reach the same final state, from the same ancestral state; convergent evolution, which is the independent evolution to reach the same final state, from a different ancestral state; and secondary loss, which is a reversion to the ancestral state.

A phylogenetic tree is a mathematical structure used to represent the evolutionary history among a group of sequences or organisms. Phylogenetic inference requires a precise selection of the method to use from all the available ones, given a set of sequences. The aim of phylogenetic inference is to obtain the best estimate of an evolutionary history based on the incomplete and noisy information contained in the sequences.

One of the most commonly used methods to construct a phylogenetic tree is based on distances between sequences coming from a multiple sequence alignment. The distance values are arranged as a distance matrix whose values depend on the evolutionary model selected and could be used to calculate the tree by the Unweighted Pair Group Method with Arithmetic mean (UPGMA) and Neighbor Joining (NJ) methods. The clustering methods, UPGMA and NJ, reconstruct the tree from a distance matrix. These are very fast methods, but very sensitive to certain parameters such as the order in which Operational Taxonomic Unit (OTUs) are added to the tree. This is because the distance matrix is built pairwise, that is, a distance measure is chosen to quantify the differences between a pair of items. NJ and UPGMA are good only to have a quick idea of how your tree looks, but the resulting tree will not be robust.

Alternatively to distance methods, trees can be constructed using discrete methods such as Maximum Parsimony (MP), Maximum Likelihood (ML), and Bayesian. These methods consider each site (column) in the alignment directly. The MP and ML methods follow a different optimization criterion, which allows a better selection of the resulting topology from millions of topologies that are to be analyzed. The big limitation of these optimization criteria is that they are computationally costly.

MP assumes an implicit evolutionary model that prefers the resulting phylogeny with the minimum substitutions needed. Parsimony informative sites are those that partition sequences into at least two groups each with at least two members. MP uses the branch and bound optimization algorithm. Exhaustive search and branch and bound methods guarantee finding the best tree. However, exhaustive search methods do not work for anything more than ten sequences on existing single-processor computers. It is important to consider that MP often gives multiple equally parsimonious trees making it difficult to choose among them; it underestimates branch lengths and it does not take account of multiple substitutions at a given site.

ML tree reconstruction is an explicit statistical technique based on the likelihood framework. ML makes several assumptions of substitution models. The most typical are (1) the probability of a change is independent of the prior history of the site (a Markov Model; see above), (2) substitution probabilities do not change with time or over the tree (a homogeneous Markov process), and (3) change is time reversible. All sites are informative because a site that has the same base for two sequences tells us something about the time separating the two molecules.

There are several ML advantages: it is mathematically rigorous and performs well in computer simulations, it takes into account multiple/hidden substitutions, and there is a large support of statistical theory for likelihood estimation and inference and extensions to Bayesian analysis. However, there are disadvantages: ML may be inconsistent if the model of evolution is miss-specified, it is computationally tedious and intensive, and it is not immediately intuitive.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123739445000274

Population or Point-of-Origin Identification

Einar Eg Nielsen, in Seafood Authenticity and Traceability, 2016

Population Genetics of Marine Organisms

What Is a Genetic Population?

Genetic point-of-origin identification is based on assigning fish back to their genetic or evolutionary population. Under an evolutionary paradigm, a population can be defined as “A group of individuals of the same species living in close enough proximity that any member of the group can potentially mate with any other member” (Waples and Gaggiotti, 2006). That definition is distinct from an ecological paradigm where a population is defined as individuals of the same species that cooccur in an area and potentially interact. Thus the distinction is related to reproduction and a genetic population should therefore be reproductively isolated to some degree from any other genetic population of the species. The term “to some degree” is, however, quite vague and hardly operational. A more quantitative and practical definition of when groups of individuals are different enough to be considered populations is based on the exchange of effective migrants between populations per generation (Nem, see later). If this number is above 25, populations become very difficult to distinguish using standard genetic tools (Waples and Gaggiotti, 2006). So, when populations are practically genetically indistinguishable they are, in this framework, defined as a single population. This distinction between theoretical delimitations of populations to a more application-based definition is important in applying DNA-based identification of populations to seafood.

Evolutionary Forces and Genetic Population Structure

Following evolutionary theory, individuals within reproductively isolated populations are subject to the same evolutionary forces that determine their genetic composition. These are: mutation, migration, random genetic drift, and selection. In popular terms, mutation is the long-term process that generates the genetic raw material in the form of new genetic variants, “alleles” at any gene locus (position of the DNA sequence in question in the genome). Migration of individuals carries those alleles among populations, and if migrants are successfully interbreeding, spread them through the process of “gene flow”. Random genetic drift is the sampling error associated with breeding; that is, if “effectively” few individuals (where Ne is defined as the effective population size) participate in mating, then allele frequencies in the population will change fast and ultimately lead to the loss of allelic variants. Finally, an individual carrying specifically favorable alleles may be at an advantage related to survival and reproduction (fitness), mediated through natural or sexual selection. Thus differential selection pressure among populations can lead to fast changes and large differences in allele frequency between populations. On the relatively short evolutionary timescale often associated with population processes, migration, drift, and selection are the most important processes. The relative impact of the different evolutionary drivers ultimately determines the genetic composition of populations and the genetic differentiation among them. Thus, migration tends to homogenize allele frequencies among populations, while random genetic drift and differential selection acts to differentiate them. As a rule of thumb, small, isolated populations subject to special environmental conditions tend to show the largest genetic differentiation and are therefore most easily distinguished using genetic tools. Genetic differentiation due to population structure is traditionally measured using a fixation index, “FST” (Wright, 1950; Weir and Cockerham, 1984). The index ranges from zero to one, where zero denotes no differentiation and one represents fixation of different alleles among populations. As a measure of scale, FSTs among humans on different continents ranges between 0.1 and 0.15 (Jorde and Wooding, 2004).

Types of Population Structure

A prerequisite in order to be able to use genetics to determine the population of origin of seafood is that the marine organism in question display some sort of genetic structuring of populations. In general, there are many evolutionary and ecological models for population structuring of marine organisms, which are beyond the scope of this chapter. However, three crude categories of significant importance for the identification of origin can be recognized (Laikre et al., 2005; Fig. 2.1). First, there is no genetic differentiation (panmixia) across the geographical regions of interest, that is, that migration and associated gene flow is sufficient to homogenize populations. This means that genetic tools cannot be used for origin identification as the different regions of the species distribution display non-distinguishable genetic compositions. This may, naturally, be an inherent characteristic of the species in question; however, it may also be an artifact of the sampling strategy and/or the genetic and analytical tools applied (for more details see the following sections). Another type is continuous genetic change, that is, allele frequencies shift gradually along a geographical or environmental transect. Accordingly, the genetic compositions at each end of the species distribution are highly genetically differentiated, while intermediate locations display minute and gradual genetic changes. This kind of population structure imposes some problems in relation to determination of origin, as the statistical power associated with referring individuals to specific sites, as opposed to adjacent locations, is expected to be relatively weak. In addition, a significant sampling and genetic typing effort has to be undertaken in order to be able to describe the genetic shape of this “isolation by distance,” that is, to establish whether the continuous change is homogenous across the whole distributional area. The final major type is distinct populations, where migration among populations is sufficiently small to allow the buildup of distinct genetic differences. This type of population structure not only represents the ideal setting for population-based management and conservation, it also represents the optimal structure for population/origin assignment. As all populations in this scenario are geographically defined and genetically distinct, the population of origin of individuals can be inferred with high probability, dependent on the levels of genetic differentiation among populations. However, as the genetic population represents the reproductive unit, different populations may have distributional areas that significantly differ and overlap outside spawning time. In the latter case, the genetically determined population of origin of an individual fish may provide little information on the geographical origin of the sample. Still the mixture composition, using information from a sufficiently large sample of individuals in concert, may be able to provide insights into the geographical origin. This issue is treated in more detail in a subsequent section.

What evolutionary process or processes tend to decrease the variation between separate populations?

Figure 2.1. Three types of population structure for marine organisms (A) no genetic differentiation (B) isolation (C) distinct populations (see text for explanation).

Population Structure of Marine Organisms

The population structure and level of genetic differentiation for important commercial species is of paramount importance for successful origin determination, and subsequently for improved stock management. The level of genetic differentiation among populations (FST) is typically much lower for marine organisms than for freshwater and anadromous species (Fig. 2.2A, redrawn from Ward et al., 1994). Likewise, it is evident (Fig. 2.2B) that the vast majority of marine fish species display very low levels of genetic differentiation among populations (FST < 0.03). The reason for the relatively low levels of genetic differentiation for “classical marine organisms,” including many of our most important commercial species such as clupeoids, gadoids, and scombrids (Nielsen and Kenchington, 2001) relates to a number of inherent characteristics of these species. First, the number of obvious physical barriers in the sea is not so pronounced as in freshwater. For example, while highly mobile marine fish can freely migrate vast distances in the oceans, fish living in a lake are restricted to this particular water body. Likewise, many marine organisms have pelagic eggs and larvae, which can be spread over vast areas by ocean currents before settling. Finally, most marine species have comparatively large (effective) population sizes (Hare et al., 2011) resulting in minute levels of random genetic drift and related low levels of genetic differentiation. However, although it may seem that the oceans are devoid of any physical boundaries, this is not the case. The major oceans are separated by large landmasses, restricting gene flow on a large geographical scale. In addition, factors, such as bathymetry and ocean currents, may serve as barriers to active migration of adult specimens, or act to retain eggs and larvae, so that the juveniles will settle in proximity to the parental population (e.g., see Sinclair and Power, 2015). It has been identified that environmental differences may also restrict migration among populations (Limborg et al., 2009). Thus differences in temperature, salinity, and other environmental factors may define the boundaries between populations. Habitat preference and life history may also restrict gene flow geographically. This phenomenon also sets the scene for the identification of genes subject to differential selection in populations inhabiting different environments. This selection can create vast differences in allele frequencies even in the face of relatively high levels of gene flow, which renders the application of these genes particularly interesting for origin determination. This will be treated in detail in subsequent sections. In conclusion, marine organisms display relatively low levels of genetic differentiation among populations, which in association with the lack of obvious physical boundaries among populations poses a number of challenges for origin determination of seafood products.

What evolutionary process or processes tend to decrease the variation between separate populations?

Figure 2.2. Levels of genetic differentiation among populations of marine fish. (A) Comparison of genetic differentiation (FST) among freshwater, anadromous and marine fish. (B) Distribution of FST values in marine fish.

Redrawn from Ward, R.D., Woodwark, M., Skibinski, D.O.F., 1994. A comparison of genetic diversity levels in marine, freshwater and anadromous fishes. Journal of Fish Biology 44, 213–232.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128015926000085

What decreases genetic variation between populations?

Inbreeding, genetic drift, restricted gene flow, and small population size all contribute to a reduction in genetic diversity. Fragmented and threatened populations are typically exposed to these conditions, which is likely to increase their risk of extinction (Saccheri et al.

Which mechanism of evolution decreases variation in population?

Mechanisms that decrease genetic variation include genetic drift and natural selection.

What process decreases genetic variation?

Two forces affecting genetic variation are genetic drift (which decreases genetic variation within but increases genetic differentiation among local populations) and gene flow (which increases variation within but decreases differentiation among local populations).

Which reduces variation in a population?

Both inbreeding and drift reduce genetic diversity, which has been associated with an increased risk of population extinction, reduced population growth rate, reduced potential for response to environmental change, and decreased disease resistance, which impacts the ability of released individuals to survive and ...