U.S. Dept Commerce/NOAA/NMFS/NWFSC/Publications

Genetic Effects of Straying of Non-Native Hatchery Fish into Natural Populations


Joseph Felsenstein

Department of Genetics
University of Washington, Box 357360
Seattle, WA 98195-7360, U.S.A.


In the first part of this talk, I will briefly review the evolutionary forces acting upon natural populations, and in the second part, show you the results of simulations that illustrate what would happen in natural populations affected by straying from hatchery populations. The results are abstract and idealized, and the simulations have been done in nice symmetrical ways for mathematical convenience, but it is important to realize that the same principles operate in a more complicated way in real-life situations. The task is to understand what the most important principles are and how you can apply them to a real situation.

Processes Influencing Genetic Change

I will mention five basic evolutionary forces. We usually do not think of random mating as an evolutionary force, but it is. Another force is natural selection, in particular natural selection to adapt populations to local conditions. A third force that can potentially change the genetic makeup of a population is mutation. Since mutation occurs about equally everywhere, it is not in and of itself a force making populations different or more similar. Actually, mutation tends to make populations more similar to each other, but it is a minor force. Migration is another important force; for this workshop, we are concerned with hatchery straying, which is one kind of migration. Population geneticists equate migration with gene flow, the actual incorporation of migrant genes into a receiving population, and not just with the physical movement of an individual. The final mechanism that can lead to genetic changes in a population is random change from random births and deaths of individuals. This process is called random genetic drift. I will concentrate on gene flow, random genetic drift, and natural selection.

Gene flow

To illustrate the simple mathematics of gene flow between natural populations, let us imagine that we have a genetic locus (a place on a chromosome) which is variable in a set of populations. The information encoded by the locus occurs in different forms, called alleles. Each fish normally has two copies of a gene, one inherited from each parent, and the frequency of a particular allele in a natural population can be measured as the proportion of all the copies of that gene that are of one allele or another. This proportion is called a gene or allele frequency. Imagine we have a population represented in Figure 1 as a big square, and we find--using one biochemical or molecular technique or another--two alleles, one of which is at a frequency of 0.80, or 80% of the total. In the next generation, imagine that 70% of the individuals in the population stayed in that population, but 20% came from population 2, and 10% came from population 3. In populations 2 and 3, the frequency of the same allele was 0.1 and 0.2, respectively. The allele frequency in population 1 is simply the weighted average of the frequencies in the residents and the migrants.

When the mix of individuals in population 1 begins to mate, let us assume that they mate at random, without regard to where they came from. If so, the basic units are not whole genotypes, but individual alleles that re-assort themselves each generation, and the best way to think about gene flow is to consider the flow of individual alleles rather than the flow of genotypes. Because of the peculiarities of sexual reproduction and random mating, geneticists talk about the frequencies of alleles or genes, which partially determine the frequencies of genotypes in a population. In calculating allelic frequencies in the recipient population, we have to consider migration rates. Migration rates are calculated as the fraction of new migrants in the recipient population, and population size can be important when the donating and receiving populations are very different, as is often the case for salmonid populations. For example, a migration rate of 20% in a small recipient population may represent a much smaller fraction of a large donor population. If you think about the number of individuals leaving a population, you may get the wrong impression about the effects of migration.

Random genetic drift

Suppose that we assume that natural selection is not occurring; that is, we are not favoring one allele or another and that the alleles are passively reproducing themselves at random. Following the frequencies of two alleles from one generation to the next in a population is much like tossing a coin in which each side of the coin represents an allele of the gene. If there are 100 individuals in a population, 200 copies of the gene are present--two copies for each individual. We can simulate random drift by tossing a coin 200 times to get the frequency in the next generation. But instead of having a probability of 0.5 for a particular side of the coin, the probability of getting a particular side in the toss would be the frequency of the allele in the population before reproduction, but after migration. As you know, if you toss a coin several times, you usually do not get the exact proportion that you expect. In a small number of tosses, which simulates a small population, the frequencies vary a lot from the expected. With a large number of tosses (a large population), the frequencies are closer to the expected proportion. Figure 2 illustrates the random changes in allele frequencies that might occur in a population.

In natural populations, randomness arises from three sources: randomness of deaths (some individuals may die early), randomness of births (some pairs may have a lot of offspring and others very few), and randomness of Mendelian segregation of genes during gamete formation (only one of two parental genes occurs in each gamete). These three sources of randomness lead to small changes in allele frequencies in a population from one generation to the next. An important characteristic of drift is that these small changes are cumulative; that is, the starting point for the next generation is the allele frequency of the present generation, and not the frequencies of previous generations. If frequencies change from 0.30 to 0.32 in one generation, the next generation starts from 0.32 and has no memory that the frequency was ever at 0.30. Another characteristic of random drift is that the direction of change is not predetermined. The frequency of each generation can change up or down, so the frequency can randomly 'walk' away from the original frequency, then cross back over it again. If you repeat the same simulation with the same starting allele frequency, you will not get the same path each time.

Natural selection

Natural selection results in the unequal representation of different alleles in the next generation, owing to differences in survival or reproduction between different genotypes. A particular case that is of importance to this symposium is the adaptation of genotypes to the local environment. We might have an allele, A, that is favored in the local environment but not elsewhere. Thus genotype AA might have fitness 1.05, genotype Aa fitness 1.03, and genotype aa 1.00. For population genetic purposes, it does not matter what units we measure fitness in: all that matters is the ratio of the fitnesses of different genotypes. In this case, we have arbitrarily taken genotype aa to have fitness 1.00. Genotype AA has a 5% higher fitness than aa, and Aa has a 3% higher fitness. The quantities 0.05 and 0.03 here are called selection coefficients (s): they give us a quick idea of how strong natural selection is.

If natural selection occurs in a randomly mating population, with no migration or genetic drift, we can easily calculate what happens to the allele frequencies. It will surprise no one that in this case, allele A will continue to increase in frequency until it approaches 1. The speed with which this happens is a function of the selection coefficient. If the selection coefficient is 0.01, it will take hundreds of generations for allele frequencies to change substantially. For example, with the fitnesses I just gave (1.05:1.03:1), it will take about 200 generations for the allele frequency to rise from 0.10 to 0.90. If the selection coefficient is smaller, it will take proportionally longer. For selection coefficients one-tenth as great (1.005:1.003:1), it will take about 2,000 generations instead of 200.

There are many interesting and complex results for more complex patterns of fitness (overdominance, in which the heterozygote has the highest fitness, underdominance, in which it has the lowest fitness, frequency-dependent fitnesses, temporally varying fitnesses, fitnesses dependent on multiple loci, and so on). But we will primarily deal with the simple pattern of local adaptation here.

Combining Evolutionary Forces

Migration and random genetic drift

Let us first combine the effects of migration and random drift. Migration between populations tends to average out allele frequencies so populations become more and more similar, whereas random drift tends to make populations different. Figure 3 shows three populations that are exchanging genes at a particular rate and in some kind of pattern. The allelic frequencies in each population will wander over time as they undergo genetic drift, but the amount and direction of divergence between the populations is constrained by migration between them. If one population reaches a high allele frequency, a high proportion of the migrants into the other two populations will have the high-frequency gene, and migration will tend to pull the frequencies in the other two populations in the same direction. At the same time, random drift--thermal noise like Brownian motion--will tend to pull the frequencies of the three populations apart. The result is that the whole set of populations, or the species as a whole, will change at a slower rate than individual populations.

When migration and genetic drift are operating in the absence of natural selection, the important quantity is four times the effective population size, Ne, times the migration rate m, 4Nem. The effective population size is the population size corrected for other factors that affect the amount of genetic drift expected in the population. These factors include unequal contributions of offspring from different individuals in the population, unequal numbers of males and females, overlapping generations, and several other factors. These factors usually reduce the effective population size and cause more genetic drift. Population genetics theory shows that if 4Nem is much less than one, the populations act more or less independently of one another and allelic frequencies in a set of populations become quite dispersed. If this number is much greater than one, allele frequencies in the populations tend to be similar to one another. Note that Nem, the effective population size times the proportion of migrants coming into a population, is simply the number of migrants. If the number of migrants for a set of populations exchanging migrants is less than one per generation, the populations will tend to drift apart, and this is true whether the sizes of the populations are 100 or 1 million. The importance of genetic drift depends not on the proportion of migrants, but on the number of migrants, and the size of the population is unimportant. This is strange but true.

Population geneticists use abstract models to understand the effects of random drift and migration on sets of populations with specific geographic structures. One such model is called the island model of migration, in which local populations receive immigrants from a pool of migrants drawn from each population. There is really no geographic structure in the model. No two populations are closer to each other than any other two. Another abstract representation of population structure is called the stepping stone model, in which migration is limited to neighboring populations. Stepping stone models can be one-, two-, or even three-dimensional, depending on the biology of the species being considered. More realistic models can also be constructed in which populations can be situated anywhere with specific sizes and specific migration rates. These kinds of models, however, are complicated mathematically and are usually studied with numerical simulations.

Stepping stone migration and natural selection

Some work has been done on models similar to the one I will develop here (Haldane 1930, Hanson 1966), which I call patch swamping. Let us imagine five populations with stepping stone migration between them; that is, each population exchanges migrants only with its two neighbors at a rate m1/2 so that the total fraction of immigrants is m1 (Fig. 4). An end population receives migrants from a hatchery population, also with a migration rate of m1/2. Whatever comes into the population most distant from the hatchery can get there only through the other populations by working its way down the chain of populations. Long-range straying is also possible; I am not sure what kind of gene flow is most important for salmon. In this long-range model of straying, migrants can go into any of the populations. Let us label the exchange rate between neighboring populations as m1, and the long-distance migration rate as m2.

First of all, let us consider an allele at a gene that has an adaptive advantage over other alleles in the local populations. In the absence of migration from the hatchery, this allele will increase in the natural populations to a frequency of 100%, except for the small effects of mutation. Next, let us add the effects of migration from a hatchery population that does not have the favored allele, so that the frequency of this allele in the hatchery is 0%. The pattern of allele frequencies among the populations depends on the relative amounts of local and long-range straying that we expect to see. The 'simulations' reported here are exact calculations, by computer, of the allele frequencies that we would see in the absence of genetic drift.

In the first simulation, we set m2 to 0.10, so that 10% of the fish in the end population are strays from the hatchery. We also set selection to 0.10, so that fish carrying the favored allele have a 10% increase in fitness for each copy of the allele they carry. If a fish is heterozygous with one non-native allele and one favored allele, it is 10% better off than a hatchery fish with two non-native alleles; however, if it is homozygous with two copies of the favored allele, it is 21% better off. In the first generation of the simulation, some of the non-adapted alleles from the hatchery get into the end population, so the frequency of the adapted allele is only 90% in that population (Fig. 5A; I have drawn the hatchery population twice so that its allele frequency is more visible). The frequency of the favored allele is still 100% in the remaining four populations. The simulations then continue for 5,000 generations (1, 10, and 5,000 generations are shown to give you a feel for the rate of change). As the simulation proceeds, the frequency of the non-native allele begins to increase down the chain of natural populations, but it is lower in the more distant populations. The frequency of the favored allele in the most distant population is still close to 100%, so this population is resisting the immigration of the non-adaptive allele from the hatchery. When we set the migration rate to 20%, we get a similar pattern, except that more hatchery alleles appear in the natural populations, and the frequency of the favored allele in the most distant population drops to 98%. At 50% migration, a smooth geographic pattern appears--a cline--and the frequency of the favored allele in the end population is 90% when the system stabilizes. One conclusion from these results is that for a linear string of populations with stepping stone migration, the populations have a tremendous ability to resist migration from hatcheries. But note that the selection coefficient used here was rather large.

What happens with a favored allele with only a 1% selective advantage? Such a selective value is not small in evolutionary terms, and is sufficient to make large changes in allele frequencies over long periods. In real life, however, it is difficult to measure a fitness value of only 1%, because humans can measure far fewer fish than nature can. Researchers are limited to the number of fish they can measure with the sizes of grants usually available from funding agencies, whereas nature measures millions of fish. It is also difficult to get a grant that would last 5,000 generations. We will still use 10% immigration from neighboring populations. These results show that after 5,000 generations, more of the hatchery allele is getting through to the end population, which has a frequency of the favored allele of 66% (Fig. 5B). This shows that immigration of non-adaptive alleles is more effective when selection favoring local adaptation is not strong.

If we increase the amount of migration with a 1% selection coefficient, we see the patch swamping phenomenon. At 20% migration, allele frequencies appear to form a cline after a few generations, but the cline stablizes at very low frequencies. The locally favored allele is still present, but only at a maximum of 20%, and the hatchery allele is getting through to the most distant population (Fig. 5C). At a higher migration rate of 30%, the cline collapses, and at 5,000 generations only a very small frequency of the favorable allele is present in the natural populations (Fig. 5D). The patches of local adaptation have been completely erased by migration from the hatchery into a single end population. This model does not take into consideration that the hatchery straying rate may be much higher than the natural migration rate among wild populations. It also does not account for long-distance migration beyond neighboring populations.

Long-distance migration, random drift, and natural selection

Let us now incorporate long-distance migration, by using a 1% long-range straying rate from the hatchery superimposed on a natural migration rate of 10% between the wild populations. An allele-frequency cline appears, but many more hatchery alleles move into the most distant population than would be the case for no long-distance straying (Fig. 6A). Compare this with the same values for selection and natural migration, but without long-distance straying (Fig. 5A). Long-distance straying dramatically increases the migration of hatchery alleles. An increase in long-distance hatchery straying of 2%, 5%, and 8% progressively depresses the allele-frequency cline among the natural populations, so that the cline has virtually collapsed at 8% long-distance straying, and only a very few adaptive alleles are present in the natural populations (Fig. 6B). The point is that long-distance straying greatly erodes the populations' ability to resist the immigration of non-adaptive alleles, because the non-adaptive alleles can get to the end of chain of populations in one jump without having to travel through the string of populations.

All of these results show that the collapse of the patch of adaptation occurs at a critical ratio of the strength of selection to the migration rate, and depends on which model is used. If the rates of immigration are larger than the difference in the fitness of the adaptive and non-adaptive hatchery gene (m > s), locally adaptive alleles will predictably be swamped by hatchery alleles. Since this occurs locus by locus, allele by allele, a situation could arise in which a local population has several locally adaptive alleles, some strongly favored and others only weakly favored, in the face of some mix of local and long-distance migration. Weakly favored alleles may be replaced by hatchery alleles, but strongly favored alleles may persist in a clinal pattern. Because of this locus-by-locus complexity, fish in the populations along the cline will be made up of a mix of adapted and non-adapted genotypes to varying degrees. If the alleles in the natural populations are neutral to selection and have differentiated among populations because of random drift, then hatchery alleles will push out local alleles and homogenize the frequencies of alleles among the natural populations. So, fortuitous adaptations due to genetic drift will not resist invasion from hatchery alleles. On the other hand, adaptations due to natural selection will resist the invasion of hatchery alleles to the extent that the strength of natural selection is greater than the amount of gene flow.


In the simulations presented here, we have assumed that the effects for one locus are independent of those for other loci. This is not quite true, because loci are often physically linked together on the same chromosome. Slatkin (1975) showed that if two genes are close to each other on a chromosome, and there is little recombination between them, alleles at the two loci will tend to be associated with one another in geographically structured populations. For example, suppose we have two populations: population 1 has all capital A alleles at locus A and B alleles at locus B, and population 2 has all a and b alleles at the two corresponding loci. If individuals from the two populations are mixed, you would find only A-B and a-b chromosomes. After random mating, but with very low rates of recombination because of linkage, you will find not only A-B and a-b chromosomes, but also double heterozygotes with the genotype A-B/a-b and very few recombinant chromosomes, A-b, a-B, which also produce double heterozygotes, A-b/a-B, but with different states of linkage.

Let us assume that the a-b chromosome is from hatchery fish and the A-B chromosome is from adapted wild fish. A correlation appears in the population in which the adapted alleles at one locus are associated with the adapted alleles at the other locus. This association has the effect of helping favored alleles resist migration from non-favored hatchery alleles, because they travel together and natural selection favors chromosomes with both adapted alleles over those with just one adapted allele. Because they are physically linked, selection for one allele is also selection for the other. The strength of selection is as though the two individual selection coefficients are added together. For a chromosome with two linked loci each with alleles having a selection coefficient of 10%, the total strength of selection for that chromosome is 20%. Selection for linked loci provides more resistance to invasion by hatchery alleles than does selection on two similar, but unlinked loci. So to be able to predict the effects of hatchery straying in real life, we would have to know how many genes confer local adaptations, the kind of natural selection favoring them, and the strength of the linkage on the chromosome. In addition, we would have to know how much local and how much long-distance migration is occurring.


The genetic makeup of natural populations is potentially influenced by an interacting mix of evolutionary forces. In the absence of natural selection, the quantity 4Nem, four times the number of migrants, is an important quantity. If Nem is greater than one, then differentiation among natural populations from random genetic drift is unimportant. When natural selection is overlaid on migration and genetic drift, patch swamping will occur when immigration from hatcheries is greater than the strength of locally adapted selection. Patch swamping also occurs more quickly with long-distance hatchery migration than with migration into a single natural population. Linkage between loci with adapted alleles, however, increases a wild population's ability to resist the invasions of non-adapted alleles.


Haldane, J. B. S. 1930. A mathematical theory of natural and artificial selection. VI. Isolation. Proceedings of the Cambridge Philosophical Society 26:220-230.

Hanson, W. D. 1966. Effects of partial isolation (distance), migration, and different fitness requirements among environmental pockets upon steady state gene frequencies. Biometrics 22:453-468.

Slatkin, M. 1975. Gene flow and selection in a two-locus system. Genetics 81:787-802.


Question: Mike Lynch: If we know a specific straying rate, what you showed was the greater the strength of adaptive selection, the lower the equilibrium frequency of deleterious alleles. So from a fishery point of view, the question might be that, given a particular amount of migration and a particular strength of selection, how does a natural population 'feel'? Is this analogous to mutational load where the load on the population does not depend on selection, only on mutation?

Answer: Joe Felsenstein: Yes, the analogy with mutational load is correct. The fitness of a wild population will be controlled in much the same way that the fitness of a population receiving deleterious mutations is reduced. This is the concept of mutational load. The effects of hatchery straying would be called migrational load, and you can use the principle that deleterious alleles will sooner or later be selected out. When deleterious alleles are selected out, you have one reproductive failure (death) for each copy of the deleterious allele that comes into the population. On the other hand, if the individuals being eliminated through selection carry multiple hatchery alleles, then there will be less than one death per allele, because each death eliminates more than one deleterious allele.

If migration is 10% in a set of populations that have formed a cline because the hatchery alleles are being resisted, the fitness of a population is reduced by 10%, and you do not need to know what the actual selection coefficient of the allele is. It is the migration rate that is most important in reducing fitness. However, if the adapted patches are swamped by hatchery alleles, you can 'calculate' the effects by saying that the natural populations used to have an adaptive allele, but do not have it any more. In this case, the reduction in fitness depends on the selection coefficient.

Question: Gary James: You have assumed that hatchery alleles are not favorable in natural habitats, but what if the hatchery alleles do show favorable traits for local adaptation?

Answer: Joe Felsenstein: If that is the case, then only genetic drift is important. With more than 1/4 of a migrant per generation, the natural populations will all have similar allelic frequencies. If not, they will genetically diverge by genetic drift. Allele-frequency clines would not appear, patch swamping would be unimportant, and fitness in the wild populations would not change.

Question: Audience: It seems from what salmon biologists know about hatchery straying that the value of Nem is probably larger than one, so how do we use these principles?

Answer: Joe Felsenstein: If this value is greater than one, the frequencies of selectively neutral genes in the natural populations will be pushed toward the frequencies of these genes in the hatchery population. If wild alleles have higher fitness than hatchery alleles, the effects depend on the balance between the migration rate and the selection coefficient.

Question: Audience: Why did you use 5,000 generations in these simulations? Most salmon populations we work with have not been in the rivers for that long because of events in the Late Pleistocene, and selection has changed over that time and will continue to change in the future.

Answer: Joe Felsenstein: I thought 5,000 generations would be enough to show whatever might happen, but most of the changes took place in tens of generations. In most of these simulations, equilibrium conditions arrived in a short while; I continued the simulations just to make sure. I hope you do not come away with the impression that these are only very long-term problems.

Comment: Robin Waples: One of the limitations of theoretical population genetics is that it is difficult to make the transition from dealing with frequencies of alleles at a single locus to what happens in organisms as a whole, which have thousands of gene loci affecting fitness.

Table of Contents