Escolar Documentos
Profissional Documentos
Cultura Documentos
www.elsevier.com/locate/gene
Abstract
Some DNA data show patterns of variation not expected under the neutral theory. Here, the independent multicodon (IMC) model, a nearly
neutral mutation model assuming no interaction among codons, was studied when population size changes using computer simulation.
Patterns of variation expected under the model were investigated using statistics for the neutrality tests. The average dispersion index is more
than one when population size changes slowly but it never becomes large. The diversity at linked silent site decreases when the strength of
selection is intermediate and the reduction is larger when population size changes slowly. Tajima's (1989. Genetics 123, 585±595) D is
generally negative. Rejections by the Tajima's test occur more frequently if population size changes quickly but the effect of selection is
confounded with the size change itself in this case. If we apply the test of McDonald and Kreitman (1991. Nature 351, 652±654), the rejection
is always in the direction of excess replacement polymorphisms. The rejection probability decreases as the rate of population size changes
decreases. These results show that the predictions of the IMC model are consistent with the pattern observed in mitochondrial DNA data but
not consistent with some data of nuclear DNA. Interaction among codons or variable selection would be necessary to explain such cases.
q 2000 Elsevier Science B.V. All rights reserved.
Keywords: Nearly neutral model; Neutrality tests; Random genetic drift; Bottleneck effect
1. Introduction (see Moriyama and Powell, 1996). These data suggest some
action of selection on protein coding genes. But we still do
Recent advances in molecular techniques allow us to not know what type of selection is acting on these
obtain DNA sequences fairly easily and now data on varia- sequences.
tion of DNA sequences coding for protein genes within and One candidate model involving selection is the nearly
between species have been accumulating. Some of those neutral mutation model originally proposed by Ohta
data show patterns of variation not expected under the (1973, 1992). In the neutral model, selection coef®cients
neutral theory which states that the main cause of evolu- of mutants are assumed to have a bimodal distribution.
tionary change at the molecular level is random ®xation of More concretely, let Ne and s be effective size and selection
selectively neutral or very nearly neutral mutations rather coef®cient of a mutant, respectively. In the neutral model,
than Darwinian selection (Kimura, 1968, 1983). First, a 2Nes is either much larger or smaller than one so that only
large dispersion index that measures the extent of variation very neutral mutations with 2N e s ,, 1 contribute to poly-
in substitution rate was observed in mammals (Gillespie, morphisms in populations and substitutions among species.
1989; Ohta, 1995). Secondly, low levels of diversity at silent On the other hand, in the nearly neutral model, selection
sites were observed in regions of low recombination in coef®cients are distributed continuously and some muta-
Drosophila (Begun and Aquadro, 1992), Lycopersicon tions have s with 2Nes of the order of one. Thus, random
(Stephan and Langley, 1998) and human (Nachman et al., genetic drift and selection both affect fates of mutant genes.
1998). Finally, applications of neutrality tests such as those In the present paper, I report a simulation study on the
of Tajima (1989), Fu and Li (1993), HKA (Hudson et al., sequence pattern expected under the independent multico-
1987) and McDonald and Kreitman (1991) revealed patterns don (IMC) model (Tachida, 2000) when population size
not compatible with predictions of the strict neutral model changes. The model is a variant of the nearly neutral muta-
tion model. With constant population size, it was shown that
Abbreviations: the IMC model, the independent multicodon model the dispersion index was close to one, Tajima's D was
* Corresponding author. Tel.: 181-92-726-4577; fax: 181-92-726-4644. minus and excess replacement polymorphisms compared
E-mail address: htachscb@mbox.nc.kyushu-u.ac.jp (H. Tachida).
0378-1119/00/$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.
PII: S 0378-111 9(00)00475-3
4 H. Tachida / Gene 261 (2000) 3±9
to the neutral expectation were detected by the McDonald zygosity in the equilibrium state is computed to be (M.
and Kreitman (1991) test (Tachida, 2000). However, Iizuka, H. Tachida and H. Matsuda, personal communica-
because the behavior of the nearly neutral mutation models tion)
depends on population size, we need to explore the pattern
1
expected under the model when population size changes. f
t1 1 t2
1 2 A1 A2
1
1 2 A1
t1 1 t2 A2
1 2 A2
t2 1 t1 A1
2. Model and simulation 1
1 1 2N1 u 1 1 2N2 u
We need to specify two aspects, relationship between where
gene and ®tness and population structures, to simulate
protein evolution. 1
Ai
t
1 1 2Ni u i 1 1
2.1. Gene model Ni
The gene model studied is one of the simplest mimick- This formula can be obtained by ®rst noting that the
ing a protein coding gene. A gene consists of n codons probability of population size being Ni at a random time is
each comprising three sites. Each site allows four alleles ti/(t1 1 t2) and then computing the expected homozygosity
and the mutation rate to one of the other three alleles is u/3. applying Eq. (6) of Nei et al. (1975) for each case. As t1 and
The ®rst and second sites determine the amino acid of the t2 get larger compared to N1 and N2, the homozygosity
codon so that there are 16 kinds of amino acids. They are approaches
called replacement sites. The third site is a silent site. The
t1 1 t2 1
®tness of an amino acid speci®ed by a codon is determined 1
t1 1 t2 1 1 2N1 u t1 1 t2 1 1 2N2 u
by drawing a random number from a normal distribution
with variance s 2 at the beginning of a simulation. The as it should. For a given mutation rate, the effective size, Ne,
®tness of a gene is determined multiplicatively by the is de®ned as the size of a population that has constant size
®tness of amino acids at respective codons. Because ®tness and the same homozygosity as the one with changing size.
is determined multiplicatively, there is no interaction The effective size can be computed from Eq. (1). The reason
among codons. Hence, the model is called the independent why the effective size is de®ned in this way is that we can
multicodon (IMC) model (Tachida, 2000) and one of the observe only the current DNA data and have little informa-
simplest nearly neutral models imitating a protein coding tion on past population size. So our strategy is to carry out
gene. Another extreme model is the house-of-cards (®xed) simulation with parameter values that lead to similar
model in which all codons interact (Ohta and Tachida, nucleotide diversity as the observed values.
1990). Between these extreme models, various intermedi-
ate ones can be conceived (Kauffman, 1993; Ohta, 1997) 2.3. Simulation
but as a ®rst step, we stick to the simplest model. Both
advantageous and deleterious mutations occur but most Computer simulation was carried out following the model
mutations become deleterious after population evolves to speci®ed above. After about 5/u generations to avoid effects
high ®tness. We assume that no recombination occurs of the initial state, the population is split into two with the
within a gene. same size. These two populations evolve independently
after the split. Their size changes are also independent.
2.2. Population structure Genes were sampled from two populations with the interval
of 0.05/u generations after the split of the population and
A random mating haploid population with discrete various test statistics for the neutrality were computed. The
generations (the Wright±Fisher model) is assumed but its statistics examined are, the dispersion index (I), nucleotide
size ¯uctuates stochastically in the simulation. For simpli- diversity (p ), Tajima's D and statistics for the McDonald
city the size takes only two values N1 and N2 with N 1 $ N2 . and Kreitman test. In all simulations, N e 500 and
The waiting times for size change are T1 for N1 and T2 for N2 u 10 25 so that 2N e u 0:01. Only cases with N1
and they are assumed to be distributed geometrically with 10N2 and N1 40N2 were examined. For N1 10N2 ,
means t1 and t2, respectively. In the simulation, exponential effects of the rate of size change were investigated. Intensity
random numbers truncated to integers were used as approx- of selection is measured by a parameter a 2N e s and it is
imations to geometrical random numbers for generating Ti. changed from 0.2 to 50. Population parameters used in the
So there are four population parameters, N1, N2, t1 and t2. simulations are listed in Table 1. The number of codons, n,
In order to compare effects of changing population size, I is 400 so that there are 1200 sites in a gene. The number of
de®ne the effective population size based on the average replications was 1000 for Quick, Medium and Slow cases
homozygosity, f, under the in®nite allele model (see and 500 for the Large case. For comparison, the result for
Kimura, 1983) at a site when all sites are neutral. The homo- the case with constant population size (Tachida, 2000) will
H. Tachida / Gene 261 (2000) 3±9 5
Table 1
Parameter values for the population structure used in the simulation
Simulation N1 N2 t1 t2 u
Fig. 7. The rejection probability by the McDonald and Kreitman (1991) test
at the 5% level as functions of a 2N e s. Twenty samples (m 20) were
Fig. 6. The rejection probability by Tajima's test at the 5% level as func- taken from each population at 0.1/u generations after the split of the two
tions of a 2N e s. The sample size is m 50. populations.
8 H. Tachida / Gene 261 (2000) 3±9
4. Conclusion
tions after the split. The McDonald and Kreitman test can
detect the present type of selection very well if population Acknowledgements
size is constant. However, if population size changes, the
rejection probability decreases, especially when it changes This work was supported in part by a grant from Program
slowly. for Promotion of Basic Research Activities for Innovative
In order to see how the rejection occurs, we de®ne a Biosciences (PROBRAIN) and a grant from Uehara
statistics z . Let N( fr), N( fs), N(pr) and N(ps) be numbers Memorial Foundation to H.T.
of ®xed-replacement, ®xed-silent, polymorphic-replace-
ment and polymorphic-synonymous differences, respec-
tively. We de®ne a ratio z as References
N
fr=N
fs Araki, H., Tachida, H., 1997. Bottleneck effect on evolutionary rate in the
z
N
pr=N
ps nearly neutral mutation model. Genetics 147, 907±914.
Ballard, J.W.O., Kreitman, M., 1994. Unraveling selection in the mitochon-
drial genome of Drosophila. Genetics 138, 757±772.
This ratio is expected to be approximately one under the
Begun, D.J., Aquadro, C.F., 1992. Levels of naturally occurring DNA
neutrality. If z , 1, there are excess replacement poly- polymorphism correlate with recombination rates in D. melanogaster.
morphisms. If z . 1, there are excess replacement ®xations. Nature 356, 519±520.
The distribution of log10(z ) when the rejection probability is Bulmer, M., 1989. Estimating the variability of substitution rates. Genetics
highest (a 20:0) is shown in Fig. 8. For comparison, that 123, 615±619.
Charlesworth, B., Morgan, M.T., Charlesworth, D., 1993. The effects of
in the Constant case with a 20:0 is also shown. With the
deleterious mutations on neutral molecular variation. Genetics 134,
current type of selection, the distribution is shifted toward 1289±1303.
negative and wider. Since log10z is negative, the rejection is Fu, Y.-X., Li, W.-H., 1993. Statistical tests of neutrality of mutations.
always in the direction of excess replacement polymorph- Genetics 133, 693±709.
isms. As the size change becomes slower, the distribution is Gillespie, J.H., 1989. Lineage effects and the index of dispersion of mole-
cular evolution. Mol. Biol. Evol. 6, 636±647.
shifted toward positive but z rarely becomes more than one
Gillespie, J.H., 1991. The Causes of Molecular Evolution. Oxford Univer-
even in the case Slow. sity Press, New York.
This pattern of excess replacement polymorphisms Gillespie, J.H., 1997. Junk ain't junk does: neutral alleles in a selected
appears in mitochondrial data but is rarely observed in context. Gene 205, 291±299.
nuclear data. For example, in mitochondrial DNA, log10z Hudson, R.R., Kreitman, M., Aguade, M., 1987. A test of neutral molecular
evolution based on nucleotide data. Genetics 116, 153±159.
is 20.46 in human-chimpanzee (Nachman et al., 1996),
Jukes, T.H., Cantor, C.R., 1969. Evolution of protein molecules. In: Munro,
20.99 in mice (Nachman et al., 1994) and 20.36 (Rand H.N. (Ed.). Mammalian Protein Metabolism, Vol. 3. Academic Press,
et al., 1994) and 20.69 (Ballard and Kreitman, 1994) in New York, NY, pp. 21±132.
Drosophila but it is 0.94 in Adh of Drosophila (McDonald Kaplan, N.L., Hudson, R.R., Langley, C.H., 1989. The `hitchhiking effect'
and Kreitman, 1991). Thus, the IMC model can explain revisited. Genetics 123, 887±899.
Kauffman, S.A., 1993. The Origins of Order. Oxford University Press,
results of the McDonald and Kreitman test for mitochon-
Oxford.
drial DNA data but can not explain those seen in nuclear Kimura, M., 1968. Evolutionary rate at the molecular level. Nature 217,
data. 624±626.
H. Tachida / Gene 261 (2000) 3±9 9
Kimura, M., 1983. The Neutral Theory of Molecular Evolution. Cambridge Ohta, T., 1995. Synonymous and nonsynonymous substitutions in mamma-
University Press, Cambridge. lian genes and the nearly neutral theory. J. Mol. Evol. 40, 56±63.
McDonald, J.H., Kreitman, M., 1991. Adaptive protein evolution at the Adh Ohta, T., 1997. The meaning of near-neutrality at coding and non-coding
locus in Drosophila. Nature 351, 652±654. regions. Gene 205, 261±267.
Moriyama, E.N., Powell, J.R., 1996. Intraspeci®c nuclear DNA variation in Ohta, T., Tachida, H., 1990. Theoretical study of near neutrality I. Hetero-
Drosophila. Mol. Biol. Evol. 13, 261±277. zygosity and rate of mutant substitution. Genetics 126, 219±229.
Nachman, M.W., Boyer, S.N., Aquadro, C.F., 1994. Nonneutral evolution Przeworski, M., Charlesworth, B., Wall, J.D., 1999. Genealogies and weak
at the mitochondrial NADH dehydrogenase subunit 3 gene in mice. purifying selection. Mol. Biol. Evol. 16, 246±252.
Proc. Natl. Acad. Sci. USA 91, 6364±6368. Rand, D.M., Dorfsman, M., Kan, L.M., 1994. Neutral and nonneutral
Nachman, M.W., Brown, W.M., Stoneking, M., Aquadro, C.F., 1996. evolution of Drosophila mitochondrial DNA. Genetics 138, 741±756.
Nonneutral mitochondrial DNA variation in humans and chimpanzees. Stephan, W., Langley, C.H., 1998. DNA polymorphism in Lycopersicon
Genetics 142, 953±963. and crossing-over per physical length. Genetics 150, 1585±1593.
Nachman, M.W., Bauer, V.L., Crowell, S.L., Aquadro, C.F., 1998. DNA Tachida, H., 2000. Molecular evolution in a multisite nearly neutral muta-
variability and recombination rates at X-linked loci in humans. Genetics tion model. J. Mol. Evol. 50, 69±81.
150, 1133±1141. Tajima, F., 1989. Statistical method for testing the neutral mutation hypoth-
Nei, M., Maruyama, T., Chakraborty, R., 1975. The bottleneck effect and esis. Genetics 123, 585±595.
genetic variability in populations. Evolution 29, 1±10. Tajima, F., 1993. Statistical analysis of DNA polymorphism. Jpn. J. Genet.
Ohta, T., 1973. Slightly deleterious mutant substitutions in evolution. 8, 567±595.
Nature 246, 96±98. Zeng, L.-W., Comeron, J.M., Chen, B., Kreitman, M., 1998. The molecular
Ohta, T., 1992. The nearly neutral theory of molecular evolution. Ann. Rev. clock revisited: the rate of synonymous vs. replacement change in
Syst. Ecol. 23, 263±286. Drosophila. Genetica 102/103, 369±382.