Escolar Documentos
Profissional Documentos
Cultura Documentos
A Writing Project
Presented to
The Faculty of the Department of Mathematics
San José State University
In Partial Fulfillment
of the Requirements for the Degree
Master of Arts
by
Simon A. Ward
May 2009
c 2009
Simon A. Ward
ALL RIGHTS RESERVED
APPROVED FOR THE DEPARTMENT OF MATHEMATICS
by Simon A. Ward
The divergence or convergence of the series of prime reciprocals was finally resolved
by Euler in 1744. Euler showed directly that this series is divergent, which shed some light
on the density of primes. The divergence shows the primes are not so few such that the sum
of their reciprocals converges. The asymptotic formula for the partial sums of reciprocals of
primes is the cornerstone of Mertens’ theorem. The formula is derived and the constant in the
formula is analyzed to show that the probability that a random integer is prime decreases to
zero with large numbers. This probability estimate is known as the product form of Mertens’
theorem. Many proofs of the divergence of the series of prime reciprocals are reviewed in
detail, including modern proofs by Erdös, Dux, Clarkson and Niven. Chebyshev’s theorem
provides an upper and lower bound for the number of primes up to any given number.
The proof of this theorem is explained in detail, and the divergence of the series of prime
reciprocals is shown to be a consequence. This paper details many of the important results
in the history of the development of the distribution of prime numbers. The sum of prime
reciprocals is shown from many different approaches; and along the way many supporting
lemmas and other useful results in elementary number theory are discussed and proved. The
reader will come away with a good understanding of the problem of counting prime numbers
and a motivation for understanding and proving the prime number theorem.
ACKNOWLDEGEMENTS
I would like to thank Daniel Goldston for the guidance and resources he provided. Thank
you to committee members Marylin Blockus and Mohammed Saleem for many helpful sug-
gestions regarding this project. Thank you to my wife Nora and my family for their support
while I completed this project.
v
TABLE OF CONTENTS
§1: Introduction
1.1: Prime Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2: A Formula for Prime Numbers . . . . . . . . . . . . . . . . . . . . . . 2
1.3: The Prime Number Theorem . . . . . . . . . . . . . . . . . . . . . . 4
1.4: Euler’s Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5: The Divergence of the Sum of Prime Reciprocals . . . . . . . . . . . . . . 7
P 1
§2: The Divergence of p
p
2.1: Euler’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2: Erdös’ Proof. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3: Dux’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4: Clarkson’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5: Niven’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
§3: Chebyshev’s Theorem
3.1: Theorem (Chebyshev). . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2: The Chebyshev Function Theorem. . . . . . . . . . . . . . . . . . . . 17
3.3: Proof of Chebyshev’s Theorem . . . . . . . . . . . . . . . . . . . . . 18
3.4: Another Small Step Towards the Prime Number Theorem . . . . . . . . . . 22
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
vi
LIST OF FIGURES
Figure
vii
§1: Introduction
The prime numbers are the natural numbers which have exactly two unique divisors,
one and themselves. The Chinese thought of the prime numbers as “macho” numbers,
which attempted to resist any attempt to break them down into a product of smaller
integers [17]. From the fundamental theorem of arithmetic, we can conclude that the prime
numbers are the most basic elements of the integers. That is, every positive integer greater
than 1 can be written uniquely as a product of primes, with the prime factors in the
product written in nondecreasing order [15]. A composite number is a number greater than
1 that is not prime. Thus 1 is the only natural number which is neither prime nor
composite. The integers also form an integral domain, whose field of quotients forms the
rational numbers, which in turn can be used to construct the real numbers [11]. The
fundamental importance of prime numbers to mathematics makes them worthy of special
study.
The ancient civilizations which emerged in Babylon, Egypt and China have left
evidence that they had an understanding of prime numbers. However, the properties of
prime numbers were first formally documented by the Pythagoreans before 500 B.C. in
Greece [21]. In the third century BC, the Greek Eratosthenes invented his famous sieve,
which could, with relative ease, determine all the primes up to a given number. For
example, suppose we want to create a list of primes up to a modest number, say 100. The
Sieve of Eratosthenes works by starting with 2, and deleting all numbers greater than 2
that are multiples of 2 (even numbers), all numbers
√ greater than 3 that are multiples of 3,
and for each successive prime not exceeding 100 = 10, deleting all multiples of that
prime. We do not need to consider a prime greater than 10 since any composite number
k > 10 with prime factors greater than 10 would exceed 100 (the first such integer being
112 = 121). The numbers that remain in your list must be prime. This method allowed
mathematicians from antiquity to begin making tables of prime numbers [17]. Around the
same time as Eratosthenes, Euclid’s book Elements was published. This book contained a
prove that there are infinitely many primes. The elegance of the proof allows us to present
it here in one short paragraph.
Suppose there are only n primes, and consider the sequence of primes 2, 3, 5, . . . , pn .
Take the product of these n primes 2 · 3 · · · pn−1 · pn , and then add one to it, forming the
integer N =2 · 3 · · · ·pn−1 · pn +1. By the fundamental theorem of arithmetic, N must have a
prime divisor. But if the prime divisor is any one of the pn , then pn must divide
N − 2 · 3 · · · ·pn−1 · pn = 1. This implies pn divides one, which is a contradiction. Thus our
assumption that there are finitely many primes is false, and hence there must be infinitely
many primes. Euclid was responsible for the only known proof of the infinitude of the
primes for over 2,000 years!
1
can one find a formula involving n which will produce the nth prime? Mathematicians have
tried for centuries to find such a formula, however no formula is known. It is highly likely
that no such formula exists, since the gaps between consecutive primes are apparently
random. One might then search for a weaker function, that is a function that assumes only
prime values, not necessarily sequentially. Over the course of mathematical history, there
have been a few formulas that have been conjectured to be prime valued. One such
function was proposed by Pierre de Fermat (1605-1665) to always be prime valued, and
n
produces the nth “Fermat” number Fn = 22 + 1. The first four Fermat numbers 5,17, 257,
65,537, are all prime. The reader might recognize Fn from a well known proof by Karl
Gauss (1777-1855), that if Fn is prime, then a regular polygon of Fn sides can be inscribed
in a circle with only a compass and straightedge. Fermat’s conjecture was proved false by
Euler in 1732, when he showed that 641 is a divisor of the fifth Fermat number
Fn = 232 + 1 = 4, 294, 967, 297. Another theorem from Fermat, provides a way to find
divisors of very large numbers. [15]
Fermat’s Little Theorem: If p is a prime and a is a positive integer with (a, p) = 1, then
ap−1 ≡ 1 mod p, where (m, n) denotes the greatest common divisor of m and n.
Proof: Consider the p − 1 integers, a, 2a, 3a, . . . , (p − 1)a. None of these integers are
divisible by p since if p|ka, then since (a, p) = 1, p|k. But 1 ≤ k ≤ (p − 1), and hence it is
not possible that p|k. Furthermore, no two of these integers are congruent mod p. For if
sa ≡ ta mod p, then since (a, p) = 1, s ≡ t mod p. But 1 ≤ s < t ≤ p − 1, which makes
their equivalence impossible. Since none of the integers are divisible by p, or equivalent to
each other, they represent in some order the equivalence classes mod p. That is, they are
equivalent to 1, 2, . . . , p − 1. Therefore, by the multiplication properties of congruence, we
have ap−1 (p − 1)! ≡ (p − 1)! mod p. Now since ((p − 1)!, p) = 1, it follows that ap−1 ≡ 1
mod p.
The power of this theorem is that it gives information about the divisors of very large
integers, for Fermat’s little theorem states that ap−1 − 1 is divisible by p, whenever
(a, p) = 1. For example, since (6, 97) = 1, 696 − 1 is divisible by 97, and therefore composite.
Another formula that is used today to compute the largest known primes is the
formula 2n − 1. This function is certainly not always prime valued, since 24 − 1 = 15 = 3 · 5.
However, when this number is prime, then n must be a prime by the following [10].
Theorem: If n > 1 and an − 1 is prime, then a = 2, and n is prime.
Proof: an − 1 = (a − 1)(an−1 + an−2 + · · · + a + 1), and hence if a 6= 2 then an − 1 is
composite. Therefore a = 2. If n = st where 1 < s ≤ t < n, then
2n − 1 = (2s )t − 1 = (2s − 1)((2s )t−1 + (2s )t−2 + · · · + 2 + 1). Since this number is prime, we
must have s = 1. This is a contradiction, and hence n is prime.
When 2p − 1 is prime, it is known as a “Mersenne” prime, named after the French
monk Marin Mersenne (1588-1648). Mersenne claimed that this formula gave prime
numbers for n = 2, 3, 5, 7, 13, 17, 19, 31, 67, 127 and 257. It turns out that 67 and 257 do not
give Mersenne primes. However, once a new prime P is found, then one knows that 2P − 1
is a good candidate for a large prime. The largest known prime to date is the Mersenne
prime 243,112,609 − 1 [18] which has about 12, 900, 000 digits!
2
Other formulas that have been conjectured to be only prime valued are n2 − n + 41,
which is prime for all 0 ≤ n ≤ 40, and n2 − 79n + 1601, which is prime for all 0 ≤ n ≤ 79.
One can stop the search for a prime valued polynomial function with integral coefficients
by the following [10]
Theorem:
No nonconstant polynomial f (n) with integral coefficients, can be prime for all n, or for all
sufficiently large n.
Proof: We can assume that the leading coefficient of f (n) is positive, so that f (n) → ∞ as
n → ∞. Suppose that for n > N , for some large N , and f (n) > 1, then
Then
f (kM + n) = as (kM + n)s + as−1 (kM + n)s−1 + · · · + a1 (kM + n) + a0
is divisible by M for every integer k.
Hence f (n) is composite for infinitely many values. It follows that no such polynomial can
be strictly prime valued.
Not to be dissuaded, one might search for a function whose range contains an infinity
of primes. A trivial example is f (n) = n. G. Lejeune Dirichlet (1805-1859) has given us the
following.
Theorem:
If a and b are relatively prime, then there are infinitely many primes of the form an + b.
There is no known simple proof of this theorem. Dirichlet’s original proof used complex
variables. A very complicated elementary proof was found in the 1950’s by Alte Selberg (b.
1917) [15]. We will prove a special case here, that is
Theorem: There are infinitely many primes of the form 4n + 3.
To prove this theorem we will require the following lemma.
Lemma: If a and b are integers of the form 4n + 1, then ab is also of this form.
Proof: If a = 4r + 1, and b = 4s + 1, then ab = 16rs + 4r + 4s + 1 = 4(4rs + r + s) + 1,
which is of the form 4n + 1.
Now we shall prove the desired result.
Proof: Suppose there are only finitely many primes of the form 4n + 3, say
p0 = 3, p1 , . . . , pk . Let
Q = 4p1 · p2 · · · pk + 3.
Then by the fundamental theorem of arithmetic, Q must have a prime factorization. At
least one of the primes in this factorization must be of the form 4n + 3; because the odd
primes must be be in either the residue classes {1} or {3} mod 4. If none were in {3}, the
lemma above implies Q is also of the form 4n + 1, which would be a contradiction. Now,
none of the p0 = 3, p1 , . . . , pk divides Q. 3 does not divide Q for if 3|Q then
3
3|(4p1 · p2 · · · pk ), which would lead to a contradiction. Similarly, none of the pk can divide
Q, because if pk |Q, then pk |3, which is clearly a contradiction. Therefore there are
infinitely many primes of the form 4n + 3. It is still an open question whether simple
quadratic forms such as n2 + 1 assume infinitely many prime values.
In the last 50 years or so, several techniques have been developed for generating prime
numbers involving sieving and systems of Diophantine equations. However the formulas are
very complicated and not practical for use [9].
N
This sum is much more accurate than . However as the prime number theorem
log N
4
N
states, they are both asymptotic to π(N ). The formula is easier to use, and when
log N
dealing with large enough numbers is usually sufficient for the estimation of π(N ).
175
150
125
100
75
50
25
600
500
400
300
200
100
5
1.4 Euler’s Product
After the fall of Ancient Greece, history saw little advancement in the theory of prime
numbers until the 17th century with significant contributions from Fermat and Mersennes,
and later when the Swiss mathematician Leonhard Euler (1707-1783) published Variae
obsevationes circa series infinitas in 1744 [16]. Euler introduced his famous product that
now bears his name,
∞ −1
X 1 Y 1
= 1− . (1.4.1)
n=1
n p
p
Throughout this paper the index p runs over all the primes. Fortunately, there were a few
important results from medeival mathematics, and perhaps the most important result is
due to Nicole Oresme (1323-1382), that the summation on the left hand side of (1.4.1) or
the “harmonic” series, is divergent. Oresme noticed an inequality when he doubled the
index of the partial sums,
1
S2 = 1 +
2
1 1 1 1 1 1
S4 = 1 + + + > 1 + + +
2 3 4 2 |4 {z 4}
1
2
1 1 1 1 1 1 1 1 1 1 1 1 1 1
S8 = 1 + + + + + + + >1+ + + + + + +
2 3 4 5 6 7 8 2 |4 {z 4} |8 8 {z 8 8}
1 1
2 2
..
.
n
S2n > 1 +
2
and since n was arbitrary, the series diverges. From this known result, (1.4.1) yielded the
first new proof of the infinitude of prime numbers in two millennia. This is due to the
simple fact that for the product on the right hand side to diverge, we must necessarily have
infinitely many indices (primes) in the product.
Euler’s product (1.4.1) is actually not very difficult to prove. The way Euler did it, is
he started by supposing that the harmonic sum converges to S, that is
1 1 1 1
S =1+ + + + + ···
2 3 4 5
and dividing by two
S 1 1 1 1 1
= + + + + + ···
2 2 4 6 8 10
therefore
S 1 1 1 1
= 1 + + + + + ··· (1.4.2)
2 3 5 7 9
Since all the terms in (1.4.2) have odd denominators, in a similar fashion to the Sieve of
Eratosthenes, we can eliminate all denominators that are multiples of 3 by dividing (1.4.2)
by 3, to get
S 1 1 1 1
= + + + + ··· (1.4.3)
2·3 3 9 15 21
6
then subtracting (1.4.3) from (1.4.2) to obtain
1·2 1 1 1 1
S =1+ + + + + ···
2·3 5 7 11 13
continuing this process indefinitely, Euler obtained
1 · 2 · 4 · · · (p − 1) · · ·
S = 1,
2 · 3 · 5···p···
cross multiplying gives
∞ −1
2 · 3 · 5···p··· X 1 Y 1
S= = = 1− ,
2 · 4 · · · (p − 1) · · · n=1 n p
p
which is (1.4.1).
Euler generalized (1.4.1) for any real number s > 1 [4]
∞ Y −1
X 1 1
= 1− s . (1.4.4)
n=1
ns p
p
This product opened up many doors into exploring the divergence of the prime numbers.
The left hand side is the Riemann zeta-function evaluated at s. When Bernhard Riemann
(1822-1866) began plugging complex numbers in for s in (1.4.4), a whole new mathematical
landscape opened up. Riemann found that information in the complex zeros of the
zeta-function could be used to find the best known formula for counting primes up to N ,
and best of all, he found information about the zeros which corresponded to the exact error
in his formula for the number of primes up to N . He had found it; an exact formula for the
number of primes up to any given N . To complete this program, one needs to prove what
is now known as the “Riemann Hypothesis” that is, the real part of any non-trivial zero of
1
the Riemann zeta-function is [17]. The proof of the hypothesis is so challenging that a
2
correct proof will win a $1,000,000 reward offered by the Clay institute. This paper will
make no further attempt to describe the very deep topic of the Riemann Hypothesis. We
have noted it here because it is at the crux of the distribution of the primes, and it
illustrates the importance of Euler’s identity which will come in to play in some of the
proofs in this paper.
The divergence of this series is related to the distribution of the primes. While we know
that the harmonic series is divergent, and from the prime number theorem, that primes
become increasingly rare, the divergence of (1.5.1) shows us the primes do not thin out fast
7
enough for the sum to converge. On the other hand, since (1.5.1) diverges, then in a sense
of contributing to a sum, there would have to be more prime numbers than square numbers
since we know from another well known result of Euler [16] that
∞
X 1 π2
= .
n=1
n2 6
Euler succeeded in showing the divergence of (1.5.1), and many mathematicians have
added more proofs. Each proof is interesting in its own right since each offers a different
strategy and insight to the problem. In this paper we will present many proofs of the
divergence of (1.5.1). We have selected proofs from: Euler, Erdös, Dux, Clarkson, Niven
and Chebyshev. Knowing the divergence of (1.5.1) is important. However, a theorem due
to the Polish mathematician Franz Mertens (1840-1927) not only shows the divergence of
(1.5.1), but explains exactly how (1.5.1) diverges. A section of this paper will be devoted to
proving this theorem. There are actually four results that are collectively referred to as
Mertens’ theorem: as N → ∞ we have
X Λ(n) X log p
= log N + O(1); = log N + O(1), (1.5.2)
n≤N
n p≤N
p
Z N
ψ(t)
dt = log N + O(1), (1.5.3)
1 t2
X1
1
= log log N + C + O , (1.5.4)
p≤N
p log N
Y e−γ
1
1− = (1 + o(1)), (1.5.5)
p≤N
p log N
where Λ(N ) is the “Von Mongoldt” function, ψ(N ) is one of “Chebyshev’s” functions
(section 3), γ is the Euler-Mascheroni constant, and C is a constant (sections 4 & 5). A
function f (N ) is “big O” of g(N ), or O(g(N )) if and only if there exist constants K > 0
and N0 such that |f (N )| ≤ K|g(N )| for all N ≥ N0 . A function f (N ) is “little
o” of g(N ),
f (N )
or o(g(N )), if and only if for any K > 0, there exists N0 such that < K for all
g(N )
N ≥ N0 . Notice that theorem (1.5.4) shows for large N , the series of prime reciprocals
diverges as log log N plus a constant C, where C ≈ .26 [13]. Theorem (1.5.5) follows from
(1.5.4) and gives an intuitive probability of a random integer being prime, and will be
referred to as the “product form of Mertens’ theorem.” All of the theorems (1.5.2)-(1.5.5)
will be proved in sections 4 & 5 of this paper.
8
S
1.25
0.75
0.5
0.25
N
5 10 15 20
1
Fig.1.5.1: The stars are the graph of {log log(N + 1) + .26}, and the diamonds are the graph of
P
p≤N ,
p
for 1 ≤ N ≤ 20.
S
1.8
1.6
1.4
1.2
0.8
N
20 40 60 80 100
1
Fig.1.5.2: The stars are the graph of {log log(N + 1) + .26}, and the diamonds are the graph of
P
p≤N ,
p
for 1 ≤ N ≤ 100.
Fig.1.5.1 and Fig.1.5.2 provide evidence that the convergence of these functions
happens rather quickly, with the functions being very close at only N = 100. At N = 1, 000
the graphs are virtually indistinguishable.
9
P 1
§2: The Divergence of p
p
hence
Y XN
1 1 1 1
1 + + 2 + ··· + m ≥ . (2.1.4)
p≤N
p p p n=1
n
Now, the harmonic series on the right hand side of (2.1.4) is divergent from (1.4.1), which
shows P (N ) is divergent, but in order to show the divergence of the series, we need to use
the following from integral calculus,
N Z N
X 1 dt
> = log N. (2.1.5)
n=1
n 1 t
10
−1
Q 1
From (2.1.3), (2.1.4) and (2.1.5), we obtain P (N ) > log N , hence p 1− diverges
p
as N grows without bound.
To prove the divergence of the series, first consider the Maclaurin series expansion of
x2 x3 (−1)n−1 xn
log(1 + x) = x − + − ··· + + · · · . Clearly
2 3 n
x2 x3 xn
1
log =x+ + + ··· + + ··· (2.1.6)
1−x 2 3 n
which is convergent by the ratio test for all x satisfying 0 < x < 1. Hence from (2.1.6) we
obtain
x2 x3 xn x2
1
1 + x + x2 · · · + xn + · · · , (2.1.7)
log −x= + + ··· + + ··· <
1−x 2 3 n 2
since the geometric series on the right hand side is convergent, we have
x2
1
log −x< , 0 < x < 1. (2.1.8)
1−x 2(1 − x)
1
Now setting x = and adding the inequalities over all p ≤ N , we obtain
p
∞
1X 1 1X 1 1
log P (N ) − S(N ) < < = . (2.1.9)
2 p≤N p(p − 1) 2 n=2 n(n − 1) 2
The equality on the right hand side follows from the convergent telescoping series. Now
from P (N ) > log N and rearranging (2.1.9) we get that
1 1
log log N − < log P (N ) − < S(N ) ,
2 2
P 1
so that p is divergent.
p
11
2.2 Erdös’ Proof
Paul Erdös (1913-1996) published the following proof in 1938 [6]. This proof is notable
for its lack of series manipulations. It is the first of four proofs by contradiction that I will
review. For the next four sections, slight variations of the following definitions will be used:
Let a, b be given positive integers. Let
P = {p : a ≤ p ≤ b}, M = {n ∈ Z+ : p | n ⇒ p ∈ P }, Mn = {m ∈ M : m ≤ n}.
Let m ∈ Mn , from the prime factorization of m, and parity of the exponents, we can always
write m = k 2 r where r is square-free. That is
Y
r= p, where S ⊂ P.
S
Given any finite set A of n elements, there are 2n possible subsets. This is easily seen by
induction; because adding one more element to the set doubles the number of subsets.
Since P is a finite set of primes, there are 2|P | − 1 non empty subsets of P . Including the
possibility that r = 1, there are√2|P | possible
√ values for r. Since√m = k 2 r, we have
k 2 ≤ m ≤ n which √ implies k ≤ m ≤ n, so that there are ≤ n possible values of k, and
thus |Mn | ≤ 2|P | n.
n
It is clear that for any fixed prime p, there can be no more than integers less than n
p
that are divisible by p. Since the number of integers less than n that are not divisible by
any p ∈ P is X n n
n − |Mn | ≤ < .
n≥p>b
p 2
So that √ √
n
< |Mn | < 2|P | n or n < 2|P |+1 ,
2
which is clearly a contradiction for large enough n.
12
2.3 Dux’s Proof
This proof was published in 1956 by Erich Dux [6]. This proof is interesting since it
involves rearranging the harmonic series. Borrowing from Erdös’s proof, the sets used in
this proof are defined as:
P = {p : 1 < p ≤ b}, M = {n : p | n ⇒ p ∈ P }.
P1
Assume that is convergent, then there must be a b large enough so that
p
1
= A < 1. Dux defines M 0 = {n0 > 1 : p | n0 ⇒ p > b} and M 00 = M
P f∩ M f0 , (where
p>b
p
f is the complement of M ) that is M 00 is all integers that have prime divisors in both M
M
and M 0 . Since P is a finite set of primes, we have
X 1 Y Y −1
1 1 1
≤ 1 + + 2 + ··· = 1− < ∞,
M
n P
p p P
p
since each term in the sum on the left hand side, must appear at least once as a term in
the expansion of the product on the right hand side. This is due to the definition of M as
all integers divisible only by primes less than b. By a similar argument, we find an upper
bound for sums over M 0 , that is
!2
X 1 X1 X1 A
0
≤ + + ··· = < ∞,
M 0
n p>b
p p>b
p 1 − A
by the initial assumption of convergence, and the formula for the sum of an infinite
geometric series. Now, it follows that
X 1 X1X 1 X 1
= − < ∞,
M 00
n00 M
n M 0 n0 M0
n0
since 1 ∈ M , Dux subtracts off reciprocals over M 0 , to ensure a sum over integers which are
in M 00 . Since N = M ∪ M 0 ∪ M 00 and M ∩ M 0 ∩ M 00 = ∅, we must have
∞
X 1 X1 X 1 X 1
= + + < ∞,
n=1
n M
n M0
n0 M 00 n00
which is a contradiction of the divergence of the harmonic series. Therefore our initial
P1
assumption was false and diverges.
p
13
2.4 Clarkson’s Proof
James Clarkson published the following proof in 1966 [6]. This proof is similar to the
proof by Erdös, but with an interesting twist. Clarkson’s proof employs the trick that
Euclid used in his proof that there are infinitely many primes. This proof only requires the
set P from the previous two proofs, that is P = {p : a ≤ p ≤ b}. Start by assuming that
P 1
p is convergent. Therefore, there must be a large enough a such that
p
X1 1
<
P
p 2
then for any fixed r, there is a large enough b such that all primes which divide 1 + iQ for
1 ≤ i ≤ r are in P , since as in Euclid’s proof, p < a implies that p - 1 + iQ. Since each term
of r
X 1
i=1
1 + iQ
which has a denominator which is a product of j (not necessarily distinct) primes, occurs
at least once in the expansion of !j
X1
< 2−j
P
p
by assumption and repeated application of the multiplication property of inequalities.
Summing over all j ≥ 1, we get
r
X 1 X
< 2−j = 1.
i=1
1 + iQ j≥1
Since the right hand side is the geometric series with first term 2−1 . On the other hand,
1 1
since > , and
1 + iQ 2Qi
r r r
X 1 X 1 1 X1
> = ;
i=1
1 + iQ i=1
2iQ 2Q i=1 i
because r was arbitrary, the series diverges as a fraction of the harmonic series, a
contradiction to the upper bound of 1 for large enough r.
14
2.5 Niven’s Proof
The following is a proof published by Ivan Niven in 1971 [14]. This proof is very short
and employs the series expansion for ex . As we saw in Erdös’ proof, every positive integer
can be expressed as a square-free integer r and a square k 2 . Let n be any positive integer
and S = {r < n : r is square-free}. Then we have
! !
X1 X 1 X1
≥ .
S
r j<n
j2 q<n
q
The inequality follows from the fact that every term on the right hand side will appear at
least once as a term in the expansion of the product on the left hand side. Now the second
sum is p-series convergent, and is therefore bounded, but since the right hand side is the
unbounded harmonic series as n → ∞, the first sum over square-free integers must be
P1
unbounded. Now suppose that converges to β. From the Maclaurin series expansion
p
x2
ex = 1 + x + + . . ., we obtain ex > 1 + x for x > 0. Now the last chain of inequalities
2
follows
P 1 Y 1 Y 1
X
1
β
e >e p<n p
= ep > 1+ ≥ ,
p<n p<n
p S
m
P1
which contradicts the unboundedness of the series over square-free integers, hence
p
diverges.
15
§3: Chebyshev’s Theorem
and X
ψ(N ) = log p.
1≤m,p
pm ≤N
Here, if m is the largest integer such that pm ≤ N , then log p occurs exactly m times in the
sum. For example, ψ(9) = 3 log 2 + 2 log 3 + log 5 + log 7. It can be easily seen that eθ(N ) is
the product of all primes less than or equal to N , and eψ(N ) is the least common multiple of
all positive integers ≤ N.
If we let m be the largest integer such that 2m ≤ N , there are no primes p > 2 such
log N log N
that pm ≤ N. This is so because if p > 2, such that pm ≤ N , then m ≤ < .
log p log 2
log N
This is a contradiction since m ≤ , thus pm > N . Now, if we let m − in be the largest
log 2
1
integer such that pm−i
n
n
≤ N , then pn ≤ N m−in , where pn is the nth prime for which
pn ≤ N . It follows that in the formula for ψ(N ), log pn occurs exactly m − in times. On the
other hand, log pn occurs exactly m − in times in the sum
1 1 1 1
θ(N ) + θ(N 2 ) + θ(N 3 ) + · · · + θ(N m−in ) + · · · + θ(N m ),
hence
1 1 1
ψ(N ) = θ(N ) + θ(N 2 ) + θ(N 3 ) + · · · + θ(N m ). (3.1.2)
1 log N
This sum terminates since θ(N ) = 0 for N < 2, and specifically, N k < 2 when k > .
log 2
log n
If pm ≤ N < pm+1 , N ≥ 1, then m log p ≤ log N < (m + 1) log p, and m ≤ < m + 1,
log p
16
log N
hence log p occurs exactly m times in ψ(N ), and m = . Thus there is another way
log p
to express ψ(N ), that is
X log N
ψ(N ) = · log p. (3.1.3)
p≤N
log p
It turns out that Chebyshev’s functions are closely related to π(N ), as seen in the following
theorem, which is crucial in the proof of (3.1.1).
3.2 The Chebyshev Function Theorem
Let
π(N ) π(N )
l1 = lim , L1 = lim ,
N N
log N log N
θ(N ) θ(N )
l2 = lim , L2 = lim ,
N N
ψ(N ) ψ(N )
l3 = lim , L3 = lim .
N N
Then l1 = l2 = l3 , and L1 = L2 = L3 .
Proof. From (3.1.2), we get that θ(N ) ≤ ψ(N ), and from (3.1.3) that
X log N X
ψ(N ) ≤ · log p = log N 1,
p≤N
log p p≤N
that is ψ(N ) ≤ π(N ) log N, therefore θ(N ) ≤ ψ(N ) ≤ π(N ) log N. Dividing through by N ,
we obtain
L2 ≤ L3 ≤ L1 . (3.2.1)
Next, choose a constant real number α, such that 0 < α < 1. Let N > 1, then
X
θ(N ) ≥ log p,
N α <p≤N
which implies that θ(N ) ≥ α log N (π(N ) − π(N α )). Since there can be no more primes
than there are integers, we have π(N α ) < N α , so that θ(N ) > απ(N ) log N − αN α log N,
dividing through by N, we get
17
log N
Now since 0 < α < 1, it follows that 1−α → 0, as N → ∞. So it follows that L2 ≥ αL1 ,
N
for every real number α, such that 0 < α < 1. Therefore L2 ≥ L1 , and combining this with
(3.2.1) we can conclude that L1 = L2 = L3 .
The proof that l1 = l2 = l3 is similar. From the same inequality by which we obtained
(3.2.1), we must have l2 ≤ l3 ≤ l1 . By the same steps as above, we obtain l2 ≥ α l1 ,
therefore l2 ≥ l1 and thus l1 = l2 = l3 .
This theorem states that if any one of the three functions
π(N ) π(N )
l = lim , and L = lim ,
N N
log N log N
then we will prove the theorem by showing L ≤ 4 log 2 and l ≥ log 2. As we saw in the
proof of the previous theorem, these inequalities can be exchanged for the following
θ(N )
L = lim ≤ 4 log 2, (3.3.1)
N
ψ(N )
l = lim ≥ log 2. (3.3.2)
N
We will start by proving (3.3.1). We consider the binomial coefficient
2m (2m)! (m + 1)(m + 2) · · · (2m)
M= = 2
= .
m (m!) 1 · 2···m
18
X
N N N N
+ 2 + ... + s = (3.3.3)
p p p s≥1
ps
s s
s is the largest integer for which p ≤ N . This series terminates since once p > N,
where
N
= 0. Therefore, the prime factorization of N ! is
ps
bN c+b N2 c+···+b pNs c
Y
N! = p p p . (3.3.4)
p≤N
where s is the largest integer for which ps ≤ 2m. Let r ≤ s be the largest integer for which
pr ≤ m, so that
m m
2 r+1 + · · · + 2 s = 0.
p p
In the prime factorization of M above, we see the expression b 2m
pk
c − 2b pmk c, 1 ≤ k ≤ s
occur many times as a term in the expression for the exponent. We can determine what
the values of this expression may be. We drop the floor notation and write b pmk c = pmk −
for some 0 ≤ < 1, and b 2m pk
c = 2m
pk
− δ for some 0 ≤ δ < 1. Then 2b pmk c = 2m
pk
− 2 and we
2m m
get b pk c − 2b pk c = 2 − δ, thus
2m m 2m m
−1 < k
− 2 k < 2 hence k
− 2 k = 0 or 1. (3.3.6)
p p p p
From (3.3.5), M is always an even integer, because if r is the largest integer such that
2 ≤ m, then m < 2r+1 ≤ 2m, and 2m < 2r+2 ≤ 4m, thus s = r + 1 and
r
2m j m k
− 2 = 1.
2r+1 2r+1
M is also the largest integer in the expansion. Since (1 + 1)2m = 22m ; combining this and
the fact that there are 2m + 1 terms in the expansion of (1 + 1)2m , we obtain
19
From the product (3.3.5) we see that since the denominator of M is composed of
(m!)2 , it can not be divisible by any Q
primes m < p ≤ 2m. It is also clear from (3.3.5), and
the fact that 2 divides M that M ≥ m<p≤2m p, thus
X
log M ≥ log p = θ(2m) − θ(m).
m<p≤2m
we get
k
X
k
θ(2 ) − θ(1) < log 2 2r < 2k+1 log 2.
r=1
If we now let N > 1, and k be a positive integer such that 2k−1 ≤ N < 2k . The
function θ is never decreasing, so (3.3.9) gives us
θ(N )
Hence < 4 log 2, which implies that
N
θ(N )
L = lim ≤ 4 log 2,
N
π(N ) log N
thus by Theorem 3.2, L = lim ≤ 4 log 2, which proves (3.3.1).
N
20
We now begin to prove (3.3.2), the second part of Chebyshev’s theorem. To prove
(3.3.2) we need to consider the binomial coefficient from (3.3.5) and its prime factorization,
that is
2m 2m! Y (b 2m c−2b m c)+“b 2m c−2b m c”+···+(b 2m c−2b m c)
p2 p2 ps ps
M= = 2
= p p p
.
m (m!) p≤2m
This product shows that M is divisible by p exactly vp times, where vp can be written
X 2m
m
vp = k
−2 k .
k≥1
p p
Therefore Y
M= pvp .
p≤2m
2m m k log 2m
Now since = k = 0 when p > 2m, that is when k > , we have
pk p log p
Mp
X 2m m log 2m
vp = k
− 2 k , Mp = . (3.3.10)
k=1
p p log p
Combining this result with (3.3.6) we get that vp ≤ Mp , hence
Y Y
M= pvp ≤ pM p . (3.3.11)
p≤2m p≤2m
so it follows that Y
eψ(2m) = pM p ,
p≤2m
or
1 a a
> >
pn n log pn 2n log n
for large enough n. Thus the series
∞
X 1
p
n=1 n
π(N )
a< < A.
N
log N
22
π(N )
In this section we will do even better, we will show that if tends to a limit as
N
log N
N → ∞, then this limit is 1 [10], from which the prime number theorem would follow. This
result will be readily shown by applying the asymptotic formulas from 3.2 and the
following result which will be proven in 4.3, that is
Z N
ψ(t)
dt = log N + O(1). (3.4.1)
1 t2
ψ(N ) ψ(N )
lim ≤ 1, lim ≥ 1.
N N
ψ(N )
This proof is by contradiction, if we assume that lim = 1 + α, for some α > 0,
N
then we have ψ(N ) > (1 + α) N for all N greater than some N0 . Therefore
Z N 1+ α
Z N Z N0
ψ(t) ψ(t) 2 dt > 1 + α log N + O(1),
dt > dt +
1 t2 1 t2 N0 t 2
ψ(N )
which is a contradiction to (3.4.1), so that lim ≤ 1.
N
ψ(N )
Now suppose that lim = 1 − α, α > 0. ψ(N ) < (1 − α) N for all N greater than
N
some N0 . Therefore
Z N 1− α
Z N Z N0
ψ(t) ψ(t) 2 dt < 1 − α log N + O(1),
dt < dt +
1 t2 1 t2 N0 t 2
ψ(N ) ψ(N )
a contradiction to (3.4.1), thus lim ≥ 1. Therefore, if limN →∞ exists then it is
N N
equal to 1. Now applying Theorem 3.2 we see that if the limit exists, then
π(N )
lim = 1.
N →∞ N
log N
Therefore, all that remains to proving the prime number theorem and showing that
N
π(N ) ∼ , is the existence of this limit. As stated in the introduction, this is quite
log N
difficult, and was not proven until independent proofs were given by Hadamard and
Poisson in 1896.
23
§4: Mertens’ Theorem
4.1 Introduction
where C is a constant. The proof of this theorem will require several lemmas, which will be
the content of the next section.
The strong version of Stirling’s formula provides a better approximation to m!, but this is
more than is needed in the proof of Merten’s theorem. For a complete proof of Stirling’s
formula see [20].
24
Proof:
To prove this result, we will show that m log m − log m! < m for all m ≥ 2. We do not
need absolute value because this difference is always positive for m > 2 since for any m,
log m + log m + . . . + log m > log m + log (m − 1) + . . . + log 2 + log 1. Since log m is always
| {z }
mR terms
m Rm
increasing, m−1 log t dt < log m, hence 1 log t dt < log 2 + . . . + log m = log m!. Using
integration by parts, one finds the antiderivative
Rm of log t to be t log t − t. Using the
fundamental theorem of calculus, we get 1 log t dt = m log m − m + 1, so that
m log m − log m! < m − 1 < m for all m ≥ 2. Therefore, log m! = m log m + O(m) (4.2.1).
The next result that will be needed is, as m → ∞
This result follows easily from the theorems proved in 3.2 and 3.3. By Chebyshev’s
π(m) log m ψ(m)
theorem, for large enough m, a ≤ ≤ A, thus by Theorem 3.2 ≤ A, thus
m m
ψ(m) ≤ Am, and hence ψ(m) is O(m).
The final result that will be required in the proof of Mertens’ theorem is the formula for
“partial summation”, or “Abel summation”. This formula is useful to number theorists as
a systematic approach to the computation of finite sums of number theoretic functions. Let
0 ≤ λ1 ≤ λ2 ≤ · · · be any divergent sequence of real numbers, and let an be a sequence of
complex numbers. Let X
A(N ) := an ,
λn ≤N
and φ(N ) a complex-valued function defined for N ≥ 0. Then if φ(N ) has a continuous
derivative in (0, ∞), and N ≥ λ1 , then
X Z N
an φ(λn ) = A(N )φ(N ) − A(t)φ0 (t)dt. (4.2.3)
λn ≤N λ1
A(λ1 ) − A(λ0 ) = a1
A(λ2 ) − A(λ1 ) = a2
..
.
A(λn ) − A(λn−1 ) = an
25
k
X
an φ(λn ) = (A(λ1 ) − A(λ0 ))φ(λ1 ) + (A(λ2 ) − A(λ1 ))φ(λ2 ) + · · · + (A(λk ) − A(λk−1 ))φ(λk )
n=1
k−1
X
= A(λk )φ(λk ) − A(λn )(φ(λn+1 ) − φ(λn ))
n=1
k−1
X Z λn+1
= A(λk )φ(λk ) − A(λn ) φ0 (t)dt
n=1 λn
k−1 Z λn+1
X
= A(λk )φ(λk ) − A(t)φ0 (t)dt
n=1 λn
Z λk
= A(λk )φ(λk ) − A(t)φ0 (t)dt. (4.2.4)
λ1
The last equation follows since φ(t) has a continuous derivative on (0, ∞). Now, if we let k
be the largest integer such that λk ≤ N , then from integration by parts we have
Z N Z N
0
A (t)φ(t)dt = A(N )φ(N ) − A(λk )φ(λk ) − A(t)φ0 (t)dt.
λk λk
Since A(t) is a step function hence is constant on λk ≤ t < λk+1 , so the left hand side of the
above equation is zero, and rearranging we obtain
Z N
A(λk )φ(λk ) = A(N )φ(N ) − A(t)φ0 (t)dt.
λk
Substituting this back into (4.2.4), and again using the fact that A(t) is a step function we
obtain our result (4.2.3)
X Z N
an φ(λn ) = A(N )φ(N ) − A(t)φ0 (t)dt.
λn ≤N λ1
We now have all the ingredients necessary to prove Mertens’ theorem. We will begin by
proving (4.1.1), that is
X Λ(n) X log p
= log N + O(1); = log N + O(1).
n≤N
n p≤N
p
26
where s is the largest integer for which ps ≤ N . Taking logarithms of both sides of this
formula we obtain
X N X N
log N ! = log p = Λ(n). (4.3.1)
r
pr n≤N
n
(p,r): p ≤N
N N
As in the proof of Chebyshev’s theorem, we write = − n , where 0 ≤ n < 1.
n n
Substitution in (4.3.1) gives us
X Λ(n) X
log N ! = N − n Λ(n),
n≤N
n n≤N
thus
X Λ(n)
log N ! = N + O (N ) .
n≤N
n
By (4.2.1)
log N ! = N log N + O(N ),
and dividing through by N we get
X Λ(n)
= log N + O(1), (4.3.2)
n≤N
n
which is the first formula in (4.1.1). To get the second formula in (4.1.1) we bound the
difference between (4.3.2) and our summation of interest by a constant, that is
X Λ(n) X log p X 1 1
X log p X log n
− ≤ + + · · · log p < < .
n p p2 p3 p(p − 1) n(n − 1)
n≤N p≤N p≤N p n≥2
(4.3.3)
Here the inequality on the left hand side of (4.3.3) follows since the von Mangoldt function
is nonzero for n = pm , hence all terms in both sums agree except those with denominators
which are powers of primes with m ≥ 2. The final series on the right hand side of (4.3.3) is
1 2 √
convergent since ≤ 2 , and from the proof of Chebyshev’s theorem, log n < n
n(n − 1) n
for n ≥ 2, it follows that
X log n X log n X √n X 1
<2 2
< 2 2
= 2 3/2
= C < ∞.
n≥2
n(n − 1) n≥2
n n≥2
n n≥2
n
27
The last inequality is the convergent p-series, p= 32 . This shows the second formula in
(4.1.1) holds since the difference between the two summations will always be bounded
above by this constant of convergence. That is by (4.3.2)
X Λ(n) X log p
− = O(1).
n p
n≤N p≤N
Now
!
X log p X Λ(n) X log p X Λ(n)
= + −
p≤N
p n≤N
n p≤N
p n≤N
n
= log N + O(1) + O(1)
= log N + O(1).
Therefore
X log p
= log N + O(1),
p≤N
p
π(N ) π(N )
lim ≥ 1 and lim ≤ 1.
N N
log N log N
Therefore, if
π(N )
lim
N →∞ N
log N
exists, then the limit must be equal to 1. From this, we obtain the asymptotic formula in
the prime number theorem.
To begin the proof, recall the previously mentioned relation
X
ψ(t) = Λ(n).
n≤t
28
Now, when considering t = 1, 2, 3, ..., N we see that we can rewrite
Z NX Z 2 Z 3 Z N
dt 1 1 1
Λ(n) 2 = Λ(1) 2 dt+ (Λ(1) + Λ(2)) 2 dt+· · ·+ (Λ(1) + · · · + Λ(N − 1)) 2 dt
1 n≤t
t 1 t 2 t N −1 t
1 1
The integral on the right hand side of (4.3.4) is equal to − , substituting this back in
n N
and simplifying, we obtain
Z N
ψ(t) X Λ(n) ψ(N )
dt = − .
1 t2 n≤N
n N
The proof requires defining the proper sequences and functions for use in Abel’s
summation (4.2.3). Letting λn = pn , where pn is the sequence of prime numbers in natural
order. We define
X log pn
A(N ) = an , where an =
p ≤N
pn
n
1
and φ(pn ) = . Abel summation gives us
log pn
X 1 Z N
A(N ) A(t)
= + 2
dt.
p ≤N
p n log N 2 t(log t)
n
Now from the second formula in (4.1.1), letting A(N ) = log N + E(N ), so that
X 1 Z N
E(N ) log t + E(t)
=1+ + dt
p ≤N
pn log N 2 t(log t)2
n
29
where by (4.1.1), E(N ) is O(1), thus |E(N )| ≤ K for all N ≥ 2, where K is some constant.
We can move this around a little to get
X 1 Z N Z N
E(N ) dt E(t)
=1+ + + dt.
p ≤N
pn log N 2 t log t 2 t(log t)2
n
|E(N )| K
Using the fact that E(N ) = O(1), we have ≤ . It remains to show that
log N log N
Z N
E(t) 1
dt = O .
2 t(log t)2 log N
We write Z ∞ Z ∞
Z N
E(t) E(t) E(t)
dt = dt − dt.
2 t(log t)2 2 t(log t)2 N t(log t)
2
where Z ∞
E(t) K
2
dt ≤ .
N t(log t) log N
Hence Z ∞
E(N ) E(t) 2K
log N − dt ≤ ,
2
N t(log t) log N
1
for N ≥ 2, which shows the error term is O . Setting
log N
Z ∞
E(t)
C = 1 − log log 2 + dt,
2 t(log t)2
(4.1.3) follows. This completes the proof of Mertens’ theorem.
30
§5: The Product Form of Mertens’ Theorem
The goal of this section is to prove an important result that follows from the proof of
Mertens’ theorem in 4.3, that is [10]
Y e−γ e−γ
1
1− = (1 + o(1)) ∼ , (5.0)
p≤N
p log N log N
The symbol γ was first used by the geometer Lorenzo Mascheroni, who in 1790 used the
symbol γ instead of Euler’s C, hence the name Euler-Mascheroni. We can use the very
useful Abel summation formula (4.2.3) to give us an approximation to γ. Let an = 1, so
1
that A(N ) = bN c and let φ(t) = ; then Abel summation gives us
t
Z N
X1 bN c btc
= + dt. (5.1.2)
n≤N
n N 1 t2
where ∞
N ∞ t N
Z Z
t 2
|E(N )| = 2
dt − ≤ 2
dt + < .
N t N N t N N
Thus we have X1 2 X1 2
− log N − <γ< − log N + .
n≤N
n N n≤N
n N
31
This inequality provides a method to compute the value of γ to several places. To fifteen
decimal places, the value of γ is 0.577215664432730. [3]
The constant γ is closely connected to the Gamma function,
Z ∞
Γ(N ) = e−x xN −1 dx. (5.1.5)
0
This connection is due the mathematician Karl Weierstrass (1815-1897) and his formula
N
0 1
−
1 N
@ A
= Ne γN
Y k
1+ e . (5.1.6)
Γ(N ) k>0
k
It requires some complex analysis to show that (5.1.5) and (5.1.6) are equivalent
representations for Γ(N ) [19]. Differentiating (5.1.5) with respect to N we obtain
Z ∞ Z ∞
0 −x N −1 0
Γ (N ) = e x log x dx, Γ (1) = e−x log x dx. (5.1.7)
0 0
The product form of Mertens’ theorem verifies that this product tends to zero
1
asymptotically as γ . This limit does take a surprisingly long time to reach zero,
e log N
since the product form of Mertens’ theorem shows that at N = 100, 000,
1
≈ .049, thus according to this formulation, we still have about a 5%
e.577216 log 100, 000
chance that the integer is prime.
33
p
0.8
0.7
0.6
0.5
0.4
0.3
0.2
N
2 4 6 8 10
1
Fig.5.2.1: The diamonds are the graph of p≤N +1 1 − , and the stars are the graph of
Q
p
1
for 1 ≤ N ≤ 10.
e.577216 log(N + 1)
p
0.275
0.25
0.225
0.2
0.175
0.15
0.125
N
20 40 60 80 100
1
Fig.5.2.2:
Q
The diamonds are the graph of p≤N +1 1 − , and the stars are the graph of
p
1
for 1 ≤ N ≤ 100.
e.577216 log(N + 1)
34
The reader may notice a sort of “paradox” between this interpretation of (5.0) and the
prime number theorem. The prime number theorem asserts that the density of prime
1 1
numbers is , whereas Mertens’ theorem gives the quite non-asymptotic γ . In
log N e log N
fact compared to the prime number theorem, since eγ ≈ 1.78, the probability
underestimates the density of primes by nearly a factor of 2. One explanation is that the
Sieve of Eratosthenes is more efficient than random.
√ Recall from the introduction that we
find all primes p ≤ N by sieving with primes p ≤ N . Thus a number N can be √
determined to be prime by testing whether it is divisible by any of the primes p ≤ N . If
in our probability formulation, we take the probability of N being prime to be
Y 1
2 1.12
1− ∼ γ ≈ ,
√ p e log N log N
p≤ N
1
then Mertens’ theorem is much closer to , but now overestimates by 12% the number
log N
of primes. It turns out that our intuition in formulating the probability in this section
relied on assumption of independent events. It turns out that when using the Sieve of
Eratosthenes approach, these events are not completely independent. In light of the prime
1
number theorem, is the true density of primes; but the density obtained by Mertens’
log N
theorem is a decent estimate, and shows that that the product tends to zero.
1
Because = o(1), Mertens’ theorem can be restated as
log N
X1
= log log N + C + o(1). (5.3.1)
p≤N
p
where Z ∞
E(t)
C = 1 − log log 2 + dt.
2 t(log t)2
(5.0) will follow from (5.3.1) and the following theorem which gives an alternate form of
the constant C in (5.3.1).
Theorem:
X 1
1
C=γ+ log 1 − + . (5.3.2)
p
p p
35
If α ≥ 0, we have, by (2.1.8)
1 1 1 1
0 < − log 1 − 1+α − 1+α < 1+α 1+α ≤ .
p p 2p (p − 1) 2p(p − 1)
We saw in (2.1.8) that the series is convergent. Thus by the Weierstrass m-test, we can
define for each α a uniformly convergent series
X 1
1
F (α) = log 1 − 1+α + 1+α . (5.3.3)
p
p p
Since the series is uniformly convergent for all α ≥ 0, we have F (α) is continuous, thus
F (α) → F (0) as α → 0 through positive values. If we now suppose that α > 0, then from
the equality between Euler’s product and the zeta-function discussed in the introduction,
we have
F (α) = g(α) − log ζ(1 + α),
where X 1
g(α) = 1+α
.
p
p
We again call upon the Abel summation formula (4.2.3), with the following definitions:
let the sequence λn be the sequence of positive integers with λ1 = 2,
1
, if n = prime
an = n
0, otherwise
1
and φ(N ) = . Now, from the proof of Mertens’ Theorem in 4.3 we have that
Nα
X1
A(N ) = = log log N + C + E(N ),
p≤N
p
If we let N → ∞, we have Z ∞
A(t)
g(α) = α dt
2 t1+α
Z ∞ Z ∞ Z ∞
log log t 1 E(t)
=α dt + αC dt + α dt.
2 t1+α 2 t1+α 2 t1+α
Now, Z ∞ Z ∞ Z 2
log log t log log t log log t
α dt = α dt − α dt.
2 t1+α 1 t1+α 1 t1+α
36
u
Making the change of variables t = e α ,
Z ∞ Z ∞
log log t −u
u
α dt = e log du = −γ − log α
1 t1+α 0 α
by (5.1.9), and Z ∞
1
α dt = 1.
1 t1+α
Therefore
Z ∞ Z 2
E(t) (log log t + C)
g(α) + log α − C + γ = α dt − α dt.
2 t1+α 1 t1+α
1
It was shown in 4.3 that E(t) = O , using this, and making the substitution
log t
√1
T =e α ,
∞ Z T Z ∞
Z
α E(t) dt Kα dt
1+α
dt < Kα + 1+α
2 t 2 t log T T t
K √
< Kα log T + ≤ 2K α → 0 as α → 0.
log T
We also have
2
Z Z 2
(log log t + C) (| log log t| + |C|)
1+α
dt < dt = K,
1 t 1 t
since the integral converges at t = 1. Therefore g(α) + log α → C − γ
as α → 0.
Recall from section 1.2, the zeta-function, which can be written in the following form
∞ Z ∞ ∞ Z n+1
X 1 −s
X
ζ(s) = s
= x dx + (n−s − x−s ) dx. (5.3.4)
n=1
n 1 n=1 n
37
Therefore
1
ζ(s) = + O(s),
s−1
and on taking logarithms, for |s − 1| < 1,
1 1
log ζ(s) = log + log [1 + O(s − 1)] = log + O(s − 1). (5.3.5)
s−1 s−1
Now, from (5.3.5) we get log ζ(1 + α) + log α → 0 as α → 0, and so F (α) → C − γ.
Therefore
C = γ + F (0),
which is (5.3.2).
It is now relatively easy to prove (5.0), by means of (5.3.1) and (5.3.2). To see this,
using our new form of the constant C in (5.3.2) in (5.3.1) we write
X1 X 1
1
= log log N + γ + log 1 − + + o(1). (5.3.6)
p≤N
p p
p p
Thus X
X1 X 1 1
= log log N + γ + log 1 − + + o(1).
p≤N
p p≤N
p p≤N
p
exponentiating we get
log(1− p1 )
P
e p≤N
= e− log log N −γ−o(1) ,
using properties of logarithms we obtain
log(1− p1 )
Q
e p≤N
= e− log log N e−γ e−o(1) ,
finally
e−γ
Y 1 1 −γ
log 1 − = e (1 + o(1)) ∼ ,
p≤N
p log N log N
which is (5.0).
38
5.4 Another Proof of The Product Form of Merten’s Theorem
The following proof uses many of the same steps and lemmas as the previous proof;
but the integral evaluations for finding the expressions in the sum F (α) are simplified by
an exponential substitution. This proof also does not require using the Gamma function
form of γ given by (5.1.9), only the definition of γ. We begin by using (5.3.5) to write
1
log ζ(1 + α) = log + O(α).
α
From the Maclaurin series expansion
α2 α3
−α
1−e =1− 1−α+ − + ···
2! 3!
α2 α3
=α− + − ···
2! 3!
1 α
= α 1 + α − + − ···
2! 3!
= α (1 + O(α)) , as α → 0.
the last line follows from the Maclaurin series expansion for − log(1 − x) as shown in
(2.1.8). This relationship is a key difference between the two proofs and simplifies the
analysis considerably.
From (5.3.3) we have the second representation of log ζ(1 + α) :
X 1
log ζ(1 + α) = 1+α
− F (α).
p
p
Now letting
X1
H(t) =
n≤t
n
X1
P (t) =
p≤t
p
by Abel summation
X 1 Z ∞
1+α
=α P (et )e−αt dt;
p
p 0
39
and ∞ ∞
e−αn
X Z
=α H(t)e−αt dt.
n=1
n 0
Therefore Z ∞
log ζ(1 + α) = α H(t)e−αt dt + O(α)
0
and Z ∞
log ζ(1 + α) = α P (et )e−αt dt − F (α).
0
By (5.1..1)-(5.1.4) we have
1
H(t) = log t + γ + O .
t
We have from (5.3.1) that
P (et ) = log t + C + o(1).
Therefore
Z ∞ Z ∞
−αt
α e t
(H(t) − P (e )) dt = α e−αt (γ − C + o(1)) dt
0 0
∞
e−αt
=α (γ − C + o(1))
−α 0
= γ − C + o(1)
40
6. Conclusion
provided the connection between the zeta-function and the primes. We relied on Euler’s
product in 5.3 when we proved the product form of Merten’s theorem.
As was shown throughout this paper, the infinite sum of prime reciprocals is interwoven
in the theory of the distribution of prime numbers. As we observed in section 2.1, Euler
used his product to prove that
X1 1
> log log N − , for all N ≥ 2.
p≤N
p 2
In 3.2 the sum’s divergence was shown to be a consequence of Chebyshev’s theorem, which
provides upper and lower bounds for π(N ). Further, we could not have proved the
asymptotic formulas for the product form of Merten’s theorem without the formula
X1
1
= log log N + C + O ,
p≤N
p log N
as we saw in 4.3. The divergence of prime reciprocals also shows that there must be more
prime numbers than square numbers; and in fact more numerous than any integers of the
form n1+α where α > 0, since the reciprocals of such numbers always converge.
41
number divisible by 3. Therefore, there will be no other repetitions of twin primes in the
twin prime pairs. It is still only a conjecture that there are infinitely many twin prime
pairs. Most mathematicians believe that there are infinitely many, but a proof still remains
elusive.
In 1919, the Norwegian mathematician Viggo Brun (1885-1978), proved that in stark
contrast to the divergence of prime reciprocals
X 1
< ∞. (6.2.1)
p a twin prime
p
where π2 (N ) is the number of primes p ≤ N for which p + 2 is also prime (the number of
twin prime pairs with first number in the pair less than or equal to N ). There is no known
easy proof of this result; perhaps the most accessible is a proof by Edmund Landau
(1877-1938) who proved this result by a detailed analysis of the properties of certain
alternating binomial coefficients [12]. Brun’s theorem at least tells us that there are not too
many twin prime pairs.
42
References
[1] Bays, C. and Hudson,R., (1999). A new bound for the smallest x with π(x) > li(x).
Mathematics of Computation vol.69, 231: pp.1285-1296.
[2] Bell, E.T., (1937). Men of Mathematics. Simon and Schuster Inc., New York, New
York.
[3] Brent, R.P., (1977). Computation of the regular continued fraction for Eulers
constant. Mathematics of Computation vol. 31: pp. 771-777.
[5] Cojocaru, A. C, and Murty, M.R. (2006). An Introduction to Sieve Methods and their
Applications. Cambridge University Press, New York, New York.
P1
[6] Eynden, C.V., (1980, May). Proofs that p
diverges. The American Mathematical
Monthly vol. 87, No.5 : pp. 394-397.
[7] Goldstein, L.J., (1973). A history of the prime number theorem. American
Mathematical Monthly 80 (June-July): pp. 599-615.
[8] Goldston, D.A., Are there infinitely many twin primes? Article.
[9] H. Halberstam and H.-E. Richert (1974), Sieve Methods. Academic Press, New York.
[10] Hardy, G.H., and Wright, E.M., (1962). The Theory of Numbers, fourth edition.
Oxford University Press, Amen House, London.
[11] Hungerford, T., (1974). Algebra. Springer Science+Business Media, LLC, New York,
New York.
[12] Landau, E., (1958). Elementary Number Theory. Chelsea Publishing Company, New
York, N.Y.
[14] Niven, I., (1971, May). A proof of the divergence of σ p1 . The American Mathematical
Monthly vol. 78, No.3: pp. 272-273.
[15] Rosen, K.H., (2000). Elementary Number Theory, 4th edition. AT&T Laboratories
and Kenneth Rosen.
[16] Sandifer, E., (2006, March). How Euler did it. Mathematical Association of America
Online. June 27, 2008.
http://fermatslasttheorem.blogspot.com/2006/08/euler-product-formula.html
43
[17] Sautoy, M.D., (2003). Music of the Primes, 1st edition. Harper Collins Publishers Inc.,
New York, New York.
[18] The Great Internet Mersenne Prime Search. July 16, 2008 at
http://www.mersenne.org/prime.htm
[19] Titchmarsh, E.C., (1939). The Theory of Functions, second edition. Oxford University
Press, Amen House, London.
[20] Whittaker, E. T., and Watson, G. N., (1963). A Course in Modern Analysis, fourth
edition. Cambridge University Press, New York, New York.
44