Divergence of Prime Reciprocals

THE DIVERGENCE OF THE SUM OF RECIPROCALS OF PRIMES AND MERTENS’ THEOREM
A Writing Project
Presented to
The Faculty of the Department of Mathematics
San José State University
In Partial Fulfillment
of the Requirements for the Degree
Master of Arts
by
Simon A. Ward
May 2009

c 2009
Simon A. Ward
ALL RIGHTS RESERVED
APPROVED FOR THE DEPARTMENT OF MATHEMATICS
Dr. Daniel Goldston
Dr. Marylin Blockus
Dr. Mohammed Saleem
APPROVED FOR THE UNIVERSITY

ABSTRACT
THE DIVERGENCE OF THE SUM OF RECIPROCALS OF PRIMES AND MERTENS’ THEOREM
by Simon A. Ward
The divergence or convergence of the series of prime reciprocals was finally resolved
by Euler in 1744. Euler showed directly that this series is divergent, which shed some light
on the density of primes. The divergence shows the primes are not so few such that the sum
of their reciprocals converges. The asymptotic formula for the partial sums of reciprocals of
primes is the cornerstone of Mertens’ theorem. The formula is derived and the constant in the
formula is analyzed to show that the probability that a random integer is prime decreases to
zero with large numbers. This probability estimate is known as the product form of Mertens’
theorem. Many proofs of the divergence of the series of prime reciprocals are reviewed in
detail, including modern proofs by Erdös, Dux, Clarkson and Niven. Chebyshev’s theorem
provides an upper and lower bound for the number of primes up to any given number.
The proof of this theorem is explained in detail, and the divergence of the series of prime
reciprocals is shown to be a consequence. This paper details many of the important results
in the history of the development of the distribution of prime numbers. The sum of prime
reciprocals is shown from many different approaches; and along the way many supporting
lemmas and other useful results in elementary number theory are discussed and proved. The
reader will come away with a good understanding of the problem of counting prime numbers
and a motivation for understanding and proving the prime number theorem.
ACKNOWLDEGEMENTS
I would like to thank Daniel Goldston for the guidance and resources he provided. Thank
you to committee members Marylin Blockus and Mohammed Saleem for many helpful sug-
gestions regarding this project. Thank you to my wife Nora and my family for their support
while I completed this project.
v
TABLE OF CONTENTS
§1: Introduction
1.1: Prime Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2: A Formula for Prime Numbers . . . . . . . . . . . . . . . . . . . . . . 2
1.3: The Prime Number Theorem . . . . . . . . . . . . . . . . . . . . . . 4
1.4: Euler’s Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5: The Divergence of the Sum of Prime Reciprocals . . . . . . . . . . . . . . 7
P 1
§2: The Divergence of p
p
2.1: Euler’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2: Erdös’ Proof. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3: Dux’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4: Clarkson’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5: Niven’s Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
§3: Chebyshev’s Theorem
3.1: Theorem (Chebyshev). . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2: The Chebyshev Function Theorem. . . . . . . . . . . . . . . . . . . . 17
3.3: Proof of Chebyshev’s Theorem . . . . . . . . . . . . . . . . . . . . . 18
3.4: Another Small Step Towards the Prime Number Theorem . . . . . . . . . . 22
§4: Mertens’ Theorem

4.1: Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2: Preliminary Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3: Proof of Mertens’ Theorem. . . . . . . . . . . . . . . . . . . . . . . 26
§5: The Product Form of Mertens’ Theorem
5.1: The Euler-Mascheroni Constant . . . . . . . . . . . . . . . . . . . . . 31
5.2: The Product Form as a Probability of a Number Being Prime. . . . . . . . . 32
5.3: Proof of the Product Form of Mertens’ Theorem˙ . . . . . . . . . . . . . . 35
5.4: Another Proof of the Product Form of Mertens’ Theorem. . . . . . . . . . . 39
§6: Conclusion
6.1: Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.2: Further Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
vi
LIST OF FIGURES
Figure
1.3.1 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.2 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5.1 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.2 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.2.1 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.2.2 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
vii
§1: Introduction
1.1 Prime Numbers
The prime numbers are the natural numbers which have exactly two unique divisors,
one and themselves. The Chinese thought of the prime numbers as “macho” numbers,
which attempted to resist any attempt to break them down into a product of smaller
integers [17]. From the fundamental theorem of arithmetic, we can conclude that the prime
numbers are the most basic elements of the integers. That is, every positive integer greater
than 1 can be written uniquely as a product of primes, with the prime factors in the
product written in nondecreasing order [15]. A composite number is a number greater than
1 that is not prime. Thus 1 is the only natural number which is neither prime nor
composite. The integers also form an integral domain, whose field of quotients forms the
rational numbers, which in turn can be used to construct the real numbers [11]. The
fundamental importance of prime numbers to mathematics makes them worthy of special
study.
The ancient civilizations which emerged in Babylon, Egypt and China have left
evidence that they had an understanding of prime numbers. However, the properties of
prime numbers were first formally documented by the Pythagoreans before 500 B.C. in
Greece [21]. In the third century BC, the Greek Eratosthenes invented his famous sieve,
which could, with relative ease, determine all the primes up to a given number. For
example, suppose we want to create a list of primes up to a modest number, say 100. The
Sieve of Eratosthenes works by starting with 2, and deleting all numbers greater than 2
that are multiples of 2 (even numbers), all numbers
√ greater than 3 that are multiples of 3,
and for each successive prime not exceeding 100 = 10, deleting all multiples of that
prime. We do not need to consider a prime greater than 10 since any composite number
k > 10 with prime factors greater than 10 would exceed 100 (the first such integer being
112 = 121). The numbers that remain in your list must be prime. This method allowed
mathematicians from antiquity to begin making tables of prime numbers [17]. Around the
same time as Eratosthenes, Euclid’s book Elements was published. This book contained a
prove that there are infinitely many primes. The elegance of the proof allows us to present
it here in one short paragraph.
Suppose there are only n primes, and consider the sequence of primes 2, 3, 5, . . . , pn .
Take the product of these n primes 2 · 3 · · · pn−1 · pn , and then add one to it, forming the
integer N =2 · 3 · · · ·pn−1 · pn +1. By the fundamental theorem of arithmetic, N must have a
prime divisor. But if the prime divisor is any one of the pn , then pn must divide
N − 2 · 3 · · · ·pn−1 · pn = 1. This implies pn divides one, which is a contradiction. Thus our
assumption that there are finitely many primes is false, and hence there must be infinitely
many primes. Euclid was responsible for the only known proof of the infinitude of the
primes for over 2,000 years!
1.2 A Formula for Prime Numbers?

A natural question one might have when considering the sequence of prime numbers
2, 3, 5, . . . , pn , . . . is whether or not there is a formula that will yield the nth prime. That is,
1
can one find a formula involving n which will produce the nth prime? Mathematicians have
tried for centuries to find such a formula, however no formula is known. It is highly likely
that no such formula exists, since the gaps between consecutive primes are apparently
random. One might then search for a weaker function, that is a function that assumes only
prime values, not necessarily sequentially. Over the course of mathematical history, there
have been a few formulas that have been conjectured to be prime valued. One such
function was proposed by Pierre de Fermat (1605-1665) to always be prime valued, and
n
produces the nth “Fermat” number Fn = 22 + 1. The first four Fermat numbers 5,17, 257,
65,537, are all prime. The reader might recognize Fn from a well known proof by Karl
Gauss (1777-1855), that if Fn is prime, then a regular polygon of Fn sides can be inscribed
in a circle with only a compass and straightedge. Fermat’s conjecture was proved false by
Euler in 1732, when he showed that 641 is a divisor of the fifth Fermat number
Fn = 232 + 1 = 4, 294, 967, 297. Another theorem from Fermat, provides a way to find
divisors of very large numbers. [15]
Fermat’s Little Theorem: If p is a prime and a is a positive integer with (a, p) = 1, then
ap−1 ≡ 1 mod p, where (m, n) denotes the greatest common divisor of m and n.
Proof: Consider the p − 1 integers, a, 2a, 3a, . . . , (p − 1)a. None of these integers are
divisible by p since if p|ka, then since (a, p) = 1, p|k. But 1 ≤ k ≤ (p − 1), and hence it is
not possible that p|k. Furthermore, no two of these integers are congruent mod p. For if
sa ≡ ta mod p, then since (a, p) = 1, s ≡ t mod p. But 1 ≤ s < t ≤ p − 1, which makes
their equivalence impossible. Since none of the integers are divisible by p, or equivalent to
each other, they represent in some order the equivalence classes mod p. That is, they are
equivalent to 1, 2, . . . , p − 1. Therefore, by the multiplication properties of congruence, we
have ap−1 (p − 1)! ≡ (p − 1)! mod p. Now since ((p − 1)!, p) = 1, it follows that ap−1 ≡ 1
mod p.
The power of this theorem is that it gives information about the divisors of very large
integers, for Fermat’s little theorem states that ap−1 − 1 is divisible by p, whenever
(a, p) = 1. For example, since (6, 97) = 1, 696 − 1 is divisible by 97, and therefore composite.
Another formula that is used today to compute the largest known primes is the
formula 2n − 1. This function is certainly not always prime valued, since 24 − 1 = 15 = 3 · 5.
However, when this number is prime, then n must be a prime by the following [10].
Theorem: If n > 1 and an − 1 is prime, then a = 2, and n is prime.
Proof: an − 1 = (a − 1)(an−1 + an−2 + · · · + a + 1), and hence if a 6= 2 then an − 1 is
composite. Therefore a = 2. If n = st where 1 < s ≤ t < n, then
2n − 1 = (2s )t − 1 = (2s − 1)((2s )t−1 + (2s )t−2 + · · · + 2 + 1). Since this number is prime, we
must have s = 1. This is a contradiction, and hence n is prime.
When 2p − 1 is prime, it is known as a “Mersenne” prime, named after the French
monk Marin Mersenne (1588-1648). Mersenne claimed that this formula gave prime
numbers for n = 2, 3, 5, 7, 13, 17, 19, 31, 67, 127 and 257. It turns out that 67 and 257 do not
give Mersenne primes. However, once a new prime P is found, then one knows that 2P − 1
is a good candidate for a large prime. The largest known prime to date is the Mersenne
prime 243,112,609 − 1 [18] which has about 12, 900, 000 digits!
2
Other formulas that have been conjectured to be only prime valued are n2 − n + 41,
which is prime for all 0 ≤ n ≤ 40, and n2 − 79n + 1601, which is prime for all 0 ≤ n ≤ 79.
One can stop the search for a prime valued polynomial function with integral coefficients
by the following [10]
Theorem:
No nonconstant polynomial f (n) with integral coefficients, can be prime for all n, or for all
sufficiently large n.
Proof: We can assume that the leading coefficient of f (n) is positive, so that f (n) → ∞ as
n → ∞. Suppose that for n > N , for some large N , and f (n) > 1, then
f (n) = as ns + as−1 ns−1 + · · · + a1 n + a0 = M > 1.
Then
f (kM + n) = as (kM + n)s + as−1 (kM + n)s−1 + · · · + a1 (kM + n) + a0
is divisible by M for every integer k.
Hence f (n) is composite for infinitely many values. It follows that no such polynomial can
be strictly prime valued.
Not to be dissuaded, one might search for a function whose range contains an infinity
of primes. A trivial example is f (n) = n. G. Lejeune Dirichlet (1805-1859) has given us the
following.
Theorem:
If a and b are relatively prime, then there are infinitely many primes of the form an + b.
There is no known simple proof of this theorem. Dirichlet’s original proof used complex
variables. A very complicated elementary proof was found in the 1950’s by Alte Selberg (b.
1917) [15]. We will prove a special case here, that is
Theorem: There are infinitely many primes of the form 4n + 3.
To prove this theorem we will require the following lemma.
Lemma: If a and b are integers of the form 4n + 1, then ab is also of this form.
Proof: If a = 4r + 1, and b = 4s + 1, then ab = 16rs + 4r + 4s + 1 = 4(4rs + r + s) + 1,
which is of the form 4n + 1.
Now we shall prove the desired result.
Proof: Suppose there are only finitely many primes of the form 4n + 3, say
p0 = 3, p1 , . . . , pk . Let
Q = 4p1 · p2 · · · pk + 3.
Then by the fundamental theorem of arithmetic, Q must have a prime factorization. At
least one of the primes in this factorization must be of the form 4n + 3; because the odd
primes must be be in either the residue classes {1} or {3} mod 4. If none were in {3}, the
lemma above implies Q is also of the form 4n + 1, which would be a contradiction. Now,
none of the p0 = 3, p1 , . . . , pk divides Q. 3 does not divide Q for if 3|Q then
3
3|(4p1 · p2 · · · pk ), which would lead to a contradiction. Similarly, none of the pk can divide
Q, because if pk |Q, then pk |3, which is clearly a contradiction. Therefore there are
infinitely many primes of the form 4n + 3. It is still an open question whether simple
quadratic forms such as n2 + 1 assume infinitely many prime values.
In the last 50 years or so, several techniques have been developed for generating prime
numbers involving sieving and systems of Diophantine equations. However the formulas are
very complicated and not practical for use [9].
1.3 The Prime Number Theorem

As was shown in 1.2, finding a prime number formula, or even a formula that outputs
infinitely many primes is a very difficult problem. Mathematicians attempted to aim for a
more realistic goal. They began approximating the number of primes up to any given N .
The search for primes has gone on for thousands of years, and extensive tables have been
constructed. With centuries of tables of primes as empirical data, mathematicians had a
lot of evidence to work with. The famous mathematician Karl Gauss once told a friend
that when he had 15 minutes to spare, he would count the number of primes in a chiliad (a
range of 1,000 numbers). By the end of his mathematical career, Gauss had computed all
primes up to 3,000,000 [7]. In 1792, at the age of fifteen, Gauss made his now famous
prime number conjecture, which stated that if π(N ) represented the number of primes less
N
than or equal to N , then π(N ) ∼ . By ∼ we mean “asymptotic”. That is, the ratio of
log N
N
π(N ) to approaches 1 as N grows without bound. Gauss did not prove this, so a
log N
conjecture it remained. The great mathematician Pafnuty Chebyshev (1821-1894) wrote a
proof now known in number theory simply as “Chebyshev’s theorem”. He showed that
there exist constants a and A, 0 < a < 1 < A, such that if N is sufficiently large, we have
N N
a < π(N ) < A .
log N log N
This theorem was an important step in the direction of proving the prime number
conjecture. We will review “Chebyshev’s theorem” in section 3 of this paper, which will
also provide a direct proof that the series of prime reciprocals is divergent. Chebyshev’s
theorem was still a long way from proving the prime number theorem, which was finally
proven independently by Hadamard and Poisson in 1896 [7]. In making his conjecture,
Gauss actually used a different and more accurate estimation for π(N ) known as the
logarithmic integral of N or Li(N ). From his table of primes, Gauss noticed that given a
number N , the probability that a number sufficiently close to N is prime is approximately
1
. Since the sum π(N ) would increase by exactly one if N is prime, Gauss obtained
log N
the estimation
Z N Z N
1 N 1
π(N ) ≈ Li(N ) = dx = + dx.
2 log x log N 2 (log x)2
N
This sum is much more accurate than . However as the prime number theorem
log N
4
N
states, they are both asymptotic to π(N ). The formula is easier to use, and when
log N
dealing with large enough numbers is usually sufficient for the estimation of π(N ).
175
150
125
100
75
50
25
200 400 600 800 1000

N +1
Fig.1.3.1: (bottom), {π(N + 1)} (middle), {Li(N + 1)} (top) for 1 ≤ N ≤ 1, 000.
log (N + 1)
700
600
500
400
300
200
100
1000 2000 3000 4000 5000

N +1
Fig.1.3.2: (bottom), {π(N + 1)} (middle), {Li(N + 1)} (top) for 1 ≤ N ≤ 5, 000.
log (N + 1)
It is interesting to notice that Li(N ) is a much better estimate, and appears to slightly
overestimate π(N ). Empirically, this holds for extremely large numbers and it was at one
time widely conjectured to always overestimate π(N ). However, John Littlewood
(1885-1977) proved that there are actually infinitely many numbers for which
Li(N ) − π(N ) changes sign. The best modern estimates to the first such N , are of the
order 1.397 · 10316 ! [1]
5
1.4 Euler’s Product
After the fall of Ancient Greece, history saw little advancement in the theory of prime
numbers until the 17th century with significant contributions from Fermat and Mersennes,
and later when the Swiss mathematician Leonhard Euler (1707-1783) published Variae
obsevationes circa series infinitas in 1744 [16]. Euler introduced his famous product that
now bears his name,
∞ −1
X 1 Y 1
= 1− . (1.4.1)
n=1
n p
p
Throughout this paper the index p runs over all the primes. Fortunately, there were a few
important results from medeival mathematics, and perhaps the most important result is
due to Nicole Oresme (1323-1382), that the summation on the left hand side of (1.4.1) or
the “harmonic” series, is divergent. Oresme noticed an inequality when he doubled the
index of the partial sums,
1
S2 = 1 +
2
1 1 1 1 1 1
S4 = 1 + + + > 1 + + +
2 3 4 2 |4 {z 4}
1
2
1 1 1 1 1 1 1 1 1 1 1 1 1 1
S8 = 1 + + + + + + + >1+ + + + + + +
2 3 4 5 6 7 8 2 |4 {z 4} |8 8 {z 8 8}
1 1
2 2
..
.
n
S2n > 1 +
2
and since n was arbitrary, the series diverges. From this known result, (1.4.1) yielded the
first new proof of the infinitude of prime numbers in two millennia. This is due to the
simple fact that for the product on the right hand side to diverge, we must necessarily have
infinitely many indices (primes) in the product.
Euler’s product (1.4.1) is actually not very difficult to prove. The way Euler did it, is
he started by supposing that the harmonic sum converges to S, that is
1 1 1 1
S =1+ + + + + ···
2 3 4 5
and dividing by two
S 1 1 1 1 1
= + + + + + ···
2 2 4 6 8 10
therefore
S 1 1 1 1
= 1 + + + + + ··· (1.4.2)
2 3 5 7 9
Since all the terms in (1.4.2) have odd denominators, in a similar fashion to the Sieve of
Eratosthenes, we can eliminate all denominators that are multiples of 3 by dividing (1.4.2)
by 3, to get
S 1 1 1 1
= + + + + ··· (1.4.3)
2·3 3 9 15 21
6
then subtracting (1.4.3) from (1.4.2) to obtain
1·2 1 1 1 1
S =1+ + + + + ···
2·3 5 7 11 13
continuing this process indefinitely, Euler obtained
1 · 2 · 4 · · · (p − 1) · · ·
S = 1,
2 · 3 · 5···p···
cross multiplying gives
∞ −1
2 · 3 · 5···p··· X 1 Y 1
S= = = 1− ,
2 · 4 · · · (p − 1) · · · n=1 n p
p
which is (1.4.1).
Euler generalized (1.4.1) for any real number s > 1 [4]
∞ Y −1
X 1 1
= 1− s . (1.4.4)
n=1
ns p
p
This product opened up many doors into exploring the divergence of the prime numbers.
The left hand side is the Riemann zeta-function evaluated at s. When Bernhard Riemann
(1822-1866) began plugging complex numbers in for s in (1.4.4), a whole new mathematical
landscape opened up. Riemann found that information in the complex zeros of the
zeta-function could be used to find the best known formula for counting primes up to N ,
and best of all, he found information about the zeros which corresponded to the exact error
in his formula for the number of primes up to N . He had found it; an exact formula for the
number of primes up to any given N . To complete this program, one needs to prove what
is now known as the “Riemann Hypothesis” that is, the real part of any non-trivial zero of
1
the Riemann zeta-function is [17]. The proof of the hypothesis is so challenging that a
2
correct proof will win a $1,000,000 reward offered by the Clay institute. This paper will
make no further attempt to describe the very deep topic of the Riemann Hypothesis. We
have noted it here because it is at the crux of the distribution of the primes, and it
illustrates the importance of Euler’s identity which will come in to play in some of the
proofs in this paper.
1.5 The Divergence of the Sum of Prime Reciprocals

The central topic of this paper is the divergence of
X1
. (1.5.1)
p
p
The divergence of this series is related to the distribution of the primes. While we know
that the harmonic series is divergent, and from the prime number theorem, that primes
become increasingly rare, the divergence of (1.5.1) shows us the primes do not thin out fast
7
enough for the sum to converge. On the other hand, since (1.5.1) diverges, then in a sense
of contributing to a sum, there would have to be more prime numbers than square numbers
since we know from another well known result of Euler [16] that
∞
X 1 π2
= .
n=1
n2 6
Euler succeeded in showing the divergence of (1.5.1), and many mathematicians have
added more proofs. Each proof is interesting in its own right since each offers a different
strategy and insight to the problem. In this paper we will present many proofs of the
divergence of (1.5.1). We have selected proofs from: Euler, Erdös, Dux, Clarkson, Niven
and Chebyshev. Knowing the divergence of (1.5.1) is important. However, a theorem due
to the Polish mathematician Franz Mertens (1840-1927) not only shows the divergence of
(1.5.1), but explains exactly how (1.5.1) diverges. A section of this paper will be devoted to
proving this theorem. There are actually four results that are collectively referred to as
Mertens’ theorem: as N → ∞ we have
X Λ(n) X log p
= log N + O(1); = log N + O(1), (1.5.2)
n≤N
n p≤N
p
Z N
ψ(t)
dt = log N + O(1), (1.5.3)
1 t2
X1
1
= log log N + C + O , (1.5.4)
p≤N
p log N
Y e−γ

1
1− = (1 + o(1)), (1.5.5)
p≤N
p log N
where Λ(N ) is the “Von Mongoldt” function, ψ(N ) is one of “Chebyshev’s” functions
(section 3), γ is the Euler-Mascheroni constant, and C is a constant (sections 4 & 5). A
function f (N ) is “big O” of g(N ), or O(g(N )) if and only if there exist constants K > 0
and N0 such that |f (N )| ≤ K|g(N )| for all N ≥ N0 . A function f (N ) is “little
o” of g(N ),
f (N )
or o(g(N )), if and only if for any K > 0, there exists N0 such that < K for all
g(N )
N ≥ N0 . Notice that theorem (1.5.4) shows for large N , the series of prime reciprocals
diverges as log log N plus a constant C, where C ≈ .26 [13]. Theorem (1.5.5) follows from
(1.5.4) and gives an intuitive probability of a random integer being prime, and will be
referred to as the “product form of Mertens’ theorem.” All of the theorems (1.5.2)-(1.5.5)
will be proved in sections 4 & 5 of this paper.
8
S
1.25
0.75
0.5
0.25
N
5 10 15 20
1
Fig.1.5.1: The stars are the graph of {log log(N + 1) + .26}, and the diamonds are the graph of
P
p≤N ,
p
for 1 ≤ N ≤ 20.
S
1.8
1.6
1.4
1.2
0.8
N
20 40 60 80 100
1
Fig.1.5.2: The stars are the graph of {log log(N + 1) + .26}, and the diamonds are the graph of
P
p≤N ,
p
for 1 ≤ N ≤ 100.
Fig.1.5.1 and Fig.1.5.2 provide evidence that the convergence of these functions
happens rather quickly, with the functions being very close at only N = 100. At N = 1, 000
the graphs are virtually indistinguishable.
9
P 1
§2: The Divergence of p
p
2.1 Euler’s Proof

−1
Q 1 P 1
Euler first proved that p 1 − diverges, then deduced from this that p
p p
diverges [1]. This is a direct proof and shows why the series is divergent. Define
Y −1
1 X1
P (N ) = 1− , S(N ) = , N ≥ 2.
p≤N
p p≤N
p
Recall the formula for the sum of a finite geometric series

1 − xm+1 1
1 + x + x2 + · · · + xm = < , for 0 ≤ x < 1. (2.1.1)
1−x 1−x
1
Letting x = , then 0 < x < 1, and from (2.1.1) we get
p
−1
1 1 1 1 1
1 + + 2 + ... + m < = 1− . (2.1.2)
p p p 1 p
1−
p
Repeatedly applying the multiplication property for inequalities over all primes p ≤ N , we
get
Y 1 1 1

1 + + 2 + . . . + m < P (N ). (2.1.3)
p≤N
p p p
Now each n ≤ N has a prime factorization n = 2α1 · 3α2 · · · pαnn where pn ≤ N. Therefore,
log N
we must have for each αi , pαi i ≤ N, or αi ≤ , hence if we choose m such that
log pi
2m ≥ N , then the product on the left hand side of (2.1.3) will contain at least all terms of
N
X 1
,
n=1
n
hence
Y XN
1 1 1 1
1 + + 2 + ··· + m ≥ . (2.1.4)
p≤N
p p p n=1
n
Now, the harmonic series on the right hand side of (2.1.4) is divergent from (1.4.1), which
shows P (N ) is divergent, but in order to show the divergence of the series, we need to use
the following from integral calculus,
N Z N
X 1 dt
> = log N. (2.1.5)
n=1
n 1 t
10
−1
Q 1
From (2.1.3), (2.1.4) and (2.1.5), we obtain P (N ) > log N , hence p 1− diverges
p
as N grows without bound.
To prove the divergence of the series, first consider the Maclaurin series expansion of
x2 x3 (−1)n−1 xn
log(1 + x) = x − + − ··· + + · · · . Clearly
2 3 n
x2 x3 xn

1
log =x+ + + ··· + + ··· (2.1.6)
1−x 2 3 n
which is convergent by the ratio test for all x satisfying 0 < x < 1. Hence from (2.1.6) we
obtain
x2 x3 xn x2

1
1 + x + x2 · · · + xn + · · · , (2.1.7)

log −x= + + ··· + + ··· <
1−x 2 3 n 2
since the geometric series on the right hand side is convergent, we have
x2

1
log −x< , 0 < x < 1. (2.1.8)
1−x 2(1 − x)
1
Now setting x = and adding the inequalities over all p ≤ N , we obtain
p
∞
1X 1 1X 1 1
log P (N ) − S(N ) < < = . (2.1.9)
2 p≤N p(p − 1) 2 n=2 n(n − 1) 2
The equality on the right hand side follows from the convergent telescoping series. Now
from P (N ) > log N and rearranging (2.1.9) we get that
1 1
log log N − < log P (N ) − < S(N ) ,
2 2
P 1
so that p is divergent.
p
11
2.2 Erdös’ Proof
Paul Erdös (1913-1996) published the following proof in 1938 [6]. This proof is notable
for its lack of series manipulations. It is the first of four proofs by contradiction that I will
review. For the next four sections, slight variations of the following definitions will be used:
Let a, b be given positive integers. Let
P = {p : a ≤ p ≤ b}, M = {n ∈ Z+ : p | n ⇒ p ∈ P }, Mn = {m ∈ M : m ≤ n}.
In words, M is the set of integers generated by P under multiplication, and Mn is a

truncation of M up to and including n.
P1
Now suppose that a = 1 and is convergent, then there is a large enough b such that
p
X1 1
< .
p>b
p 2
Let m ∈ Mn , from the prime factorization of m, and parity of the exponents, we can always
write m = k 2 r where r is square-free. That is
Y
r= p, where S ⊂ P.
S
Given any finite set A of n elements, there are 2n possible subsets. This is easily seen by
induction; because adding one more element to the set doubles the number of subsets.
Since P is a finite set of primes, there are 2|P | − 1 non empty subsets of P . Including the
possibility that r = 1, there are√2|P | possible
√ values for r. Since√m = k 2 r, we have
k 2 ≤ m ≤ n which √ implies k ≤ m ≤ n, so that there are ≤ n possible values of k, and
thus |Mn | ≤ 2|P | n.
n
It is clear that for any fixed prime p, there can be no more than integers less than n
p
that are divisible by p. Since the number of integers less than n that are not divisible by
any p ∈ P is X n n
n − |Mn | ≤ < .
n≥p>b
p 2
So that √ √
n
< |Mn | < 2|P | n or n < 2|P |+1 ,
2
which is clearly a contradiction for large enough n.
12
2.3 Dux’s Proof
This proof was published in 1956 by Erich Dux [6]. This proof is interesting since it
involves rearranging the harmonic series. Borrowing from Erdös’s proof, the sets used in
this proof are defined as:
P = {p : 1 < p ≤ b}, M = {n : p | n ⇒ p ∈ P }.
P1
Assume that is convergent, then there must be a b large enough so that
p
1
= A < 1. Dux defines M 0 = {n0 > 1 : p | n0 ⇒ p > b} and M 00 = M
P f∩ M f0 , (where
p>b
p
f is the complement of M ) that is M 00 is all integers that have prime divisors in both M
M
and M 0 . Since P is a finite set of primes, we have
X 1 Y Y −1
1 1 1
≤ 1 + + 2 + ··· = 1− < ∞,
M
n P
p p P
p
since each term in the sum on the left hand side, must appear at least once as a term in
the expansion of the product on the right hand side. This is due to the definition of M as
all integers divisible only by primes less than b. By a similar argument, we find an upper
bound for sums over M 0 , that is
!2
X 1 X1 X1 A
0
≤ + + ··· = < ∞,
M 0
n p>b
p p>b
p 1 − A
by the initial assumption of convergence, and the formula for the sum of an infinite
geometric series. Now, it follows that
X 1 X1X 1 X 1
= − < ∞,
M 00
n00 M
n M 0 n0 M0
n0
since 1 ∈ M , Dux subtracts off reciprocals over M 0 , to ensure a sum over integers which are
in M 00 . Since N = M ∪ M 0 ∪ M 00 and M ∩ M 0 ∩ M 00 = ∅, we must have
∞
X 1 X1 X 1 X 1
= + + < ∞,
n=1
n M
n M0
n0 M 00 n00
which is a contradiction of the divergence of the harmonic series. Therefore our initial
P1
assumption was false and diverges.
p
13
2.4 Clarkson’s Proof
James Clarkson published the following proof in 1966 [6]. This proof is similar to the
proof by Erdös, but with an interesting twist. Clarkson’s proof employs the trick that
Euclid used in his proof that there are infinitely many primes. This proof only requires the
set P from the previous two proofs, that is P = {p : a ≤ p ≤ b}. Start by assuming that
P 1
p is convergent. Therefore, there must be a large enough a such that
p
X1 1
<
P
p 2
for all b. Now define Y

Q= p
p<a
then for any fixed r, there is a large enough b such that all primes which divide 1 + iQ for
1 ≤ i ≤ r are in P , since as in Euclid’s proof, p < a implies that p - 1 + iQ. Since each term
of r
X 1
i=1
1 + iQ
which has a denominator which is a product of j (not necessarily distinct) primes, occurs
at least once in the expansion of !j
X1
< 2−j
P
p
by assumption and repeated application of the multiplication property of inequalities.
Summing over all j ≥ 1, we get
r
X 1 X
< 2−j = 1.
i=1
1 + iQ j≥1
Since the right hand side is the geometric series with first term 2−1 . On the other hand,
1 1
since > , and
1 + iQ 2Qi
r r r
X 1 X 1 1 X1
> = ;
i=1
1 + iQ i=1
2iQ 2Q i=1 i
because r was arbitrary, the series diverges as a fraction of the harmonic series, a
contradiction to the upper bound of 1 for large enough r.
14
2.5 Niven’s Proof
The following is a proof published by Ivan Niven in 1971 [14]. This proof is very short
and employs the series expansion for ex . As we saw in Erdös’ proof, every positive integer
can be expressed as a square-free integer r and a square k 2 . Let n be any positive integer
and S = {r < n : r is square-free}. Then we have
! !
X1 X 1 X1
≥ .
S
r j<n
j2 q<n
q
The inequality follows from the fact that every term on the right hand side will appear at
least once as a term in the expansion of the product on the left hand side. Now the second
sum is p-series convergent, and is therefore bounded, but since the right hand side is the
unbounded harmonic series as n → ∞, the first sum over square-free integers must be
P1
unbounded. Now suppose that converges to β. From the Maclaurin series expansion
p
x2
ex = 1 + x + + . . ., we obtain ex > 1 + x for x > 0. Now the last chain of inequalities
2
follows
P 1 Y 1 Y 1
X
1
β
e >e p<n p
= ep > 1+ ≥ ,
p<n p<n
p S
m
P1
which contradicts the unboundedness of the series over square-free integers, hence
p
diverges.
15
§3: Chebyshev’s Theorem
As was stated in the introduction, Chebyshev made an important, although relative to

the proof of the prime number theorem; modest, step toward proving the prime number
theorem. He actually found a sort of asymptotic upper and lower bound for the prime
N
counting function π(N ) in terms of , that is he proved the following theorem. [4]
log N
3.1 Theorem (Chebyshev)

There exist constants a and A, 0 < a < 1 < A, such that if N is sufficiently large, we
have
N N
a < π(N ) < A . (3.1.1)
log N log N
It also follows from this theorem that there are infinitely many primes, and the sum of
reciprocals of primes is divergent. In order to prove (3.1.1), we will require some definitions
and lemmas.
Chebyshev defined the following useful functions:
X
θ(N ) = log p, N > 0, p a prime.
p≤N
and X
ψ(N ) = log p.
1≤m,p
pm ≤N
Here, if m is the largest integer such that pm ≤ N , then log p occurs exactly m times in the
sum. For example, ψ(9) = 3 log 2 + 2 log 3 + log 5 + log 7. It can be easily seen that eθ(N ) is
the product of all primes less than or equal to N , and eψ(N ) is the least common multiple of
all positive integers ≤ N.
If we let m be the largest integer such that 2m ≤ N , there are no primes p > 2 such
log N log N
that pm ≤ N. This is so because if p > 2, such that pm ≤ N , then m ≤ < .
log p log 2
log N
This is a contradiction since m ≤ , thus pm > N . Now, if we let m − in be the largest
log 2
1
integer such that pm−i
n
n
≤ N , then pn ≤ N m−in , where pn is the nth prime for which
pn ≤ N . It follows that in the formula for ψ(N ), log pn occurs exactly m − in times. On the
other hand, log pn occurs exactly m − in times in the sum
1 1 1 1
θ(N ) + θ(N 2 ) + θ(N 3 ) + · · · + θ(N m−in ) + · · · + θ(N m ),
hence
1 1 1
ψ(N ) = θ(N ) + θ(N 2 ) + θ(N 3 ) + · · · + θ(N m ). (3.1.2)
1 log N
This sum terminates since θ(N ) = 0 for N < 2, and specifically, N k < 2 when k > .
log 2
log n
If pm ≤ N < pm+1 , N ≥ 1, then m log p ≤ log N < (m + 1) log p, and m ≤ < m + 1,
log p
16

log N
hence log p occurs exactly m times in ψ(N ), and m = . Thus there is another way
log p
to express ψ(N ), that is
X log N
ψ(N ) = · log p. (3.1.3)
p≤N
log p
It turns out that Chebyshev’s functions are closely related to π(N ), as seen in the following
theorem, which is crucial in the proof of (3.1.1).
3.2 The Chebyshev Function Theorem
Let
π(N ) π(N )
l1 = lim , L1 = lim ,
N N
log N log N
θ(N ) θ(N )
l2 = lim , L2 = lim ,
N N
ψ(N ) ψ(N )
l3 = lim , L3 = lim .
N N
Then l1 = l2 = l3 , and L1 = L2 = L3 .
Proof. From (3.1.2), we get that θ(N ) ≤ ψ(N ), and from (3.1.3) that
X log N X
ψ(N ) ≤ · log p = log N 1,
p≤N
log p p≤N
that is ψ(N ) ≤ π(N ) log N, therefore θ(N ) ≤ ψ(N ) ≤ π(N ) log N. Dividing through by N ,
we obtain
L2 ≤ L3 ≤ L1 . (3.2.1)
Next, choose a constant real number α, such that 0 < α < 1. Let N > 1, then
X
θ(N ) ≥ log p,
N α <p≤N
and since log p > log N α , we have

X
θ(N ) ≥ α log N 1,
N α <p≤N
which implies that θ(N ) ≥ α log N (π(N ) − π(N α )). Since there can be no more primes
than there are integers, we have π(N α ) < N α , so that θ(N ) > απ(N ) log N − αN α log N,
dividing through by N, we get
θ(N ) log N log N

> απ(N ) − α 1−α .
N N N
17
log N
Now since 0 < α < 1, it follows that 1−α → 0, as N → ∞. So it follows that L2 ≥ αL1 ,
N
for every real number α, such that 0 < α < 1. Therefore L2 ≥ L1 , and combining this with
(3.2.1) we can conclude that L1 = L2 = L3 .
The proof that l1 = l2 = l3 is similar. From the same inequality by which we obtained
(3.2.1), we must have l2 ≤ l3 ≤ l1 . By the same steps as above, we obtain l2 ≥ α l1 ,
therefore l2 ≥ l1 and thus l1 = l2 = l3 .
This theorem states that if any one of the three functions
π(N ) θ(N ) ψ(N )

, , ,
N N N
log N
has a limit as N tends to infinity, then so do the others, and all the limits are the same. In
other words, the three functions are asymptotic to each other. As was stated in the
introduction, the existence of the limit of any one of these functions is the real difficulty in
proving the prime number theorem.
3.3 Proof of Chebyshev’s Theorem

We are now ready to prove (3.1.1). If we let
π(N ) π(N )
l = lim , and L = lim ,
N N
log N log N
then we will prove the theorem by showing L ≤ 4 log 2 and l ≥ log 2. As we saw in the
proof of the previous theorem, these inequalities can be exchanged for the following
θ(N )
L = lim ≤ 4 log 2, (3.3.1)
N
ψ(N )
l = lim ≥ log 2. (3.3.2)
N
We will start by proving (3.3.1). We consider the binomial coefficient

2m (2m)! (m + 1)(m + 2) · · · (2m)
M= = 2
= .
m (m!) 1 · 2···m
In order to understand the properties of M , it is useful to find the prime factorization of

M . We can find this factorization by the formula for the prime factorization of N ! for any
positive integer N > 1. To derive this formula, we first consider how many times a given
prime p divides N !. The formula for this number is easily motivated by considering a
special case, take N = 5, then 5! = 5 · 4 · 3 · 2 · 1 = 23 · 3 · 5 = 120. Note that
the exponent
5
of 2 is determined by the number of times 2 occurs as a factor, that is 2 = 2, added to
the number of times its square, 4 occurs as a factor 54 = 1. It can be shown by induction
on N that in general, the number of times that a prime p divides N ! is
18
X
N N N N
+ 2 + ... + s = (3.3.3)
p p p s≥1
ps
s s
s is the largest integer for which p ≤ N . This series terminates since once p > N,
where
N
= 0. Therefore, the prime factorization of N ! is
ps
bN c+b N2 c+···+b pNs c
Y
N! = p p p . (3.3.4)
p≤N
We can use (3.3.4) to write

2m (2m)! Y (b 2m c−2b m c)+“b 2m c−2b m c”+···+(b 2m c−2b m c)
p2 p2 ps ps
M= = = p p p
(3.3.5)
m (m!)2 p≤2m
where s is the largest integer for which ps ≤ 2m. Let r ≤ s be the largest integer for which
pr ≤ m, so that
m m
2 r+1 + · · · + 2 s = 0.
p p
In the prime factorization of M above, we see the expression b 2m
pk
c − 2b pmk c, 1 ≤ k ≤ s
occur many times as a term in the expression for the exponent. We can determine what
the values of this expression may be. We drop the floor notation and write b pmk c = pmk −
for some 0 ≤ < 1, and b 2m pk
c = 2m
pk
− δ for some 0 ≤ δ < 1. Then 2b pmk c = 2m
pk
− 2 and we
2m m
get b pk c − 2b pk c = 2 − δ, thus

2m m 2m m
−1 < k
− 2 k < 2 hence k
− 2 k = 0 or 1. (3.3.6)
p p p p
From (3.3.5), M is always an even integer, because if r is the largest integer such that
2 ≤ m, then m < 2r+1 ≤ 2m, and 2m < 2r+2 ≤ 4m, thus s = r + 1 and
r

2m j m k
− 2 = 1.
2r+1 2r+1
Therefore 2 is always a factor of M , and hence M is always even.

Since 2m is even, M is the unique middle term in the binomial expansion

2m 2m 2m 2m
(1 + 1) = 1 + 2m + + ··· + + ··· + 2m + 1.
2 m 2m − 2
M is also the largest integer in the expansion. Since (1 + 1)2m = 22m ; combining this and
the fact that there are 2m + 1 terms in the expansion of (1 + 1)2m , we obtain
M < 22m , and 22m < (2m + 1)M. (3.3.7)
19
From the product (3.3.5) we see that since the denominator of M is composed of
(m!)2 , it can not be divisible by any Q
primes m < p ≤ 2m. It is also clear from (3.3.5), and
the fact that 2 divides M that M ≥ m<p≤2m p, thus
X
log M ≥ log p = θ(2m) − θ(m).
m<p≤2m
Now from (3.3.7) we obtain log M < 2m log 2, hence
θ(2m) − θ(m) < 2m log 2. (3.3.8)
If we set m = 1, 2, 22 , . . . , 2k−1 in (3.3.8), we obtain a list of inequalities
θ(2) − θ(1) < 2 log 2

θ(4) − θ(2) < 4 log 2
θ(8) − θ(4) < 8 log 2
..
.
θ(2k ) − θ(2k−1 ) < 2k log 2
adding these inequalities together, and using

k
X
2r = 2k+1 − 2 < 2k+1
r=1
we get
k
X
k
θ(2 ) − θ(1) < log 2 2r < 2k+1 log 2.
r=1
Now, since θ(1) = 0, this can be rewritten as
θ(2k ) < 2k+1 log 2. (3.3.9)
If we now let N > 1, and k be a positive integer such that 2k−1 ≤ N < 2k . The
function θ is never decreasing, so (3.3.9) gives us
θ(N ) < θ(2k ) < 2k+1 log 2 ≤ 4N log 2.
θ(N )
Hence < 4 log 2, which implies that
N
θ(N )
L = lim ≤ 4 log 2,
N
π(N ) log N
thus by Theorem 3.2, L = lim ≤ 4 log 2, which proves (3.3.1).
N
20
We now begin to prove (3.3.2), the second part of Chebyshev’s theorem. To prove
(3.3.2) we need to consider the binomial coefficient from (3.3.5) and its prime factorization,
that is

2m 2m! Y (b 2m c−2b m c)+“b 2m c−2b m c”+···+(b 2m c−2b m c)
p2 p2 ps ps
M= = 2
= p p p
.
m (m!) p≤2m
This product shows that M is divisible by p exactly vp times, where vp can be written
X 2m
m
vp = k
−2 k .
k≥1
p p
Therefore Y
M= pvp .
p≤2m

2m m k log 2m
Now since = k = 0 when p > 2m, that is when k > , we have
pk p log p
Mp
X 2m m log 2m
vp = k
− 2 k , Mp = . (3.3.10)
k=1
p p log p
Combining this result with (3.3.6) we get that vp ≤ Mp , hence
Y Y
M= pvp ≤ pM p . (3.3.11)
p≤2m p≤2m
Now we also have from (3.1.3) and (3.3.10) that

X log 2m X
ψ(2m) = · log p = Mp log p,
p≤2m
log p p≤2m
so it follows that Y
eψ(2m) = pM p ,
p≤2m
and by (3.3.11), log M ≤ ψ(2m). It follows from (3.3.7) that

log M > 2m log 2 − log(2m + 1), therefore for every positive integer m, we have
ψ(2m) > 2m log 2 − log(2m + 1). (3.3.12)

N
To finish the proof, let N be a positive integer, N > 2, and let m = ≥ 1, then
2
N N
m= − , for 0 ≤ < 1. Thus m > − 1, and 2m + 2 = N hence 2m ≤ N . From
2 2
(3.3.12), we get
ψ(N ) ≥ ψ(2m) > (N − 2) log 2 − log(N + 1),
and dividing through by N
ψ(N ) N −2 log(N + 1)
> log 2 − ,
N N N
21
hence
ψ(N )
l = lim ≥ log 2,
N
which combined with Theorem 3.2, proves Chebyshev’s theorem.
It follows from Chebyshev’s theorem that π(N ) diverges, and provides another proof
P 1
that p over all primes p, diverges. The divergence of π(N ) follows since the exists a
p
N
constant 0 < a < 1 such that for sufficiently large N , π(N ) > a. The divergence of
log N
prime reciprocals follows from Chebyshev’s theorem by letting pn be the nth prime, then
π(pn ) = n, and from Chebyshev’s theorem, we have
N
π(N ) > a ,
log N
for sufficiently large N . Temporarily
√ assuming n to be a real variable, and differentiating
√ 2− n
f (n) = log n − n, we get < 0 for n ≥ 4, and f (4) = 2 log 2 − 2 < 0. Therefore
√ √ 2n √
n
log n < n or n log n < n, and hence > n for all n ≥ 4. It follows that
log n
pn √
n = π(pn ) > a > pn
log pn
1
if n is sufficiently large. Therefore log n > log pn , hence log pn < 2 log n, it follows that
2
pn a < n log pn < 2n log n,
or
1 a a
> >
pn n log pn 2n log n
for large enough n. Thus the series
∞
X 1
p
n=1 n
diverges, in comparison to the divergent series

∞
X 1
.
n=2
n log n
3.4 Another Small Step Towards the Prime Number Theorem

N
In section 3.3 we showed that for large N , π(N ) does not stray to far from in an
log N
asymptotic sense, that is for sufficiently large N ,
π(N )
a< < A.
N
log N
22
π(N )
In this section we will do even better, we will show that if tends to a limit as
N
log N
N → ∞, then this limit is 1 [10], from which the prime number theorem would follow. This
result will be readily shown by applying the asymptotic formulas from 3.2 and the
following result which will be proven in 4.3, that is
Z N
ψ(t)
dt = log N + O(1). (3.4.1)
1 t2
From (3.4.1) we will deduce that
ψ(N ) ψ(N )
lim ≤ 1, lim ≥ 1.
N N
ψ(N )
This proof is by contradiction, if we assume that lim = 1 + α, for some α > 0,
N
then we have ψ(N ) > (1 + α) N for all N greater than some N0 . Therefore
Z N 1+ α

Z N Z N0
ψ(t) ψ(t) 2 dt > 1 + α log N + O(1),

dt > dt +
1 t2 1 t2 N0 t 2
ψ(N )
which is a contradiction to (3.4.1), so that lim ≤ 1.
N
ψ(N )
Now suppose that lim = 1 − α, α > 0. ψ(N ) < (1 − α) N for all N greater than
N
some N0 . Therefore
Z N 1− α

Z N Z N0
ψ(t) ψ(t) 2 dt < 1 − α log N + O(1),

dt < dt +
1 t2 1 t2 N0 t 2
ψ(N ) ψ(N )
a contradiction to (3.4.1), thus lim ≥ 1. Therefore, if limN →∞ exists then it is
N N
equal to 1. Now applying Theorem 3.2 we see that if the limit exists, then
π(N )
lim = 1.
N →∞ N
log N
Therefore, all that remains to proving the prime number theorem and showing that
N
π(N ) ∼ , is the existence of this limit. As stated in the introduction, this is quite
log N
difficult, and was not proven until independent proofs were given by Hadamard and
Poisson in 1896.
23
§4: Mertens’ Theorem
4.1 Introduction
Franz Mertens (1840-1927) was a Polish mathematician. Mertens studied under

Kronecker and Kummer at the University of Berlin where he received his doctorate in
1865. In 1874, Mertens proved three famous results on the distribution of primes now
collectively referred to as “Mertens’ Theorem”. Before we state this theorem we will need a
few definitions. Hans von Mangoldt (1854-1925) was a German mathematician who made
contributions to solving the prime number theorem. He defined the following function now
known as the “von Mangoldt” function,

 log p, if n = pm , m a positive integer
Λ(n) =
0, otherwise.

Notice how this function is related to the Chebyshev function,

X X
ψ(N ) = log p = Λ(n).
1≤m,p n≤N
pm ≤N
With these definitions, we can now state [4]:

Mertens’ Theorem
As N → ∞ we have
X Λ(n) X log p
= log N + O(1); = log N + O(1), (4.1.1)
n≤N
n p≤N
p
Z N
ψ(t)
dt = log N + O(1), (4.1.2)
1 t2
X1
1
= log log N + C + O . (4.1.3)
p≤N
p log N
where C is a constant. The proof of this theorem will require several lemmas, which will be
the content of the next section.
4.2 Preliminary Lemmas
We will begin with a weak form of Stirling’s formula, that is, as m → ∞
log m! = m log m + O(m) (4.2.1)
The strong version of Stirling’s formula provides a better approximation to m!, but this is
more than is needed in the proof of Merten’s theorem. For a complete proof of Stirling’s
formula see [20].
24
Proof:
To prove this result, we will show that m log m − log m! < m for all m ≥ 2. We do not
need absolute value because this difference is always positive for m > 2 since for any m,
log m + log m + . . . + log m > log m + log (m − 1) + . . . + log 2 + log 1. Since log m is always
| {z }
mR terms
m Rm
increasing, m−1 log t dt < log m, hence 1 log t dt < log 2 + . . . + log m = log m!. Using
integration by parts, one finds the antiderivative
Rm of log t to be t log t − t. Using the
fundamental theorem of calculus, we get 1 log t dt = m log m − m + 1, so that
m log m − log m! < m − 1 < m for all m ≥ 2. Therefore, log m! = m log m + O(m) (4.2.1).
The next result that will be needed is, as m → ∞
ψ(m) = O(m). (4.2.5)
This result follows easily from the theorems proved in 3.2 and 3.3. By Chebyshev’s
π(m) log m ψ(m)
theorem, for large enough m, a ≤ ≤ A, thus by Theorem 3.2 ≤ A, thus
m m
ψ(m) ≤ Am, and hence ψ(m) is O(m).
The final result that will be required in the proof of Mertens’ theorem is the formula for
“partial summation”, or “Abel summation”. This formula is useful to number theorists as
a systematic approach to the computation of finite sums of number theoretic functions. Let
0 ≤ λ1 ≤ λ2 ≤ · · · be any divergent sequence of real numbers, and let an be a sequence of
complex numbers. Let X
A(N ) := an ,
λn ≤N
and φ(N ) a complex-valued function defined for N ≥ 0. Then if φ(N ) has a continuous
derivative in (0, ∞), and N ≥ λ1 , then
X Z N
an φ(λn ) = A(N )φ(N ) − A(t)φ0 (t)dt. (4.2.3)
λn ≤N λ1
Proof: Let us define A(λ0 ) = 0, then
A(λ1 ) − A(λ0 ) = a1
A(λ2 ) − A(λ1 ) = a2
..
.
A(λn ) − A(λn−1 ) = an
substituting these expressions for an into the sum, we see that
25
k
X
an φ(λn ) = (A(λ1 ) − A(λ0 ))φ(λ1 ) + (A(λ2 ) − A(λ1 ))φ(λ2 ) + · · · + (A(λk ) − A(λk−1 ))φ(λk )
n=1
k−1
X
= A(λk )φ(λk ) − A(λn )(φ(λn+1 ) − φ(λn ))
n=1
k−1
X Z λn+1
= A(λk )φ(λk ) − A(λn ) φ0 (t)dt
n=1 λn
k−1 Z λn+1
X
= A(λk )φ(λk ) − A(t)φ0 (t)dt
n=1 λn
Z λk
= A(λk )φ(λk ) − A(t)φ0 (t)dt. (4.2.4)
λ1
The last equation follows since φ(t) has a continuous derivative on (0, ∞). Now, if we let k
be the largest integer such that λk ≤ N , then from integration by parts we have
Z N Z N
0
A (t)φ(t)dt = A(N )φ(N ) − A(λk )φ(λk ) − A(t)φ0 (t)dt.
λk λk
Since A(t) is a step function hence is constant on λk ≤ t < λk+1 , so the left hand side of the
above equation is zero, and rearranging we obtain
Z N
A(λk )φ(λk ) = A(N )φ(N ) − A(t)φ0 (t)dt.
λk
Substituting this back into (4.2.4), and again using the fact that A(t) is a step function we
obtain our result (4.2.3)
X Z N
an φ(λn ) = A(N )φ(N ) − A(t)φ0 (t)dt.
λn ≤N λ1
4.3 Proof of Merten’s theorem
We now have all the ingredients necessary to prove Mertens’ theorem. We will begin by
proving (4.1.1), that is
X Λ(n) X log p
= log N + O(1); = log N + O(1).
n≤N
n p≤N
p
Recall in (3.3.4) we showed that

bN c+b N2 c+···+b pNs c
Y
N! = p p p ,
p≤N
26
where s is the largest integer for which ps ≤ N . Taking logarithms of both sides of this
formula we obtain
X N X N
log N ! = log p = Λ(n). (4.3.1)
r
pr n≤N
n
(p,r): p ≤N

N N
As in the proof of Chebyshev’s theorem, we write = − n , where 0 ≤ n < 1.
n n
Substitution in (4.3.1) gives us
X Λ(n) X
log N ! = N − n Λ(n),
n≤N
n n≤N
and by (3.3.1), we have

X X
n Λ(n) < Λ(n) = ψ (N ) = O(N ),
n≤N n≤N
thus
X Λ(n)
log N ! = N + O (N ) .
n≤N
n
By (4.2.1)
log N ! = N log N + O(N ),
and dividing through by N we get
X Λ(n)
= log N + O(1), (4.3.2)
n≤N
n
which is the first formula in (4.1.1). To get the second formula in (4.1.1) we bound the
difference between (4.3.2) and our summation of interest by a constant, that is

X Λ(n) X log p X 1 1
X log p X log n
− ≤ + + · · · log p < < .

n p p2 p3 p(p − 1) n(n − 1)

n≤N p≤N p≤N p n≥2
(4.3.3)
Here the inequality on the left hand side of (4.3.3) follows since the von Mangoldt function
is nonzero for n = pm , hence all terms in both sums agree except those with denominators
which are powers of primes with m ≥ 2. The final series on the right hand side of (4.3.3) is
1 2 √
convergent since ≤ 2 , and from the proof of Chebyshev’s theorem, log n < n
n(n − 1) n
for n ≥ 2, it follows that
X log n X log n X √n X 1
<2 2
< 2 2
= 2 3/2
= C < ∞.
n≥2
n(n − 1) n≥2
n n≥2
n n≥2
n
27
The last inequality is the convergent p-series, p= 32 . This shows the second formula in
(4.1.1) holds since the difference between the two summations will always be bounded
above by this constant of convergence. That is by (4.3.2)

X Λ(n) X log p
− = O(1).

n p

n≤N p≤N
Now
!
X log p X Λ(n) X log p X Λ(n)
= + −
p≤N
p n≤N
n p≤N
p n≤N
n
= log N + O(1) + O(1)
= log N + O(1).
Therefore
X log p
= log N + O(1),
p≤N
p
which completes (4.1.1).

Next we shall prove (4.1.2), that is
Z N
ψ(t)
dt = log N + O(1).
1 t2
Recall that in section 3.4, we used this formula without proof to show that
π(N ) π(N )
lim ≥ 1 and lim ≤ 1.
N N
log N log N
Therefore, if
π(N )
lim
N →∞ N
log N
exists, then the limit must be equal to 1. From this, we obtain the asymptotic formula in
the prime number theorem.
To begin the proof, recall the previously mentioned relation
X
ψ(t) = Λ(n).
n≤t
We substitute this expression for ψ(t) into the integral to obtain

Z N Z N
ψ(t) X dt
dt = Λ(n) .
1 t2 1 n≤t
t2
28
Now, when considering t = 1, 2, 3, ..., N we see that we can rewrite
Z NX Z 2 Z 3 Z N
dt 1 1 1
Λ(n) 2 = Λ(1) 2 dt+ (Λ(1) + Λ(2)) 2 dt+· · ·+ (Λ(1) + · · · + Λ(N − 1)) 2 dt
1 n≤t
t 1 t 2 t N −1 t
which can be rewritten

Z N Z N Z N
1 1 1
Λ(1) dt + Λ(2) dt + · · · + Λ(N − 1)) dt
1 t2 2 t2 N −1 t2
thus Z N Z N Z N
ψ(t) X dt X dt
dt = Λ(n) 2 = Λ(n) . (4.3.4)
1 t2 1 n≤t
t n≤N n t2
1 1
The integral on the right hand side of (4.3.4) is equal to − , substituting this back in
n N
and simplifying, we obtain
Z N
ψ(t) X Λ(n) ψ(N )
dt = − .
1 t2 n≤N
n N
Now applying the first part of (4.1.1) and (4.2.2), we have

Z N
ψ(t)
dt = log N + O(1)
1 t2
which is (4.1.2).
Now we can prove (4.1.3) that is
X1
1
= log log N + C + O .
p≤N
p log N
The proof requires defining the proper sequences and functions for use in Abel’s
summation (4.2.3). Letting λn = pn , where pn is the sequence of prime numbers in natural
order. We define
X log pn
A(N ) = an , where an =
p ≤N
pn
n
1
and φ(pn ) = . Abel summation gives us
log pn
X 1 Z N
A(N ) A(t)
= + 2
dt.
p ≤N
p n log N 2 t(log t)
n
Now from the second formula in (4.1.1), letting A(N ) = log N + E(N ), so that
X 1 Z N
E(N ) log t + E(t)
=1+ + dt
p ≤N
pn log N 2 t(log t)2
n
29
where by (4.1.1), E(N ) is O(1), thus |E(N )| ≤ K for all N ≥ 2, where K is some constant.
We can move this around a little to get
X 1 Z N Z N
E(N ) dt E(t)
=1+ + + dt.
p ≤N
pn log N 2 t log t 2 t(log t)2
n
Now the first integral is, Z N

dt
= log log N − log log 2.
2 t log t
and substituting this back in and rearranging, we see that
X 1 Z N
E(N ) E(t)
= log log N + 1 − log log 2 + + dt.
p ≤N
pn log N 2 t(log t)2
n
|E(N )| K
Using the fact that E(N ) = O(1), we have ≤ . It remains to show that
log N log N
Z N
E(t) 1
dt = O .
2 t(log t)2 log N
We write Z ∞ Z ∞
Z N
E(t) E(t) E(t)
dt = dt − dt.
2 t(log t)2 2 t(log t)2 N t(log t)
2
Now the improper integral Z ∞

E(t)
dt
2 t(log t)2
is convergent since Z ∞ Z ∞
E(t) dt K
dt ≤ K = .
2 t(log t)2 2 t(log t)2 log 2
Now we re-write once again
X 1 Z ∞ Z ∞
E(t) E(N ) E(t)
= log log N + 1 − log log 2 + 2
dt + − 2
dt,
p ≤N
p n 2 t(log t) log N N t(log t)
n
where Z ∞
E(t) K
2
dt ≤ .
N t(log t) log N
Hence Z ∞
E(N ) E(t) 2K
log N − dt ≤ ,

2
N t(log t) log N

1
for N ≥ 2, which shows the error term is O . Setting
log N
Z ∞
E(t)
C = 1 − log log 2 + dt,
2 t(log t)2
(4.1.3) follows. This completes the proof of Mertens’ theorem.
30
§5: The Product Form of Mertens’ Theorem
The goal of this section is to prove an important result that follows from the proof of
Mertens’ theorem in 4.3, that is [10]
Y e−γ e−γ

1
1− = (1 + o(1)) ∼ , (5.0)
p≤N
p log N log N
where γ is the Euler-Mascheroni constant.

5.1 The Euler-Mascheroni Constant
The Euler-Mascheroni or Euler’s constant was defined by Euler in 1734 as
" N # Z
X1 ∞
1 1
γ = lim − log N = − dt. (5.1.1)
N →∞
n=1
n 1 btc t
The symbol γ was first used by the geometer Lorenzo Mascheroni, who in 1790 used the
symbol γ instead of Euler’s C, hence the name Euler-Mascheroni. We can use the very
useful Abel summation formula (4.2.3) to give us an approximation to γ. Let an = 1, so
1
that A(N ) = bN c and let φ(t) = ; then Abel summation gives us
t
Z N
X1 bN c btc
= + dt. (5.1.2)
n≤N
n N 1 t2
Letting bN c = N − N and btc = t − t where 0 ≤ N < 1, 0 ≤ t < 1, and substituting into

(5.1.2) we get
X1 Z N
N t
− log N = 1 − − 2
dt. (5.1.3)
n≤N
n N 1 t
Letting N → ∞ gives us another expression for γ,
Z ∞
t
γ =1− dt, (5.1.4)
1 t2
R ∞ t
thus we have 0 < γ ≤ 1. Now if we subtract and add N
dt to the right hand side of
t2
(5.1.3), we can use (5.1.4) to write
X1
= log N + γ + E(N ),
n≤N
n
where ∞
N ∞ t N
Z Z
t 2
|E(N )| = 2
dt − ≤ 2
dt + < .
N t N N t N N
Thus we have X1 2 X1 2
− log N − <γ< − log N + .
n≤N
n N n≤N
n N
31
This inequality provides a method to compute the value of γ to several places. To fifteen
decimal places, the value of γ is 0.577215664432730. [3]
The constant γ is closely connected to the Gamma function,
Z ∞
Γ(N ) = e−x xN −1 dx. (5.1.5)
0
This connection is due the mathematician Karl Weierstrass (1815-1897) and his formula
N 
 0 1
−
1 N
@ A
= Ne γN
Y k
 1+ e . (5.1.6)

Γ(N ) k>0
k
It requires some complex analysis to show that (5.1.5) and (5.1.6) are equivalent
representations for Γ(N ) [19]. Differentiating (5.1.5) with respect to N we obtain
Z ∞ Z ∞
0 −x N −1 0
Γ (N ) = e x log x dx, Γ (1) = e−x log x dx. (5.1.7)
0 0
Logarithmically differentiating (5.1.6) we obtain

X
N

N

− log(Γ(N )) = log N + γN + log 1 + −
k>0
k k
Γ0 (N )

1 X 1 1
− = +γ+ − ,
Γ(N ) N k>0
k + N k
−Γ0 (1) = γ (5.1.8)
since from (5.1.5), Γ(1) = 1. Therefore, from (5.1.7) and (5.1.8)

Z ∞
0
γ = −Γ (1) = − e−x log x dx. (5.1.9)
0
5.2 The Product Form as a Probability of a Number Being Prime

Intuitively, the product in (1) can be interpreted as the probablilty that a random
number N is not divisible by any primes p up to N ; in other words, the product represents
the probability that N is prime [8]. To see this, first consider the probability that a
1
number is divisible by 2, which is , thus the probability that a number is not divisible by
2
1 1 1
2 is 1 − = . The probability that a number is divisible by 3 is naturally , and
2 2 3
1 2
therefore the probability that a number is not divisible by 3 is 1 − = . Now, what is
3 3
the probability that a number is not divisible by 2, or 3? Define the two events A: a
number is divisible by 2; and B: a number is divisible by 3. As a special case, consider the
10 1
first 20 numbers, 1, 2, 3, . . . , 20 then the probability of A = = , and the probability of
20 2
32
6 3
B= = . The probability that a number is divisible by 3, given that it is divisible by
20 10
3
2, is which is just the probability of B, hence the events A and B are independent of
10
each
other. Therefore
the probability that a given number is not divisible by 2 or 3 is
1 1 1
1− 1− = . Similarly, the probability that a number is not divisible by 5 is
2 3 3
1
1− , therefore the probability that any random integer is not divisible by 2, 3 or 5 is
5
1 1 1 4
1− 1− 1− = . Continuing this process for all primes p ≤ N , gives the
2 3 5 15
probability that N is not divisible by any of the p ≤ N as
Y 1

1− .
p≤N
p
The product form of Mertens’ theorem verifies that this product tends to zero
1
asymptotically as γ . This limit does take a surprisingly long time to reach zero,
e log N
since the product form of Mertens’ theorem shows that at N = 100, 000,
1
≈ .049, thus according to this formulation, we still have about a 5%
e.577216 log 100, 000
chance that the integer is prime.
33
p
0.8
0.7
0.6
0.5
0.4
0.3
0.2
N
2 4 6 8 10
1
Fig.5.2.1: The diamonds are the graph of p≤N +1 1 − , and the stars are the graph of
Q
p
1
for 1 ≤ N ≤ 10.
e.577216 log(N + 1)
p
0.275
0.25
0.225
0.2
0.175
0.15
0.125
N
20 40 60 80 100
1
Fig.5.2.2:
Q
The diamonds are the graph of p≤N +1 1 − , and the stars are the graph of
p
1
for 1 ≤ N ≤ 100.
e.577216 log(N + 1)
34
The reader may notice a sort of “paradox” between this interpretation of (5.0) and the
prime number theorem. The prime number theorem asserts that the density of prime
1 1
numbers is , whereas Mertens’ theorem gives the quite non-asymptotic γ . In
log N e log N
fact compared to the prime number theorem, since eγ ≈ 1.78, the probability
underestimates the density of primes by nearly a factor of 2. One explanation is that the
Sieve of Eratosthenes is more efficient than random.
√ Recall from the introduction that we
find all primes p ≤ N by sieving with primes p ≤ N . Thus a number N can be √
determined to be prime by testing whether it is divisible by any of the primes p ≤ N . If
in our probability formulation, we take the probability of N being prime to be
Y 1

2 1.12
1− ∼ γ ≈ ,
√ p e log N log N
p≤ N
1
then Mertens’ theorem is much closer to , but now overestimates by 12% the number
log N
of primes. It turns out that our intuition in formulating the probability in this section
relied on assumption of independent events. It turns out that when using the Sieve of
Eratosthenes approach, these events are not completely independent. In light of the prime
1
number theorem, is the true density of primes; but the density obtained by Mertens’
log N
theorem is a decent estimate, and shows that that the product tends to zero.
5.3 Proof of the Product form of Mertens’ Theorem

Recall that in section 4.3 we found that
X1
1
= log log N + C + O .
p≤N
p log N
1
Because = o(1), Mertens’ theorem can be restated as
log N
X1
= log log N + C + o(1). (5.3.1)
p≤N
p
where Z ∞
E(t)
C = 1 − log log 2 + dt.
2 t(log t)2
(5.0) will follow from (5.3.1) and the following theorem which gives an alternate form of
the constant C in (5.3.1).
Theorem:
X 1

1

C=γ+ log 1 − + . (5.3.2)
p
p p
35
If α ≥ 0, we have, by (2.1.8)

1 1 1 1
0 < − log 1 − 1+α − 1+α < 1+α 1+α ≤ .
p p 2p (p − 1) 2p(p − 1)
We saw in (2.1.8) that the series is convergent. Thus by the Weierstrass m-test, we can
define for each α a uniformly convergent series
X 1

1

F (α) = log 1 − 1+α + 1+α . (5.3.3)
p
p p
Since the series is uniformly convergent for all α ≥ 0, we have F (α) is continuous, thus
F (α) → F (0) as α → 0 through positive values. If we now suppose that α > 0, then from
the equality between Euler’s product and the zeta-function discussed in the introduction,
we have
F (α) = g(α) − log ζ(1 + α),
where X 1
g(α) = 1+α
.
p
p
We again call upon the Abel summation formula (4.2.3), with the following definitions:
let the sequence λn be the sequence of positive integers with λ1 = 2,
1

 , if n = prime

an = n

0, otherwise

1
and φ(N ) = . Now, from the proof of Mertens’ Theorem in 4.3 we have that
Nα
X1
A(N ) = = log log N + C + E(N ),
p≤N
p
with these definitions, (4.2.3) becomes

Z ∞
X 1 A(N ) A(t)
= +α dt.
p≤N
p1+α Nα 2 t1+α
If we let N → ∞, we have Z ∞
A(t)
g(α) = α dt
2 t1+α
Z ∞ Z ∞ Z ∞
log log t 1 E(t)
=α dt + αC dt + α dt.
2 t1+α 2 t1+α 2 t1+α
Now, Z ∞ Z ∞ Z 2
log log t log log t log log t
α dt = α dt − α dt.
2 t1+α 1 t1+α 1 t1+α
36
u
Making the change of variables t = e α ,
Z ∞ Z ∞
log log t −u
u
α dt = e log du = −γ − log α
1 t1+α 0 α
by (5.1.9), and Z ∞
1
α dt = 1.
1 t1+α
Therefore
Z ∞ Z 2
E(t) (log log t + C)
g(α) + log α − C + γ = α dt − α dt.
2 t1+α 1 t1+α

1
It was shown in 4.3 that E(t) = O , using this, and making the substitution
log t
√1
T =e α ,
∞ Z T Z ∞
Z

α E(t) dt Kα dt
1+α
dt < Kα + 1+α

2 t 2 t log T T t
K √
< Kα log T + ≤ 2K α → 0 as α → 0.
log T
We also have
2
Z Z 2
(log log t + C) (| log log t| + |C|)

1+α
dt < dt = K,

1 t 1 t
since the integral converges at t = 1. Therefore g(α) + log α → C − γ
as α → 0.
Recall from section 1.2, the zeta-function, which can be written in the following form
∞ Z ∞ ∞ Z n+1
X 1 −s
X
ζ(s) = s
= x dx + (n−s − x−s ) dx. (5.3.4)
n=1
n 1 n=1 n
Now, since s > 1, we have Z ∞

1
x−s dx = .
1 s−1
Also x x
s(x − n)
Z Z
−s −s −s−1 s
0<n −x = st dt < st−2 dt = < 2,
n n nx n
if n < x < n + 1, and so
Z n+1 Z n+1
−s −s s s
0< (n − x ) dx < dx = ;
n n n2 n2
and the last term in (5.3.4) is positive and less than

∞
X 1 sπ 2
s 2
= .
n=1
n 6
37
Therefore
1
ζ(s) = + O(s),
s−1
and on taking logarithms, for |s − 1| < 1,
1 1
log ζ(s) = log + log [1 + O(s − 1)] = log + O(s − 1). (5.3.5)
s−1 s−1
Now, from (5.3.5) we get log ζ(1 + α) + log α → 0 as α → 0, and so F (α) → C − γ.
Therefore
C = γ + F (0),
which is (5.3.2).
It is now relatively easy to prove (5.0), by means of (5.3.1) and (5.3.2). To see this,
using our new form of the constant C in (5.3.2) in (5.3.1) we write
X1 X 1

1

= log log N + γ + log 1 − + + o(1). (5.3.6)
p≤N
p p
p p
Now, (5.3.6) is equal to

X1 X 1

1
X
1

1

= log log N + γ + log 1 − + + log 1 − + + o(1).
p≤N
p p≤N
p p p>N
p p
Thus X
X1 X 1 1
= log log N + γ + log 1 − + + o(1).
p≤N
p p≤N
p p≤N
p
Canceling and moving some terms around we get

X 1
log 1 − = − log log N − γ − o(1),
p≤N
p
exponentiating we get
log(1− p1 )
P
e p≤N
= e− log log N −γ−o(1) ,
using properties of logarithms we obtain
log(1− p1 )
Q
e p≤N
= e− log log N e−γ e−o(1) ,
finally
e−γ

Y 1 1 −γ
log 1 − = e (1 + o(1)) ∼ ,
p≤N
p log N log N
which is (5.0).
38
5.4 Another Proof of The Product Form of Merten’s Theorem
The following proof uses many of the same steps and lemmas as the previous proof;
but the integral evaluations for finding the expressions in the sum F (α) are simplified by
an exponential substitution. This proof also does not require using the Gamma function
form of γ given by (5.1.9), only the definition of γ. We begin by using (5.3.5) to write
1
log ζ(1 + α) = log + O(α).
α
From the Maclaurin series expansion
α2 α3

−α
1−e =1− 1−α+ − + ···
2! 3!
α2 α3
=α− + − ···
2! 3!
1 α
= α 1 + α − + − ···
2! 3!
= α (1 + O(α)) , as α → 0.
Taking logarithms we obtain
log ζ (1 + α) = − log 1 − e−α + O(α)

∞
X e−αn
= + O(α),
n=1
n
the last line follows from the Maclaurin series expansion for − log(1 − x) as shown in
(2.1.8). This relationship is a key difference between the two proofs and simplifies the
analysis considerably.
From (5.3.3) we have the second representation of log ζ(1 + α) :
X 1
log ζ(1 + α) = 1+α
− F (α).
p
p
Now letting
X1
H(t) =
n≤t
n
X1
P (t) =
p≤t
p
by Abel summation
X 1 Z ∞
1+α
=α P (et )e−αt dt;
p
p 0
39
and ∞ ∞
e−αn
X Z
=α H(t)e−αt dt.
n=1
n 0
Therefore Z ∞
log ζ(1 + α) = α H(t)e−αt dt + O(α)
0
and Z ∞
log ζ(1 + α) = α P (et )e−αt dt − F (α).
0
Subtracting to eliminate log ζ(1 + α) :

Z ∞
α e−αt (H(t) − P (et )) dt = −F (α) + O(α). (5.4.1)
0
By (5.1..1)-(5.1.4) we have
1
H(t) = log t + γ + O .
t
We have from (5.3.1) that
P (et ) = log t + C + o(1).
Therefore
Z ∞ Z ∞
−αt
α e t
(H(t) − P (e )) dt = α e−αt (γ − C + o(1)) dt
0 0
∞
e−αt

=α (γ − C + o(1))
−α 0
= γ − C + o(1)
Thus by (5.4.1) letting α → 0+

γ − C = −F (0)
or
C = γ + F (0)
which is (5.3.2). As in the previous proof, once this form of the constant C in Mertens’
theorem is known, the asymptotic formula follows.
40
6. Conclusion
6.1 Final Remarks

The distribution of the prime numbers throughout the integers is a fundamental
problem in number theory. While the prime number theorem gives us certainty that the
1
density is approximately , this is only an asymptotic estimate. The actual difference
log N
N
between π(N ) and for large N , can be very large. The better approximation
R N dt log N
Li(N )= 2 log t can still differ from π(N ) considerably, and how considerably is a Clay
Institute million dollar price: the Riemann Hypothesis. Euler laid the foundation for the
modern analysis of prime numbers. His wonderful product
∞ Y −1
X 1 1
s
= 1− s
n=1
n p
p
provided the connection between the zeta-function and the primes. We relied on Euler’s
product in 5.3 when we proved the product form of Merten’s theorem.
As was shown throughout this paper, the infinite sum of prime reciprocals is interwoven
in the theory of the distribution of prime numbers. As we observed in section 2.1, Euler
used his product to prove that
X1 1
> log log N − , for all N ≥ 2.
p≤N
p 2
In 3.2 the sum’s divergence was shown to be a consequence of Chebyshev’s theorem, which
provides upper and lower bounds for π(N ). Further, we could not have proved the
asymptotic formulas for the product form of Merten’s theorem without the formula
X1
1
= log log N + C + O ,
p≤N
p log N
as we saw in 4.3. The divergence of prime reciprocals also shows that there must be more
prime numbers than square numbers; and in fact more numerous than any integers of the
form n1+α where α > 0, since the reciprocals of such numbers always converge.
6.2 Further Results

An intriguing continuation of the thesis of this paper is to attempt to do the same sort
of analysis, but instead summing over reciprocals of twin primes. Twin primes are primes
that have a difference of two. We commonly refer to twin primes as twin prime pairs,
(p, p + 2). The first five twin prime pairs are (3, 5), (5, 7), (11, 13), (17, 19), (29, 31). Notice
that 5 occurs in two pairs, and is the only instance of a prime occurring in two different
twin prime pairs. This is because there can be no other instances of “triple” primes aside
from 3,5,7, since any other sequence of three consecutive odd numbers must contain a
41
number divisible by 3. Therefore, there will be no other repetitions of twin primes in the
twin prime pairs. It is still only a conjecture that there are infinitely many twin prime
pairs. Most mathematicians believe that there are infinitely many, but a proof still remains
elusive.
In 1919, the Norwegian mathematician Viggo Brun (1885-1978), proved that in stark
contrast to the divergence of prime reciprocals
X 1
< ∞. (6.2.1)
p a twin prime
p
The proof of Brun’s Theorem is historically important because it introduced powerful

sieving techniques for estimating the size of a sifted set of integers (recall the Sieve of
Eratosthenes). It takes a lot of work to develop Brun’s sieve, but the key result that is
used to prove (6.1.1) is that

N 2
π2 (N ) = O (log log N ) ,
(log N )2
where π2 (N ) is the number of primes p ≤ N for which p + 2 is also prime (the number of
twin prime pairs with first number in the pair less than or equal to N ). There is no known
easy proof of this result; perhaps the most accessible is a proof by Edmund Landau
(1877-1938) who proved this result by a detailed analysis of the properties of certain
alternating binomial coefficients [12]. Brun’s theorem at least tells us that there are not too
many twin prime pairs.
42
References
[1] Bays, C. and Hudson,R., (1999). A new bound for the smallest x with π(x) > li(x).
Mathematics of Computation vol.69, 231: pp.1285-1296.
[2] Bell, E.T., (1937). Men of Mathematics. Simon and Schuster Inc., New York, New
York.
[3] Brent, R.P., (1977). Computation of the regular continued fraction for Eulers
constant. Mathematics of Computation vol. 31: pp. 771-777.
[4] Chandrasekharan, K., (1968). Introduction to Analytic Number Theory. Berlin

Heidelberg New York: Springer-Verlag.
[5] Cojocaru, A. C, and Murty, M.R. (2006). An Introduction to Sieve Methods and their
Applications. Cambridge University Press, New York, New York.
P1
[6] Eynden, C.V., (1980, May). Proofs that p
diverges. The American Mathematical
Monthly vol. 87, No.5 : pp. 394-397.
[7] Goldstein, L.J., (1973). A history of the prime number theorem. American
Mathematical Monthly 80 (June-July): pp. 599-615.
[8] Goldston, D.A., Are there infinitely many twin primes? Article.
[9] H. Halberstam and H.-E. Richert (1974), Sieve Methods. Academic Press, New York.
[10] Hardy, G.H., and Wright, E.M., (1962). The Theory of Numbers, fourth edition.
Oxford University Press, Amen House, London.
[11] Hungerford, T., (1974). Algebra. Springer Science+Business Media, LLC, New York,
New York.
[12] Landau, E., (1958). Elementary Number Theory. Chelsea Publishing Company, New
York, N.Y.
[13] Lindqvist, P. and Peetre, J. (1997). On the remainder in a series of Merten’s.

Expositiones Mathematicae 15, 467-477.
[14] Niven, I., (1971, May). A proof of the divergence of σ p1 . The American Mathematical
Monthly vol. 78, No.3: pp. 272-273.
[15] Rosen, K.H., (2000). Elementary Number Theory, 4th edition. AT&T Laboratories
and Kenneth Rosen.
[16] Sandifer, E., (2006, March). How Euler did it. Mathematical Association of America
Online. June 27, 2008.
http://fermatslasttheorem.blogspot.com/2006/08/euler-product-formula.html
43
[17] Sautoy, M.D., (2003). Music of the Primes, 1st edition. Harper Collins Publishers Inc.,
New York, New York.
[18] The Great Internet Mersenne Prime Search. July 16, 2008 at
http://www.mersenne.org/prime.htm
[19] Titchmarsh, E.C., (1939). The Theory of Functions, second edition. Oxford University
Press, Amen House, London.
[20] Whittaker, E. T., and Watson, G. N., (1963). A Course in Modern Analysis, fourth
edition. Cambridge University Press, New York, New York.
[21] Wolframscience.com. November 18th, 2008 at

http://www.wolframscience.com/nksonline/page-908b-text?firstview=1
44

Divergence of Prime Reciprocals

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Divergence of Prime Reciprocals

Enviado por

Direitos autorais:

Formatos disponíveis

THE DIVERGENCE OF THE SUM OF RECIPROCALS OF PRIMES AND MERTENS’ THEOREM

Dr. Daniel Goldston

Dr. Marylin Blockus

Dr. Mohammed Saleem

APPROVED FOR THE UNIVERSITY

THE DIVERGENCE OF THE SUM OF RECIPROCALS OF PRIMES AND MERTENS’ THEOREM

§4: Mertens’ Theorem

1.3.1 A PDF figure . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.1 Prime Numbers

1.2 A Formula for Prime Numbers?

f (n) = as ns + as−1 ns−1 + · · · + a1 n + a0 = M > 1.

1.3 The Prime Number Theorem

200 400 600 800 1000

1000 2000 3000 4000 5000

1.5 The Divergence of the Sum of Prime Reciprocals

2.1 Euler’s Proof

Recall the formula for the sum of a finite geometric series

In words, M is the set of integers generated by P under multiplication, and Mn is a

for all b. Now define Y

As was stated in the introduction, Chebyshev made an important, although relative to

3.1 Theorem (Chebyshev)

and since log p > log N α , we have

θ(N ) log N log N

π(N ) θ(N ) ψ(N )

3.3 Proof of Chebyshev’s Theorem

In order to understand the properties of M , it is useful to find the prime factorization of

We can use (3.3.4) to write

Therefore 2 is always a factor of M , and hence M is always even.

M < 22m , and 22m < (2m + 1)M. (3.3.7)

Now from (3.3.7) we obtain log M < 2m log 2, hence

θ(2m) − θ(m) < 2m log 2. (3.3.8)

If we set m = 1, 2, 22 , . . . , 2k−1 in (3.3.8), we obtain a list of inequalities

θ(2) − θ(1) < 2 log 2

adding these inequalities together, and using

Now, since θ(1) = 0, this can be rewritten as

θ(2k ) < 2k+1 log 2. (3.3.9)

θ(N ) < θ(2k ) < 2k+1 log 2 ≤ 4N log 2.

Now we also have from (3.1.3) and (3.3.10) that

and by (3.3.11), log M ≤ ψ(2m). It follows from (3.3.7) that

ψ(2m) > 2m log 2 − log(2m + 1). (3.3.12)

diverges, in comparison to the divergent series

3.4 Another Small Step Towards the Prime Number Theorem

From (3.4.1) we will deduce that

Franz Mertens (1840-1927) was a Polish mathematician. Mertens studied under

Notice how this function is related to the Chebyshev function,

With these definitions, we can now state [4]:

4.2 Preliminary Lemmas

We will begin with a weak form of Stirling’s formula, that is, as m → ∞

log m! = m log m + O(m) (4.2.1)

ψ(m) = O(m). (4.2.5)

Proof: Let us define A(λ0 ) = 0, then

substituting these expressions for an into the sum, we see that

4.3 Proof of Merten’s theorem

Recall in (3.3.4) we showed that

and by (3.3.1), we have

which completes (4.1.1).

We substitute this expression for ψ(t) into the integral to obtain

which can be rewritten

Now applying the first part of (4.1.1) and (4.2.2), we have

Now the first integral is, Z N

Now the improper integral Z ∞

where γ is the Euler-Mascheroni constant.

Letting bN c = N − N and btc = t − t where 0 ≤ N < 1, 0 ≤ t < 1, and substituting into

Logarithmically differentiating (5.1.6) we obtain

Letting bN c = N − N and btc = t − t where 0 ≤ N < 1, 0 ≤ t < 1, and substituting into