Você está na página 1de 190

Multiplicative number theory

Andrew Granville
Universite de Montreal

K. Soundararajan
Stanford University

2011

The Snowbird version


Please do not circulate

This is an early draft of the book we are writing on the subject.


You will find that the first eight or so chapters are written reasonably carefully.
After that, well, its varied.
Hopefully the ideas are understandable,
even if the writing is not as clean as it might be.
This edition of the notes includes open problems;
you are invited to solve some of them!
We hope that the theory will be a lot cleaner by the time
that the book comes to print thanks to the
efforts of some of the participants here at Snowbird.

If you do solve some of the questions here, you are welcome to go


ahead and publish your results, of course. However please do not
reference this version of the book since it will soon be outdated.
We can help you with references if needs be

PREFACE
Riemanns seminal 1860 memoir showed how questions on the distribution of
prime numbers are more-or-less equivalent to questions on the distribution of
zeros of the Riemann zeta function. This was the starting point for the beautiful
theory which is at the heart of analytic number theory. Heretofore there has been
no other coherent approach that was capable of addressing all of the central issues
of analytic number theory.
In this book we present the pretentious view of analytic number theory;
allowing us to recover the basic results of prime number theory without use
of zeros of the Riemann zeta-function and related L-functions, and to improve
various results in the literature. This approach is certainly more flexible than
the classical approach since it allows one to work on many questions for which
L-function methods are not suited. However there is no beautiful explicit formula
that promises to obtain the strongest believable results (which is the sort of thing
one obtains from the Riemann zeta-function). So why pretentious?
It is an intellectual challenge to see how much of the classical theory one
can reprove without recourse to the more subtle L-function methodology (For
a long time, top experts had believed that it is impossible is prove the prime
number theorem without an analysis of zeros of analytic continuations. Selberg
and Erd
os refuted this prejudice but until now, such methods had seemed ad
hoc, rather than part of a coherent theory).
Selberg showed how sieve bounds can be obtained by optimizing values
over a wide class of combinatorial objects, making them a very flexible tool. Pretentious methods allow us to introduce analogous flexibility into many problems
where the issue is not the properties of a very specific function, but rather of a
broad class of functions.
This flexibility allows us to go further in many problems than classical
methods alone, as we shall see in the latter chapters of this book.
The Riemann zeta-function (s) is defined when Re(s) > 1; and then it is
given a value for each s C by the theory of analytic continuation. Riemann
pointed to the study of the zeros of (s) on the line where Re(s) = 1/2. However
we have few methods that truly allow us to say much so far away from the
original domain of definition. Indeed almost all of the unconditional results in
the literature are about understanding zeros with Re(s) very close to 1. Usually
the methods used to do so, can be viewed as an extrapolation of our strong
understanding of (s) when Re(s) > 1. This suggests that, in proving these
results, one can perhaps dispense with an analysis of the values of (s) with
Re(s) 1, which is, in effect, what we do.

viii

Preface

Our original goal in the first part of this book wasMR1790423


to recover all the main
results of Davenports Multiplicative Number Theory [21] by pretentious methods, and then to prove MR891718
as much as possible of the result of classical literature,
such as the results in [7]. It turns out that pretentious methods yield a much
easier proof of Linniks Theorem, and quantitatively yield much the same quality
of results throughout the subject.
However Siegels Theorem, giving a lower bound on |L(1, )|, is one result
that we have little hope of addressing without considering zeros of L-functions.
The difficulty is that all proofs of his lower bound run as follows: Either the
Generalized Riemann Hypothesis (GRH) is true, in which case we have a good
lower bound, or the GRH is false, in which case we have a lower bound in
terms of the first counterexample to GRH. Classically this explains the inexplicit
constants in analytic number theory (evidently Siegels lower bound cannot be
made explicit unless another proof is found, or GRH is resolved) and, without a
fundamentally different proof, we have little hope of avoiding zeros. Instead we
give a proof, due to Pintz, that is formulated in terms of multiplicative functions
and a putative zero.
Although this is the first coherent account of this theory, our work rests on
ideas that have been around for some time, and the contributions of many authors. The central role in our development belongs to Halaszs Theorem. Much is
based on the results and perspectives of Paul Erdos and Atle Selberg. Other early
authors include Wirsing, Hal
asz, Daboussi and Delange. More recent influential
authors include Elliott, Hall, Hildebrand, Iwaniec, Montgomery
and Vaughan,
MR1366197
Pintz, and Tenenbaum. In addition, Tenenbaums book [101] gives beautiful insight into multiplicative functions, often from a classical perspective.
Our own thinking has developed in part thanks to conversations with our
collaborators John Friedlander, Regis de la Breteche and Antal Balog. We are
particularly grateful to Dimitris Koukoulopoulos who has been working with us
while we have worked on this book, and proved several results that we needed,
when we needed them!

CONTENTS
1

The
1.1
1.2
1.3
1.4
1.5
1.6

prime number theorem


Partial Summation
Chebyshevs elementary estimates
Multiplicative functions and Dirichlet series
The average value of the divisor function and Dirichlets hyperbola method
The prime number
theorem and the Mobius function: proof
PNTM
of Theorem 1.10
Selbergs formula

1
1
3
5
6
8
9

First results on multiplicative functions


2.1 A heuristic
2.2 Multiplicative functions close to 1
2.3 Non-negative multiplicative functions

12
12
13
15

Integers without large prime factors


3.1 Smooth or friable numbers
3.2 Multiplicative functions which only vary at small prime factors

18
18
21

Distances and the Theorems of Delange, Wirsing and Hal


asz
4.1 The distance between two multiplicative functions
4.2 Delanges Theorem
4.3 A key example: the multiplicative function f (n) = ni
4.4 Hal
aszs theorem

23
23
26
26
28

Proof of Delanges Theorem

30

Deducing the prime number theorem from Hal


aszs theorem
6.1 Real valued multiplicative functions: Deducing Wirsings theorem
6.2 Deducing the prime number theorem

33
33
34

Selbergs sieve and the Brun-Titchmarsh theorem


7.1 The Brun-Titchmarsh theorem
7.2 An alternative lower bound for a key distance

35
35
38

Hal
aszs Theorem
8.1 Averages of averages
8.2 Applications of the Plancherel formula
8.3 The key estimate

39
39
40
42

Contents

8.4
8.5
9

Proof of Hal
aszs theorem
The logarithmic mean

Multiplicative functions
9.1 Upper bounds by averaging further
9.2 Convolutions of Sums
9.3 A first Structure Theorem
9.4 Bounding the tail of a sum
9.5 Elementary proofs of the prime number theorem

43
44
45
45
46
46
47
48

10 Dirichlet Characters

49

11 Zeta Functions and Dirichlet series: A minimalist discussion


11.1 Dirichlet characters and Dirichlet L-functions
11.2 Dirichlet series just to the right of the 1-line

52
52
53

12 Hal
aszs Theorem: Inverses and Hybrids
12.1 Lower Bounds on mean values
12.2 Tenenbaum (Selberg)

56
58
59

13 Distribution of values of a multiplicative function

60

14 Lipschitz bounds
14.1 Consequences
14.2 Truncated Dirichlet series

62
64
66

15 The structure theorem


15.1 Best possible

68
69

16 The large sieve


16.1 Prime moduli
16.2 Other things to perhaps include on the large sieve

71
74
75

17 The
17.1
17.2
17.3

76
76
77
78

Small Sieve
List of sieving results used
Shius Theorem
Consequences

18 The Pretentious Large Sieve


18.1 Mean values of multiplicative functions, on average

80
80

19 Multiplicative functions in arithmetic progressions

85

20 Primes in arithmetic progression

88

21 Linniks Theorem

91

22 Binary Quadratic Forms


22.1 The basic theory

94
94

Contents

22.2 Prime values


22.3 Finishing the proof of Linniks Theorem

xi

96
98

23 Exponential Sums
23.1 Technical Lemmas
23.2 The bound of Montgomery and Vaughan
23.3 How good is this bound?
23.4 When f is pretentious

101
101
102
103
105

24 The exponents k
24.1 How to determine a better upper bound on k in general

109
111

25 Lower bounds on L(1, ), and zeros; the work of Pintz


25.1 Siegels Theorem

114
114

26 The
26.1
26.2
26.3

116
116
117
118
119
122

Siegel-Walfisz Theorem
Primes well distributed implies...
Main results
Technical results
26.3.1 Preliminaries
26.3.2 Proofs

27 Primes in progressions, on average


27.1 The Barban-Davenport-Halberstam-Montgomery-Hooley Theorem
27.2 The Bombieri-Vinogradov Theorem

126

28 Integral Delay equations


IntDelEqn
28.1 Remarks on (28.1)
28.2 Inclusion-Exclusion inequalities
28.3 A converse Theorem
28.4 An example for Halaszs Theorem

133
134
135
136
136

29 Laplace Transforms

138

30 The
30.1
30.2
30.3
30.4
30.5

Spectrum
The Mean Value Spectrum
Factoring mean values
The Structure of the Mean Value Spectrum
The Euler product spectrum
The Delay Equation Spectrum

140
140
141
144
144
146

31 Results on spectra
31.1 The spectrum for real-valued multiplicative functions
31.2 The number of mth power residues up to x
31.3 An important example
31.4 Open questions of interest

148
148
149
149
150

32 The number of unsieved integers up to x

151

126
130

xii

Contents

32.1
32.2
32.3
32.4
33 The
33.1
33.2
33.3
33.4
33.5

Reformulation in terms of integral equations


An open problem or two
Upper bounds for G(w) and Lipschitz estimates
An improved upper bound: Proof of Theorem 2

153
153
154
156

logarithmic spectrum
Results for logarithmic means
Bounding 0 (S)
Negative truncations
Convergence
Upper bounds revisited

163
163
166
167
170
170

34 The Polya-Vinogradov Inequality


34.1 A lower bound on distances
34.2 Using the Pretentious Generalized Riemann Hypothesis

172
173
174

References

175

1
THE PRIME NUMBER THEOREM
As a boy Gauss determined that the density of primes around x is 1/ log x, leading
him to conjecture that the number of primes up to x is well-approximated by
the estimate
X
x
.
(1.1)
(x) :=
1
log x

PNT

px

It may seem less intuitive, but in fact it is simpler to weight each prime with
log p; and, as we have seen, it is natural to throw the prime powers into this sum,
which has little impact on the size. Thus we define the von Mangoldt function
(
log p if n = pm , where p is prime, and m 1
(n) :=
(1.2)
0
otherwise,

vM

PNT

and then, in place of (1.1), we conjecture that


X
(x) :=
(n) x.

(1.3)

nx

PNT

PNT2

The equivalent estimates (1.1) and (1.3), known as the prime number theorem,
are difficult to prove. In this chapter we show how the prime number theorem
is equivalent to understanding the mean value of the Mobius function. This will
motivate our study of multiplicative functions in general, and provide new ways
of looking at many of the classical questions in analytic number theory.
1.1

Partial Summation

We begin with a useful technique known as Abels partial summation. Let an


be a sequence
of complex numbers, and let f : R C be some function. Set
P
S(t) = kt ak , and our goal is to understand
B
X

an f (n)

n=A+1

in terms of the partial sums S(t). Let us first assume that A < B are non-negative
integers. Since an = S(n) S(n 1) we may write
B
X
n=A+1

an f (n) =

B
X
n=A+1

f (n)(S(n) S(n 1)),

PNT2

The prime number theorem

and with a little rearranging we obtain


B
X

B1
X

an f (n) = S(B)f (B) S(A)f (A)

S(n)(f (n + 1) f (n)).

(1.4)

PS1

If now we suppose that f is continuously differentiable on [A, B] then we may


write the above as
Z B
X
S(t)f 0 (t)dt.
(1.5)
an f (n) = S(B)f (B) S(A)f (A)

PS2

n=A+1

n=A

A<nB

PS2

We leave to the reader to check that (P


1.5) continues to hold for all non-negative
real numbers A < B. If we think of A<nB an f (n) as the Riemann-Stieltjes
R B+
PS2
integral A+ f (t)d(S(t)) then (1.5) amounts to integration by parts.
PNT

PNT2

Exercise 1.1 Using partial summation show that (1.1) and (1.3) are equivalent,
and that both are equivalent to
X
(x) =
log p = x + o(x).
(1.6)
px

ex:harmonic

Exercise 1.2 Using partial summation, prove that for any integer N 1
Z N
N
X
1
{t}
= log N + 1
dt,
N
t2
1
n=1
where throughout we write [t] for the integer part of t, and {t} for its fractional
part (so that t = [t] + {t}). Deduce that for any real x 1
1
X 1
= log x + + O
n
x
nx

where is the Euler-Mascheroni constant


Z
N
X

1
{t}
:= lim
log N = 1
dt.
N
n
t2
1
n=1
ex:stirling

Exercise 1.3 For an integer N 1 show that


Z
log N ! = N log N N + 1 +
1

Using that

Rx
1

{t}
dt.
t

({t} 1/2)dt = ({x}2 {x})/2 and integrating by parts, show that


Z
1

{t}
1
1
dt = log N
t
2
2

Z
1

{t} {t}2
dt.
t2

PNT3

Chebyshevs elementary estimates

Conclude that N ! C N (N/e)N . Here one also knows that


Z

1 {t} {t}2 
dt = 2,
C = exp 1
2 1
t2
and the resulting asymptotic for N ! is known as Stirlings formula.
Recall that the Riemann zeta function is given by
(s) =

Y
X
1 1
1
=
1 s
.
s
n
p
p
n=1

Here the Dirichlet series and the Euler product both converge absolutely in the
region Re(s) > 1.
zeta

Exercise 1.4 Prove that for Re(s) > 1


Z
Z
s
{y}
[y]
dy =
s
dy.
(s) = s
s+1
y
s1
y s+1
1
1
Observe that the right hand side above is an analytic function of s in the region
Re(s) > 0 except for a simple pole at s = 1 with residue 1. Thus we have an
analytic continuation of (s) to this larger region, and near s = 1 we have the
Laurent expansion
1
(s) =
+ + ....
s1
ex:stirling

Adapting the argument in Exercise 1.3 obtain an analytic continuation of (s)


to the region Re(s) > 1. Generalize.
1.2

Chebyshevs elementary estimates

Chebyshev made significant progress on the distribution of primes by showing


that there are constants 0 < c < 1 < C with
(c + o(1))

x
x
(x) (C + o(1))
.
log x
log x

(1.7)

Moreover he showed that if


lim

(x)
x/ log x

exists, then it must equal 1.


The key to obtaining such information is to write the prime factorization of
n in the form
X
log n =
(d).
d|n

Summing both sides over n (and re-writing d|n as n = dk), we obtain that

Cheb1

The prime number theorem

log n =

nx

X X

(d) =

nx n=dk

(x/k).

(1.8)

Cheb2

(1.9)

Cheb3

k=1

ex:stirling

Using Stirlings formula, Exercise 1.3, we deduce that

(x/k) = x log x x + O(log x).

k=1

Exercise 1.5 Deduce that


lim sup
x

(x)
(x)
1 lim inf
,
x
x
x

so that if limx (x)/x exists it must be 1.


Cheb1

Cheb2

To obtain Chebyshevs estimates (1.7), take (1.8) at 2x and subtract twice


that relation taken at x. This yields
x log 4 + O(log x) = (2x) (2x/2) + (2x/3) (2x/4) + . . . ,
and upper and lower estimates for the right hand side above follow upon truncating the series after an odd or even number of steps. In particular we obtain
that
(2x) x log 4 + O(log x),
Cheb1

which gives the lower bound of (1.7) with c = log 2 a permissible value. And we
also obtain that
(2x) (x) x log 4 + O(log x),
which, when used at x/2, x/4, . . . and summed, leads
to (x) x log 4 +
Cheb1
O((log x)2 ). Thus we obtain the upper bound in (1.7) with C = log 4 a permissible value.
ex:Bertrand

Exercise 1.6 Using that (2x) (x) + (2x/3) x log 4 + O(log x), prove
Bertrands postulate that there is a prime between N and 2N .
Cheb2

Returning to (1.8), we may recast it as


x

X
X
X
X
log n =
(d)
1=
(d)
+ O(1) .
d
nx

dx

kx/d

dx

Using Stirlings formula, and the recently established (x) = O(x), we conclude
that
X (d)
,
x log x + O(x) = x
d
dx

or in other words
X (n) X log p
=
+ O(1) = log x + O(1).
n
p

nx

px

(1.10)

Pavg

Multiplicative functions and Dirichlet series

Pavg

Exercise 1.7 Show that (1.10) would follow from the prime number theoremPavg
and
partial summation. Why does the prime number theorem
not
follow
from
(
1.10)
P
and partial summation? What stronger information on px log p/p would yield
the prime number theorem?
Pavg

Exercise 1.8 Use (1.10) and partial summation show that there is a constant c
such that


X1
1
= log log x + c + O
.
p
log x
px

Deduce Mertens Theorem, that there exists a constant such that



Y
1
e
1

.
p
log x
px

(In fact is the Euler-Mascheroni constant. There does not seem to be a straightforward, intuitive proof known that it is indeed this constant.)
1.3

Multiplicative functions and Dirichlet series

The main objects of study in this book are multiplicative functions. These are
functions f : N C satisfying f (mn) = f (m)f (n) for all coprime integers m and
n. If the relation f (mn) = f (m)f (n)Qholds for all integers m and n we say that f

is completely multiplicative. If n = j pj j is the prime factorization of n, where


Q

the primes pj are distinct, then f (n) = j f (pj j ) for multiplicative functions f .
Thus a multiplicative function is specified by its values at prime powers and a
completely multiplicative function is specified by its values at primes.
A handy way to study multiplicative functions is through Dirichlet series. We
let


X
f (p) f (p2 )
f (n) Y 
=
1
+
+
+
.
.
.
.
F (s) =
ns
ps
p2s
p
n=1
The product over primes above is called an Euler product, and viewed formally
the equality of the Dirichlet series and the Euler product above is a restatement
of the unique factorization of integers into primes. If we suppose that the multiplicative function f does not grow rapidy for example, that |f (n)|  nA
for some constant A then the Dirichlet series and Euler product will converge
absolutely in some half-plane with Re(s) suitably large.
Given any two functions f and g from N C (not necessarily multiplicative),
their Dirichlet convolution f g is defined by
X
(f g)(n) =
f (a)g(b).
ab=n

If F (s) = n=1 f (n)ns and G(s) = n=1 g(n)ns are the associated Dirichlet
convolution f g corresponds to their product F (s)G(s) =
Pseries, then the
s
(f

g)(n)n
.
n=1

The prime number theorem

Here are some examples of the basic multiplicative functions and their associated Dirichlet series.
The function (1) = 1 and (n) = 0 for all n 2 has the associated
Dirichlet series 1.
The function 1(n) = 1 for all n N has the associated Dirichlet series (s)
which converges absolutely
when Re(s) > 1, and whose analytic continuation we
zeta
discussed in Exercise 1.4.
For a natural number k, the k-divisor function dk (n) counts the number of
ways of writing n as a1 ak . That is, dk is the k-fold convolution of the function
1(n), and its associated Dirichlet series is (s)k . The function d2 (n) is called the
divisor function and denoted simply by d(n). More generally, for any complex
number z, the z-th divisor function dz (n) is defined as the n-th Dirichlet series
coefficient of (s)z .
The M
obius function (n) is defined to be 0 if n is divisible by the square
of some prime, and if n is square-free (n) is 1 or 1 depending on whether
n
an even or odd number of prime factors. The associated Dirichlet series
Phas

s
= (s)1 so that is the same as d1 .
n=1 (n)n
The von Mangoldt function (n) is not multiplicative, but is of great interest to us. Its associated Dirichlet series is 0 /(s). The function log n has
associated Dirichlet series 0 (s), and putting these facts together we see that
log n = (1 )(n) =

(d), and (n) = ( log)(n) =

(a) log b. (1.11)

Lammu

ab=n

d|n

Exercise 1.9 If f and g are functions from N to C, show that the relation
f = 1 g is equivalent to the relation g = f . This is known as M
obius
inversion.
As mentioned earlier, our goal in this chapter is to show that the prime
number theorem is equivalent to a statement about the mean value of the multiplicative function . We now formulate this equivalence precisely.
PNTM

Theorem 1.10 The prime number theorem, namely (x) = x + o(x), is equivalent to
X
M (x) =
(n) = o(x).
(1.12)
nx

Before we can prove this, we need one more ingredient: namely, we need to
understand the average value of the divisor function.
1.4
PrS4

The average value of the divisor function and Dirichlets


hyperbola method
P
We wish to evaluate asymptotically nx d(n). An immediate idea gives

Mx

The average value of the divisor function and Dirichlets hyperbola method

XX

d(n) =

nx d|n

nx

dx

dx nx
d|n

X hxi

XX

1=

X x
dx


+ O(1)

= x log x + O(x).
Dirichlet realized that one can substantially improve the error term above by
pairing each divisor a of an integer n with its complementary divisor b = n/a;
2
one minor exception is when
n = m and the divisor m cannot be so paired.
Since a or n/a must be n we have
d(n) =

1=2

1 + n ,

d|n

d< n

d|n

where n = 1 if n is a square, and 0 otherwise. Therefore


X
X X
X
d(n) = 2
1+
1
nx

nx

X 

d< x

d|n

d< n

1+2

nx
n=d2


1

d2 <nx
d|n

(2[x/d] 2d + 1) ,

d< x

and so
X
nx

d(n) = 2x

X 1

x + O( x) = x log x x + 2x + O( x),
d

(1.13)

DD

d< x

ex:harmonic

by Exercise 1.2.
The method described above is called the hyperbola method because we are
trying to count the number of lattice points (a, b) with a and b non-negative and
lying below the hyperbola ab = x. Dirichlets idea maybe thought of as choosing
parameters A, B with AB = x, and dividing the points under the hyperbola
according to whether a A or b B or both. We remark that an outstanding
open problem,
known as the Dirichlet divisor problem, is to show that the error
DD
1
term in (1.13) may be improved to O(x 4 + ). ex:stirling DD
For our subsequent work, we use Exercise 1.3 to recast (1.13) as
X

(log n + 2 d(n)) = O( x).

nx

(1.14)

divest

8
k-div

The prime number theorem

Exercise 1.11 Given a natural number k, use the hyperbola method together
with induction and partial summation to show that
X
dk (n) = xPk (log x) + O(x11/k+ )
nx

where Pk (t) denotes a polynomial of degree k 1 with leading term tk1 /(k 1)!.
1.5

Primes5

The primePNTM
number theorem and the M
obius function: proof of
Theorem 1.10
P
First we show that the estimate M (x) = nx (n) = o(x) implies the prime
number theorem (x) = x + o(x).
Define the arithmetic function a(n) = log n d(n) + 2, so that
a(n) = (1 ( 1))(n) + 2.
When we convolve a with the Mobius function we therefore obtain
( a)(n) = ( 1 ( 1))(n) + 2( 1)(n) = ( 1)(n) + 2(n),
where (1) = 1, and (n) = 0 for n > 1. Hence, when we sum ( a)(n) over all
n x, we obtain
X
X
( a)(n) =
((n) 1) + 2 = (x) x + O(1).
nx

nx

On the other hand, we may write the left hand side above as
X
(d)a(k),
dkx

and, as in the hyperbola method, split into terms where k K or k > K (in
which case d x/K). Thus we find that
X
X
X
X
(d)a(k) =
a(k)M (x/k) +
(d)
a(k).
dkx

kK

dx/K

K<kx/d

divest

Using (1.14) we see that the second term above is



 X p

x/d = O(x/ K).


=O
dx/K

Putting everything together, we deduce that


X

(x) x =
a(k)M (x/k) + O(x/ K).
kK

If we now know that M (x) = o(x), then by letting K tend to infinity very
slowly with x, we may conclude that (x) x = o(x), obtaining the prime
number theorem.

Selbergs formula

Now we turn to the converse. We must show that the prime number theorem
implies that M (x) = o(x). Consider the arithmetic function (n) log n which
is the n-th Dirichlet series coefficient of (1/(s))0 . Since
 1 0
0
0 (s)
1
=

=
(s)
,
(s)
(s)2

(s)
we obtain the identity (n) log n = ( )(n). Since 1 = , we find that
X
X
(n) log n 1.
(1.15)
( ( 1))(n) =
nx

nx

Pr51

The right hand side of (1.15) is


X

X
X
log x
(n) +
(n) log(x/n) 1 = (log x)M (x) + O
log(x/n)
nx

nx

nx

= (log x)M (x) + O(x),


ex:stirling

Pr51

upon using Exercise 1.3. The left hand side of (1.15) is




X
X
(a)((b) 1) =
(a) (x/a) x/a .
abx

ax

We are assuming the prime number theorem, which means that given  > 0 if
t T is large enough then |(t) t| t. Using this for a x/T (so that
x/a > T ) and the Chebyshev estimate
|(x/a) x/a|  x/a for x/T a x
Pr51
we find that the left hand side of (1.15) is
X
X

x/a +
x/a  x log x + x log T.
ax/T

x/T ax

Combining these observations, we find that


|M (x)|  x + x

log T
 x,
log x

if x is sufficiently large. Since  was arbitrary, we have demonstrated that M (x) =


o(x).
Exercise 1.12 Modify the above proof to show that if M (x)  x/(log x)A then
(x)x  x(log log x)2 /(log x)A . And conversely, if (x)x  x/(log x)A then
M (x)  x/(log x)min{1,A} .
1.6 Selbergs formula
The elementary techniques discussed above were brilliantly used by Selberg to get
an asymptotic formula for a suitably weighted sum of primes and products of two
primes. Selbergs identity then led Erdos and Selberg to discovering elementary
proofs of the prime number theorem. We will not discuss the elementary proof
of the prime number theorem here, but let us see how Selbergs identity follows
from the ideas developed so far.

Pr51

10
Selberg

The prime number theorem

Theorem 1.13 We have


X
X
(log p)2 +
(log p)(log q) = 2x log x + O(x).
px

pqx

P
Proof We define 2 (n) := (n) log n + `m=n (`)(m). Thus 2 (n) is the
n-th Dirichlet series coefficient of
 0 0  0 2
00 (s)
(s) +
(s) =
,

(s)
so that 2 = ( (log)2 ).
Our previous work exploited that = (log) and that the function d(n)2
had the same average value as log n. Now we search for a divisor type function
which has the same average as (log n)2 .
By partial summation we find that
X
(log n)2 = x(log x)2 2x log x + 2x + O((log x)2 ).
nx

k-div

Using Exercise 1.11 we may find constants c2 and c1 such that


X
(2d3 (n) + c2 d(n) + c1 ) = x(log x)2 2x log x + 2x + O(x2/3+ ).
nx

Set b(n) = (log n)2 2d3 (n) c2 d(n) c1 so that the above relations give
X
b(n) = O(x2/3+ ).
(1.16)
nx

Now consider ( b)(n) = 2 (n) 2d(n) c2 c1 (n), and summing this


over all n x we get that
X
X
( b)(n) =
2 (n) 2x log x + O(x).
nx

nx

The left hand side above is


X
X
X
(k)
b(l) 
(x/k)2/3+  x,
kx

lx/k

kx

and we conclude that


X

2 (n) = 2x log x + O(x).

nx

The difference between


the left hand side above and the left hand side of our
desired identity is  x log x, and so our Theorem follows.
2

Pr61

Selbergs formula
SelbergIden

11

Exercise 1.14 Recast Selbergs identity in the form


 x x
X
((x) x) log x =
(n)

+ O(x)
n
n
nx

Pavg

using (1.10) is necessary. Deduce that a + A = 0 where


a = lim inf
x

(x) x
,
x

and

A = lim sup
x

(x) x
.
x

2
FIRST RESULTS ON MULTIPLICATIVE FUNCTIONS
C2

As we have just seen, understanding the mean value of the Mobius function
leads to the prime number theorem. Motivated by this, we now begin our study
of mean values of multiplicative functions in general. We begin by giving in this
chapter some basic examples and developing some preliminary results in this
direction.
S2.1

2.1 A heuristic
PrS4
In Section 1.4 we saw that a profitable way of studying the mean value of the kdivisor function is to write dk as the convolution 1 dk1 . Given a multiplicative
function f let us write f as 1 g where g is also multiplicative. Then
hxi
X
XX
X
f (n) =
g(d) =
g(d)
.
d
nx

nx d|n

dx

Since [z] = z + O(1) we have



X
X
X g(d)
f (n) = x
|g(d)| .
+O
d

(2.1)

E2.1

In several situations, for example


in the case of the k-divisor function treated
E2.1
earlier, the remainderP
term in (2.1) may be shown to be small.
Q Omitting this
term, and thinking of dx g(d)/d as being approximated by px (1 + g(p)/p +
g(p2 )/p2 + . . .) we arrive at the following heuristic:
X
f (n) xP(f ; x)
(2.2)

E2.2

nx

dx

dx

nx

where
P(f ; x) =

Y

1+

px

 Y

g(p) g(p2 )
1 
f (p f (p2 )
+ 2 +... =
1
1+
+ 2 +... .
p
p
p
p
p
px

(2.3)
E2.2
Consider the heuristic (2.2) in the case of the k-divisor function. The heuristic
predicts that
X
Y
1 (k1)
dk (n) x
1
x(e log x)k1 ,
p
nx

px

which is off from the true asymptotic formula x(log x)k1 /(k 1)! only by a
constant factor.

E2.3

Multiplicative functions close to 1

13

One of our aims will be to obtain results that are uniform over the class of all
mutiplicative functions. Thus for example we could consider
x to be large and
f (pk ) = 1
consider
the multiplicative function f with f (pk ) = 0 for p x and
for p > x. In this case, we have f (n) = 1 if n is a prime between x and x
and f (n) = 0 for other n x. Thus, the heuristic suggests that
X
nx

Y 

1
e
2e x

1
x
f (n) = 1 + (x) ( x) x
.
p
log x
log x

p x

Again this is comparable to the prime number theorem, but the heuristic is off
by the constant 2e 1.1.... This discrepancy is significant in prime number
theory, and has been exploited beautifully by many authors starting with the
pioneering work of Maier.
In the case of the M
obius function, the heuristic suggests comparing
M (x) =

(n)

with

nx

Y

px

xe2
1 2

,
p
(log x)2

but in fact nx (n) is much smaller. The best bound that we know uncondiP
3
tionally is that nx (n)  x exp(c(log x) 5  ), but we expect that it is as
1
small as x 2 + this is equivalent to the Riemann Hypothesis. In any event, the
heuristic certainly suggests the prime number theorem that M (x) = o(x).
S2.2

2.2

Multiplicative functions close to 1


E2.2

The heuristic (2.2) is accurate and easy to justify when the function g is small in
size, or in other words, when f is close to 1. We give a sample such result which
is already quite useful.
pr2.1

Proposition 2.1 Let f = 1 g be a multiplicative function, and suppose that


0 1 is such that

X
|g(d)|
e
= G()
d
d=1

is convergent. Then, with P(f ) = P(f ; ),


X



e
f (n) xP(f ) x G().

nx

E2.1

Proof The argument giving (2.1) yields that


X
X g(d) X

|g(d)|.
f (n) x


d
nx

Since P(f ) =

d1

dx

g(d)/d we have that

dx

14

First results on multiplicative functions

X g(d)
X |g(d)|


P(f )
.

d
d
d>x

dx

Combining these two inequalities yields that


X
X
X |g(d)|


|g(d)| + x
f (n) xP(f )
.

d
nx

dx

d>x

The result follows from the following observation, which holds for any sequence
of non-negative real numbers: If an 0 for all n 1 then for any , 0 1,
we have
X  x 
X an  n 1
X an
X
X an
an

+x
= x
.
(2.4)
an + x
n
n
n x
n
n>x
n>x
nx

nx

n1

2
P

|g(n)|/n

ex2.0

Exercise 2.2 If g is multiplicative,P


show that the convergence of
is equivalent to the convergence of pk |g(pk )|/pk .

ex2.1

ExerciseP2.3 If f is a non-negative arithmetic


P function, and > 0 is such that

F () = n=1 f (n)n is convergent, then nx f (n) x F (). This simple


observation is known as Rankins trick, and is sometimes surprisingly effective.

n=1

Remark 2.4 If we are bounding the sum of f (n) for n x then the values of
f (pk ) for p > x are not used in determining the sum, yet the F () in the upper
bound in the previous exercise implicitly uses those values. This suggests that in
order to optimize our bound we may select these f (p) to be as helpful as possible,
typically taking f (pk ) = 1 for all p > x, so that g(pk ) = 0.
ex2.2

ex2.3

Exercise 2.5 For any natural number q, prove that for any 0 1




X

Y

(q)
1

1

x
x
1+ .


q
p
nx
p|q
(n,q)=1

If one takes = 0, we obtain the sieve of Eratosthenes bound of 2(q) (where
(q) is the number of distinct
primes dividing q) for the right side above. A little
P
calculus shows that, if p|q (log p)/(p + 1) log x, the choice of that optimizes
P
our bound, is given by the relation p|q (log p)/(p + 1) = log x.
P
Exercise 2.6 Let (n) = d|n d. Prove that
X (n)2 (n)

15
= 2 x + O( x log x).
(n)

nx

Rankin

Non-negative multiplicative functions


ex2.4

15

Exercise 2.7 Let f = 1 g be a multiplicative function and [0, 1) is such


P
e
that d |g(d)|d = G()
< . Prove that for x exp(1/(1 ))

X f (n)
X
g(d)
e
= P(f )(log x + )
log d + O(x1 log xG()).
n
d

nx

ex2.5

d=1

Exercise 2.8 Let f be multiplicative and write f = dk g where k N and pr2.1


dk
deontes the k-divisor function. Assuming
that
|g|
is
small,
as
in
Proposition
2.1,
P
develop an asymptotic formula for nx f (n).
pr2.1

E2.3

Now we refine Proposition 2.1 and establish the heuristic (2.3) under a less
restrictive hypothesis.
Prop2.7

Proposition 2.9 Let f = 1 g and suppose that

X
|g(n)|
e
= G(1)
n
n=1

is convergent. Then

X
g(d)
1X
f (n) = P(f ) =
.
x x
d
nx
d=1
E2.1
P
P
P
Proof Recall (2.1) which gives nx f (n) = x dx g(d)/d + O( dx |g(d)|).
Now


X
X


g(n)
|g(n|)

P(f )
x 0,

n
n>x n

lim

nx

and
X

Z
|g(n)| =
0

nx

as

n=1

X |g(n)|
dt = o(x),
n

t<nx

|g(n)|/n is convergent, and the result follows.

2.3 Non-negative multiplicative functions


Let us now consider our heuristic for the special case of non-negative multiplicative functions with
suitable growth conditions. Here we shall
E2.2
P see that right side
of our heuristic (2.2) is at least a good upper bound for nx f (n).
Prop2.1

Proposition 2.10 Let f be a non-negative multiplicative function, and suppose


there are constants A and B such that
X
f (pk ) log(pk ) Az + B,
pk z

for all z 1. Then for x e2B we have



X
(A + 1)x Y 
f (p) f (p2 )
f (n)
1+
+
+
.
.
.
.
log x B
p
p2
nx

px

16

First results on multiplicative functions

Proof Consider
X

f (n) log x =

f (n) log n +

f (n) log(x/n).

nx

nx

nx

The first term satisfies


X
X X
X
X
f (n) log n =
f (r)f (pk ) log(pk )
f (r)
f (pk ) log(pk )
nx

nx n=pk r
(p,r)=1

X
rx

rx

 Ax

f (r)
+B .
r

Since log t t the second term is x


X

f (n)

and since
follows.

nx

nx

f (n)/n. We conclude that

X f (n)
B X
x
(A + 1)
+
f (n),
log x
n
log x
nx

nx

pk x/r

f (n)/n

px (1

nx

+ f (p)/p + f (p2 )/p2 + . . .), the Proposition


2
Prop2.1

Note that, by Mertens Theorem, the upper bound in Proposition 2.10 is


(A + 1 + o(1))xP(f
; x).
Prop2.1
In Proposition 2.10 we have in mind a non-negative multiplicative function
dominated
by some k-divisor function, and in such a situation we have shown
P
that nx f (n) is bounded above by a constant times the heuristic prediction
xP(f
; x). For a non-negative multiplicative
function bounded by 1, Propositions
Prop2.7 Prop2.1
E2.3
2.9 and 2.10 establish the heuristic (2.3) in the limit x .
cor2.3

Corollary 2.11 If 0 f (n) 1 is a non-negative multiplicative function then


X
nx

 X 1 f (p) 
f (n)  xP(f ; x)  x exp
p

(2.5)

px

with an absolute implied constant. Moreover we have


1X
f (n) = P(f ).
x x
lim

nx

Proof The Chebyshev estimates give that


X
X
f (pk ) log(pk )
log(pk ) Az + B
pk z

pk z

E2.5

with any constant A > log


4 being permissible. The estimate (2.5) therefore
Prop2.1
follows from Proposition 2.10.

E2.5

Non-negative multiplicative functions

If

p (1

17

E2.5

f (p))/p diverges, then (2.5) shows that


lim

1X
f (n) = 0 = P(f ).
x
nx

Suppose now that p (1 f (p))/p converges. If we write f = 1 g then this


P
k
k
condition assures us that
which in turn is equivalent
pk |g(p )|/p converges,
P
Prop2.7
to the convergence of n |g(n)|/n. Proposition 2.9 now finishes our proof.
2
E2.5

We would love to have a uniform result like (2.5) for real valued multiplicative
functions with 1 f (n) 1 (and more generally for complex valued multiplicative functions), since that would immediately imply the prime number theorem.
Establishing such a result will be one
P of our goals in the coming chapters. In particular, one may ask if limx x1 nx f (n) exists (and equals P(f )) for more
general classes of multiplicative functions. Erdos and Wintner conjectured that
this is so for real valued multiplicative functions with 1 P
f (n) 1, and this
was established by Wirsing whose proof also establishes that nx (n) = o(x).
The work of Halasz, which we shall focus on soon, considers the more general
case of complex valued multiplicative functions taking values in the unit disc.

3
INTEGERS WITHOUT LARGE PRIME FACTORS
3.1

Smooth or friable numbers

Given a real number y 2, we let S(y) denote the set of natural numbers all of
whose prime factors are at most y. Such natural numbers are called smooth in
the English literature, and friable (meaning crumbly) in the French literature;
the latter usage seems to be spreading, at least partly because the word smooth
is already overused. Smooth numbers appear all over analytic number theory in
connections ranging from computational number theory and factoring algorithms
to Warings problem. Our interest is in the counting function of smooth numbers:
X
(x, y) :=
1.
nx
nS(y)

We can formulate this as a question about multiplicative functions by considering


the multiplicative function given by f (pk ) = 1 if p y, and f (pk ) = 0 otherwise.
If x y then clearly (x, y) = [x] = x + O(1). Next suppose that y x y 2 .
If n x is not y-smooth then it must be divisible by a unique prime p (y, x].
Thus

X X
X x
+ O(1)
(x, y) = [x]
1 = x + O(1)
p
y<px nx
p|n

y<px


 x 
log x 
= x 1 log
+O
.
log y
log y
The formula above suggests writing x = y u , and then for 1 u 2 it gives
(y u , y) = y u (1 log u) + O

 yu 
.
log y

We can continue the process begun above, using the principle of inclusion and
exclusion to evaluate (y u , y) by subtracting from [y u ] the number of integers
which are divisible by a prime larger than y, adding back the contribution from
integers divisible by two primes larger than y, and so on. A result of this type
for small values of u may be found in Ramanujans unpublished manuscripts
(collected in The last notebook), but the first published uniform results on this
problem are due to Dickman and de Bruijn. The answer involves the Dickmande Bruijn function (u) defined as follows. For 0 u 1 let (u) = 1, and

Smooth or friable numbers

19

let (u) = 1 log u for 1 u 2. For u > 1 we define by means of the


differential-difference equation
u0 (u) = (u 1),
or, equivalently, the integral equation
Z

(t)dt.

u(u) =
u1

It is easy to check that the differential-difference equation above has a unique


continuous solution, and that (u) is non-negative and decreases rapidly to 0 as
u increases. For example, note that (u) (u 1)/u and iterating this we find
that (u) 1/[u]!.
smooth

Theorem 3.1 Uniformly for all u 1 we have


(y u , y) = (u)y u + O


 yu
+1 .
log y

Proof Let x = y u , and we start with


X

X
X
(x, y) log x =
log n + O
log(x/n) =
log n + O(x).
nx
nS(y)

Using log n =
X

d|n

nx
nS(y)

(d) we have

log n =

nx
nS(y)

nx

(d)(x/d, y) =

(log p)(x/p, y) + O(x),

py

dx
dS(y)

since the contribution of prime powers pk (with k 2) is easily seen to be O(x).


Thus
x 
X
(x, y) log x =
log p , y + O(x).
(3.1)
p
py

Now we show that a similar equation is


Psatisfied by what we think approximates (x, y), namely x(u). Put E(t) = pt logp p log t so that E(t) = O(1)
Pavg
by (1.10). Now
X log p  log(x/p)  Z y 
log t 

=
u
d(log t + E(t)),
p
log y
log y
1
py

and making a change of variables t = y we find that


Z y 
Z 1
log t 
u
d(log t) = (log y)
(u )d = (log x)(u).
log y
1
0

E2.10

20

Integers without large prime factors

Moreover, since E(t)  1 and is monotone decreasing, integration by parts


gives
Z y 
Z y

log t 
log t 
d
u
d(E(t))  (u 1) +
u
dt  (u 1).
log y
dt
log y
1
1
Thus we find that
(x(u)) log x =

log p

py

E2.11

 x  log(x/p) 
+ O(x).

p
log y

(3.2)

E2.11

(3.3)

E2.12

E2.10

Subtracting (3.2) from (3.1) we arrive at


 x  x  log x/p 
X


log p , y
|(x, y) x(u)| log x
+ Cx,
p
p
log y
py

for a suitable constant C.


Now suppose that the Theorem has been established for all values until x/2,
and we now wish to establish it for x. We may suppose that x y 2 , and our
induction hypothesis is that for all t x/2 we have

 log t 

 t


+1 ,
(t, y) t
C1
log y
log y
E2.12

for a suitable constant C1 . From (3.3) we obtain that



 x

 x
X
+1 +Cx C1 x+O
+y +Cx.
|(x, y)x(u)| log x C1
log p
p log y
log y
py

Assuming, as we may, that C1 2C and that y is sufficiently large, the right


hand side above is 2C1 x, and we conclude that |(x, y) x(u)| C1 x/ log y.
This completes our proof.
2
ex2.6

Exercise 3.2 Let


(s, y) =

(1 1/ps )1 =

py

ns ,

nS(y)

be the Dirichlet series associated with the y-smooth numbers. For any real numbers x 1 and y 2, show that the function x (, y) for (0, ) attains its
minimum at = (x, y) satisfying
log x =

X log p
.
p 1

py

ex2.1

By Rankins trick (see Exercise 2.3) conclude that


(x, y) x (, y) = min x (, y).
>0

Multiplicative functions which only vary at small prime factors


ex2.7

Exercise 3.3 For any given ,

py

 < 1, show that




1
log y

p1

21

log(1/) + O

y
log(y )


.

(Hint: Compare the sum for the primes with p  1 to the sum of 1/p in the
same range. Use upper bounds on (x) for those primes for which p  1.)
ex2.8

ex2.9

log u)
Exercise 3.4 For x = y u with y > (log x)2+ let = 1 log(u
. Deduce
log y
from the last two exercises that there exists a constant C > 0 such that

u
C
(x, y) 
x log y.
u log u

Exercise 3.5 Prove that



(u) =

1 + o(1)
u log u

u
.

(Hint: Select c maximal such that (u)  (c/u log u)u . By using the functional
equation for deduce that c 1. Take a similar approach for the implicit upper
bound.)
C2a
GenFundLem

3.2

Multiplicative functions which only vary at small prime factors

Proposition 3.6 Suppose that f (pk ) = 1 for all p > y. Let x = y u . Then
1X
f (n) = P(f ; x) + O(1/uu/3 ).
x
nx

Can we get an estimate of P(f ; x){1 + O(1/uu )}, and so generalize the Fundamental Lemma of Sieve Theory? We begin with a simple case that follows
from the Fundamental Lemma of Sieve Theory:
BabyBuchstab

Lemma 3.7 Suppose that g(pk ) = 0 for all p y, and g(pk ) = 1 for all p > y.
Let x = y u . Then

X
Y
1
{1 + O(1/uu )}.
g(n) = x
1
p
nx

py

GenFundLem

Proof of Proposition 3.6 Define


(
(
0 if p y
f (pk ) if p y
k
k
g(p ) =
and h(p ) =
1 if p > y
0
if p > y
so that f = g h. Hence if AB = x then

22

Integers without large prime factors

f (n) =

h(a)

aA

nx

Let A = B =

X
bx/a

g(b) +

g(b)

bB

h(a).

A<a<x/b

BabyBuchstab

x and use lemma 3.7 on the first sum to obtain


X
x
h(a)y {1 + O(1/(u/2)u/2 )}.
a
aA

where y :=

py

1
p

y x

. Hence we have a main term of

X h(a)
a

= xP(h; y) = xP(f ; x),

plus an error term of


y x

X |h(a)|
X |h(a)|
x X 1
+(u/2)u/2 y x

+(u/2)u/2 x  (u/2)u/2 x,
a
a
log y
a
a
a>A

a>A

as |h(a)| 1, using our estimate on tail of sums over smooth numbers. We bound
the second sum above using our knowledge of smooths to obtain
X
x

g(b) (u/2)u/2  x(u/2)1u/2


b
bB

4
DISTANCES AND THE THEOREMS OF DELANGE, WIRSING

AND HALASZ
C2

In Chapter 2 we considered the heuristic that the mean value of a multiplicative


E2.2
function
f might be approximated by the Euler product P(f ; x) (see (2.2) and
E2.3
towards this heuristic and were most
(2.3)). We proved some elementary results S2.2
successful
when
f
was
close
to
1
(see

2.2)
or when f was non-negative (see
S2.3
??). Even for nice non-negative functions the heuristic is not entirely accurate,
C3
as revealed by the example of smooth numbers discussed in Chapter ??. We
now continue our study of this heuristic, and focus on whether the mean value
can be bounded above by something like |P(f ;S2.2
x)|. We begin by making precise
the geometric language, already employed in 2.2, of one multiplicative function
being close to another.
4.1 The distance between two multiplicative functions
The notion of a distance between multiplicative functions makes most sense in
the context of functions whose values are restricted to the unit disc U = {|z| 1}.
In thinking of the distance between two such multiplicative functions f and g,
naturally we may focus on the difference between f (pk ) and g(pk ) on prime
powers. An obvious candidate for quantifying this distance is
X |f (pk ) g(pk )|
,
pk
k

p x

Prop2.7

and implicitly it is this distance which is used inpr2.1


Proposition 2.9 (and a stronger
form of such a distance is used in Proposition 2.1). However, it turns out that a
better notion of distance involves 1 Re(f (pk )g(pk )) in place of |f (pk ) g(pk )|.
lem4.1

Lemma 4.1 Suppose we have a sequence of functions j : U U R0 satisfying the triangle inequality
j (z1 , z3 ) j (z1 , z2 ) + j (z2 , z3 ),
for all z1 , z2 , z3 U. Then we may define a metric UN = {z = (z1 , z2 , . . .)} by
setting

X
 21
d(z, w) =
j (zj , wj )2 ,
j=1

assuming that the sum converges. This metric satisfies the triangle inequality
d(z, w) d(z, y) + d(y, w).

24

Distances and the Theorems of Delange, Wirsing and Hal


asz

Proof Expanding out we have


d(z, w)2 =

j (zj , wj )2

j=1

(j (zj , yj ) + j (yj , wj ))2

j=1

by the assumed triangle inequality for j . Now, using Cauchy-Schwarz, we have

(j (zj , yj ) + j (yj , wj ))2 = d(z, y)2 + d(y, w)2 + 2

j=1

j (zj , yj )j (yj , wj )

j=1

d(z, y)2 + d(y, w)2 + 2

X

j (zj , yj , )2

 21  X

j (yj , wj )2

 21

j=1

j=1
2

= (d(z, y) + d(y, w)) ,


2

which proves the triangle inequality.

A nice class of examples is provided by taking j (z) = aj (1 Re (zj )) for


non-negative
aj , and we now check that this satisfies the hypothesis of Lemma
lem4.1
4.1.
lem4.0

Lemma 4.2 Define : U U R0 by (z, w)2 = 1 Re(zw). Then for any


z1 , z2 , z3 in U we have
(z1 , z3 ) (z1 , z2 ) + (z2 , z3 ).
Proof Without loss of generality we may suppose that z1 = 1 , z2 = 2 ei2
and z3 = 3 ei3 with 1 , 2 , 3 [0, 1] and 2 , 3 (, ]. Our claim is that
1

(1 1 3 cos 3 ) 2 (1 1 2 cos 2 ) 2 + (1 2 3 cos(2 3 )) 2 .

(4.1)

eq:4.2.1

Suppose first that cos 2 and


cos( 3 ) have the same sign. If they are both
eq:4.2.1 2
negative then the RHS of (4.1) is clearly 2 and our
claim holds. If they are
eq:4.2.1
both positive, then for fixed 1 and 3 the RHS of (4.1) is minimum for 2 = 1
and our claim is then that
1

(1 1 2 cos 3 ) 2 (1 1 cos 2 ) 2 + (1 2 cos(2 3 )) 2 .

(4.2)

To establish this we square both sides, write cos 3 = cos 2 cos(2 3 ) +


sin 2 sin(2 3 ), and the inequality (1 r cos ) 12 r2 sin2 (valid for all
0 r 1).
So we may assume that cos 2 and cos(2 3 ) have opposite signs, so that
one of the two must have opposite sign from cos 3 . Suppose cos 3 and
cos
eq:4.2.1 2
have opposite signs. If cos 3 0 cos 2 then it suffices to check (4.1) in the
case
1 = 0 and clearly this holds. If cos 2 0 cos 3 then it suffices to check
eq:4.2.1
(4.1)
in the case when 1 = 1 and this may be verified in the same manner as
eq:4.2.2
(4.2).
2

eq:4.2.2

The distance between two multiplicative functions

25

We can use the above remarks to define distances between multiplicative


functions taking values in the unit disc. Taking aj = 1/p for each prime p x
we may define a distance (up to x) of the multiplicative functions f and g by
D(f, g; x)2 =

X 1 Re f (p)g(p)
.
p

px

lem4.1

By Lemma 4.1 this satisfies the triangle inequality


D(f, g; x) + D(g, h; x) D(f, h; x).

(4.3)

triangle1

It is natural to multiply multiplicative functions together, and we may wonder: if


f1 and g1 are close to each other, and f2 and g2 are close to each other whether
it then follows that f1 f2 is close to g1 g2 ? Indeed this variant of the triangle
inequality holds, and we leave its proof as an exercise to the reader:
D(f1 , g1 ; x) + D(f2 , g2 ; x) D(f1 f2 , g1 g2 ; x).

(4.4)

triangle2

Alternatively, we can take any > 1 and take the coefficients aj = 1/p and
zj = f (p) as p runs over all primes. In this case we have
D (f, g)2 =

X 1 Re f (p)g(p)
,
p
p

triangle1 triangle2

which obeys the analogs of (4.3) and (4.4).


lem4.3

Lemma 4.3 For any multiplicative functions f and g taking values in the unit
disc we have
D(f, g; x)2 = D (f, g)2 + O(1)
with = 1 + 1/ log x. Furthermore, if f is completely multiplicative and F (s) =
P

s
n=1 f (n)/n is the Dirichlet series associated to f we have




|F (1+1/ log x)|  (1+1/ log x) exp D(1, f ; x)2  log x exp D(1, f ; x)2 .
Proof With = 1 + 1/ log x we have
X 1
X 1
1 
1+ + 2
= O(1),
|D(f, g; x)2 D (f, g)2 | 2
p p
p1+
p>x
px

proving
assertion. The second statement follows since log |F ()| =
P our first

Re
f
(p)/p
+
O(1).
2
p
Taking g(n) = nit we obtain, for x 2



X f (p)
X
f (n)
1

exp

+ it .
=F 1+
1
p1+it
log x
n1+ log x +it
px

n1

(4.5)

TruncRight

26

4.2
Delange

Distances and the Theorems of Delange, Wirsing and Hal


asz

Delanges Theorem

Theorem 4.4 Let f be a multiplicative function taking values in the unit disc
U. Suppose that
X 1 Re f (p)
D(1, f ; ) =
< .
p
p
Then as x we have
X

f (n) xP(f ; x).

nx

Prop2.7

Delanges theorem may be


Pseen as a refinement of Proposition 2.9. There the
hypothesis is essentially that p |1f (p)|/p < which is a stronger requirement
than Delanges hypothesis. We warn the reader that the hypothesis of Delanges
theorem does not guarantee that P(f ; x) tends to a limiting value P(f ) as x
the reader may have fun coming up with examples. We postpone the proof of
Delanges theorem to the next chapter.
4.3

A key example: the multiplicative function f (n) = ni

Delanges theorem gives a satisfactory answer in the case of multiplicative functions at a bounded distance from 1, and we are left to ponder what happens
when D(1,
P f ; x) as x . One would be tempted to think that in this
case x1 nx f (n) 0 as x were it not for the following important counter
example. Let 6= 0 be a fixed real number and consider the completely multiplicative function f (n) = ni . By partial summation we find that
X
nx

ni =

x+

y i d[y]

x1+i
.
1 + i

(4.6)

The mean-value at x then is xi /(1 + i) which has magnitude 1/|1 + i| but


whose argument varies with x. In this example it seems plausible enough that
D(1, pi ; x) as x and we now supply a proof of this important fact.
We begin with a useful Lemma on the Riemann zeta function.
lem4.3.0

Lemma 4.5 If s = + it with > 1 then


|s|
|s|
|s| |(s)|
+ |s|.
|s 1|
|s 1|
If in addition we have |s 1|  1 then
|(s)|  log(2 + |s|).

eq:4.1

A key example: the multiplicative function f (n) = ni

27

zeta

Proof The first assertion follows easily from


Exercise 1.4. To prove the second
zeta
assertion, modify the argument of Exercise 1.4 to show that for any integer N 1
we have
Z
N
X
N 1s
{y}
1
+
dy.
(s) =
s
s
s+1
n
s

1
y
N
n=1
Choose N = [|s|] + 1, and bound the sum over n trivially to deduce the stated
bound for |(s)|.
2
lem4.3.1

Lemma 4.6 Let be any real number. Then for all x 3 we have
D(1, pi ; x)2 = log(1 + || log x) + O(1),
in the case || 1/10. When || 1/10 we have
D(1, pi ; x)2 log log x log log(2 + ||) + O(1).
lem4.3

Proof We have from Lemma 4.3

D(1, pi ; x)2 = log

log x
.
|(1 + 1/ log x + i)|

lem4.3.0

Now use the bounds of Lemma 4.5.


lem4.3.1

We shall find Lemma 4.6 very useful in our work. One important consequence
of it and the triangle inequality is that a multiplicative function cannot pretend
to be like two different problem examples ni and ni .
cor:repulsive

Corollary 4.7 Let and be two real numbers and let f be a multiplicative
function taking values in the unit disc. Then


D(f, pi ; x) + D(f, pi ; x)

2

exceeds
log(1 + | | log x) + O(1)
in case | | 1/10 , and in the case | | 1/10 it exceeds
log log x log log(2 + | |) + O(1).
Proof Indeed the triangle inequality gives that D(f, pi ; x) + lem4.3.1
D(f, pi ; x)
i i
i()
D(p , p ; x) = D(1, p
; x) and we may now invoke Lemma 4.6.
2
The problem example ni discussed above takes on complex values, and one
might wonder if there is a real valued multiplicative function f taking values in
[1, 1] for which D(1, f ; x) as x but for which the mean value does not
tend to zero. A lovely theorem of Wirsing, a precursor to the important theorem
of Hal
asz that we shall next discuss, establishes that this does not happen.

28
Wirsing

Distances and the Theorems of Delange, Wirsing and Hal


asz

Theorem 4.8 Let f be a real valued multiplicative function with |f (n)| 1 and
D(1, f ; x) as x . Then as x
1X
f (n) 0.
x
nx

Note that Wirsings theorem applied to (n) immediately yields the prime
number theorem. We shall not directly discuss this theorem; instead we shall
deduce it as a consequence of Halaszs theorem.
4.4

Hal
aszs theorem

We saw in the previous section that the function f (n) = ni has a large mean
value even though D(1, f ; x) as x . We may tweak such a function at
a small number of primes and expect a similar result to hold. More precisely, one
can ask if an analog of Delanges result holds: that is if f is multiplicative
with
P
D(f (p), pi ; ) < for some , can we understand the behavior of nx f (n)?
This is the content of the first result of Halasz.
ex:4.4.1

Exercise 4.9 If f is a multiplicative function with |f (n)| 1 show that there is


at most one real number with D(f, pi ; ) < .

Hal1

Theorem 4.10 Let f be multiplicative function with |f (n)| 1 and suppose


there exists R such that D(f, pi ; ) < . Write f (n) = g(n)ni . Then as
x
X
x1+i
f (n) =
P(g; x) + o(x).
1 + i
nx

Proof We
show how Hal
aszs first theorem may be deduced from Delanges
Delange
Theorem 4.4. By partial summation we have
Z x
Z x
X

X
X
X
f (n) =
ti d
g(n) = xi
g(n) i
ti1
g(n)dt.
nx

nt

nx

nt

Now D(1, g; ) = D(f, pi ; ) < and so by Delanges theorem, if t is sufficiently large then
X
g(n) = tP(g; t) + o(t).
nt

Therefore
X

f (n) = x P(g; x) i

ti P(g; t)dt + o(x).

nx

Now note that P(g; t) is slowly varying: P(g; t) = P(g; x) + O(log(ex/t)/ log x)
and our result follows.
2
Hal1

Applying Theorem 4.10 with f replaced by f (n)/ni we obtain the following:

Hal
aszs theorem

29

Corollary 4.11 Let f be multiplicative function with |f (n)| 1 and suppose


there exists R such that D(f, pi ; ) < . Then as x
1X
xi
1 X f (n)
+ o(1).
f (n) =

x
1 + i x
ni
nx

nx

AsympT2

This will be improved considerably in Theorem 12.1.


The next result of Hal
asz is central to our book, and it deals with the case
when D(f, pi ; ) = for all . In fact Halaszs result is more precise and
quantitative.
Hal2

Theorem 4.12 Let f be a multiplicative function with |f (n)| 1 for all n and
let 1 T (log x)10 be a parameter. Let
M (x, T ) = Mf (x, T ) = min D(f, pit ; x)2 .
|t|T

Then

(4.7)


1 X
1

f (n)  M (x, T ) exp(M (x, T )) + .

x
T
nx

Corollary 4.13 If f is multiplicative with |f (n)| 1 and D(f, pi ; ) = for


all real numbers then as x
1X
f (n) 0.
x
nx

ex:4.13

Exercise 4.14 Show that if T 1 then


1
2T

D(f, pit ; x)2 dt log log x + O(1).

Conclude that Mf (x, T ) log log x + O(1), and the bound in Hal
aszs theorem
is never better than x log log x/ log x.
Exercise 4.15 If x y show that
0 Mf (x, T ) Mf (y, T ) 2

X 1
log x
= 2 log
+ O(1).
p
log y

y<px

mindist

5
PROOF OF DELANGES THEOREM
Delange

Theorem 4.4 Let f be a multiplicative function taking values in the unit disc
U for which D(1, f ; ) < . Then as x we have
X
f (n) xP(f ; x).
nx

Let y be large and


(y) :=

X 1 Re f (p)
p

(5.1)

eq:Del21

py

so that, by hypothesis, (y) 0 as y . Since |1 z|2 2(1 Re z) for z U


we have
X |1 f (p)|2
2(y).
(5.2)
p

eq:Del22

py

Now we decompose the function f as f (n) = s(n)`(n) where s(n) = sy (n)


is the multiplicative function defined by s(pk ) = f (pk ) if p y and s(pk ) = 1
otherwise. Correspondingly, `(n) = `y (n) is the multiplicative function defined
Prop2.7
by `(pk ) = f (pk ) for p > y and `(pk ) = 1 otherwise. Fixing y, Proposition 2.9
gives that as x
X
s(n) = xP(s; ) + o(x) = xP(f ; y) + o(x).
(5.3) eq:Del23
nx

We shall prove Delanges theorem by showing that for large x (henceforth assumed > y 2 ) the function `(n) is more or less constant over n x.
Exercise 5.1 For any complex numbers w1 , . . ., wk and z1 , . . ., zk in the unit
disc we have
j
X
|z1 zk w1 wk |
|zj wj |.
j=1

Define
now g(p) = 0 if p y, g(p) = f (p) 1 for y < p
for p > x. Then consider the additive function
X
g(n) =
g(p),
p|n

x and g(p) = 0

Proof of Delanges Theorem

31

where the primes are counted without multiplicity. If n x is not divisible by


the square of any prime > y, using Exercise 5.1 we have
X
X
|f (p) exp(f (p) 1)| +
|f (p) 1|
|`(n) exp(g(n))|
p|n

p> x

p|n
x>p>y

|1 f (p)|2 +

p|n
p>y

|f (p) 1|.

p|n

p> x

Since the number


P of integers below x that are divisible by the square of some
prime > y is p>y x/p2 x/y, we conclude that
X

X |1 f (p)|2
X |1 f (p)|
x
+x
+x
y
p
p

x>p>y
x<px
p
 x( (y) + 1/y),
(5.4)

|`(n) exp(g(n))| 

nx

eq:Del22

where the last step follows upon using Cauchys inequality and (5.2).
PropDel

Proposition 5.2 Suppose that g(.) is additive (as above) with each |g(p)|  1.
Let
X g(p)
ge =
.
p
y<px

Then, for x y 2 ,
X

|g(n) ge|2 x

nx

 x 
X |g(p)|2
+O
.
p
(log x)2

y<px

Proof Note that since g(.) is additive, and g(p) = 0 for p y and p >
have
x

X
X

g(n) =
g(p)
+ O(1) = xe
g + O(( x)).
p

nx

xp>y

Hence, using |e
g |  log log x,
 x log log x 
X
X
|g(n) ge|2 =
|g(n)|2 x|e
g |2 + O
.
log x
nx

nx

Now, if [p, q] is the least common multiple of p and q then


X
X
X
g(p)g(q)
1
|g(n)|2 =

nx

nx
p,q|n

xp,qy

x
+ O(( x)2 )
[p, q]


 x 
1
1
2
|g(p)|
2 +O
p p
(log x)2

g(p)g(q)

xp,qy

= x|e
g |2 + x

xpy

x we

eq:bound1

32

Proof of Delanges Theorem

and the result follows.


2
eq:bound1

Now we are ready to prove Delanges theorem. Using (5.4) we have


X
X
p
s(n) exp(g(n)) + O(x( (y) + 1/y)).
f (n) =
nx

nx

Now if z and w have negative real parts, | exp(z) exp(w)|  |z w|. Therefore
X

X
X
s(n) exp(g(n)) = exp(e
g)
s(n) + O
|g(n) ge|
nx

nx

= exp(e
g)

nx

s(n) + O(x(

(y) + 1/ log x)),

nx

eq:Del22

PropDel

eq:Del23

upon using (5.2), Proposition 5.2 and Cauchys inequality. Now using (5.3) we
conclude that
X
1
1
f (n) = exp(e
g )xP(s; x) + o(x) + O(x((y) 2 + y 2 )).
nx

Now


P(`; x) = exp ge+O

y<p x

X
1
+
p2

x<px


1 p

|f (p) 1| 
= ge 1+O + (y) .
p
y

Since P(`; x)P(s; x) = P(f, x) we conclude that


X
1
1
f (n) = xP(f ; x) + o(x) + O(x((y) 2 + y 2 )).
nx

Letting y so that (y) 0, we obtain Delanges theorem.

6
DEDUCING THE PRIME NUMBER THEOREM FROM

HALASZS
THEOREM
6.1

Real valued multiplicative functions: Deducing Wirsings


theorem

Let f be a multiplicative function with 1 f (n) 1 for all n. It seems


unlikely that f can pretend to be a complex valued multiplicative function ni .
The triangle inequality allows us to make this intuition precise:
realdist

Lemma 6.1 Let f be a multiplicative function with 1 f (n) 1 for all n.


For any real number || (log x)10 we have
D(f, pi ; x) min

1p
2


1
log log x + O(1), D(1, f ; x) + O(1) .
3

Proof Since D(f, pi ; x) = D(f, pi ; x) the triangle inequality gives


D(1, p2i ; x) = D(pi , pi ; x) 2D(f, pi ; x).
lem4.3.1

In the range 1/100 || (log x)10 , we obtain from Lemma 4.6 that D(1, p2i ; x)2
(1 ) log log x, and so the lemma follows in this range.
Suppose
now that || 1/100. Then D(1, p2i ; x) = D(1, pi ; x) + O(1) by
lem4.3.1
Lemma 4.6. Thus, by the triangle inequality and our estimate above
D(f, pi ; x) D(1, f ; x) D(1, pi ; x) D(1, f ; x) 2D(f, pi ; x) + O(1)
so that
D(f, pi ; x)

1
D(1, f ) + O(1).
3
2

Using the above Lemma and Halaszs theorem with T = (log x)10 we deduce:
Halreal

Corollary 6.2 If f is a multiplicative function with 1 f (n) 1 then


 1

1X
1
f (n)  D(1, f ; x)2 exp D(1, f ; x)2 +
.
1
x
9
4 +o(1)
(log
x)
nx
Note
that the above Corollary implies a quantitative
form of Wirsings TheWirsing
Halreal
orem 4.8. An optimal version of Corollary 6.2 has been obtained by Hall and
Tenenbaum.

34

6.2

Deducing the prime number theorem from Hal


aszs theorem

Deducing the prime number theorem


Halreal

Using Corollary 6.2 with f = we get



X


(n) 

nx

and then


(x) = x + O

x
2

(log x) 9 +o(1)
x

(log x) 9 +o(1)

by Exercise 1.12 of 1.5.


The classical proof of the Prime Number Theorem yields a much better error
term than what we have obtained above; indeed one can obtain



(x) = x + O x exp (log x)3/5+o(1) .
There are also elementary proofs of theprime number theorem that yield an
error term of O x exp (log x)1/2+o(1) . While we can make some small im2
provements (see Lemma 6.2 below) to the error term O(x/(log x) 9 +o(1) ) obtained
by Hal
aszs theorem, the methods from the study of multiplicative functions do
not appear capable of giving an error better than O(x/ log x). That is our methods are very far, quantitatively, from what can be obtained by other methods.

7
SELBERGS SIEVE AND THE BRUN-TITCHMARSH
THEOREM
In order to develop the theory of mean values of multiplicative functions, we
shall need an estimate for the distribution of primes in short intervals. We need
only an upper estimate for the number of such primes, and this can be achieved
by a simple sieve method and does not need results of the strength of the prime
number theorem. We describe a beautiful method of Selberg which works well in
this and many other applications, but there are many other sieves which would
also work. The reader is referred to Friedlander and Iwaniecs Opera de Cribro
for a thorough treatment of sieves in general and their many applications.
7.1

The Brun-Titchmarsh theorem

Let a (mod q) be an arithmetic progression with (a, q) = 1 and let (x; q, a)


denote the number of primes p x with p a (mod q). The Brun-Titchmarsh
theorem gives an estimate for the number of primes in an interval (x, x + y] lying
in the arithmetic progression a (mod q).
Let 1 = 1 and let d be a sequence of real numbers with d = 0 if d > R or
if d has a common factor with q. Selbergs sieve is based on the simple idea that
squares are positive, and so
(
 X 2
= 1 if n > R is prime
d is
0 always.
d|n
Therefore, assuming for simplicity that R x,
X

(x + y; q, a) (x; q, a)

x<nx+y
na (mod q)

Expanding out the inner sum this is


X
d1 d2
d1 ,d2

2

d|n

1,

x<nx+y
na (mod q)
[d1 ,d2 ]|n

where [d1 , d2 ] denotes the l.c.m. of d1 and d2 . Since d = 0 unless (d, q) = 1,


the inner sum over n above is over one congruence class (mod q[d1 , d2 ]), and
therefore this inner sum is within 1 of y/(q[d1 , d2 ]). We conclude that

36

Selbergs sieve and the Brun-Titchmarsh theorem

X
y X d1 d2
|d1 d2 |
+
q
[d1 , d2 ]
d1 ,d2
d1 ,d2
2
y X d1 d2  X
=
+
|d | .
q
[d1 , d2 ]

(x + y; q, a) (x; q, a)

d1 ,d2

(7.1)

E3.1

The ingenious part of Selbergs argument


is in determining the optimal
choice
E3.1
E3.1
of d so as to minimize the first term in (7.1). The second term in (7.1) may be
viewed as an error term, arising from the error in countingp
integers in an interval,
y/q. In such a range
and this roughly places the
restriction
that
R
is
at
most
E3.1
of R, the first term in (7.1) is the more important main term, and observe that
it is a quadratic form in the variables d . The problem of minimizing this main
term thus takes the shape of minimizing a quadratic form subject to the linear
constraint 1 = 1. Selbergs quadratic form admits an elegant diagonalization
which allows us to find the optimal choice for dP
.
Since [d1 , d2 ] = d1 d2 /(d1 , d2 ), and (d1 , d2 ) = `|(d1 ,d2 ) (`) we have
X
X d d
X (`)  X d` 2
X d d
1
2
1
2
=
(`)
.
=
[d1 , d2 ]
d1 d2
`2
d

d1 ,d2

`|d1
`|d2

If we set
` =

X d`
d

then we have diagonalized the quadratic form in our main term:


X d d
X (`)
1
2
=
2.
[d1 , d2 ]
`2 `

d1 ,d2

(7.2)

E3.2

Note that like d , we have that ` = 0 if ` > R or if (`, q) > 1.


What does the constraint 1 = 1 mean for the new variables ` ? We must
invert the linear change of variables that we made in going from
P the s to the
s, and this is easily done by Mobius inversion. Let (`) = r|` (r) be 1 if
` = 1 and 0 otherwise. Then
d =

X d`
`

(`) =

X d` X
`

(r) =

(r)

r|`

X d`
r|`

X (r)
r

dr .

In particular, the linear constraint 1 = 1 becomes


1=

X (r)
r

r .

(7.3)

We
have transformed our problem to minimizing
the diagonal quadratic form
E3.2
E3.3
in (7.2) subject to the linear constraint in (7.3). It is clear that the optimal choice

E3.3

The Brun-Titchmarsh theorem

37

is when r is proportional to (r)r/(r) forE3.3


r R and (r, q) = 1. The constant of
proportionality can be determined from (7.3) and we conclude that the optimal
choice is to take (for r R and (r, q) = 1)
r =

1 r(r)
Lq (R) (r)

where

Lq (R) =

X (r)2
.
(r)

(7.4)

E3.4

rR
(r,q)=1

E3.2

For this choice, the quadratic form in (7.2) attains the minimum value which is
1/L(q R). Note also that for this choice of , we have (for d R and (d, q) = 1)
d =

X d(r)(dr)
1
,
Lq (R)
(dr)
rR/d
(r,q)=1

and so
X
dR

|d |

1
Lq (R)

X
d,r
drR
(dr,q)=1

X (n)2 (n)
(dr)2 d
1
=
,
(dr)
Lq (R)
(n)

(7.5)

P
where (n) = d|n d.
E3.1
Putting these estimates into (7.1) we deduce that for any arithmetic progression a (mod q) with (a, q) = 1, and any R x, we have
y
1  X (n)2 (n) 2
(x + y; q, a) (x; q, a)
+
,
(7.6)
qLq (R) Lq (R)2
(n)
nR
(n,q)=1

C2

This bound looks unwieldy but the techniques developed in Chapter 2 are enough
to estimate
the sums above. We illustrate this in the case q = 1. Note that by
ex2.3
Exercise 2.6
X (n)2 (n)

15
= 2 x + O( x log x).
(n)

nR

ex2.4

Exercise 7.1 Using Exercise 2.7, or otherwise, show that


L1 (R) = log R + + C + O
where
C=

X
p

Exercise 7.2 Taking R =


any 3 y x we have
(x + y) (x)

E3.5

nR
(n,q)=1

 (log R)2 

,
R

log p
= 0.7553 . . . .
p(p 1)

y and choosing optimally as 4 /450, prove that for



2y
2y 
450 
y
+
1 2 2C + log 4 + O
,
2
3
log y
(log y)

(log y)

BTeqn

38

Selbergs sieve and the Brun-Titchmarsh theorem

In particular we have:
Theorem 7.1 (Brun-Titchmarsh) If y y0 is large enough then
(x + y) (x)

2y
.
log y
BTeqn

One can go much further than this, using (7.6), to obtain that if y/q y0
then
2y
(x + y; q, a) (x; q, a)
.
(q) log(y/q)
Exercise 7.2 Prove this.
7.2

An alternative lower bound for a key distance

Lemma 7.3 If |t| xo(1) then






2
log x
2
it
D ((n), n ; x) 1 + o(1) log
.

log(2 + |t|)
Proof Fix [0, 1) and  > 0. Let P be the set of primes for which there
exists an integer n such that p In := [e2(n+)/|t| , e2(n++)/|t| ), so that
Re(pit ) lies between cos(2) and cos(2( + )). We partition the intervals In
into subintervals of the form [y, y + z], where z = o(y) and log z log y, which is
possible provided |t| = o(n/ log n) (Exercise). The Brun-Titchmarsh Theorem
implies P
that the number of primes in each such interval is {2 + o(1)}z/ log y,

and so pIn 1/p {2 + o(1)} log(1 + n+
), from which we deduce


X
1
log x
{2 + o(1)} log
+ O(),
p
log x0
x0 <px
pIn for some n

where x0 := (2 + |t|)log u e2/|t| and 2 + |t| = x1/u , as u . Combining this


with (1.2.4), we deduce (exercise) that

 Z 3/4
X 1 + cos(t log p)
log x
{2 + o(1)} log
(1 + cos(2))d + O(1)
p
log x0
1/4
x0 <px




log x
2
+ O(1).
1 + o(1) log

log x0


log x
The result follows if |t| 1. If |t| < 1 then log log
x0 log(|t| log x). However,
we also have
X 1
X 1 + cos(t log p)
1
1
(1 + cos(2/3))
log
+ O(1),
p
p
2
|t|
2/3|t|
2/3|t|
pe

pe

by (1.2.4), and then adding these lower bounds gives the result.

HALASZS
THEOREM
Hal2

In this chapter we develop the proof of Halaszs Theorem (Theorem) 4.12) that
if f is a multiplicative function with |f (n)| 1 for all n and let 1 T (log x)10
be a parameter with M (x, T ) = Mf (x, T ) = min|t|T D(f, pit ; x)2 , then

1
1 X

f (n)  M (x, T ) exp(M (x, T )) + .

x
T
nx

Throughout
the chapter f will be a multiplicative function with |f
1. The
P
P(n)|

sum nx f (n) will be denoted by S(x) and the Dirichlet series n=1 f (n)ns
by F (s).
MeanF(n)

lemHal1

8.1 Averages of averages


E2.10
First we begin with an identity which generalizes the identity (3.1) for smooth
numbers.
Lemma 8.1 For any multiplicative function f with |f (n)| 1 we have
X
S(x) log x =
f (p) log p S(x/p) + O(x).

(8.1)

HildIdentity

px

Proof Note that


X
 X
X
S(x) log x =
f (n) log n + O
log(x/n) =
f (n) log n + O(x). (8.2)
nx

nx

nx

P
Next writing log n = d|n (d) we have
X
X
X
X
X
f (n) log n =
f (n)
(d) =
(d)
f (n).
nx

nx

d|n

dx

nx
d|n

The last sum above has P


size x/d, and so the contribution from prime powers
d = pb with b 2 is  px (log p)(x/p2 )  x. Further when d = p the final
sum over n equals f (p)S(x/p) + O(x/p2 ), where the error results from those n
that are divisible by p2 and there are at most x/p2 such terms. We thus conclude
that
X
X
X
f (n) + O(x) =
log pS(x/p) + O(x),
S(x) log x =
log p
px

proving our Lemma.

nx
p|n

px

eq:Hal1.1

40

Hal
aszs Theorem

The next step is to bound S(x) by an average involving S(t) for all t x.
propHal1

Proposition 8.2 With notations as above


1
|S(x)|

x
log x

Z
1

|S(t)| dt
1
+
.
t
t
log x

Note that |S(t)|/t is the average size of f (n) for n up to t, and so the Proposition bounds |S(x)| by an average of averages.
Proof Now, for z = y + y 1/2 , using the Brun-Titchmarsh theorem,
X
y<pz

 
 x 
 x 
X

x




log
p
max


(z

y)
max
log p S
S

S

yuz
yuz
p
u
u

y<pz
Z z
y

x
 x 
 x 




S
S
dt + (z y) max S
,
yt,uz
t
t
u

and
x
 x  x x
|u t|
zy



x
.
S
S
=x
t
u
t
u
tu
y2
Summing over such intervals between y and 2y we obtain
X
y<p2y

 
Z

x


log p S
p

2y

 x 
x


S
dt + 1/2 ,
t
y

lemHal1

which implies, by Lemma 8.1, that


Z x  
x
1
x

dt +
S
log x 1
t
log x
Z x
x
dt
x
=
|S (t)| 2 +
.
log x 1
t
log x

|S(x)| 

2
8.2
Plancherel

Applications of the Plancherel formula

Proposition
8.3 Let an be any sequence of complex numbers such that
P
P A(s) =
s
a
n
converges
absolutely
in
Re(s)
>
1.
Define
also
A(x)
=
n=1 n
nx an .
For any > 0 we have
Z
1

|A(t)|2
1
dt =
3+2
t
2

|A(1 + + iy)|2
dy.
|1 + + iy|2

Applications of the Plancherel formula

41

Proof Consider the function G(y) = A(ey )/e(1+y) . Note that the Fourier
transform of G is
Z
Z

X
A(1 + + i)

.
G()
=
G(y)eiy dy =
an
e(1++i)y dy =
1 + + i

log n
n=1
Thus Plancherels formula gives
Z
Z
Z

|A(t)|2
1
A(1 + + i) 2
2
|G(y)| dy =
dt,

d =
2 1 + + i
t3+2
1

upon making the substitution t = ey .


Plancherel

keybound

The Proposition connects weighted averages of |S(t)| with the generating


function F (s). It turns out that it is more fruitful to apply the Plancherel formula
not directly to F but to F 0 . The bound that we thus derive is crucial to the proof
of Hal
aszs theorem.
Proposition 8.4 Let T 1 be a parameter. For any 1 > 0 we have
Z X
2 dt
1
1 


f (n) log n 3+2 
max |F (1 + + iy)|2 + 1 +
.

t
|y|T
(T )2
1
nt

Proof We write F (s) = G(s)H(s) where


G(s) =

Y
p

f (p) 1 X f(n)
=
.
ps
ns
n=1

That is, f(n) is the completely multiplicative function which matches f on all
primes. Note that H(s) is given then by an Euler product which converges absolutely in Re(s) > 1/2 and that in the region Re(s) 1 we have |H(s)| and
|H 0 (s)|  1.
The Plancherel formula gives
Z X
Z 0
2 dt



F (1 + + iy) 2
f (n) log n 3+2 

dy

t
1 + + iy
1

nt
Z  0



G H(1 + + iy) 2 GH 0 (1 + + iy) 2

(8.3)
+
dy.

1 + + iy
1 + + iy

Since H 0 (1 + + iy)  1 the second term above is


Z
Z X

2 dt
1
G(1 + + iy) 2



dy

f(n) 3+2  ,



1
+

+
iy
t

1
nt

upon using Plancherel again.

(8.4)

eq:Hal12

eq:Hal13

42

Hal
aszs Theorem
eq:Hal12

Now consider the first term in (8.3) and split it into the two regions |y| T
and |y| > T . Consider the contribution of the first region. This is
(G0 /G)(1 + + iy) 2


dy

1 + + iy
|y|T
|y|T
 Z (G0 /G)(1 + + iy) 2



 max |F (1 + + iy)|2

dy.
1
+

+
iy
|y|T

max |F (1 + + iy)|2

Now G0 /G(s) =

Z

P
s
, and using Plancherel yet again we have
n f (n)(n)n

Z X
Z
(G0 /G)(1 + + iy) 2
2 dt
dt
1



f (n)(n) 3+2 
 ,

dy 

1+2
1
+

+
iy
t
t

1
1

nt

upon using the Chebyshev bound that (t)  t. This is clearly acceptable.
It remains lastly to consider the contribution of the region |y| T . Since
H(1 + + iy)  1 we must bound
Z
|y|>T

Z 0
G0 (1 + + iy) 2



G (1 + + iy) 2

dy 
dy.

1 + + iy
T + 1 + + iy

Now G0 (1 + P+ iy)/(T + 1 + + iy) is the Fourier transform of the function


e(T +1+)x nex f(n)nT log n and so by Plancherel the above quantity is
Z

Z
2
X


f(n)nT log n dx 
e2x(T +1+)
nex

x2 e2x(T +1+)

 X

nT

2

dx.

nex

Now
into the cases ex T and ex > T we can easily establish that
P by splitting
T
Tx
n

e
+ e(T +1)x /(T + 1). Therefore our integral above is
nex
Z

0


1
e2x 
dx  1 + 3 2 .
x2 e2x(1+) +
(T + 1)2
T

This completes our proof.


8.3

The key estimate

Combining our work in the preceding two sections we arrive at the following key
estimate.
keyProp

Proposition 8.5 With notations as above, we have for x 3,


Z 1

1 X
1
d
1
log log x

f (n) 
max |F (1 + + it)|
+ +
.

x
log x 1/ log x |t|T

T
log x
nx

Proof of Hal
aszs theorem

43

eq:Hal1.1

Proof For any x y 3 from (8.2) we have


S(y) =

Z 1
X
d
 y 
1 X
y



f (n) log n 2 +
f (n) log n + O
.

log y
log y
y
log y
1/ log x
ny

ny

Therefore
Z 1
Z x
dy 
Z x X
|S(y)|


f
(n)
log
n
dy


2+2 d + log log x.
y2
y
1/ log x
2
2

(8.5)

ny

keybound

Applying Cauchys inequality and Proposition 8.4 we get, for 1 1/ log x,


dy 2  Z x dy  Z x X
2 dy 
Z x X




f (n) log n 2+2
f
(n)
log
n


3+2
y
y 1+2
y
2
1
2
ny
ny

1 
1
.
 2 max |F (1 + + iy)|2 + 1 +
|y|T
(T )2
eq:Hal14

Using this in (8.5) we conclude that


Z
1

|S(y)|
dy 
y2

1/ log x
1

1
1 
max |F (1 + + iy)| + 1 +
d + log log x
|y|T
T

Z


max |F (1 + + iy)|

1/ log x |y|T

d log x
+
+ log log x.

propHal1

Inserting this bound in Proposition 8.2 we have completed the proof of our
Proposition.
2
8.4

Proof of Hal
aszs theorem

We begin with the following general Lemma.


lemHal12

P |an |
Lemma 8.6 Let
n be a sequence of complex numbers such that
n=1 n < ,
Pa
so that A(s) = n=1 an ns is absolutely convergent in Re(s) 1. For all real
numbers T 1, and all 0 1 we have
max |A(1 + + it)| max |A(1 + iu)| + O

|t|T

|u|2T

 X
|an | 
.
T n=1 n

Exercise 8.7 Prove that, for any integer n 1, we have


n
(Hint: Show that

2
2 + 2

1
=



i
n
d
+
O
.
2 + 2
T

is the Fourier transform of e|z| .)

eq:Hal14

44

Hal
aszs Theorem

Proof Multiplying the result in this exercise through by an /n1+it , and summing over all n, we obtain
Z

 X

1 T
|an | 
A(1
+
it
+
i)d
+
O
A(1 + + it) =
2
2
T +
T n=1 n
which yields the result when |t| T , since then |u| |t| + || 2T for u = t + ,
RT
R
1

2
and as 1 T 2
+ 2 d 2 + 2 d = 1 by the exercise with n = 1.
Now we are ready to complete the proof
of Halaszs theorem.
We will bound
keyProp
lemHal12
the termslemHal12
in the integral in Proposition 8.5 using Lemma 8.6 above. Applying
Lemma 8.6 with an = f (n)n1/ log x we obtain that for any 1/ log x 1 we
have
 log x 
max |F (1 + + iy)| max |F (1 + 1/ log x + iy)| + O
T
|y|T
|y|2T
log x
 (log x) exp(M (x, 2T )) +
.
T
Moreover we have
1
max |F (1 + + iy)| (1 + ) = + O(1).

|y|T
keyProp

Using the minimum of the two bounds above in Proposition 8.5 (in other words,
using the first bound for exp(M (x, 2T ))/ log x and the second for larger
) we conclude that

1 X
1
log log x

f (n)  M (x, 2T ) exp(M (x, 2T )) + +
.

x
T
log x
nx

ex:4.13

By Exercise 4.14 we have M (x, 2T ) exp(M (x, 2T ))  log log x/ log x, so that
the log log Hal2
x/ log x term above may be dropped. Now renaming 2T as T we obtain
Theorem 4.12.
8.5 The logarithmic mean
Since
Z x
Z xX
X
X f (n) 1 X
dt
dt
f (n)
=

f (n),
f (n) 2 =
2
t
n
x
1
n t
nx

nt

nx

nx

and so we deduce that






Z x
X f (n) S(x)
dt


+
|S (t)| 2 .


x
t
1
nx n
propHal1

eq:Hal14

By Proposition 8.2 and then (8.5) we then deduce


1 X f (n)
1

 (1 + M (x, T ))eM (x,T ) + .
log x
n
T
nx

(8.6)

Halasz4Log

9
MULTIPLICATIVE FUNCTIONS
This is where the book gets less organized. This section includes several useful
results that will be used, but at the moment are not well tied together in a
common theme.

9.1

Upper bounds by averaging further

Suppose that 0 h(pa )  C a for all prime powers pa , where C < 2.


Exercise 9.1 Use this hypothesis to show that
example to show that this fails for C = 2.

pa x

h(pa ) log pa  x. Give an

Therefore
X
nx

h(n) log n =

X
nx

h(n)

log pa =

pa kn

h(m)

h(pa ) log pa  x

pa x/m
p-m

mx

X h(m)
,
m

mx

by the Brun-Titchmarsh theorem. Moreover, since log(x/n) x/n whenever


n x, hence
X
X h(m)
h(n) log(x/n) x
m
nx

mx

and adding these together gives


X

h(n) 

nx

x X h(m)
.
log x
m

(9.1)

(3.2.1)

(9.2)

(3.2.2)

mx

(3.2.1)

Using partial summation we deduce from 9.1 that for 1 y x1/2 ,


X
x/y<nx

h(n)
log(2y)

n
log x

X h(n)
.
n

nx

pr2.1

If f = 1 g and we proceed as in the proof of 2.1 then

46

Multiplicative functions








X g(d) X g(d)
S(x) S(x/y) S(x) X g(d) S(x/y)
+
+


x x/y x
d x/y
d
d

dx
x/y<dx
dx/y
X
X
X
1
|g(d)|
1

|g(d)| +
|g(d)| +
x
x/y
d
dx
dx/y
x/y<dx

X
X
log(2y)
|g(m)|
log(2y)
|1 f (p)|
,


exp
log x
m
log x
p
mx

px

(9.3)
(3.2.1)

ConvId

(3.2.2)

by (9.1) and (9.2). Note that this holds trivially for y > x. This result may be
regarded C20
as a first Lipschitz type estimate, explored in more detail later on in
chapter 15.
9.2

Convolutions of Sums

We
introduce here an idea that will be of importance later, in which we develop
HildIdentity
(8.1). If f is totally multiplicative then
Z

S(x

1t

f (r)(r)dt =

rxt

dt

f (mr)(r)
log r
log x

mrx

log x/m
log x

f (n) log n

nx

log x/n
=
log x

S(x/t)
1

log(x/t2 ) dt
.
log x t

ConvId

By (9.3) this equals

X
S(x)
x
|1 f (p)|
S(x) + O
+
exp
.
log x log x
p

(9.4)

px

9.3

A first Structure Theorem

Given a multiplicative function f , define g(pk ) = 1, h(pk ) = f (pk ) if p y and


g(pk ) = f (pk ), h(pk ) = 1 if p > y. Now 1 f = g h so that if h = 1 H then
f = g H. Therefore
X H(a) 1 X
1X
1 X
f (n) =
H(a)g(b) =
g(b).
x
x
a x/a
nx

ax

abx

bx/a

ConvId

By (9.3) this is

X H(a) 1 X
X |H(a)| log(2a)
X |1 g(p)|
.

g(b) + O
exp
a
x
a
log x
p

ax

bx

ax

px

(ConvolApprox)

Bounding the tail of a sum

47

We may extend both sums to be over all integers a since the error term is trivially
bigger than the main term when a > x. Now
X |H(a)|
a

a1

log a =

X |H(a)| X
a

a1

k log p

pk k|a

X k log p X |H(A)|
X |H(p)|
,
 log y exp
2
pk
A
p
A1

py
k1

writing a = pk A with (A, p) = 1 and then extending the sum to all A, since
|H(pk )| 2. Now
X |1 g(p)| + |H(p)| X |1 f (p)|
=
,
p
p

px

px

GenFundLem

and so we have proved, applying Proposition 3.6,

X |1 f (p)|
1X
1X
1X
log
y
.
f (n) =
g(n)
h(n) + O
exp
x
x
x
log x
p
nx

nx

nx

px

(9.5)
This is especially useful for understanding real valued f whose mean-value is
large.
9.4

Bounding the tail of a sum

Lemma 9.2 If f and g are totally multiplicative, with 0 f (p) g(p) p for
all primes p, then
 X
 X
Y
Y
f (p)
f (n)
g(p)
g(n)
1

1
p
n
p
n
py

nx
P (n)y

py

nx
P (n)y

Proof We prove this in the case that f (q) < g(q) and g(p) = f (p) otherwise,
since then the result follows by induction. Define h so that g = f h, so that
h(q b+1 ) = (g(q) f (q))g(q b ) for all b  0, and h(pa ) = 0 otherwise. The left
Q
times
hand side above equals py 1 g(p)
p
X h(m)
m

m1

as desired.

X f (n)

nx
P (n)y

X h(m) f (n)
X g(n)

=
,
m
n
n

N x
mn=N
P (N )y

nx
P (n)y

1stStructure

48

Multiplicative functions

Corollary 9.3 Suppose that f is a totally multiplicative function, with 0


f (p) 1 for all primes p. Then


u
X
Y
f (n)
C
f (p)
,

1
p
n
u log u
n>x
py

p|n = py

where x = y u .
Proof If take x = , both sides equal 1 in the Lemma. Hence if we subtract
both sides from 1, and let g = 1, we obtain


Y
X
Y
X
f (p)
f (n)
1
1
1

1
.
p
n
p
n
n>x
n>x
py

p|n = py

py

p|n = py

By Mertens theorem and this is


Z
Z
e
d(t, y)
(t, y)
e

dt,
.
log y x
t
log y x
t2
2

and the result follows from (3.3.3).


9.5

Elementary proofs of the prime number theorem


SelbergIden

In exercise 1.14, we rewrote Selbergs formula as


  

X
x
x
((x) x) log x =
log p

+ O(x).
p
p
px

There is an analogous formula for (n), derived from (3.1.1):


 
X
x
+ O(x).
M (x) log x =
log p M
p
px

Exercise 9.4 Show that


lim inf
x

M (x)
M (x)
+ lim sup
= 0.
x
x
x

10
DIRICHLET CHARACTERS
We give a concise introduction to Dirichlet characters. We wish to classify the
non-zero homomorphisms : Z/qZ C.
Qk
e
e
Suppose that q = j=1 pj j . We define a homomorphism j : Z/pj j Z C,
e
ej
by taking j (a) = (A) where A a (mod pj ) and A 1 (mod q/pj j ) (as
is possible by the Chinese Remainder Theorem). Moreover one can verify that
= 1 2 . . . k , and so the characters mod q can be determined by the characters
mod the prime power factors of q.
Now if k = 1 then = 1 . . . k1 is a homomorphism Z/(q/pekk )Z C.
Dirichlet characters are those that are not (also) a homomorphism Z/dZ C
for some proper divisor d of q with (d, q/d) = 1. Hence we may assume that each
j 6= 1.
Now suppose that q = pe . Since 6= 0 there exists a such that (a) 6= 0.
Then (a) = (a 1) = (a)(1) and so (1) = 1. Since 6= 1 there exists b
such that (b) 6= 1. Then (0) = (b 0) = (b)(0) and so (0) = 0. But then
(p)e = (pe ) = (q) = (0) = 0 and so (p) = 0. Hence (a) = 0 if (a, p) > 1.
Now let us return to arbitrary q. The last paragraph implies that (a) = 0 if
(a, q) > 1, so we can think of as a homomorphism (Z/qZ) C. Now suppose
that (Z/qZ) is generated by g1 , g2 , . . . , g` of orders k1 , . . . k` , respectively. Any
a with (a, q) = 1 can be written uniquely as g1a1 . . . g`a` (mod q) where 0 ai
ki 1 for each i, and so (a) = (g1 )a1 . . . (g` )a` and therefore the values of
(g1 ), . . . , (g` ) determine . Now (gi )ki = (giki ) = 1 and so (gi ) is a ki th
root of unity, and in fact we can select any ki th root of unity. Indeed let j be
that character mod q with j (gj ) = e(1/kj ), and j (gi ) = 1 for i 6= j. Then the
set of possible characters mod q is
{1a1 . . . `a` where 0 ai ki 1 for each i}
which, we see, can be viewed as a multiplicative group, isomorphic to (Z/qZ) .
Exercise 10.1 Prove that if (a, q) = 1 but a 6 1 (mod q) then there exists a character
mod q such that (a) 6= 1.

We call 0 the principal character if 0 (a) = 1 whenever (a, q) = 1. If q = dm


with m > 1 and = 0 where is a character mod d and 0 is the principal
character mod m then is induced by . If m is the smallest such integer then
m is the conductor of ; if m = q then is primitive.
The orthogonality relations are of central importance:

50

Dirichlet Characters

1
(q)

(
X

1
(q)

1
0

if m = 1,
otherwise;

(10.1)

Orthog1

1
0

if = 0 ,
otherwise.

(10.2)

Orthog2

(m) =

(mod q)

(
X
b

(b) =

(mod q)

Orthog1

(10.1) is trivial if m = 1. Otherwise select (mod q) for which (m) 6= 1. As the


characters mod q form
the set { :
P a group, P
P(mod q)} is also the character
group, and so (m) (m) = ()(m) = (m), and the result follows.
Orthog2

Orthog1

Exercise 10.2 Prove (10.2). (Hint: One proof is analogous to that of (10.1).)

For a given character (mod q), define the Gauss sum


X

g() :=
a

(a)e

(mod q)

 
a
.
q

When (m, q) = 1 we can change the variable a to bm, as b varies through the
residues mod q, coprime to q, so that


X
bm
.
(10.3)
(m)g() = g(, m), where g(, m) :=
(b)e
q
b

(mod q)

P
e
e
e
Select bj to be the inverse of q/pj j (mod pj j ) so that 1 j bj q/pj j (mod q),
and therefore

Y
Y
X
X abj
=
g(j , bj ) =
j (bj )g(j ).
g() =
(1 . . . k )(a)e
ej
pj
j
j
j
a

(mod q)

Q
This implies that |g()| = j |g(j )|, and so we may restrict our attention to
prime powers q = pe :
Orthog2
Suppose that is a primitive character mod q. We have g(, 0) = 0 by (10.2).
If e > 1 then (1 + q/p) 6= 1, else is a character mod q/p. Now by writing
a b(1 + q/p) (mod q), we have
X

g(, M p) =
a


(a)e

(mod q)

aM
q/p

= (1 + q/p)
b

(mod q)


(b)e

bM
q/p


= (1 + q/p) g(, M p),

so that g(, M p) = 0; that is g(, m) = 0 whenever (m, q) 6= 1. Hence

GenGSums

Dirichlet Characters

(q)|g()| =
m

|g(, m)| =

(mod q)

=q
a

a,b

(mod q)

51

(a)(b)
m

(mod q)


e

(a b)m
q

|(a)|2 = (q)q,

(mod q)

so that |g()| = q for q a prime power and, by the above, this follows for
primitive characters modulo composite q as well.

11
ZETA FUNCTIONS AND DIRICHLET SERIES: A MINIMALIST
DISCUSSION
11.1

Dirichlet characters and Dirichlet L-functions

We define the Dirichlet L-function for the character (mod q) by


L(s, ) =

X (n)
ns

n1

for Re(s) > 1. One can verify using the fundamental theorem of arithmetic that
this has the Euler product expansion
L(s, ) =

Y 
p prime

(p)
ps

1

in the same range.


Exercise 11.1 If (mod q) is induced by (mod m) then determine L(s, )/L(s, ).
Remark 11.2 We will need to add a proof of Dirichlets class number formula,
perhaps a uniform version? (Since this can be used to establish the connection
between small class number and small numbers of primes in arithmetic progressions). We also need to discuss the theory of binary quadratic forms, at least
enough for the class number formula and to understand prime values of such
forms.
Lemma 11.3 For any non-principal Dirichlet character (mod q) and any
complex number s with real part > 0, we can define

L(s, ) = lim

N
X
(n)
,
ns
n=1

since this limit exists.


The content of this result is that the right-side of the equation converges.
One usually uses the idea of analytic continuation to state that this equals the
left-side.

Dirichlet series just to the right of the 1-line

53

Proof [ sketch] We will prove this by suitably bounding

X
(n)
,
ns

n=N +1

for N q|s|, where s = + it. If n = N + j we replace the n in the denominator


by N , incurring an error of



1
1 |s|j
1
|s|q

(N + j)s N s  N N  N 1+ ,
for 1P j q. Summing this over all n in the interval (N, N + q], gives
N s n (n) + O(|s|q 2 /N 1+ )  |s|q 2 /N 1+ . Summing now over N, N + q, N +
2
2q, . . ., we obtain a total error of  |s|q/N , which implies the result.
11.2 Dirichlet series just to the right of the 1-line
Corollary 11.4 Suppose that there exists an integer k 1 such that f (p)k = 1
for all primes p. Then D(f (n), nit ; ) = for every non-zero real t.
Examples of this include f = the Mobius function, a Dirichlet character
(though one needs to modify the result to deal with the finitely many primes p
for which (p) = 0), and even .
Proof Suppose that there exists a real number t 6= 0 such that D(f (n), nit ; ) <
. Then D(1, nikt ; ) TruncRight
kD(f (n), nit ; ) < by the triangle inequality. Let
1
s = 1 + log x + ikt. By (4.5), we have
log (s) =

X
px

1
+ O(1),
p1+ikt

and so
log |(s)| = Re(log (s)) =

X Re(pikt )
+ O(1)
p

px

X1
=
D(1, nikt ; x) + O(1) = log log x + Ot (1),
p
px

zeta

and therefore |(s)|  log x. However exercise 1.4 yields that




1
1
1
(s) =
+ O(1 + |t|) = + O 1 + |t| + 2
,
s1
it
|t| log x
2

a contradiction.
Koukoul

Lemma 11.5 If is a character mod q and x y q then




X (p)
log q(1 + |t|)
 log 2 +
.
p1+it
log y
y<px

54

Zeta Functions and Dirichlet series: A minimalist discussion

Proof
(Koukoulopoulos)
Taking absolute values we have the upper bound


log x
log log y . Let m be the product of the primes y that do not divide q. Write
sX := 1 + log1 X + it for all X > 0, and take s = sx for convenience. Taking
absolute values we obtain an acceptable upper bound for the primes in the sum
TruncRight
that are Y := (|sx |q)4 . We may therefore now assume that y Y . By (4.5)
with f = we have that

X (n)
X (p)

.
exp
p1+it
ns
y<px

n1
(n,m)=1

Take N y with H = qN 1/3 . For s = sx = 1 +

1
log x

+ it we have

X
N <nN +qH
(n,m)=1

(n)
1
= s
ns
N

X
N <nN +qH
(n,m)=1

(n) + O

X
N <nN +qH
(n,mq)=1



1

1 .
ns
s
N

Now |1/ns 1/N s |  (|s|qH/N )/N <s |s|qH/N 2 as |s|qH N , which leads
to a bound on the second
sum; and we bound the first sum by taking x = N, y =
FLS2
H, z = y in Corollary 17.3. Then, partitioning (N, 2N ] into intervals of length
qH, we obtain
X
N <n2N
(n,m)=1

1
1
|s|qH
1
1
(n)
1
1


+ +
+ 1.
ns
log y H log1 y
N log y
log y N log1 y
H
N6

Summing over N = y, 2y, 4y, 8y, . . . yields that our sum is bounded, and hence
the result.
2
TruncRight
P
(p)
By (4.5) with f = we have that log(L (sx , ) /L (sy , )) = y<px p1+it +
O(1), and so we deduce from the above that if is a character mod q and
x y q then



 





L 1 + 1 + it,  1 + log |t| L 1 + 1 + it, .




log x
log y
log y
There is a proof of this which uses the theory of analytic functions, which is too
beautiful to not include:
Proof It is well-known that the completed Dirichlet L-function has a Hadamard
factorization; that is if = (1 (1))/2 then
  s+


2

s+
(s, ) :=

L(s, ) = eA+Bs
q
2

Y
(,)=0

s
1

es/

Dirichlet series just to the right of the 1-line

where Re(B +

55

1/) = 0 (as in Chapter 12 of Davenport). We deduce that




( + it, )


( + it, ) =

Y
(,)=0



+ it


+ it .

Now if Re then | + it | | + it | by the (geometric) triangle inequality, and so the above product is 1 if 1 since we know
that Re() 1. Inserting this inequality into the definition of (s, ), we deduce the result from the fact that 0 (s)/(s) = log s + O(1/|s|) (as in (6) of
Chapter 10 of Davenport), which implies that the ratio of the Gamma factors is
 log |t|/ log y  1.
2
Exercise 11.6 The Riemann Hypothesis for L(s, ) states that if (, ) = 0
then Re() 1/2. Prove that this is equivalent to the conjecture that (s, ) is
increasing as one moves in the positive real direction along any horizontal line,
from the line Re(s) = 1/2.

12

HALASZS
THEOREM: INVERSES AND HYBRIDS
It is evidently useful to evaluate the mean value of f (n) in terms of the mean
value of f (n)/nit :
AsympT2

Theorem 12.1 Suppose f (n) is a multiplicative function with |f (n)| 1 for all
n. If t = tf (x, log x) then


X
xit X f (n)
x log log x

f (n) =
+O
.
1 + it
nit
(log x)2 3
nx
nx
This also holds if we take t = tf (xA , log(xA )) for some A, 1 A  1.
This yields a hybrid version of Halaszs theorem that takes into account the
point 1 + it:

UBdt

Theorem 12.2 Let t = t(x, log x) and let L = L(x, log x). Then

L
1 X
2
log log x

.
f (n) 
log +

x
1 + |t|
L (log x)2 3

(12.1)

nx

We can obtain a better result when we have no useful information about the
size of L:
UBdHyb2

Theorem 12.3 Let f be a multiplicative function with |f (n)| 1 for all n.


Then

1
log log x
1 X

f (n) 
+
.

x
1 + |t| (log x)1 2
nx

UBdHyb2

Proof of Theorem 12.3 We may suppose that |t| 10. Let y = tf (x, |t| 2).
tRepulsion
2
By Lemma ?? and the definition of t, weHalExplic2
see that |F (1 + iy)|  (log x) , as
|y| |t| 2, and the result follows from (??) with T = |t| 2.
2
Exercise 12.1 Prove that if |t|  m and || 1/2 then 2mit = (m )it + (m +
)it + O(|t|/m2 ). Deduce that
(
X
mz

mit =

z 1+it
1+it

+ O(1 + t2 )

O(z).

Generalize this argument to sum other (carefully selected) functions over the integers.

We require the following lemma, which relates the mean value of f (n) to the
mean-value of f (n)nit .

UBdHyb1

Hal
aszs Theorem: Inverses and Hybrids
AsympT1

57

Lemma 12.4 Suppose f (n) is a multiplicative function with |f (n)| 1 for all
n. Then for any real number t with |t| x1/3 we have



X
p
x
xit X f (n)
it
+O
f (n) =
log(2+|t|) exp D(f (n), n ; x) 2 log log x .
1 + it
nit
log x
nx

nx

Proof Let g and h denote the multiplicative functions


P defined by g(n) =
f (n)/nit , and h(pk ) = g(pk ) g(pk1 ), so that g(n) = d|n h(d). Then
X
X
X
X
X
X
f (n) =
g(n)nit =
nit
h(d) =
h(d)dit
mit .
nx

nx

nx

d|n

dx

mx/d
2

We use the first estimate in the exercise when d x/(1 + t ), and the second
estimate when x/(1 + t2 ) d x. This gives


X
X
X
x1+it X h(d)
|h(d)|
f (n) =
+ O (1 + t2 )
|h(d)| + x
.
1 + it
d
d
2
2
nx

dx

dx/(1+t )

x/(1+t )dx

Applying (2.4.5) and (2.4.6) we deduce that



X
X |h(d)| 
x
x1+it X h(d)
f (n) =
+O
log(2 + |t|)
1 + it
d
log x
d
nx
dx
dx


X |1 g(p)| 
x1+it X h(d)
x
=
+O
log(2 + |t|) exp
.
1 + it
d
log x
p
px

dx

We use this estimate twice, once as it is, and then with f (n) replaced by f (n)/nit ,
and t replaced by 0, so that g and h are the same in both cases.
Then, by the Cauchy-Schwarz inequality,

2
X |1 g(p)|
X 1 X 1 Re(g(p))

2D(g(n), 1; x)2 (log log x+O(1)),


p
p
p
px

px

px

and the result follows, since D(f (n), nit ; x)2 = D(g(n), 1; x)2  log log x.
AsympT2

UBdt

Proof
12.2 We may assume that M := Mf (x,
log x) >
of Theorems 12.1 andAsympT2
AsympT1
(2 3) log log x else Corollary 12.1 follows immediately
from
Lemma
12.4.
Now,

P
2 3
in this case nx f (n)  x log log x/(log x)
by Halaszs Theorem. Now let
P
g(n) = f (n)/nit . If |t| > 12 log x then |(xit /(1 + it)) nx g(n)| x/(1 + |t|) 
AsympT2
x/ log x and Corollary 12.1 follows. But if |t|AsympT2
> 21 log x then tg (x, 12 log x) = 0,
so that Mg (x, 21 log x) = M , and Corollary 12.1 follows from Halaszs Theorem
applied to g.
UBdt
AsympT2
Finally Theorem 12.2 follows from Corollary 12.1 by the definition of L.
It is left as an exercise for the reader to prove this for t = tf (xA , log(xA )).
2

58

12.1

Hal
aszs Theorem: Inverses and Hybrids

Lower Bounds on mean values

Hal
aszs Theorem states that

1 X

f (n)  L(x, T ) log(2/L(x, T )) + T 1 .

x
nx

We will see an example which shows that the L log(1/L) is necessary, but that is
for a very special function. Of more interest is whether we really need a function
like L in our upper bound for typical f .
LBdL

Theorem 12.5 Suppose that t = tf (x, T ) = 0 and let L = L(x, T ) with =


1/ log(1/L) and B = log(1/). There exists a constant c > 0 such that there
exists y in the range xL/C y xCB for which


X




 L(x, T )y.
f
(n)


ny

If f (n) 0 for all n then one can improve this to
X
f (n)  L(y, T )y.
ny

TruncRight

Proof By (4.5) we have


L log x 

n1

If y > x then (1/y)


rem, and so
Z
1
xCB


= 1+

X f (n)
n

1+ log1 x

ny

1
y

2+ log1 x

f (n)dy.

ny

f (n)  L(y) log(1/L(y))  L(x)/ by Halaszs Theo-

2+ log1 x

Z

1
log x

f (n)dy 

ny

L(x)

xCB

dy
y

1+ log1 x

L(x) log x
.
e(C1)B

Also taking L = L(x) we have


Z

xL/C

Now if (1/y)
Z

ny

xCB

xL/C

1
2+ log1 x

xL/C

f (n)dy
1

ny

dy
L log x

.
y
C

f (n)  L(x)/C for all y, xL/C < y < xCB we obtain


1

2+ log1 x

X
ny

L
f (n)dy 
C

xL/C

xCB

dy
y

1+ log1 x

L log x
.
C

Combining these estimates yields a contradiction if C is sufficiently large and so


implies our first result.

Tenenbaum (Selberg)

59

Now
Then L(xt ) L(x)/t, and hence if
P suppose that f (n) 0 for all n.
L/C
(1/y) ny f (n)  L(y)/C for all y, x
< y < xCB then
Z

xCB

xL/C

1
y

2+ log1 x

log x

f (n)dy  L/C
xL/C

ny

1+ log1 x

log y

xCB

dy

dy + L/C
x

1+ log1 x

L log x
,
C

which implies our second result.

Note that this cannot be much improved.


The example with f (p) = 1 for
P


p < xL and f (p) = 0 thereafter, yields ny f (n)  y/uu for y = xuL , so in
our first result we cannot improve the lower bound on the range for y to as much
as y L log(1/L) ; and in the second result to as much as y cL .
AsympT2

Exercise 12.6 Use Theorem 12.1 to obtain an analogous result when tf (x, T ) 6=
0.
12.2

Tenenbaum (Selberg)

Developing an idea of Selberg, Tenenbaum showed that if the mean value of f (p)
is z, where z 6= 0, 1, with very little variance, then


1
1X
f (n)
lim (s 1)z F (s) (log x)z1 .
x
(z) s1+
nx

Our expected mean value is the same quantity with (z) replaced by e(1z) .
Note that if z = 0, 1 then 1/(z) = 0 so we might expect a rather different
phenomenon there. Indeed one can show that in both those cases the mean
value is  1/(log x)2 . In the case z = 0 this singularity restricts how much we
might believe our heuristic about the mean value of a multiplicative function. In
particular when trying to prove a lower bound on the mean value like  L we
see that it is necessary to include at least a O(1/ log x) term.
This is a very delicate
Pkind of result for real z 0. Let z = , 0;
the above suggests that
nx f (n) = o(x/(log x)). If we now alter
P the multiplicative
function
f
on
the
primes
(x/2,
x]
only,
then
we
alter
nx f (n) by
P
0
x/2<px f (p)f (p) which can be selected to have any size as large as x/2 log x.
This implies that to prove the above result we need very precise distribution of
the f (p); not something of great general interest.

13
DISTRIBUTION OF VALUES OF A MULTIPLICATIVE
FUNCTION
Suppose that f is a multiplicative function, with |f (n)| = 1 for all n 1. Define


1
1
Rf (N, , ) :=
# nN :
arg(f (n)) (, ] .
N
2
We say that the f (n) are uniformly distributed on the unit circle if Rf (N, , )
for all 0 < < 1. Jordan Ellenberg asked whether the values f (n)
are necessarily equidistributed on the unit circle according to some measure, and
if not whether their distribution is entirely predictable. We prove the following
response.
Theorem 13.1 Let f be a completely multiplicative function such that each f (p)
is on the unit circle. Either the f (n) are uniformly distributed
on the unit circle,
P
or there exists a positive integer k for which (1/N ) nN f (n)k 6 0. If k is the
smallest such integer then
Rf (N, , ) = k1 Rf k (N, k, k) + oN (1) for 0 < < 1
Exercise 13.2 Deduce in the final case that R(N, + k1 , + k1 ) = R(N, , ) +
oN (1) for all 0 < < 1.
The last parts of the result tell us that if f is not uniformly distributed on the
unit circle, then its distribution function is k copies of the distribution function
for f k , a multiplicative function whose mean value does not 0. It is easy to
construct examples of such functions f k = g whose distribution function
is not

uniform: For example, let g(p) = 1 for all odd primes p and g(2) = e( 2), where
g is completely multiplicative.
To prove our distribution theorem we use
Weyls theorem Let {n : n 1} be any sequence of points on the unit circle.
The setP{n : n 1} is uniformly distributed on the unit circle if and only if
(1/N ) nN nm exists and equals 0, for each non-zero integer m.
We warm up for the proof of the distribution theorem by proving the following
result:
WeylCor

Corollary 13.3 Let f be a completely multiplicative function such that each


f (p) is on the unit circle. The following statements are equivalent:
(i) The f (n) are uniformly distributed on the unit circle.
(ii) Fix any t R. The f (n)nit are uniformly distributed
on the unit circle.
P
(iii) For each fixed non-zero integer k, we have nN f (n)k = o(N ).

Distribution of values of a multiplicative function

61

Proof That (i) is equivalent to (iii) is given by Weyls equidistribution theorem.


By Hal
aszs Theorem we find that (iii) does not hold for some given k 6= 0 if and
only if f (n)k is niu -pretentious for some fixed u. But this holds if and only
if
LBdL
(f (n)nit )k is ni(u+kt) -pretentious for some fixed u. But then, by Theorem 12.5,
we see that (iii) does not hold with f (n) replaced by f (n)nit , and hence the
f (n)nit are not uniformly distributed on the unit circle.
2
Proof of WeylCor
the distribution theorem The first part of the result
P follows from
Corollary 13.3. If k is the smallest positive integer for which nN f (n)k 
N then, by Halaszs Theorem we know that there exists uk  1 such that
D(f (n)k , nikuk , ) < , and that D(f j , niu , ) = for 1 j k 1,
whenever |u|  1. (And note that D(f j , niu , ) = D(f j , niu , ).) Write
f (p) = r(p)piuk g(p), where r(p) is chosen to be the nearest kth root of unity to
f (p)piuk , so that | arg(g(p))| /k, and hence 1 Re(g(p)) 1 Re(g(p)k ).
Therefore D(1, g, ) D(g k , 1, ) = D(f (n)k , nikuk , ) < .
By the triangle inequality, D(f mk , nikmuk , ) mD(f k , nikuk , ) < , and
ikmuk
D(f mk+j , niu , ) D(f j , niv )D(f mk , n
, ) = , where v = ukmuk
P
for 1 j k 1 and any |u|  1, and so nN f (n)` = o` (N ) if k - `.
The characteristic function of the interval (, ) is
X e(m) e(m)
e(mt).
2im

mZ

We can take this sum in the range 1 |m| M with an error . Hence
X

R(N, , ) =

1|m|M

e(m) e(m) 1 X
f (n)m + O()
2im
N

X
1|r|R

nN

e(kr) e(kr) 1 X
f (n)kr + O()
2ikr
N
nN

writing m = kr (since the other mean values are 0) and R = [M/k]. This formula
does not change value when we change {, } to { + k1 , + k1 }, nor when we
change {f, , } to k1 times the formula for {f k , k, k} and hence the results.
2
It is an interesting problem to prove a uniform version of this result when N
is large.

14
LIPSCHITZ BOUNDS
We wish to determine
how mean values of multiplicative functions vary in short
AsympT2
intervals. Theorem 12.1 shows that this is not straightforward for if the mean
values of f (n) at x and x/z are roughly the same and AsympT2
large, and similarly the
mean values of f (n)/nit at x and x/w; then Theorem 12.1 implies that wit 1
which is not necessarily true. However if we take the t into account then we can
prove such a result:
LipschBounds

Corollary 14.1 For 1 w x1 , we have






log 2w




X

1 X
log
x
1
log x
log log x


,
f
(n)

f
(n)
log
+

x1+it
1+it
(x/w)
1
+
|t|
log
2w
(log
x)2 3


nx
nx/w
where t = tf (x, log x) if |tf (x, log x)| < 21 log x, otherwise t = 0, and := 1 2 =
0.36338 . . ..

Note that 2 3 = 0.267949


. . .. When f (n) is MR2099829
non-negative we can improve
LipschBounds
the = 1 2/ in Corollary 14.1 to 1 1/, see [49].
As a consequence we can give the same upper bound on the absolute value
of the difference of the mean value of f up to x, and the mean value of f up to
x/w. However we can do better if f is real-valued:

RealLipsch

Exercise 14.2 Deduce that if f (n) R for all n then




X





X
1

1
log x
log log x

 log 2w
.
f
(n)

f
(n)
log
+
x

x/w
log x
log 2w
(log x)2 3
nx

nx/w
LipschBounds

We deduce Corollary 14.1 from the following:


Lipschitz

Theorem 14.3 For any


x 3 and all 1 w x/10, we have, with the same
LipschBounds
notation as Corollary 14.1


X




X f (n)
1
f
(n)
w

 log 2w log log x .

x
nit
x
nit
log x
log 2w

nx

nx/w

We would like to increase the exponent as much as possible. It must be 1


since |(1 + ) (1)| = log(1 + ) for 0 1.

Lipschitz bounds

63

Our proof is a modification of the proof of Halaszs


Theorem, so that the
keyProp
key
is
the
appropriate
modification
of
Proposition
8.5.
We
again define S(N ) :=
P
nN f (n).
If we use exercise in section 8.4 to establish that
Z

1

1 T
iy
(1

w
ni (1 wiyi )d + O
,
)
=

2
2
n
T +
T
lemHal12

then we obtain a slight variant of Lemma 8.6:

OffLineOn

OffLineOn2

Lemma 14.4 With the same hypothesis as Lemma ??, for all real numbers
T, w 1, and all 0 1 we have

 X
|an | 
.
max |A(1 + + it)(1 wiy )| max |A(1 + iu)(1 wiy )| + O
T n=1 n
|t|T
|u|2T

HalModi

Proposition 14.5 Let f , T , and x be as in Proposition 8.5. Then for 1 w


x, we have
Z 1
S(x) S(x/w)

1
1

max |(1 wiy )F (1 + + iy)| d




x
x/w
log x 1/ log x |y|T


1
log 2w
log x
+ +
log
.
T
log x
log 2w

keyProp

keyProp

8.5, we shall merely


Proof Since the proof is very similar to that of Proposition
keyProp
sketch it. Arguing as in the proof of Proposition 8.5, we get that
Z x
Z 1

dy 
Z x 1 X
1 X

S(y) S(y/w) dy


f (n) log n
f (n) log n 1+2 d



y
y/w
y
y/w
y
2w
1/ log x
2w y ny
ny/w
 log x 
+ log 2w log
.
log 2w
Using Cauchys inequality, we obtain for 1/ log x,
2

Z x X

X
1
1
dy

f (n) log n
f (n) log n 1+2

y/w
y
2w y ny
ny/w
Z x X
2 dy
1
1 X
1


f (n) log n
f (n) log n 1+2 .

2w y
y/w
y
ny

ny/w

keyProp

As
8.5, extending the range of integration for y to
R in the proof of Proposition
t
,
substitute
y
=
e
,
and
use
Plancherels
formula. The only difference is that
1
F 0 (1 + + iy)/(1 + +P
iy) in the right side there must
Pbe replaced by the Fourier
transform of e(1+)t net f(n) log n we(1+)t net /w f(n) log n which is
F 0 (1 + + iy)(1 wiy )/(1 + + iy). We
make this adjustment, and follow
keyProp
the remainder of the proof of Proposition 8.5.
2

64

Lipschitz bounds
Lipschitz

Proof of Theorem 14.3


We may assume that |t| (log x)/2, else the result
UBdHyb2
follows from Theorem 12.3. Let g(n) = f (n)nit , so that G(s) = F (s + it); and
therefore
|G(1)| = |F (1 + it)| =
max
|G(1 + iy)|.
|y|(log x)/2

HalModi

By Proposition 14.5, with f there replaced by g, F by G, and T = (log x)/2, we


obtain the upper bound
Z 1
 log x 
1
log 2w
d
+
max
|G(1++iy)(1wiy )|

log
.
log x
log 2w
log x 1/ log x |y|(log x)/2

Let aP
function with apk = g(pk ) if OffLineOn2
p x and apk = 0 so
n be the multiplicative
Q
that n |an |/n px (1 1/p)1  log x. By Lemma 14.4 with A(s) = G(s),
and T = (log x)/2, we have
max

|y|(log x)/2

|G(1 + + iy)(1 wiy )| max |G(1 + iy)(1 wiy )| + O(1).


|y|log x

Now |G(1 + iy)|


n |an |/n  log x; and |G(1 + iy)|  (log x) (1 +
tRepulsion
2
1/|y|)1 by Lemma ??. Moreover, since |1 wiy |  min(1, |y| log 2w), we
deduce that
2

max

|y|(log x)/2

|G(1 + + iy)(1 wiy )|  (log x) (log 2w)1 .

In addition, we have the trivial estimate


max

|y|(log x)/2

|G(1 + + iy)(1 wiy )|  (1 + ) 


2

1
.

Using the first bound when < 1/(log x) (log 2w)1 , and the second bound
otherwise, in our integral, we obtain our result
2
LipschBounds

AsympT2

Proof ofLipschitz
Corollary 14.1 The result follows from Corollary 12.1 followed by
Theorem 14.3.
2
14.1 Consequences
If m is a squarefree integer x1 we have, for f totally multiplicative,
X
X
X
X
X
f (n) =
f (n)
(d) =
(d)f (d)
f (r)
nx
(n,m)=1

nx

d|(m,n)

d|m

X (d)f (d) X
d|m

Y
p|m

d1+it
1

f (n) + O

nx

f (p)
p1+it

X
nx

Xx
d|m

f (n) + O

rx/d

log 2m
log x


log

log x
log 2m


+

log log x

(log x)2 3
!!




m
log 2m
log x
log log x

x
log
+
(m)
log x
log 2m
(log x)2 3
d

(14.1)

FSieved1

Consequences

65

LipschBounds

LipschBounds

by Corollary 14.1, as 1 + |t| 1. Combining this further with Corollary 14.1 we


obtain, for mw x1 ,


X
f (p) X
1 Y
1 1+it
f (n)
f (n) = 1+it
w
p
nx

p|m

nx/w
(n,m)=1

+O

m
x

(m) w

log log x
(log x)2

log mw
log x

12/


log

log x
log mw

!!

(14.2)

.
FSieved2

Exercise 14.6 Verify that


Z x
X f (n)
S(x) S(y)
S(z)
=

+
dz.
n
x
y
z2
y

ynx

Prove that if t (w) = (1 i/t)(1 1/wit )/ log w if t 6= 0, and 0 (w) = 1 , then



log 2w


X
1
f (n)
1
log x
log log x
log x
.
= t (w)
f (n)+O
log
+
2 3
log w
n
x
1 + |t|
log 2w
(log
x)
nx
x/wnx
X

Show that we may assume t = 0 if f is real-valued.


Up until this point in this book we have developed the theory for all multiplicative functions (which is necessary since we need to work with (n)). It is
typically easier to develop the theory just for totally multiplicative functions.
The point of the next two exercises is to show that this can be done with little
loss of generality.
UseTotally

Exercise 14.7 Given f define g to be that totally multiplicative function with


g(p) = f (p) for all primes p. Prove that


X
X
log log x

f (n) = Ct (f )
g(n) + O x
(log x)2 3
nx
nx
where t = tf (x, log x) = tg (x, log x), and the correction factor
Ct (f ) :=

Y
p

f (p)
1 1+it
p



f (p)
f (p2 )
1 + 1+it + 2+2it + . . . .
p
p

(Hint: Write f = g h and bound the size of h(pk ).) Show that we may take
t = 0 if f is real-valued. Show that Ct (f ) = 0 if and only if f (2k ) = 2ikt for all
k 1.

66
UseTotally2

Lipschitz bounds

Exercise 14.8 Use the last two exercises to show that




X f (n)
X g(n)
1X
log log x

= C0 (f )
t (f )
g(n) + O
n
n
x
(log x)2 3

nx

nx

nx

where t (f ) = (1 i/t)(C0 (f ) Ct (f )) if t 6= 0, and

!
P
k
k
X
kf
(p
)/p
f
(p)/p
.
Pk0

0 (f ) = C0 (f )
log p
k
k
1 f (p)/p
k0 f (p )/p
p prime

In the special case that t = 0 and f (2k ) = 1 for all k 1 we have




X f (n)
1X
log log x

= C00 (f ) log 8
g(n) + O
,
n
x
(log x)2 3
nx
nx
Q
where C00 (f ) = p3 (1 f (p)/p)(1 + f (p)/p + f (p2 )/p2 + . . .). Show that we may
take t = 0 if f is real-valued.
14.2

Truncated Dirichlet series

One can verify the identity (obtained through partial summation) that for every
> 0 one has
Z x
X f (n)
S(x)
S(z)
=
+

dz.

1+
n
x
1 z
nx

LipschBounds

OtherMeans

Exercise 14.9 Use Corollary 14.1 to prove that if (1 ) log x then







X f (n)  X 1
(1 )(1 + it) S(x)
1
log log x
log log x

=
+
1+O
+O
.
n
n
1 + it
x
x1
(1 + |t|)(log x)
(log x)2 3
nx
nx
In particular if t = 0 then this equals S(x)/x + o(1).
Exercise 14.10 Show that if > 1 and ( 1) log x then

X f (n)
Y 
f (p) f (p2 )

1 + + 2 + . . . .
n
p
p
p prime
nx

In analogy to Proposition 2.9, establish that this can be re-written as




X f (n)  X 1
Y
1
f (p) f (p2 )

1
+
+
+
.
.
.
.
n
n
p
p
p2
nx

nx

px

In the last two exercise we have seen that the value of the truncated Dirichlet
series can be easily understood for all 0 in terms of Euler products and S(x),

Truncated Dirichlet series

67

except in a small range around = 1. We write s(t) := S(xt )/xt . Substituting


this into the above identity, we obtain for = 1 + A/ log x,
Z 1
X f (n)
A
eAt s(t)dt.
=
e
s(1)
+
(log
x
+
A)
n
0

nx

If A is bounded then this implies that



Z 1

Z 1
X f (n)  X 1
1
At
At
e
dt
+
O
e
s(t)dt
=
.
n
n
log x
0
0
nx

nx

This seems to be rather more difficult to understand depending, as it does, on


the vagaries of the mean value of f .
One can view all of these results as comparison of different weighted mean
values.

15
THE STRUCTURE THEOREM
C20

We have seen two types of mean values of multiplicativeP


functions
When f (p) = 0 if p|m and f (p) = 1 otherwise then nx f (n) xP(f ; x).
When f (p) = 1 if p y say, then the mean value of f is obtained from an
integral delay equation (as in section 3.1).
One might ask what other possibilities there are. The Structure Theorem tells
us that all large mean values are the product of the two types, the first for the
small prime factors, the latter for the large prime factors:
Given a multiplicative function f , let t = tf (x, log x) and define
(
(
1
if p y
f (pk )/(pk )it if p y
k
k
g(p ) =
and
h(p ) =
.
k
k it
f (p )/(p )
if y < p x
1
if y < p x,
If t = 0 then h g = 1 f .
StructThm

Theorem 15.1 We have



 
xit 1 X
1X
log y
1X
f (n) =
g(n)
h(n) + O
x
1 + it x
x
log x
nx

nx

nx

where = /(1 + ) < 0.2665288966 . . .


StructThm

Proof of Theorem 15.1 We begin our proof in the case that tf (x, log x) = 0.
We let I(x) equal
X

(g h)(n) log(x/n) =

nx

Z
g(a)h(b)
b

abx

x/a

dT
=
T

g(a)

X
bT

ax/T

h(b)

dT
.
T

We split this integral into several intervals. First for T y we simply use the
trivial bounds to get 
x log y. For the remaining values of T we simply take
GenFundLem
f = h in Proposition 3.6 to obtain a main term, as P(h; T ) = P(h; x), of
Z
y

x/y

g(a)P(h; x)dT = P(h; x)x


1

ax/T

X
aA

plus an error term, again using the trivial bound for


y t , x = y u , of
Z u
 log y
xtt dt  x log y,
1

g(a)

dA
A2

g(a), and writing T =

Best possible

69

GenFundLem

by Proposition 3.6. Hence if z = y v where 1 v = u u then


Z x/y
I(x) zI(x/z)
1 X
dA
= P(h; x)
g(a)
+ O(log y)
x
A
A
x/yz
aA

X

1
= log z P(h; x)
g(n) + O (1/v + (v/u) )
x
nx

X
X
1
1
= log z
g(n)
h(n) + O (1/u )
x
x
nx

nx

Lipschitz

GenFundLem

by Theorem 14.3, and then re-applying Proposition 3.6.


Now since g h = f 1 we can apply the same observations to the pair f and 1
(though we could easily obtain sharper estimates in this case); comparing the two
evaluations of I(x) zI(x/z) yields the result in the case that tf (x, log x) = 0.
We now deduce the result when
t = t (x, log x) 6= 0 by comparing f (n) to
AsympT2 f
F (n) := f (n)/nit using Corollary 12.1; hence tF (x, 21 log x) = 0 and we can apply
the above. The result follows.
2
15.1

Best possible

Let f (p) = 1 if y 1/2 < p y or x/y 1/2 < p x, and f (p) = 1 otherwise. Then
1X
1
h(n) = + O((c/u)u ),
x
2
nx

 
1
1
1
= 1 + 2 log(1 1/2u) = 1 + O
,
p
u
u2
nx
x/y 1/2 <px
 
X
1X
1
1 1
1
u
f (n) = 1/2 + O((c/u) ) 2
= +O
.
2
x
p
2
u
u
1/2
1X
g(n) = 1 2
x

nx

Hence

x/y

<px

 
1
1X
1X
1X
1
f (n)
g(n)
h(n) =
+O
x
x
x
2u
u2
nx

nx

nx

StructThm

so we see that we must have 1 in Theorem 15.1.


One might hope for something like

 
1X
1X
1
1X
f (n) =
g(n)
h(n) 1 + O
x
x
x
uc
nx

nx

nx

but it is not true in general. Try f (p) = 1 for y < p y or x/y < p x, and
f (p) = 1 otherwise. Then the means for h, g and f are 2 , 12/u and 2 2/u,

70

The structure theorem

respectively. Taking = 1/u gives mean values 1/u2 , 1 and 1/u2 roughly; ie the
above hoped-for estimate is ridiculous. This example does not work if we take 0
instead of 1 since then the mean values are , 1 /u, /u respectively, so
the last displayed equation with = 1 is feasible. This would be a good research
project (ie prove the last display for f (n) [0, 1])

16
THE LARGE SIEVE
We are interested in how a given sequence of complex
numbers, a1 , a2 , . . ., is
Orthog1
distributed in arithmetic progressions mod q. By (10.1), when (b, q) = 1, we
have
X
X
X
1
(b)
an (n),
an =
(q)
n
nb

(mod q)

(mod q)

Orthog2

Therefore, by using (10.2), we deduce that



2

2

X

X
X
X

1



an =
an (n) .




(q)

(b,q)=1 nb (mod q)
(mod q) n

(16.1)

SumSqs

(16.2)

1stBound

Now

2





X N
X
X
X



a

+
1
|an |2
n

q

nN
nN
(b,q)=1
(b,q)=1

nb (mod q)
nb (mod q)
X

N
|an |2 ,
+1
=
q
n
SumSqs

so by (16.1) we deduce that


q
(q)


2
X

X



an (n) (q + N )
|an |2 .



nN
(mod q) nN
X

1stBound

Note that if an = (n) for all n, then the term on the left-side of (16.2) corre1stBound
2
sponding to the character has size (q)
q N , whereas the right-side of (16.2) is
1stBound

about (q + N ) (q)
q N . Hence if q = o(N ) and then (16.2) is best possible and any
of the terms on the left-side could be as large as the right side. It thus makes
sense to remove the largest term on the left side (or largest few terms) to determine whether we can get a significantly better upper bound for the remaining
terms.
This also has arithmetic meaning since the same argument used to prove
SumSqs
(16.1) yields, for any choice of 1 , . . . , k ,

2

2


k
X

X
X
X
X
X

1
1



an
(b)
an i (n) =
an (n) .




(q)
(q)


n
n

(b,q)=1 nb

(mod q)

i=1

6=1 ,...,k

(16.3)

SumSqk

72

The large sieve

Typically number theorists are interested in sequences where an = 0 or 1


(which indicates a subset A of the integers up to N ), and which are dense,
that is A contains more than N/(log N )k elements, or even a positive proportion
of the integers up to N . Given q it is easy enough to find a dense sequence A
that is not well distributed mod q (for example let A be the union of about q/2
arithmetic progressions mod q), or even one that is not well distributed modulo
each q in some finite set. Nonetheless
we might expect that A is well-distributed

for almost all q (say up to N ) though one needs to be cautious, for if A is


not well-distributed mod m then it will not be well-distributed mod n whenever
m divides n. To see this, suppose that there are (1 + )|A|/m elements of A that
are b (mod m). By the pigeonhole principle there exists some residue class B
(mod n), with B b (mod m), which contains at least (1 + )|A|/n elements of
A. Thus we see it makes more sense to compare the number of elements of A
that are B (mod n) with the number that are B (mod m) for each proper
divisor m of n.
PrimCharsOnly

Exercise 16.1 Show that the correct measure of how well the an are distributed
mod q (with respect to the divisors of q) is
X (q/d)(d)
(q)
d|q

an =

nx
nb (mod d)
(n,q)=1

1
(q)

(b)

(mod q)
primitive

an (n).

nx

1stBound

Summing the left-side of (16.2) over q Q is important in applications,


which yields a right-side with coefficient Q2 /2 + QN . However with the added
PrimCharsOnly
restriction to primitive characters (which we saw is appropriate in exercise 16.1),
we can use some simple linear algebra to improve this to obtain
The large sieve
X
qQ

q
(q)


2
+N
M
+N
MX

X


an (n) (N + Q2 1)
|an |2 .


(mod q) n=M +1
primitive

(16.4)

n=M +1

(We will prove this initially with Q2 1 replaced by 3Q2 log Q.)
Theorem 16.1 (Duality) Let xm,n C for 1 m M, 1 n N . For any
constant c we have

2

X X
X

am xm,n c
|an |2



n
m
n
for all am C, 1 m M if and only if

2

X X
X

bn xm,n c
|bm |2



m
n
m
for all bn C, 1 n N .

LargeSieve

The large sieve

73

Proof This can be rephrased as stating that for any m-by-n matrix X with
complex entries (that is X Mm,n (C)), we have
max
aM1,m (C)

|Xb|
|aX|
= max
.
|a|
bMn,1 (C) |b|

To see this suppose that |aX| |a| for all a M1,m (C). Given any b Mn,1 (C)
let a = Xb so that
|a| |Xb| = |Xb|2 = a Xb = aXb = aX b |aX| |b| |a| |b|,
and therefore either |Xb| |b|, or a = Xb = 0 which also yields |Xb| |b|.
The reverse implication is analogous.
2
BabyLS

Proposition 16.2 Let an , M + 1 n M + N be a set of complex numbers,


and xr , 1 r R be a set of real numbers. Let := minr6=s kxr xs k [0, 1/2],
where ktk denotes the distance from t to the nearest integer. Then

2 
 MX
+N
+N

X MX
log(e/)

an e(nxr ) N +
|an |2


r
n=M +1

where e(t) = e

2it

n=M +1

Proof For any br C, 1 r R, we have



2
M
+N

X
X X
X

br e(nxr ) =
br bs
e(n(xr xs )) = N kbk2 + E,



n

r,s

n=M +1

since the inner sum is N if r = s, where, for L := M + 21 (N + 1),


E

br bs e(L(xr xs ))

r6=s

sin(N (xr xs ))
.
sin((xr xs ))

Taking absolute values we obtain


|E|

X
r6=s

X
X
X
|br bs |
|br bs |
1

|br |2
| sin((xr xs ))|
2kxr xs k
2kx
xs k
r
r
r6=s

s6=r

since 2|br bs | |br |2 + |bs |2 . Now, for each xr the nearest two xs are at distance
at least away, the next two at distance at least 2 away, etc, and so
[1/]

|E|

X
r

so that

|br |2

X 2
log(e/) X

|bm |2 ,
2j

m
j=1


2 


X X
log(e/) X

b
e(nx
)

N
+
|bm |2 .

r
r

n
r
m

The result follows by the duality principle.

74

The large sieve

P
P
We have |E|  r |br |2 / mins6=r kxr xs k  r |br |2 / by the strong Hilbert
inequality (see section *), which leads to the constant N + O(1/) in the result
above.
LargeSieve

GenGSums

Proof of (16.4). By (10.3) we have


M
+N
X

1
g()

an (n) =

n=M +1

X
b

(b)

(mod q)

M
+N
X


an e

n=M +1

bn
q


.

SumSqs

where g(.) is the Gauss sum. Therefore, using (16.1)




M +N
2
X
  2
M
+N
X

X
X
X

1
bn



an (n)
(b)
an e




q
q

(mod q) n=M +1
primitive

(mod q) b
primitive

(q)

X
b

(mod q)

n=M +1


  2
+N
MX
bn

an e

.

q

(mod q) n=M +1
(b,q)=1

LargeSieve

We deduce that the left side of (16.4) is


+N
  2
X
X MX
bn

an e



q
qQ b

(mod q) n=M +1
(b,q)=1

BabyLS

We now apply Proposition 16.2 with {xr } = {b/q : (b, q) = 1, q Q}, so that


0
b
1
b min 1
,
min
min
q q 0 q6=q0 Q qq 0
q,q 0 Q
Q(Q 1)
b,b0
b/q6=b0 /q 0

LargeSieve

and (16.4) follows.

16.1
Prime moduli
Primes are the only moduli for which the only imprimitive
character is the prinLargeSieve
cipal character. Hence an immediate consequence of (16.4) is:

2

X
X
X
X
X

1


p
a
a

|an |2 ,
(16.5)
n
n  N

p



n
(b,p)=1 nb (mod p)
(n,p)=1
p N
p prime

which can be re-written as



2


X 1 X
X
X
X
p
N
a

a
|an |2 .
n
n


p



n
n
(b,p)=1
nb (mod p)
p N
p prime

(16.6)

Other things to perhaps include on the large sieve

75

(Elliott showed how to also include the b = 0 congruence class in the sum.)
Typically this corresponds to a massive saving. For example if an = 1 if n is
prime and 0 otherwise, then this gives
2
X
X

(x; p, b) (x)  x(x);
p

p 1

p x
p prime

and so
X

(b,p)=1

2
2
X

(x; p, b) (x)  x
.


p1
Q log x

Q<p x (b,p)=1
p prime

Schlage-Puchta [AA 2003] proved


X


2
X

X



N
a
(p)
|ap |2 .
p


log N


pN
(mod q) pN
X

(16.7)

qQ
primitive

16.2

Other things to perhaps include on the large sieve

Elliott [MR962733] proved that for Q < x1/2 , and f multiplicative with
|f (n)| 1,

2




0
X
X
X


1
x

f (n)
(p 1) max max
f (n) 
,
yx (a,p)=1
p1
logA x

ny
pQ
ny
na (mod p)

(n,p)=1
where the sum is over all p except one where there might be an exceptional
character.
Consequences of the large sieve to be discussed : Least quadratic
non-residue.

LargeSievePrimes

17
THE SMALL SIEVE
17.1

List of sieving results used

In this subsection we have collected together many of the simple sieve results that
we use. We will need to decide how to present this; whether to prove everything
or whether to quote, say, Opera di Cribro. This chapter probably should come
a lot earlier.
FLS

Lemma 17.1 (The Fundamental Lemma of Sieve Theory) If (am, q) = 1


and all of the prime factors of m are z then
X
x<nx+qy
(n,m)=1
na (mod q)


(m)

1 = 1 + O(uu2 )
y + O( y),
m

where y = z u .
FLS1

Corollary 17.2 If (am, q) = 1 and all of the prime factors of m are x1/u
then
X
nx
(n,m)=1
na (mod q)


(m) x

log n = 1 + O(uu2 )
(log x 1) + O( x log x).
m q

The proof of this and the subsequent corollaries


are left as exercises. One
Rn
approach here is to begin by writing log n = 1 dt
and
then swap the order of
t
the summation and the integral.
FLS2

Corollary 17.3 If is a character mod q and all of the prime factors of m are
z = y 1/u and coprime with q, then
X
x<nx+qy
(n,m)=1

(n) 

1 (mq)

qy + q y.
uu mq

Let p(n), P (n) be the smallest and largest prime factors of n, respectively.

Shius Theorem
FLS3

77

Corollary 17.4 If (a, q) = 1 and z is chosen so that q = z O(1) and z y then












X
q
y


1 
.



(q) log z
x<nx+qy


na (mod q)

p(n)>z
17.2

Shius Theorem
cor2.3

Suppose that 0 f (n) 1. Corollary 2.11 states that the mean value of f up
to x is  P(f ; x). Shius Theorem states that an analogous result is true for the
mean value of f in short intervals, in arithmetic progressions, and even in both:
Shiu

Theorem 17.5 If (a, q) = 1 then










Y
X
1

1
|f (p)|


1
f (n) 
1+
.
y
p
p
x<nx+qy

py
na (mod q)

p-q
 P
This is  P(|f |0 ; y)  exp py,

1f (p)
p-q
p

P
k
Proof
Let g(p)
P
P = |f (p)| where p y, and g(p ) = 1 otherwise. Then | n f (n)|
n |f (n)|
n g(n), and proving the result for g implies it for f .
Write n = pk11 pk22 . . . with p1 < p2 < . . ., and let d = pk11 pk22 . . . pkr r where
kr+1
d y 1/2 < dpr+1
. Therefore n = dm with p(m) > zd := max{P (d), y 1/2 /d},
(d, q) = 1 and g(n) g(d). Now, if we fix d then m is in an interval (x/d, x/d +
qy/d] of an arithmetic progression a/d (mod q) containing y/d + O(1) integers.
Note
that zd max{d, y 1/2 /d} y 1/2 y/d, and so we may apply Corollary
FLS3
17.4 to show that there are  qy/d(q) log(P (d) + y 1/2 /d) such m. This implies
that
X
qy X
g(d)
.
g(n)
(q)
d
log(P
(d)
+ y 1/2 /d)
1/2
x<nx+qy
na

dy
(d,q)=1

(mod q)

For those terms with d y 1/2 or P (d) > y  , we have log(P (d) + y 1/2 /d)
 log y, and so they contribute

 X


Y
qy Y
g(p)
1
1
g(d)

1
y
1
1+
,
(q)
p
d
p
p
1/2
py

dy
(d,q)=1

py
p-q

the upper bound claimed above. We are left with the d > y 1/2 for which
P (d)  2r for some r, 1 r k = [ log y]. Hence we obtain an upper bound:

78

The Small Sieve

k
qy X 1
(q) r=1 r

X
d>y 1/2
(d,q)=1
P (d)2r

qy
g(d)
1

d
(q)
k

X
d>y 1/2
(d,q)=1
P (d)2k

k
g(d) X 1
+
d
r2
r=1

X
d>y 1/2
(d,q)=1
P (d)2r

g(d)
.
d

For the first term we proceed as above. For the remaining terms we use Corollary
3.4.2, with ur := (1/2 ) log y/(r log 2), to obtain







k
k
X
qy X 1 Y
1
g(p)
1
1 Y
g(p)

y
1

1
+
1
+
.
(q) r=1 r2
p
ruur r
p
p
uur r +1
r
r=1
py
p-q

p2
p-q

P
Finally note that ur is decreasing, so that R/2<rR 1/(ruur r )  1/uuRR ; moreP
over u2R = uR /2 and so 1rk 1/(ruur r )  1/uuk k  1, and the result follows.
2
17.3

Consequences

Define
q (f ) :=

Y
pq
p-q

1
p



|f (p)|
(q)
1+
and 0q (f ) =
q (f ).
p
q
Shiu

(Note that q (f ) is an upper bound in Theorem 17.5 provided y q.) We also


define
X
logS (n) :=
(d),
dS
d|n

where S might be an interval [a, b], and we might write Q in place of [2, Q],
or R in place of [R, ). Note that log n = log[2,n] n.
Small.1

Lemma 17.6 Suppose that x Q2+ and Q q. Then, for any character
(mod q) and any R,






X


x
x

f (n)(n)e(n)L(n)  q (f ) = 0q (f )
,

q
(q)


nN
na (mod q)

log

log

x/Q
where L(n) = 1, log(x/n), logQQ or log
Q , and N = {n : Y < n Y + x}
for Y = 0 in the second and fourth cases, and for any Y in the other two cases.

Consequences

79

1+
Proof The first estimate
R Theorem
Pfor x q . One can
P follows from Shius
1
deduce the second since nx an log(x/n) = 1T x T nT an dT for any an .
If d is a power of the prime p then let fd (n) denote f (n/pa ) where pa kn, so
that if n = dm then |f (n)| |fd (m)|. Therefore if x > Qq 1+ then, for the third
estimate, times log Q, we have, again using Shius Theorem,
X
X
X
(d)
|fd (m)|
|f (md)|(d)

dQ
(d,q)=1

Y <mdY +x
mda (mod q)
dQ

Y /d<m(Y +x)/d
ma/d (mod q)

X (d)
x
x
q (fd )  q (f ) log Q.
d
q
q

dQ
(d,q)=1

2
In the final case,
writing n = mp where p is a prime > x/Q (and note that p - n
as p > x/Q > x), we have

|f (m)|

mQ
(m,q)=1

log p 

x/Q<px/m
pa/m (mod q)

|f (m)|

mQ
(m,q)=1

x
x/m
 q (f ) log Q.
(q)
q

by the Brun-Titchmarsh theorem, and then applying partial summation to Shius


Theorem.
2
SumSqs

By (16.1) we immediately deduce


Small.1

Small.2

Corollary 17.7 With the hypotheses of Lemma 17.6 we have



2


X
X


 (0q (f )x)2 .
f
(n)(n)L(n)



(mod q) nx

Small.3

Lemma 17.8 If > q 1+ then for any D 0 we have



2
X

X



f (d)(d)(d)  2 .


(mod q) DdD+
SumSqs

Proof We expand the left side using (16.1) to obtain



2

2








X
X
X
X



(q)

 2 ,
(q)
f
(d)(d)
(d)






(b,q)=1 db (mod q)
(b,q)=1 db (mod q)


DdD+
DdD+
by the Brun-Titchmarsh theorem.

18
THE PRETENTIOUS LARGE SIEVE
18.1

Mean values of multiplicative functions, on average

Define
S (x) :=

f (n)(n),

nx

and order the characters 1 , 2 , . . . (mod q) so that the


|Sj (x)| are in descendHalExplic2
ing order. Our main result is an averaged version of (??) for f twisted by all the
characters (mod q), but with a better error term:
PLSk

Corollary 18.1 Suppose that x Q2+ and Q q 2+ log x. Then


X
(mod q)
6=1 ,2 ,...,k1

2


1
S (x) 

x

eO(

k) 0
q (f )

log Q
log x

1 1

log

log x
log Q

!2
,

where the implicit constants are independent of f . If k = 1, f is real and 1 is


not, then we can replace the exponent 0 with 1 12 .
Let Cq be any subset of the set of characters (mod q), and define
L = L(Cq ) :=
where
F (s) :=

Y

1
max
log x Cq

1+

px

max

|t|log2 x

|F (1 + it)|,


f (p)(p) f (p2 )(p2 )
+
+
.
.
.
.
ps
p2s

Our main result is the following:


PLSG

Theorem 18.2 Suppose that x Q2+ and Q q 2+ log x. Then


2



2
X 1

log Q
log x
0
S (x) 
L(C
)
+

(f
)
log
.
q
q

x
log x
log Q
Cq

PLSk

PLSG

kRepulsion

Corollary 18.1 followsPLSG


immediately from Theorem 18.2 and Proposition
??.
MeanAveraged
To prove Theorem 18.2 we begin with an averaged version of (??), which
was used in theMeanAveraged
proof of Halaszs Theorem. Notice that if we simply sum up
the square of (??) for S = S , for each (mod q), then we would get the next
lemma but with the much weaker error term (q).

Mean values of multiplicative functions, on average


AvLogWt

81

Lemma 18.3 Suppose that x Q2+ and Q q. Then



2
!2
X Z x/Q 1
X 1


2
2
S (t) dt
S (x) 
+ 0q (f ) log Q .
log x
t
x

t
Q
Cq

Cq

MeanF(n)

Proof Let z = x/Q. We follow the proof in sectionSmall.2


8.1 for the main terms, but
deal with the error terms differently. By Corollary 17.7 we have

2
X

X



f (n)(n) log(x/n)  (0q (f )x)2 ,


(mod q) nx
2


X X


and
f (n)(n)(logQ n + log>x/Q n)  (0q (f )x log Q)2 ,


nx
so that, using the identity log x = log(x/n) + logQ n + log>x/Q n + log(Q,x/Q) n,

2


X
X
X


2
0
2


|S (x) log x| 
f
(n)(n)
log
n
(Q,x/Q) + (q (f )x log Q) .


Cq
Cq nx
Now for g = f we have
X
X
g(n) log(Q,x/Q) n
nx

g(p) log p

Q<p<x/Q

X
k

Q<p <x/Q
k2

log p

g(mpk ) +

mx/pk

g(m)

mx/p

log p

Q<p<x/Q

(g(mp) g(p)g(m)).

mx/p

The last term is 0 unless p2 |m, so this last bound is, in absolute value,
X log p
X
x
log p
+ 2x
 1/2 .
x
k
2
p
p
Q
k
Q<p <x/Q
k2

Q<p<x/Q

MeanF(n)

We now bound our main term as in section 8.1; though now MeanAveraged
we let z = y + y

so we obtain the error term x/ y in the equation before (??). Summing over
such dyadic intervals this yields




Z x/Q
X
X

x


g(p) log p
g(m) 
|S (x/t)|dt + 1/2 .

Q
Q
Q<p<x/Q

mx/p

The result follows from the change of variable t x/t since Q q and
0q (f ) log Q  1.
2

82

The Pretentious Large Sieve

In the next Lemma we create a convolution to work with, as well as removing


the small primes.
AvConvol

Lemma 18.4 Suppose that x Q2+ and Q q 2+ log x. Then


X Z
Cq

2


2
X Z x X
1
dt
dt
S (t)

f (n)(n) log>Q n 2

t

t
Q
t log t
Cq

nt


+

0q (f ) log Q


log

log x
log Q

2
.

Proof We expand using the fact that log t = log(t/n) + logQ n + log>Q n; and
the Cauchy-Schwarz inequality so that, for any function c (t),
X Z

dt
t2 log t

c (t)

2

dt

t log t

c (t)2

dt
t3 log t

Small.2

By Corollary 17.7 we then have


Z


2



Z x
X X

dt
log x
dt
0
2
0
2


 q (f )
 q (f ) log
f (m)(m) log(t/m) 3

log Q
Q t log t
t log t
mt

and
Z


2

Z x
X X

2 dt
dt



,
f
(m)(m)
log
m
0q (f )t log Q
Q

t3 log t
t3 log t
Q

mt
2

and the result follows.

Now we prove the mean square version of Halaszs Theorem, which is at the
heart of the pretentious large sieve.
AvParsev

Proposition 18.5 If x > Q1+ and Q q 1+ then



2
X


dt

f (n)(n) log>Q n 2

Qnt
t log t

Cq





log x
(q) log Q log3 x
2
M log
+
+
log Q
T
Q
T2

 log

log x
log Q

where M := maxCq max|u|2T |F (1 + iu)|.

Mean values of multiplicative functions, on average

83
keyProp

Proof (Revisiting the proof of Halaszs Theorem (particularly Proposition 8.5)).


For a given g = f and Q we define
X
h(n) =
g(m)g(d)(d),
md=n
d>Q

P
so that G(s)(G0>Q (s)/G>Q (s)) = n1 h(n)/ns for Re(s) > 1. Now




X X log p
X
X
X

X
t log t
2

1 2t
log p
h(n)
g(n)
log
n


,
>Q


b+1
p
Q

nt
b
b
nt
nt
b1
p >Q
p >Q
pb+1 |n

pb+1 t

by the prime number theorem. This substitution leads to a total error, in our
estimate, of
Z x
2




t log t
dt
q
log x
1
log x
2
2
 |Cq |

log

log
,
Q t2 log t
Q2
log Q
q
log Q
Qq
which is smaller than the first term in the given upper bound, since M  1/ log q.
Now we use the fact that
Z 1/ log Q
1
d

2
log t
t
1/ log x
whenever x t Q, as x > Q1+ , so that


Z x X
Z 1/ log Q Z x X
dt

dt


h(n) 2

h(n) 2+2 d.


t
2
1/ log x
2
t log t
nt

nt

keyProp

Now, Cauchying, but otherwise proceeding as in the proof of Proposition 8.5


(with f (n) log n there replaced by h(n) here), the square of the left side is
Z 1/ log Q
Z 1/ log Q
Z
0

d
1
G(G>Q /G>Q )(1 + + it) 2



dtd.

2
1 + + it
1/ log x
1/ log x
The integral in the region with |t| T is now
Z X
2 dt


max |G(1 + + it)|2
g(n)(n) 3+2 .

t
|t|T
1
Q<nt

If we take g = f and sum this over all characters Cq then we obtain an


error
Z
X
2 dt
X


2
max |F (1 + + it)|
f (n)(n)(n) 3+2

t
|t|T
Q

Cq

 max |F (1 + + it)|2
|t|T
Cq

Small.3

by Lemma 17.8 as t Q q 1+ .

(mod q) Q<nt

dt
1

max |F (1 + + it)|2 ,
t1+2
|t|T
Cq

84

The Pretentious Large Sieve

For that part of the integral with |t| > T , summed over all twists of
f by
keyProp
characters (mod Int>T
q), we now proceed as in the proof of Proposition 8.5. We
obtain (q) times (??), with f (`) log ` replaced by h(`) for ` = m and n, but now
with the sum over m n (mod q) with m, n Q. Observing that |h(`)| log `,
we proceed analogously to obtain, in total


(q) (log Q)2


(q)
1
+
4 2.
T
Q
q
T
2

The result follows by collecting the above.


PLSG

Proof
of Theorem 18.2: The result follows byAvLogWt
taking TAvConvol
= 12 log2 x in ProposiAvParsev
tion 18.5, and then combining this with Lemmas 18.3 and 18.4, since 0q (f ) log q 
1.
2
PLSRange

Corollary 18.6 Fix  > 0. There exists an integer k  1/2 such that if x
q 4+5 then
X
(mod q)
6=1 ,2 ,...,k

2


1
S (y)  eO(1/)
y j

0q (f )

log Q
log y

1 !2
,

where Q = (q log x)2 , for any y in the range


log x log y log x

 
/2
log x
2
,
log Q

where the implicit constants are independent of f .

Proof Select k to be the smallest integer for which 1/ k < 3. Let Cq be the
C
set of all characters mod q except 1 , 2 , . . . , k . PLSG
Write x = QB , so that y = QM-Bds1
,
1 1/2
where B C 2kRepulsion
B
, and apply Theorem 18.2 with x = y. Then, by (??)
and Proposition ?? we have

Ly  Lx

log x
log y

2

 eO(1/) 0q (f )

1
B 13

B   eO(1/) 0q (f )

1
C 14

and the result follows. Note that by bounding Ly in terms of Lx , we can have
the same exceptional characters 1 , 2 , . . . , k for each y in our range.
2

19
MULTIPLICATIVE FUNCTIONS IN ARITHMETIC
PROGRESSIONS
It is usual to estimate the mean value of a multiplicative function in an arithmetic
progression in terms of the mean value of the multiplicative function on all the
integers. This approximation is the summand corresponding to the principal
character when we decompose our sum in terms of the Dirichlet characters mod
q. In what follows we will instead compare our mean value with the summands
for the k characters which best correlate with f . So define
k

(k)

Ef (x; q, a) :=
na

f (n)

nx
(mod q)

X
1 X
j (a)
f (n)j (n).
(q) j=1
nx

(k1)

The trivial upper bound |Ef


(x; q, a)|  k0q (f )x/(q) can be obtained by
bounding each sum in the definition using the small sieve. We now improve this:
FnsInAPs

Theorem 19.1 For any given k 2 and sufficiently large x, if x X


max{x1/2 , q 6+7 } then
(k1)
|Ef
(X; q, a)|

e

0q (f )X
(q)

log Q
log x

1 1

log

log x
log Q


,

where Q = (q log x)5 and the implicit constants are independent of f and k. If f
is real and 1 is not then we can extend this to k = 1 with exponent 1 12 .
To prove this we need the following technical tool, deduced from Corollary
PLSRange
18.6.
LinearPLS

Proposition 19.2 Fix > 0. For given x = q A there exists K  3 log log A
such that if x X x1/2 and Q = (q log x)5 then





1
X
X


1
log Q
O(1/) 0
1

f
(n)(n)
log
n

e

(f
)
.
[Q,x/Q]
q
X
log x
log x


(mod q)
6=j , j=1,...,K

nX

Proof Let log xi = 2(1+/3) +1 log q for 0 i IA, with I chosen to be the
smallest integer
for which xI > x/Q, so that I  (1/) log log A. In order to apply
PLSRange
Corollary 18.6 with x = xi we must exclude the characters j,i , 1 j k, for

86

Multiplicative functions in arithmetic progressions

1 i I. Let 1 , 2 , . . . , K be the union of these sets of characters, so that


K k(I + 1)  3 log log A. Therefore, for all y [Q, x/Q], we have

2

1 !2
X
1

log
Q
Sj (y)  eO(1/) 0q (f )
.
(19.1)
y

log y

PLSuniform

(mod q)
6=1 ,2 ,...,K

We rewrite the sum in the Proposition as








X
X


f (m)(m)f (d)(d)(d) ,


(mod q) dmX

6= , j=1,...,K Qdx/Q
j

and split this into subsums, depending on the size of d. This is bounded by a
sum of sums of the form




X
X
X


.
f
(d)(d)(d)
f
(m)(m)




DdD+
(mod q)
mX/d
6=j , j=1,...,K

log(X/D))
where Q D x/Q with D log(q
. If we approximate the last sum
q log(X/D)
here with the range m X/D, then we can Cauchy to obtain


2


X
X
X

f
(d)(d)(d)
f (m)(m)
(19.2)


(mod q) DdD+
mX/D
6=j , j=1,...,K

2


X



f (d)(d)(d)


(mod q) DdD+

(mod q)
6=j , j=1,...,k1


2
X




f (m)(m)

mX/D

(19.3)

 eO(1/)

0q (f )

Small.3

X
D

log Q
log(X/D)

1 !2
,

(19.4)

PLSuniform

by Lemma 17.8 and (19.1). Summing the square root of this over the D/ such
intervals for d in [D, 2D) yields an upper bound
1

log Q
O(1/) 0
;
e
q (f )X
log(X/D)
and then summing this over D = X/Q2j for 0 j J  log X we obtain the
claimed upper bound.

OneTermBound

Multiplicative functions in arithmetic progressions

87

Finally the error in replacing the range m X/d by m X/D is


X
X
X

|f (m)|0 (m)
|f (m)|0 (m)  0q (f ) 2 ,
D
X/d<mx/D
(m,q)=1

X/(D+)<mx/D
(m,q)=1

so an upper bound for the contribution in [D, 2D) is


X(q) X (d)
log Q
 0q (f )
 0q (f )X
,
D
d
log(X/D)
Dd<2D

Proof of Theorem
19.1 : Fix  > 0 sufficiently small with 1/ k > . By
Small.1
applying Lemma 17.6 , with = 0 we have


X
X
x
log x
f (n) =
f (n) log[Q,x/Q] n + O 0q (f )
log Q .
(q)
which is smaller than the other error term.
FnsInAPs

na

nx
(mod q)

na

nx
(mod q)

Multiplying this by (a), and summing over a we obtain


X
X

log x
f (n)(n) =
f (n)(n) log[Q,x/Q] n + O 0q (f )x log Q ;
nx

nx

so that
(K)

Ef



(q)
X
X
log[Q,x/Q] n
x log Q
1
j (a)
f (n)j (n)
+ O K0q (f )
(q)
log x
(q) log x
j=K+1
nx


0q (f )x log Q 1
 eO(1/)
,
(q)
log x

(x; q, a) =

LinearPLS

by
Proposition 19.2 , where K  3 log log A. By Cauchying and then Corollary
PLSk
18.1, we obtain
(k)

(K)

|Ef (x; q, a) Ef

(x; q, a)|

K

1 X
Sj (x)
(q)
j=k+1
1/2

K
X


1
S (x) 2
K
j
(q)
j=k+1

 eO(

k) 0
q (f )

x
(q)

log Q
log x

1 1

1
since K  log log A, and 1 k+1
> 1 1k . Applying the same argument again,
we also obtain

1 1


0 (f )x
k
log Q
log x
q
(k1)
(k)
log
.
|Ef
(x; q, a) Ef (x; q, a)|  eC k
(q)
log x
log Q

The result follows from using the triangle inequality and adding the last three
inequalities.
2

20
PRIMES IN ARITHMETIC PROGRESSION
PNTapsk

Theorem 20.1 For any k 2 and x q 2 there exists an ordering 1 , . . . of the


non-principal characters (mod q) such that, for Q = (q log x)2 ,
X
na

(n)

ny

ny
(mod q)

e
PNTaps1

k1
X
1 X
1 X
(n)
j (a)
(n)j (n)
(q)
(q) j=1

x
(q)

ny

log Q
log x

1 1

log3

log x
log Q


.

Corollary 20.2 There exists a character (mod q) such that if x q 2 then


X
na

nx
(mod q)


1 1
2
x
1 X
(a) X
log Q
.
(n)
(n)
(n)(n) 
(q)
(q)
(q)
log x
nx

nx

where Q = (q log x)2 . We may remove the term unless is a real-valued


character.
Remark 20.3 Can we obtain the error in terms of 1/|L(1 + it, )|/ log x? And
when is real, probably t = 0.
PNTapsk

Proof of Theorem 20.1 We may assume that x q B for B sufficiently large,


else the result follows from the Brun-Titchmarsh Theorem.
Let g(.) be the totally multiplicative function for which g(p) = 0 for p Q
and g(p) = 1 for p > Q, and Lammu
then f = g, so that we have the following variant
of von Mangoldts formula (1.11),
(
X
(n) if p|n = p > Q,
Q (n) :=
f (d)g(m) log m =
0
otherwise.
dm=n
Now
X
nb

((n) Q (n))

nx
(mod q)

X
nx
p|n = pQ

(d) 

X
pQ

log x  Q

log x
.
log Q

by the Brun-Titchmarsh theorem. Denote the left side of the equation in the
(k1)
Theorem as E,+
P (x; q, a), and note that all of these sums can be expressed as
mean-values of nx, nb (mod q) (n), as b varies. Hence

Primes in arithmetic progression


(k1)

(k1)

E,+ (x; q, a) EQ ,+ (x; q, a)  Q

89

log x
.
log Q

Now
X

Q (n) =

nx
na (mod q)

f (d)

dx
(d,q)=1

g(m) log m.

(20.1)

mx/d
ma/d (mod q)

P
(k1)
Similar decompositions for the n Q (n)j (n) imply that EQ ,+ (x; q, a) equals
the sum of f (d) over d x with (d, q) = 1, times
X

g(m) log m

mx/d
ma/d (mod q)

k1
X
1 X
j (a/d)
j (b)
(q) j=0
(b,q)=1

g(m) log m.

mx/d
mb (mod q)

FLS1

By Corollary 17.2 (with m the product of the primes Q that do not divide q)
this last quantity is
r 

k
x
x

+k
log x/d
u+2
u
d(q) log Q
d
where x/d = Qu . Let R be the product of the primes Q. We deduce that the
sum over d in a range x/Q2u < d x/Qu with f (d) 6= 0, is
r 

X
x
1
x
k x
kux
k
+
log x/d  u
+ u/2
u+2 d(q) log Q
u
d
u
(q)
Q
2u
u
x/Q

<dx/Q
(d,R)=1

FLS3

by Corollary 17.4 (for the sum over d), provided u := log

log x
log Q

. Summing

this up over u = 2, 4, 8, . . . , , the sum over d in the range Q < d x/Q2 is




x
(q)

log Q
log x

2
.

The same argument works to give a much better upper bound for the terms with
d Q2 , though removing the condition (d, R) = 1 in the sum above. Hence we
are left to deal with those d LQexpand
> x/Q , which implies that m x/d < Q .
The remaining sum in (20.1) is
X
X
g(m) log m
f (d).
m<Q
(m,q)=1

x/Q <dx/m
da/m (mod q)
(k1)

There are analogous sums for the remaining terms in EQ ,+ (x; q, a) and so we
need to bound

LQexpand

90

Primes in arithmetic progression

(k1)

(k1)

g(m) log m (Ef,+ (x/m; q, a/m) Ef,+ (x/Q ; q, a/m)).

m<Q
(m,q)=1

PLSG

of all characters
To do so we need to apply Theorem 18.2 with Cq to be the set PLSk
mod q, less 0 , 1 , . . . , k1 . Then we can deduce Corollary 18.1 though now
with 6= 0 , . . . , k1 as the condition onPLSRange
the sum (but otherwise the same).
FnsInAPs
We can then similarly modify Corollary 18.6 and finally obtain Theorem 19.1
(k1)
(k1)
with Ef
replaced by Ef,+ . Therefore we obtain the bound
X

(k1)

(k1)

g(m) log m |Ef,+ (x/m; q, a/m) Ef,+ (x/Q ; q, a/m)|

m<Q
(m,q)=1

e

e

0q (f )x
(q)

0q (f )x
(q)

log Q
log x

1 1

log Q
log x

1 1

g(m)

m<Q
(m,q)=1
k

log m
m

( log Q)2
.
log Q

FLS3

by Corollary 17.4, and the result follows since 0q (f )  1/ log Q. (This means we
need to change the sieving to go up to Q throughout rather than q.)
2
PNTaps1

Proof of Corollary 20.2 We let k = 2 in Theorem 11.1 to deduce the first


part. If is not real valued, then we know that






X
X


=
|E (3) (x; q, a) E (2) (x; q, a)|
(n)(n)
(n)(n)




nx
nx

PNTapsk

and the result follows from Theorem 20.1.

21
LINNIKS THEOREM
In this section we complete the proof of Linniks famous theorem:
Linnik

Theorem 21.1 There exist constants c, L > 0 such that for any coprime integers a and q there is a prime a (mod q) that is < cq L .
There are several proofs of this in the literature, none easy. Here we present
a new proof as a consequence of thePNTaps1
Pretentious Large Sieve, as developed in
the previous few sections. Corollary 20.2 implies that if there are no primes a
(mod q) up to x, a large power of Q, then the vast majority of primes satisfy
(p) = (a). The difficult part of our current proof is to now show that (a) = 1
(which surely should not be difficult! ):

LinkNoSieg

Proposition 21.2 Suppose that x q A where A is chosen sufficiently large. If








X
X


1
x

(n)
(n) 

(q)
(q)


nx
na nx

(mod q)
then there exists a real character (mod q) such that (a) = 1, and


X 1
log x
 log log
.
p
log Q
Q<px
(p)=1

LinkSiegCond

Corollary 21.3 If there are no primes p a (mod q) with Q < p x then


there exists a real character (mod q) such that (a) = 1, and
X 1
 1.
p

Q<px
(p)=1

HalRevisited

Lemma 21.4 (Halaszs Theorem for sieved functions) Let f be a multiplicative


function with the property that f (pk ) = 0 whenever p Q. If x Q then



1 X

1
1
1
1
log x


f (n) 
(1 + M )eM + +
1+
log
.

x
log Q
T
log x
log Q
log Q
nx

where M := min|t|T

Q<px

1Re(f (p)pit )
.
p

92

Linniks Theorem
HalExplic1

Proof (sketch) We suitably modify the proof


of Halaszs Theorem (??). We
keyProp
begin by following the proof of Proposition 8.5. First note that S(N ) = 1 for all
N Q, so wekeyProp
can reduce the range in the integral for , throughout the proof of
Proposition 8.5, to log1 x log1 Q . Moreover in the first displayed equation we
can change the error term from  logNN to  log1 Q logNN for N Q by sieving.
This allows us to replace theerror term in the second displayed equation from
keyProp
log x
 log log x to  1 + log1 Q log log
Q . Hence we can restate Proposition 8.5 with
the range for , and the log log x in the error term, changed
 in this way.
log x
Now we use the bound |F (1 + + it)| |F (1 + iu)| + O T log
Q throughout
OffLineOn

this range, as in Lemma ??; and we also note that, in our range
for , |F (1 + +
HalExplic2
it)|  1/( log Q). We then proceed as in the proof of (??), but now splitting
the integral at 1/L log Q log x to obtain the result, since L log Q  eM .
2


LinkNoSieg
log x
Proof of Proposition 21.2 Write := log log
Q . We return to the proof
PNTapsk
of Theorem 20.1, and show, under our hypothesis here, that there exists y in the
range x1/2 < y x for which




X

y

.
f (n)(n)  2

log
Q


ny

For, if not, the proof there implies that




X





x


(n)(n) = o
,

(q)
nx

PNTaps1

which, by Corollary 20.2, contradicts


our hypothesis.
HalRevisited
Taking f = f in Lemma 21.4, and comparing our upper and lower bounds
for S (y) we deduce that
X 1 + Re ((p)pit )
 log .
p

Q<px

4
5
Let T := {z : |z| = 1, and 3 < arg(z) < 2
3 or 3 < arg(z) < 3 }. We must
it
it
have |t|  / log x else p T (and hence (p)p T ) for enough of the primes
in (xc/ , x] that the previous estimate cannot hold. Therefore

X 1
X 1 + Re((p)pit ) + |pit 1|
1 X 1 + Re((p))
=

 log .
p
2
p
p

Q<px
(p)=1

Q<px

Q<px

Linniks Theorem
LinkSiegCond

93

PNTaps1

Proof of Corollary 21.3 By Corollary 20.2 we know that for all y in the
range Q y x we have
X


(n)((p) + (a))  y

py

log Q
log y

1/5
.

By partial summation, we deduce that


X (a) + (p)
 1.
p

Q<px

LinkNoSieg

Comparing this to the conclusion of Proposition 21.2, we deduce that (a) = 1


and we obtain the result.
2
LinkSiegCond

Proposition 21.5 If the hypotheses of Corollary 21.3 hold for x = q A where


A is sufficiently large, and if (a) = 1 then there are primes x that are a
(mod q).

22
BINARY QUADRATIC FORMS
22.1

The basic theory

Suppose that a, b, c are integers for which b2 4ac = d and define the binary
quadratic form F (x, y) := ax2 + bxy + cy 2 , which has discriminant d. We will
study the values am2 + bmn + cn2 when m and n are integers, and in particular
the prime values. We say that F represents the integer N if there exists integers
m, n such that F (m, n) = N .
Exercise 22.1 Prove that if there is an invertible linear transformation (over
Z) between two binary quadratic forms then they represent the same integers;
indeed there is a 1-1 correspondence between representations. Show also that the
two forms have the same discriminant. These results suggests that we study the
equivalence classes of binary quadratic forms of a given discriminant.
Now d = b2 4ac b2 0 or 1 (mod 4). For such integers d there is always
at least one binary quadratic form of discriminant d:

x2 (d/4)y 2

when d 0

(mod 4)

when d 1

(mod 4).

x + xy ((d 1)/4)y

The key result is that there are only finitely many equivalence classes of binary
quadratic forms of each discriminant d, and we denote this quantity by h(d).
We now prove this when d < 0: The idea is that every binary quadratic form
of negative discriminant is equivalent to a semi-reduced form, one forpwhich
|b| a c. In that case |d| = 4ac b2 4a2 a2 = 3a2 and so a |d|/3,
and
p so for a given d there are only finitely2 many possibilities since |b| a
|d|/3 and once these are chosen c = (b d)/4a. Gausss proof that every
form is equivalent to a semi-reduced form goes as follows: If c < a then the
transformation (x, y) (y, x) swaps a and c; hence we may assume that a c.
If |b| < a then let B b (mod 2)a with a < B a, so that there exists an
integer k with B = b + ka. The transformation (x, y) (x + ky, y) changes F to
ax2 + Bxy + Cxy where C = (B 2 d)/4a. Either this is semi-reduced or C < a
in which case we repeat the above process. If we need to then we see that our
new pair a, C is smaller than our old pair a, c, so the algorithm must terminate
in finitely many steps.
Before we count representations, lets note that given one representation,
one can often find a second trivially (the automorphs), for example F (m, n) =
F (m, n).

The basic theory

95

Exercise 22.2 Show that the only other automorphs when d < 0 occur for d =
3 and d = 4. We denote the number of automorphs by w(d). Deduce that
w(4) = 4, w(3) = 6 and w(d) = 2 for all other negative discriminants d.
The key result in the theory of binary quadratic forms is to show that there is
a 1-1 correspondence between the inequivalent representations of a given integer
N by the set of binary quadratic forms of discriminant d, and the number of
solutions to x2 d (mod 4n). Once this is established one knows that the total
number of representations is
Xd
.
R(N ) = w(d)
k
k|N

Dirichlet had the idea to simply sum R(N ) over all N x since the sum equals
the total number of values up to x of the inequivalent binary quadratic forms F
of discriminant d < 0.
Exercise 22.3 Show that the number of pairs m, n of integers for which am2 +
bmn + cn2 x can be approximated by the
area of this
shape, with an error term
proportional to the perimeter, that is 4x/ d + O( x).
Hence




x
R(N ) = h(d) 2 + O( x) .
d
N x
X

On the other hand


X

R(N ) = w(d)

N x

X Xd
N x k|N

X d
= w(d)
.
a

The main term comes from summing over a


x/a + O(1), to obtain

abx

x, since the number of b is

X d x
X d 1
X d 1

+ O( x) = x
+ O x
+ x
a a
a a
a a

a1
a x
a x

= xL(1, (d/.)) + O(d x),


by partial summation since the sum of (d/a), over any interval of length 4d,
equals 0. For the same reason
X
X d

4d x.
a

b x

x<a<x/b

Dividing through by x, and then letting x , we obtain Dirichlets class


number formula:

96

Binary Quadratic Forms

h(d) = w(d)

d
L(1, (d/.)),
2

when d < 0.

When d < 0 the binary quadratic forms are positive definite and so can only take
each value finitely often. When d > 0 there is no obvious limitation on how often a
given integer can be represented, and indeed integers can be represented infinitely
often. The reason for this is that there are infinitely many automorphs for each
d. Fortunately the automorphs can all be generated by two transformations:
F (m, n) = F (m, n) and F (m, n) = F (m + n, m + n) for some linear
transformation of infinite order. After taking due consideration this leads to

h(d)Rd = d L(1, (d/.)), when d > 0,

for some constant Rd . In fact Rd = log d where d = x + y d corresponds to


the smallest solution with x, y > 0 to x2 dy 2 = 4.
22.2

Prime values

Let us suppose that is induced from the quadratic character (./D) so that D
must be squarefree. We re-write this as (d/.) = (./D) where d = (1)(D1)/4 D,
so that d 1 (mod 4). To begin with we look at divisibility. For a binary
quadratic form ax2 +bxy +cy 2 , we know that (a, b, c)2 |d, which is squarefree, and
so (a, b, c) = 1. Also note that (m, n)2 divides am2 + bmn + cn2 , so we proceed
by replacing m by m/(m, n), and n by n/(m, n), and hence we may assume that
m and n are coprime.
We now show that if odd prime p divides am2 + bmn + cn2 then (d/p) = 0
or 1. If p divides n then 0 am2 + bmn + cn2 am2 (mod p) and so p divides
a as (m, n) = 1. Therefore d = b2 4ac b2 (mod p) and hence (d/p) = 0 or 1.
If m - n then 4ap divides 4a(am2 + bmn + cn2 ) = (2am + bn)2 dn2 , and so


2am + bn
p

2


=

(2am + bn)2
p


=

dn2
p

   2  
d
n
d
=
=
,
p
p
p

implying that (d/p) = 0 or 1.


Exercise 22.1 Show that if p is an odd prime then
1

1
#{m, n
p2

(mod p) : am2 + bmn + cn2 0

(mod p)} =




(d/p)
1
1
1
.
p
p

We wish to show that am2 + bmn + cn2 takes on many prime values, that
is not many composite values.
If am2 + bmn + cn2 x is composite then it
certainly has a prime factor x so we will count the number of such values
with no small prime factor. To explain our method in an intuitive fashion we
will proceed assuming that d < 0 < a (so that am2 + bmn + cn2 only takes
non-negative values); when we give the actual proof we will use sieve weights
that are easier to work with but more difficult to understand.

Prime values

97

The small sieve shows us that if x = y u then for M =

py

#{m, n Z : N := am2 + bmn + cn2 x, (N, M ) = 1} =


u

= {1 + O(u

)}

Y
py

1
1
p




(d/p)
1
X + O( X),
p

where X := #{m, n Z : N := am2 + bmn + cn2 x} = x/ d + O( x).


We will use this estimate when y is a small power of x, and then obtain a
lower bound by subtracting the number of such integers divisible by a prime in
(y, x1/2 ].
The trick is that if prime ` is in this range with (d/`) = 1 then ` can be
written as the value of a binary quadratic form of discriminant d in one of two
(essentially different) ways, and then N/` similarly. Hence to count the number
of such N/` we can use use the same estimate, though
in this case we use the
above simply as an upper bound, particularly as N/` x. Hence
#{m, n Z : N := am2 + bmn + cn2 x, (N, M ) = 1, `|N }


Y

py

1
p



(d/p) X
.
1
p
`

Hence in total, we have


#{m, n Z : N := am2 + bmn + cn2 x, N is prime}

Y

X
y<`x1/2
(d/`)=1

2
1
1


`
p

py



(d/p)
1
X,
p

where say u  1/.


LinkSiegCond
From the first equation in the proof of Corollary 21.3 we deduce that that if
there are no primes a (mod q) up to x then
X
1/2

1

`

log Q
log y

1/5
;

y<`x
(d/`)=1

P
hence if x = q L where L/u is sufficiently large then ypx1/2 (1 + (d/p))/p
1/2; and so, from the above, we know that there are many prime values of our
binary quadratic form.

98

Binary Quadratic Forms

22.3

Finishing the proof of Linniks Theorem

To obtain a complete proof without proving all sorts of results about binary
quadratic forms (and of positive and negative discriminant), we can proceed
working (more-or-less) only with the character , though based on what we
know about binary quadratic forms. The extra observation to add to the analysis
of the previous section is that we should work with the values of all binary
quadratic forms of discriminant d, simultaneously, since Gauss
P showed that the
total number of inequivalent representations of n is then m|n (m). Hence
P
let w(n) = m|n (m), so that w(p) = 1 + (p). We define
X

A(x; q, a) =

w(n).

nx
na (mod q)

Exercise 22.2 Show that if f is totally multiplicative and g = 1 f then


X

g(mn) =

(d)f (d)g(m/d)g(n/d).

d|(m,n)

P
As usual Am (x; q, a) := n w(n) where the sum is over n x with M |n and
n a (mod q). Hence, using the exercise with f = , if (m, q) = 1 then
X
X
X
Am (x; q, a) =
w(mN ) =
(d)(d)w(m/d)w(N/d)
N x/m
N a/m (mod q)

N x/m
d|(m,N )
N a/m (mod q)

w(N/d)

N x/m
N a/m (mod q)
d|N

d|m

(d)(d)w(m/d)

(d)(d)w(m/d)A(x/md; q, a/md).

d|m

Now w(n) =

m|n, m n

(m) +

m|n, m< n

(n/m). Therefore

A(x; q, a) =

nx
na (mod q)

(m)

mx
(m,q)=1

1
q

m x
(m,q)=1

(m) +

m|n, m n

m|n, m n

1 + (a)

mx
(m,q)=1

m nx
na (mod q)
m|n

((m) + (a)(m))

x
m

(n/m)
X

(m)

m <nx
na (mod q)
m|n


m + O(1) .

Finishing the proof of Linniks Theorem

99

P
P
P
Now m (mod q) (kq+m)(m) = m (mod q) m(m)  q 3/2 . Moreover mM (m)/m =

L(1, ) + O(q/M ), and so A(x; q, a) = (1 + (a))L(1, )x/q + O(q x) since is


real. Hence, if m is squarefree and coprime to q, and (a) = 1 then

X
X
p
x
(d)(d)
q
Am (x; q, a) = L(1, )
w(m/d)(1 + (a/md)) + O
w(m/d) mx/d
mq
d
m
d|m
d|m




Y
Y

x
1
q

= 2L(1, )
1 + (p) 1
+O
x
(1 + (1 + (p)) p) .
mq
p
m
p|m

p|m

Hence if we write Am (x; q, a) = (g(m)/m)A(x;


q,

 a) + rm (x; q, a) then g is a
1
multiplicative function with g(p) = 1 + (p) 1 p and
X
mM

Sieving Lemma

X 1 Y

(1 + (p) + 1/ p)  q M x log2 M.
|rm (x; q, a)|  q M x
m
mM

p|m

Lemma 22.4 (Standard sieving lemma) Suppose that


P an are a set of real weights
supported on a finite set of integers n. Let A(x) = n an and suppose that there
exists a non-negative multiplicative function g(.) such that
Am (x) =

an =

n: m|n

g(m)
A(x) + rm (x)
m

for all squarefree m, for which there exists K, > 0 such that
1


Y 
g(p)
log z
1
K
,
p
log y
y<pz

for all 2 y < z x. Let P be a given set of primes, and P (z) be the product
of the elements of P that are z. Then



X
X


Y
g(p) X
|rm (x)|
an = 1 + OK, (eu )
1
an + O

p
nx
(n,P (z))=1

pP
pz

nx

m|P (z)
mz u

Above we let x 
q 5 /L(1, )2 and z = x , with u large and u small, and
Sieving Lemma
then apply Lemma 22.4 with = 2 to obtain



X

Y
1
(p)
u
w(n) = 1 + O(e )
1
1
A(x; q, a).
p
p
nx
na (mod q)
(n,P (z))=1

pz

Now for each primes p, z < p x we must remove from the left side those n
divisible by p. For each prime p write n = N p and so we get an upper bound from

100

Binary Quadratic Forms

w(p) times the


sum of w(N ) over N x/p, N a/p (mod q) and (N, P (z)) = 1.
Since x/p x, we can get an upper bound from the same estimate, of the right
side with x/p in place of x; that is divided by p. Hence we deduce that




X
X 1 + (p) Y 
1
(p)

1
w(p) = 1 + O eu +
1
A(x; q, a).

p
p
p

px
pa (mod q)
p prime

In the last section we explained that


we have proved that
(x; q, a) = {1 + oL (1)}

1+(p)
z<p x
p

Y
pz

where x = q L and z = q

pz

z<p x

1
p

log Q
log z

1/5

, and hence



(p)
x
1
L(1, ) ,
p
q

23
EXPONENTIAL SUMS
Given a real number we consider rational approximations a/q with (a, q) = 1
such that




a 1 .
(23.1)

q q2
23.1

DiApprox

Technical Lemmas

We will work with exponential sums.


Exercise 23.1 Define e(t) := e2it . Let ktk be the distance from t to the nearest
integer. Prove that for any real we have






X
1


 min 1, 1
e(m)
;
2M

M kk


AM m<A+M

and



1

2M

X
AM m<A+M





2
|m A|
1
e(m) 1

min
1,
.

M
M kk

We begin by proving the following:


q
2M N

M log
q



1 + q log N
X
1
min 1,
 N M

M kn + k
log 2M

nN

M
N
N
+
q
M log M

N
q
 1+
+
+
q
M

M, N q
N q<M
M q<N
q < M, N

N
log(2M N/q)
M
if
if
if
if

(23.2)

(23.3)

if q M N (and if q > M N the case = 1/q yields the trivial bound N ). In


each case there are examples for which these bounds can not be improved. We
proceed by writing = a/q + and n = mq + r with q/2 < r q/2 so that
n = mq + ra/q + r with |r| (q/2)(1/q 2 ) = 1/2q. Hence, for each m these
points are well distributed around the circle (in that for each b, 0 b q 1
there is at most one such point in the arc of length 1/q centered on mq + b/q
(mod 1)). Hence in such an interval our sum is
min{N,q/2}

X
`=0


min 1,

1
M |`/q|


1+

q
q
+
log
M
M

min{N, q}
q
1+ M


,

expsum1

102

Exponential Sums

and summing up over such intervals we obtain the result. Since


min{N,q/2}


min 1,

`=0

1
M |`/q|

2
1+

q
,
M

we also deduce that



2


X
N
1
N 
q 
q
N
=1+
min 1,
 1+
1+
+
+
.
M knk
q
M
q
M
M

(23.4)

expsum2

nN

This usually wins a log(2M N/q) over the first moment, which can be important.
Now we wish to do the same for prime differences. That is, instead of summing
over n N , we sum over p, p0 N and let n = p p0 . We get (N ) N/ log N
copies ofP0 and the number of prime pairs p, p + n is  (n/(n))N/ log2 N . Now
n
2

m|n, m< n (m)/m. Hence


(n) 

min 1,

X
p,p0 N

1
M k(p p0 )k

2



2 X 2
X
1
N
N
(m)
min
1,
+
log N
M knk
m
log2 N nN
m|n

m< n

N
N
+
log N
log2 N

X 2 (m)
m

m< N


min 1,

m<kN/m

1
M kkmk

2

writing n =expsum2
mk. Now if = a/q then m = b/r (mod 1) where r = q/(q, m).
Hence by (23.4) this sum is

X 2 (m) 
N (m, q)
q
N
N
q
N

1+
+
+
 1+
+
log N + ;
m
mq
M
(m,
q)
mM
(q)
M
M

m< N

and so we deduce that



2
X
1
N2
N2 
q 
0
+
+
1
+
N log N.
min 1,
(d)(d
)

M k(d d0 )k
(q)
M
M
0
d,d N

(23.5)
23.2

expsum3

The bound of Montgomery and Vaughan


DiApprox

We begin by proving Montgomery and Vaughans celebrated result that if (23.1)


holds then
X
p
x
x
+
+ qx log x.
(23.6)
f (n)e(n)  p
(q) log x
nx
(The last term can be removed if q x/(log x)3 .)

MV1

How good is this bound?

103

Montgomery and Vaughan proceeded by multiplying through by P


log x; converting this to log n brings in an error of O(x). Then writing log n = d|n (d),
we find ourselves with the sum
X
f (dm)e(dm)(d).
dmx

We break this into intervals (assuming f is totally multiplicative for simplicity)


to get sums of the form



X
X


f (d)e(dm)(d)

|f (m)|


m
d

and Cauchy, so that the square is



2

X
X X

2

|f (m)|
f (d)e(dm)(d)



m
m
d
X
X
e((d1 d2 )m)
M
(d1 )(d2 )f (d1 )f (d2 )
m

d1 ,d2

M


(d1 )(d2 ) max M,

d1 ,d2

1
k(d1 d2 )k


.

This can be improved by a minor modification. If the range for m is A M/2 <
m < A + M/2 then we bound the top line above by multiplying the mth term
by 2(1 |m A|/M ) (which is 1 in this range), and then extend the sum to
all m in the range A M < m < A + M . By the second part of the exercise we
then obtain the bound

2
X
1
x2
x2
 M2
(d1 )(d2 ) max 1,

+ +(M +q)x log(x/M )
M k(d1 d2 )k
(q) M
d1 ,d2 D

expsum3

by (23.5), as M D x. We take the square root (since we Cauchyed) and sum


this up over 1 M = 2i x to obtain a total upper bound
x log x
p
+ x + q 1/2 x1/2 log3/2 x,
(q)
MV1

from which (23.6) follows.


23.3 How good is this bound?
If we let f = , a character mod q with = 1/q, and x a multiple of q, then
X
x X
g()
(n)e(n/q) =
(n)e(n/q) =
x,
q
q
nx

nq

where g() is the Gauss sum. We saw earlier that |g()| = q, and so if q is
p
MV1
prime then this is  x/ (q). Hence the first term in (23.6) needs to be there.

104

Exponential Sums

P
Given the values of f (p) for p x/2 let n f (n)e(n) = re() with r
R0 , where the sum is over all n x other than the primes in (x/2, x]. Now
consider the multiplicative
P function f where f (p) = e( p) for all primes
p,Px/2 < p x. Then nx f (n)e(n) = (r + (x) (x/2))e();
in particular
MV1
| nx f (n)e(n)|  x/ log x. Hence the second term in (23.6) needs to be there.
In both cases we do not need to take f to be exactly the functions described,
f should just be pretentious in that way. In the latter case one can most easily
avoid such problems by removing all integers that have
some large prime factor:

As shown by La Breteche one has, for q < y + x/e2 log x ,


X

f (n)e(n) 

nx
P (n)y

x
xy +
q

log2 x +

log x

(23.7)

In this case we do not multiply through by log x but rather write each n = dm
where (d, m) = 1 and d is a power of the largest prime dividing n. Hence
X

f (n)e(n) =

nx
P (n)y

X
d=pk
p prime,y

f (d)

f (m)e(dm).

mx/d
P (m)<p

Taking absolute values, we first deal with the term where d is a prime power.
This gives
X

(x/pk , p).
d=pk , k2
p prime

Using our estimate


(*) for (x, y) it is an exercise to show that this 2is 
x/ exp({2 + o(1)} log x log
log x) then main contribution coming from p values around exp({1 + o(1)} log x log log x). We shall similarly approach those
terms where d = p T is small: they contribute

(x/p, p)  x/ exp({ 2 + o(1)} log x log log x)

pT
p prime

q
where T = exp( 12 log x log log x).
To bound the remaining terms we forget that d should only be prime, and
arrive at






X X

f (m)e(dm) .


T <dy mx/d

P (m)<d
Cauchying for the terms with T < d D and m  M where DM  x we obtain

LaBret1

When f is pretentious

D

D

f (m)f (m0 )

m,m0 2M

min

m,m0 2M

e(d(m m0 ))

P (m),P (m0 )<dx/ max{m,m0 }

105

x
1
,
0
max{m, m } k(m m0 )k


.

The m = m0 terms yield  Dx log M . Otherwise let k = min{m, m0 } and


k + j = max{m, m0 }, so our sum becomes
X

D


min

j,k2M

x
1
,
k + j kjk

For each k we partition the j-values into intervals [1, k] and [2i k, 2i+1 k)expsum1
for
i = 0, 1, 2, . . . , I where I is minimal such that 2I k > 2M , and then apply (23.3)
assuming x q. We obtain


X x x
+
+ q log(M/k) + M log(2x/q)
k
q
k2M




x
+ q DM + DM 2 log(2x/q).
 xD log 2M +
q

D

Now we take the square root


and sum this over all M = 2j X/T with D =
LaBret1
min{y, x/M } to obtain (23.7).
23.4

When f is pretentious

We have seen that Montgomery and Vaughans bound can be considerably


improved if one removes the effect of the large prime factors, unless f is pretentious for some character of modulus q. Here we will be interested in
obtaining better estimates in this special case.
We deduce
X

log x
nx

f (n)e(n) =

f (n)e(n) log[Q,x/Q] n + O ((f )x log Q) .

nx

(mod q)

(23.8)

Small.1

FirstRedn

from Lemma 17.6. If (b, d) = 1 then


  X
d1  
b
j
1
=
e

e
d
d
(d)
j=0

mod d

(b)(j) =

1
(d)

(b)g(); (23.9)

mod d

therefore if (a, q) = 1 then, writing n = mq/d when (n, q) = q/d (so that
(m, d) = 1),

ExpSums2Chars

106

Exponential Sums


f (n)e

nx

an
q


=

(d)

f (mq/d)e

 am 

d|q mx/(q/d)
(m,d)=1

X f (q/d)
d|q

(a)g()

mod d

f (m)(m)

mx/(q/d)

FnsInAPs

Proceeding as in the proof of Theorem 19.1 we have










X
X
X
X

an
f (q/d)


f
(n)e
(a)g()
f
(m)(m)



q
(d)
nx

mod
d
mx/(q/d)
d|q


induces some j , 1jk

X

X d
X


f (m)(m)

(d)
mx/(q/d)

mod d
d|q
induces some j , k+1jK

X
d|q

d
(d)

X
mod d
induces j , j>K



X


log[Q,x/Q] m

,
f (m)(m)


log x
mx/(q/d)

LinearPLS

since all the prime factors of q/d are < Q. By Proposition 19.2 x = QA , the
second error term is



X d 0 (f ) x
x Y
1
2
1
0



(f
)
1
+
+
1 .

1
(d) A
q/d
q
p p
A
d|q

p|q

For the first error


term we Cauchy it, in two parts, and proceed as in the proof
FnsInAPs
of Theorem 19.1 to obtain


1
2
log A
x Y
1+ +
1 1
 0 (f )
q
p p
k+1
A
p|q

which dominates.
We now deal with the main terms: Suppose that the primitive character
(mod r) induces some j (mod q). If (mod kr) is induced from (mod r)
then g() = (k)(k)g(), so we may assume (k, r) = 1 else g() = 0. Therefore
the total contribution is
= (a)g()

X f (q/kr)
(k)(k)
(kr)

k|q/r
(k,r)=1

FSieved2

By (14.2) the error terms add up to

X
mkrx/q
(m,k)=1

f (m)(m).

When f is pretentious

r X (k)2 k
krx


(r)
(k) (k)
q

2(q/r) r

x
(q)

(log x)2

as 1 2/ > 2
terms add up to

log log x
(log x)2

log log x

k|q/r
(k,r)=1

log q/r
log x

107

log q/r
log x

12/


log

12/


log

log x
log q/r

log x
log q/r

!

!


q
log log x
,
x
(q) (log x)2 3

3, and since the maximum is attained when r  q. The main


Y
X f (q/kr)
F (p) X
1
1 1+it
F (n)
(a)g()
(k)(k)
(kr)
(q/kr)1+it
p
nx

p|k

k|q/r
(k,r)=1

with F (n) = f (n)(n); and this equals


(f, , t; q)

(a)g() X
F (n)
(q)
nx

where
Y

(f, , t; q) :=

(p)e ((F (p)pit )e (F (p)pit )e1 )

(f (p)/pit )e .

p kq/r
p-r

p kq/r
p|r

Hence in total we have


X
nx


f (n)e

an
q


=

k
X

(f, j , t; q)

j=1

j (a)g(j ) X
f (n) j (n)
(q)
nx


Y
q
1
1
log
A
log
log
x
0
+ (f )
+O
x
1+ +
1 1 .
(q)
p p
k+1
(log x)2 3
A
p|q

In particular if we have log q = (log x)o(1) then since 1 2/, 1 1/ 2 > 2 3


we deduce that

  X


k
X
X
j (a)g(j )
an
x

f (n)e

(f, j , t; q)
f (n) j (n) + O 2(q/r)

q
(q)
(log x)2 3+o(1)
j=1
nx
nx

q
x

(q) (log x)1 12 +o(1)
if log q  (log log x)2 .

108

Exponential Sums

In the special case that q is prime we have that

(q)
x

nx f (n)e





f (q) 1 X
log log x
log A
+
1 it
f (n) + O
2
q
x
(log x)2 3
A1
nx
if some j = 1; plus (a)g() times


f (q) 1 X
log log x

f
(n)(n)
+
O
q it x
(log x)2 3
nx

1 k+1
for each = j of conductor q, plus O( q(log A)/A
).

an
q

equals

24
THE EXPONENTS K
We wish
to find the largest exponents 1 2 . . . that can be used in PropokRepulsion
sition ??; that is if 1 , 2 , . . . , k are distinct characters mod q, with q < Q < x
then


X 1 Re((f j )(p)/pitj )
log x
{1 k + o(1)} log
max
+ Ok (1),
1jk
p
log Q
Qpx

kRepulsion

where the implicit constants are independent of f . Proposition ?? shows that


k 1k .
It is evident that 1 = 1 taking the example f (n) = 1 along with = 0 .
eta2

Proposition 24.1 We have 2 1/3. In fact 2 = 1/3 assuming that




 
(x)
log q
(x; q, a) =
1+O
(q)
log x
for x q 2 .
Proof To prove the lower bound, suppose that (mod q) has order 3. Define
(
1
if (p) = 1
f (p) =
,
1 if (p) = or 2
so that
1 Ref (p)(p) = 1 Ref (p)(p) =

1 Re (p)
.
3

Therefore our two sums are




1
log x
1 X 1 Re (p)
= log
+ O(1),
3
p
3
log Q
Qpx

kRepulsion

as in the proof of Proposition ??, and so 2 1/3.


kRepulsion
To prove our lower bound it suffices, as in the proof of Proposition ??, to
suitably bound


X

2
X


j (p) 1 X 1
1
1
=
1 + (p)p2it


2
p j=1 pitj 2
p
Q<px

Q<px

110

The exponents k

where = 1 2 and t = (t2 t1 )/2. We will suppose t > 0 (if t < 0 we simply
replace (p)p2it by (p)p2it .) By our assumption on (x; q, a), this equals
1
2(q)

X
a

(mod q)
(a,q)=1

log x

log Q

dv



+ O(1).
1 + (a)e2itv
v

If has order m > 1 then there are exactly (q)/m values of j (mod m) for
which (a) = e2ij/m , and so our integral equals
m1 Z log x
m1 Z t log x
dv
d
1 X
1 X




=
cos(tv + j/m)
cos(j/m + )
m j=0 log Q
v
m j=0 t log Q

We
handle that part of the integral with 1 using the first part of exercise
ex7.2
??. When < 1 we substitute cos(j/m + ) = cos(j/m) + O(). Hence our
integral equals
2

where cm :=
1
m

1
m

max{1,t log x}

max{1,t log Q}

Pm1
j=0

d
+ cm

min{1,t log x}

min{1,t log Q}

d
+ O(1)

| cos(j/m)| equals
(

cos(j/m) =

m/2<jm/2

1
m sin(/2m)
1
m tan(/2m)

if m is odd,
if m is even.

The maximum of cm thus occurs for m = 3, and equals 2/3. Therefore, since
2
2
< 3 our integral is

2
3

t log x

t log Q

d
2
= log

log x
log Q


+ O(1)

as claimed.


log x
When t = 0 we can simplify the above proof to obtain cm log log
Q + O(1).
2
etak

Proposition 24.2 We have

2 m

< m

1
m

for all m 1.
kRepulsion

Proof The upper bound was obtained in Proposition ??. For the lower bound,
let ` be the smallest prime in (2m, 4m], say = 2k + 1.
Suppose that (mod q) has prime order `. Define f (p) = 1 if (p) = 1, and
a
a
f (p) =
when (p) = e
`
`
whenever a 6 0 (mod `). In this case we note that

How to determine a better upper bound on k in general

f (p)g` =
n

(mod `)

where g` = g


.
`

a1 n
`

  
n
=
e
`

X
m

(mod `)

 m   am 
e
=
`
`

111

X
m

(mod `)

m
`

, so that

`1
`1
1 X m m
1 X m
f (p) =
(p) +
(p),
g` m=0 `
` m=0

kRepulsion

for all p. As in the proof of Proposition ?? we deduce that




X Re(f (p)j (p))   1   j  1 
log x
= Re
+
log
+ Oq (1).
p
g`
`
`
log q
qpx

Now g` R if and only if l 1 (mod 4). Moreover there are exactly `1


2 = k

values of j (mod `) for which j` has the same sign as g` , and for these the
above implies that
m k

1
1
1
1
1
1
1
+ = + =
+
> .
g`
`
`
2k
+
1
2 m
2k + 1
`
2

24.1

How to determine a better upper bound on k in general

We may proceed much as in the proofs above. Given the j and tj we select f (p)
Pk
to have size 1 in the same direction as j=1 j (p)pitj so that



k
X


j (p)pitj .
j (p)/pitj =
Re((f j )(p)/pitj ) = Re f (p)

j=1
j=1
j=1

k
X

k
X

Hence in this case


P

k

k
X
X Re((f j )(p)/pitj )
X j=1 j (p)pitj
=
.
p
p
j=1
Qpx

Qpx

eta2

Using the hypothesis of Proposition 24.1, this equals


1
(q)

X
a

(mod q)
(a,q)=1

log x

log Q

k
X
dv


j (a)eitj v
+ O(1).

v
j=1

It is not so easy to proceed as before since the quantity inside |.| is no longer
periodic. Certainly one can do something similar but not the exact same thing.

m (p),

112

The exponents k

One important special case is where each tj = 0, since this was the worst
case when k = 2. In this case suppose that each m
j = 1, that q is prime and if
g has order m mod q then j (g) = e(bj /m). (Here the bj must be distinct mod
m, as the j are distinct.) Hence the above becomes


m1
k

log x
1 X X

e(nbj /m) log
+ O(1).

m n=0 j=1
log Q
We therefore wish to find the maximum of this as we vary over all possible bj .
By computer we found optimal examples for 2 m 6 by an exhaustive search. Writing the example as [b1 , . . . , bk ; m], we have [0, 1; 3], [0, 1, 3; 7],
[0, 1, 3, 9; 13], [0, 1, 4, 14, 16; 21], [0, 1, 3, 8, 12, 18; 31]. One observes that m = k 2
k + 1 and that these are all perfect difference sets; that is the numbers {bi bj
(mod m) : 1 i 6= j k} = {` (mod m) : 1 ` m 1}. This case is easy to
analyze because then we have
k
2
X


e(nbj /m) =

j=1

X
1i,jk

e(n(bi bj )/m) = k +

e(n`/m) = k 1,

1`m1

if n 6= 0. Therefore
k
m1
(m 1)k 1 + k
1 X X

e(nbj /m) =
.

m n=0 j=1
m

Exercise 24.3 Use the Cauchy-Schwarz inequality to show that this is indeed
maximal. (Hint: Under what circumstances do we get equality in the CauchySchwarz inequality?)
Although there are perfect difference sets for k = 2, 3, 4, 5, 6 and 7, there is
none for k = 8. The existence of a perfect difference set is equivalent to the
existence of a cyclic projective plane mod m = k 2 k + 1. 1 There are always
perfect difference sets for k a prime power.
The next question is to understand the size of the individual sums, if we want
a lower bound. What we get is that


X Re((f )(p))
log x
i
= ci log
+ O(1),
p
log Q

Qpx

where
1 This is Theorem 2.1 in Cyclic projective planes by Marshall Hall Jr, Duke 14 (1947) 1079
1090.

How to determine a better upper bound on k in general

113




 
  X


m1 k
k
nb
1 X X
nbi
nbj
j

e
ci =
e
e

m n=0 j=1
m
m
m
j=1

k m1
X
X  n(bj bi ) 
1
1

=
1+
e
m
m
k 1 j=1 n=1


1
mk
1
=
1+
>
,
m
k1
k+1
for k > 1. Evidently, because of equalities throughout, this best possible (when
the tj = 0). It also supplies us with a lower bound in general, at least if k is
prime.
We can use short gaps between primes to extend this to all k. For example,
the prime number theorem implies that there is always a prime in [m, m + o(m)),
and so
1
k .
k

25
LOWER BOUNDS ON L(1, ), AND ZEROS; THE WORK OF
PINTZ
Exercise 25.1 For , 0 < < 1 show that
lim

X
mx

1
m1

x 1

exists, and call it . Prove that




X



1
x

1
1

1 .

1

mx m
x
Proposition 25.2 Suppose that L(s, ) 6= 0 for real s, 1 log1 q s 1 where
is a real non-principal character mod q. Then L(1, )  1/ log q.
Proof Let = c/ log q. For any real character define g = 1 so that g(n) 0
for all n and g(m2 ) 1. Hence
X
m2 x

1
m22

X g(n)
X (d) X
1
=
1
n
d1
m1
nx
dx
mx/d

X (d)  (x/d) 1
d1

+ + 1
d1

x
dx

X
X
(d)
1
(d)
1 X
x
+
+ 1
(d)
=
1

d
x
dx
dx
dx


x
1

L(1, ) +
L(1 , ) + O(q/x1 )

P
since d>x (d)/d  q/x for all > 0. Now as there are no zeros in [1 , 1]
hence L(1 , ) > 0 (like L(1, )) and < 1/ so that term is < 0. Taking
x = q 2 we obtain the result.
2
25.1

Siegels Theorem

If L(s, ) 6= 0 for real s, 1 log1 q s 1, for all real quadratic characters


(mod q), for all q then we can use the above Proposition. Otherwise we suppose

Siegels Theorem

115

that there exists a character (mod k) and a real number such that L(, ) =
0. Now
X (a)(a) (b)
X (1 )(n)(n)
=

n
a
b
nx

abx

X (a)(a) X (b)
X (b)
X
(a)(a)
+
a
b
b
a

bx/a
a x
b x
x<ax/b

X (a)(a)
X 1
X 1
qk
 qk .
=
L(, ) + O k
+ /2

a
x
x1/2

x
b

a x

a x

b x

Now (1)(n) = 1 if n = m2 , and 0 otherwise, where is Liouvilles function.


We write = h (so that h(pk ) = (pk )(1 + (p))). Now
1

X
m2 x
(m,k)=1

m22

X (n)(1 )(n)
X (n)(1 h)(n)
X (a)(1 )(a) (b)h(b)
=
=

.
n
n
a
b

nx

nx

abx

Now the terms with b x are, since |h(b)| 2(b) , and using the bound above,




X 2(b)
X |h(b)| X (1 )(a)(a)
X 2(b)
qk
qk
qk log x




 3/4 .

(x/b)1/2
1/2
1/2
b
a
b
x
b
x

ax/b

b x
b x
b x
The remaining terms , since |(1 )(a)| d(a), 0 |h(b)| (1 )(b) and
1/(ab) x1 /ab, are
x1

X d(a) X
a

a x

x<bx

X d(a)
X
(1 )(b)
= x1
b
a

a x

x<mnx

(m)
.
mn

The first sum here is  x(1)/2 log x. For the second we have
X 1
X
1
(m)
+
n
n 1/3
m
x/m<nx/m
mx1/3
nx2/3
x
<mx/n

X (m) 1
X 1
X 1 q

+O
=
log x + O
1/3
m
2
n
x
x
mx1/3
mx1/3
nx2/3


1
1
= L(1, ) log x + O
2
x1/6
X

(m)
m

if x > q 7 . Combining the above we obtain, provided 9/10 and taking x = q 7 ,


that
L(1, )  1/q 11(1) .

26
THE SIEGEL-WALFISZ THEOREM
We saw in our discussion of Selberg-Tenenbaum that if the mean value of f (p)
is about , with 6= 0, 1 then the mean value of f (n) for n x is about
cf /(log x)1 . In both the two cases = 0 and 1 one can show that the mean
value of f (n) is f 1/(log x)2 . In our first subsection we shall sketch an argument
to show that if the mean value of f (p) is about 0 then the mean value of f (n)
is correspondingly small. The case when the
mean value of f (p) is 1 is rather
IK
more difficult but fortunately featured in [?]. This is relevant to a strong version
of the prime number theorem, since their argument can be used to bound the
mean value of (n). In a future version we shall give a stronger version of their
argument.
The main point of this section is to prove a strong converse theorem when
the mean value of f (n) is around 0 and the mean value of f (p) cannot be close
to 1. Since this is what we know about Dirichlet characters this will lead us to
a pretentious proof of the Siegel-Walfisz Theorem. This proof is due to Dimitris
Koukoulopoulos. In this version of the book we include a preliminary version of
his paper; he will present a more complete version at the Snowbird meeting.
26.1 Primes well distributed implies...
P
P
Let S(x) = nx f (n) and P (x) = dx (d)f (d). Assume |P (x)| cx/(log x)A
with A > 2.
Select B in the range 2 < B < A and then cB > 0 minimal for which there
exists a constant xB such that if x xB then |S(x)| cB x/(log x)B . Let D = x
with > 0 so that (B 1)(1 )B1 > 1.
Suppose f is totally multiplicative
X
X
X
X
f (n) log n =
f (md)(d) =
(d)S(x/d)+
f (m)(P (x/m)P (D)).
nx

dmx

dD

mx/D

The second term is, in absolute value,


X
2c
x
x
x


.
2c
A
A1
m log(x/m)
A 1 (log D)
(log x)A1
mx/D

If our bound is proved up to x/2 then we can insert into the first term to obtain
X
x
cB
x
x
cB
(d)

< (1 2)cB
.
B
B1
d log(x/d)
B 1 (log x/D)
(log x)B1
dD

Hence the total bound that we get is |S(x)| (1 )cB x/(log x)B .

Main results

117

We use this argument several times


B
1) To show that |S(x)|  x/(log
 x) . B
2) Letting cB = lim inf |S(x)| x/(log x) to show that cB = 0.
Hence we have proved that for any B < A we have S(x) = o(x/(log x)B .
I suspect that the argument can be used to show that S(x)  x(x)/(log x)A
where (x) is any function going monotonically to infinity, no matter how slowly.
Note that we need to have A > 2 for this argument to work, which seems to
fit the sort of things we know from Selberg-Delange-Tenenbaum.
The above argument is written in a uniform manner. I am interested in
what happens if, say, P (x)  x/e log x . The key remark is that we can take
= log(B/2)/B roughly. To make the argument then work we need
A(log D)A1  B(log x)B1
If say A is roughly (log x) and B is roughly this size we get something like
(1 )(A 1)  B from the powers of log x; that is B is roughly (1 )A. One
can be precise, I think, and show that one can obtain
S(x)  xAA /(log x)A

provided A . Hence if P (x)  x/e2

log x

and S(x)  x/e

log x

26.2 Main results


For an arithmetic function f : N C we set
L(s, f ) =

X
f (n)
ns
n=1

and Ly (s, f ) =

X
P (n)>y

f (n)
,
ns

provided the series converge. We will use pretentious methods to prove:


pnt

Corollary 26.1 Let x 1 and (a, q) = 1 such that


p
log q
c log x
Lq (1, )
for all real characters mod q, for some sufficiently small c > 0. Then
 x 
(x)
(x; q, a) =
+ O clog x .
(q)
e
Using Siegels Theorem this allows us to recover the Siegel-Walfisz Theorem.
That is
The Siegel-Walfisz Theorem. Fix A > 0. Uniformly we have


(x)
x
(x; q, a) =
+O
.
A
(q)
(log x)
If Ly (1 + it, f ) converges for all t R and all y 1, we set
(2)
L(1)
y (f ) = min |Ly+|t| (1 + it, f )| and Ly (f ) = min |Ly+|t| (1 + it, f )|.
|t|y

|t|>y

In the special case that y = 1 we omit the subscript y.

118
bounded

The Siegel-Walfisz Theorem

Theorem 26.2 Let f : N D be a completely multiplicative function such that



X


f (n) x1 (x Q)
(26.1)

e0

nx
(j)

for some  > 0 and some Q 2. Then we have LQ (f )  1 for j = 1, 2.


(2)

Furthermore, if we assume that LQ (f ) , then there are constants c1 and c2 ,


depending at most on  and , such that
 log Q 2
X
x
f (p)  c log x whenever log x c2 (1)
.
e1
LQ (f )
px
char

Theorem e0
26.3 Let f : N D be a completely multiplicative function which
satisfies (26.1) for some  > 0 and some Q 2.
e0

(j)

1. If (26.1) holds for f 2 as well, then LQ (f )  1 for j = 1, 2.


(1)

(2)

2. If f (p) {1, +1} for p > Q, then LQ (f )  LQ (1, f ) and LQ (f )  1.


P

Furthermore, LQ (1, f ) 6= 0 if the sum n=1 f (n)/ n converges. Lastly, if


log Q 2/ and L(, f ) 6= 0 for 1 1/ log Q 1, then LQ (f )  1.
26.3 Technical results
Let Q 3, k N {0}, A 2 and M : [(log Q)/3, +) (0, +) a differentiable function such that
1
M (u) e2u/3 (u Q)
A
and for j {0, 1 . . . , k} the function M (u)/uj increases for u Aj. We call
(Q, k, A, M ) an admissible quadruple. Given such a quadruple and t R we
define Qt by
Qt = min{z Q : M ((log z)/3) |t|}.
(26.2)
Also, we let F(Q, k, A, M ) be the family of completely multiplicative functions
f : N D := {z N : |z| 1} such that
X

x


f (n)
(x Q).

2
(log x) M (log x)
nx

For such an f we define


L+ (f ; Q, k, A, M ) =

min

|t|M ((log Q)/3)

|LQt (1 + it, f )|

and
L(f ; Q, k, A, M ) = min |LQt (1 + it, f )|.
tR

The notation
g(x) a,b,... h(x)

(x x0 )

means that |g(x)| Ch(x) for x x0 , where C is a constant which depends at


most on a, b, . . . Lastly, the letters c and C denote generic constants, possibly

qt

Technical results

119

different at each case and possibly depending on certain parameters which will
always be specified, e.g. by c = c(a, b, . . . ).
main

Theorem 26.4 Let (Q, k, A, M ) be an admissible


quadruple and consider f
qt
F(Q, k, A, M ). For t R define Qt by (26.2).
1. We have
 k1
 k1


ck log Q
1X
ck 2
2
2
f (p) A
+
x
L(f ; Q, k, A, M ) log x
L+ (f ; Q, k, A, M ) log x

(x Q)

px

for some constant c = c(A).


2. We have the estimate L(f ; Q, k, A, M ) A 1. Moreover, if for some t R
we have
X
Qt <pz

 log z 
1 + <(f (p)pit )
log
C
p
log Qt

(z Qt , t R), (26.3)

where > 0 and C 0 are some constants, then |LQt (1 + it, f )| A,,C 1.
3. If f 2 F(Q, k, A, M ) as well, then L(f ; Q, k, A, M ) A 1.
4. If f (p) {1, +1} for all primes p > Q, then
L+ (f ; Q, k, A, M ) A 1

and

L(f ; Q, k, A, M ) A LQ (1, f ).
main

The key estimate in proving Theorem 26.4 is the following theorem.


derivative

Theorem 26.5 Let (Q, k, A, M ) be an admissible quadruple and consider f


F(Q, k, A, M ). For x y Q we have
X f (p) logm p
p

p1+1/ log x

A

 c m log y m
|Ly (1, f )|

(1 m k)

for some constant c = c(A). Moreover, |Ly (1, f )| A 1.


26.3.1
series

Preliminaries

Lemma 26.6 Let {an }


n=1 be a sequence of elements of D. If
verges, then

X
X
an
an
lim
=
.
1+
0+
n
n
n=1
n=1

n=1

an /n con-

Proof The lemma follows by an easy partial summation argument.

fi

The following result is Lemma 5 in [?].


fund

Lemma 26.7 Let y 2 and D = y s with s 2. Let 1[P (n) > y] denote the
indicator function of integers n all of whose prime factors are greater than y.

notmobius

120

The Siegel-Walfisz Theorem

Then there exist two sequences { (d)}dD whose elements lie in [1, 1] and
such that
( 1)(n) 1[P (n) > y] (+ 1)(n).
Moreover, if f : N [0, 1] is a multiplicative function then
Y
X (d)f (d)
f (p) 
1
.
= (1 + O(es ))
d
p
dD

quadruple

py

Lemma 26.8 Let (Q, k, A, M ) be an admissible quadruple.


1. For 0 j k and max{j, log Q} we have
Z j2
u
Aj
du
.
M (u)
M ()
1
2. For > 0 and max{k, log Q} we have
Z
Ak (2 + log max{1, })
uk1
du

.
M ()
M (u)eu/
1
Proof (a) If j = 0, then the result follows immediately because M is increasing.
Fix 1 j k and max{j, log Q}. Then
Z A
Z
Z j2
1
du
Ak
u
Ak
du

.
uj2 du +
2
M (u)
M () 1
M (A) A u
M ()
1
(b) It suffices to consider the case 1. So fix such a and some
max{log Q, k}. If k = 0, the result follows immediately by the fact that M is
increasing and by the estimate
Z
Z
du
du

log

+
log + 1.
u/ u
u/ u
e
e
1

So assume that k 1. Then


Z
Z

Ak
Ak
Ak
Ak  1
uk1
du
du

+log
+1
,
k M () M (A) A eu/ u
M ()
M () k
M (u)eu/
1
which completes the proof.

sieve

Lemma 26.9 Let (Q, k, A, M ) be an admissible quadruple and consider f


F(Q, k, A, M ). For x y Q and 0 m k we have
X f (n) logm n



 A(2A(m + 1) log y)m .
1+1/
log
x
n

P (n)>y

Also, we have
X f (n) logk+1 n

 log x 


k+1

(2A(k
+
1)
log
y)
1
+
A
log
.


log y
n1+1/ log x

P (n)>y

Technical results

121

fund

Proof Let 0 m k + 1. Lemma 26.7 with D = x and x y 2m+2 implies


that
X

X
X
f (n) =
f (n)(+ 1)(n) + O
(+ 1 1)(n)
nx

nx
P (n)>y

nx

xe log x/2 log y


x log y
+
.
(log x)2 M (log x/2)
log y

By partial summation then we find that


X f (n) logm n
P (n)>y

n1+1/ log x
 (2(m + 1) log y)m +

m1

y 2m+2
m

m logm1 u + logm u
log y
du
(log u)2 M (log u/2)
u1+1/ log x

u + log u e log u/2 log y


du
log y
u1+1/ log x
y 2m+2
Z
tm2
dt.
 (2(m + 1) log y)m + (2(m + 1) log y)m
(2m+2) log y
1
e log x t M ((m + 1)(log y)t)
Z

m log

quadruple

Lemma 26.8 and our assumption that M (log Q) 1/A then complete the proof
of the lemma.
2
distance

Lemma 26.10 Let (Q, k, A, M ) be an admissible quadruple and consider f


F(Q, k, A, M ). Let y2 y1 y0 Q. Assume that
X 1 + <(f (p))
log z
log
C
p
log y

(z y1 )

y0 <pz

for some > 0 and C 0. Then


X


y1 <py2

f (p)
A,,C 1.
p
sieve

Proof By our assumption and Lemma 26.9 we have

X
f (n)(n) X
f (n)(n) X
(n)f (n) X f (n) log n

+







n1+1/ log x
n1+1/ log x
n1+1/ log x
n1+1/ log x
n=1
P (n)y0
P (n)>y0
P (n)>y0
 log x 1 X f (n) log n


C log y0 +


1+1/ log x
log y0
n

P (n)>y0

 log x 1

 log x 
A log y0 +
(log y0 ) 1 + log
log y0
log y0
 log x 1/2
 (log y0 )
.
log y0

122

The Siegel-Walfisz Theorem

So we deduce that
X
y1 <py2

n 
n 
f (p)
1 o
1 o
= O(1) + log F 1 +
log F 1 +
p
log y2
log y1
Z y2

0
1
du
F
1+
= O(1) +
F
log u u log2 u
y1
Z y2
du
A,,C 1 + (log y0 )/2
 1.
1+/2
y1 u(log u)
2

This completes the proof of the lemma.


26.3.2

Proofs
derivative

Proof [Proof of Theorem 26.5] (a) We have that


X f (p) logm p



m

(c
m
log
y)
+



p1+1/ log x
p

X
P (n)>y

f (n)(n) logm1 n
.
n1+1/ log x

Set F (s) = Ly (s, f ) and note that


X
P (n)>y

f (n)(n) logm1 n  F 0 (m1) 


1 
1
+
.
=
F
log x
n1+1/ log x

Moreover, we have
 F 0 (m1)
F

(s) = m!

(1 + a1 + a2 + + )!  F 0 (s) a1  F 00 (s) a2

a1 !a2 !
1!F (s)
2!F (s)
+=m

X
a1 +2a2

(26.4)

sieve

Lemma 26.9 implies that




(j)
1+
F

1 
 (j log y)j
log x

(1 j m).

In addition, for every x0 x we have that




X <(f (p))
1


= O(1) + log Lx 1 +
,
f
C1
0
p
log
x
0

x<px

sieve

for some constant C1 = C1 (A), by Lemma 26.9. Therefore




F 1 +

n X <(f (p)) o
n X <(f (p)) o
1 
A exp
 exp
log x
p
p
y<px
y<px0
X
f (n)


|Ly (1, f )|
n1+1/ log x0

P (n)>y

identity

Technical results

123

series

identity

as x0 , by Lemma 26.6. Inserting the above estimates into 26.4 with s =


1 + 1/ log x and observing that
|Ly (1, f )| = lim |Ly (1 + , f )| A,,C 1,
0+

sieve

by Lemma 26.9, yields


X f (p) logm p
p1+1/ log x

A

 C m log y m
2
|Ly (1, f )|
a

1 +2a2 +=m

for some C2 = C2 (A). To complete the proof of part (a), note that
X

1=

a1 +2a2 +=m

I{1,...,m}

Ym

I{1,...,m} iI

iai =m
ai 1 (iI)
iI

2m

(2e)m
mm
 ,
m!
m
2

by Stirlings formula.
main

Proof [Proof of Theorem 26.4] (a) We may assume that x Q1 . For every
T 1 we have
Z
X f (p)(log p)k1
x1/ log x+it
1
1X
f (p)(log p)k1 log(x/p) =
1+1/
log
x+it
x
2i |t|T p p
(1 + 1/ log x + it)2
px

+O

 (k log x)k1 
T

.
(26.5)

Call I(T ) the integral above. By partial summation, we have that


Z x X
 du
X
(1 + |t|)x
f (n)nit  x1/3 + (1 + |t|)
f (n) 2 
u
M ((log x)/3)
x1/3
nx

(x Q),

nu

that is


M (/3) 
f (n)nit F Qt , k, c1 A,
c1 (1 + |t|)

for some absolute constant c1 1. For |t| T := M ( log3 x ) we have


X f (p)(log p)k1
p

p1+1/ log x+it

A

c2 k log Qt k1
|LQt (1 + it, f )|

for some c2 = c2 (A). So


c2 k log Q k1
I(T ) 
+
L(f ; Q, A, M )


M ( log3 x ) 

M ( log3 Q )

c2 k log Qt k1 dt
.
L+ (f ; Q, A, M )
t2

e3

124

The Siegel-Walfisz Theorem

Making the change of variable t = M (u), we find that log Qt = 3u and thus
Z log3 x k1 0
Z M (log x/3)
dt
u
M (u)
(log Qt )k1 2 = 3k1
du
2
log Q
t
(M
(u))
M (log Q/3)
3
=

We have
Z

log x
3

max{k, log3 Q }

((log Q)/3)k1
((log x)/3)k1

M ((log Q)/3)
M ((log x)/3)
Z log3 x k2
u
+ (k 1)
du.
log Q
M (u)
3

(26.6)

uk2
du (A max{log Q, k})k ,
M (u)

quadruple

by Lemma 26.8. Also, if k log Q/3, then


Z k
uk2
du k k1 .
log Q M (u)
3
e3

e5

Combining the above inequalities with (26.5) and (26.6) yields


 c k log Q k1 
k1
1X
c3 k 2
3
f (p)(log p)k1 log(x/p) A
+
x
L(f ; Q, A, M )
L+ (f ; Q, A, M )
px

for some c3 = c3 (A). By a standard differentiation argument, this implies



 c k log Q  k1
 k1
c3 k 2
1X
2
2
3
k1
f (p)(log p)
A
+
.
x
L(f ; Q, A, M )
L+ (f ; Q, A, M )
px

Finally, summing by parts completes the proof of part (a).


distance

(b) For y2 y1 Qt Lemma 26.10 implies that



n X <(f (p)) o


1


, f  exp
A,,C 1.
Ly1 1 +
log y2
p
y1 <py2

series

Setting y1 = Qt , letting y2 and applying Lemma 26.6 completes the proof.


sieve

(c) By Lemma 26.9, for t R and z y Qt we have




X <(f 2 (p)p2it )
1


= O(1) + log Ly 1 +
+ it C1
p
log z
y<pz

for some absolute constant C1 . So we find that


X <(f (p)pit )
1  X 1 1/2  X <2 (f (p)pit ) 1/2

p
2
p
p
y<pz
y<pz
y<pz

2
log z

log
O(1),
2
log y
distance

since cos2 x = (1 + cos(2x))/2. Lemma 26.10 then completes the proof.

e5

Technical results

125

(d) For |t| 1/ log Q we have


X
X cos(2t log p)

= O(1) + log
p

y<pz

P (n)>y



C2
1+2it+1/ log z
fund

for some aboslute constant C2 , by partial summation and Lemma 26.7. So an


argument as the one in part (c) shows that |LQt (1 + it, f )|  1 for |t| 1/ log Q.
Fix now t R with 1/ log x |t| 1/ log Q. We claim that
X
e1/|t| <px

1 + cos(t log p)
c log(|t| log x) c0
p

(26.7)

for some gs
appropriate constants c and c0 . In order to show this we use the argument in [?, Lemma 4.2.1]. Fix some  (1/10 log Q, 1/3) to be chosen later and
let P be the set of primes for which there exists an integer n with
p In := [e

(n)
|t|

,e

(n+)
|t|

(n N {0}).

Since /|t| 1/10, Mertens theorem yields


X
e1/|t| <px, pP

1
  log(|t| log x).
p

Thus
X
e1/|t| <px

1 + <((p)pit )

X
e1/|t| <px
pP
/

1 + <(f (p)pit )

p

X
e1/|t| <px
pP
/

1
p

(1 O()) log(|t| log x).


e4

distance

e4

Choosing a small enough  proves (26.7). Next, notice that Lemma 26.10 and (26.7)
yield that
X
f (p)
 1.
p1+it
1/|t|
e

<px

Therefore for x e1/|t| Q we have


X <(f (p)pit )
=
p

Q<px

X
Q<pe1/|t|

sieve

X f (p)
f (p)
+ O(1)
+ O(1),
p
p
Q<px

by Lemma 26.9. This completes the proof of the theorem.


bounded

Proof [Proof of Theorem 26.2] For the function M (u) = eu main
the quadruple
(Q,
k,
1/,
M
)
is
admissible
for
all
k

N.
Applying
Theorem
26.4
with k 

log x proves the desired result.


2

e4

27
PRIMES IN PROGRESSIONS, ON AVERAGE
Suppose that the character (mod q) is induced from the primitive character
(mod r). Then we write cond = q and cond = r.
We shall use the Siegel-Walfisz Theorem which states that for any fixed
A, B > 0 one has
(N ; q, a)

N
(N )

,
(q)
(q) logB N

uniformly for q  logA N and (a, q) = 1. This may also be phrased as


X

(n)(n) 

nN

N
,
logB N

for all primitive characters (mod q), uniformly for q  logA N . We also make
use of a strong form of the prime number theorem: For any fixed A > 0 we have
(N ) N 

N
.
(log N )A

All of these estimates were proved in the previous section.


27.1

The Barban-Davenport-Halberstam-Montgomery-Hooley
Theorem

The first result shows that the mean square of the error term in the prime number
theorem for arithmetic progressions can be well understood.
BDHLMH

Theorem 27.1 If N/(log N )C Q N then


2
X X

(N ; q, a) (N ) = N Q log N + O (N Q log(N/Q)) .

(q)

qQ (a,q)=1

LemLS1

We begin with a technical lemma; most of the proof is left as an exercise.



Q 
P
p
1
Lemma 27.2 Let c := p 1 + p(p1)
and 0 := p p2log
p+1 . Then

The Barban-Davenport-Halberstam-Montgomery-Hooley Theorem

X
rR

X
rR

1
= c log R + c 0 + O
(r)

log R
R

127


,

r
= cR + O(log R),
(r)

X r2
c
= R2 + O(R log R).
(r)
2

rR

Also
X
rR
m|r

1
1
=
(r)
(m)

Y
1+
p-m

1
p(p 1)

Proof We can write r/(r) =


X
rR

d|r

log R +
m

X
p-m

log p
+O
p2 p + 1

log R/m
R

2 (d)/(d) to obtain in the first case

X 1 X 2 (d)
X 2 (d) X 1
X 2 (d)
1
R
d
=
=
=
(log + + O( ))
(r)
r
(d)
(d)
r
d(d)
d
R
rR

rR
d|r

dR

d|r

= c(log R + ) + O

log R
R

dR


,

by (1.2.1). The next two estimates follow analogously but more easily. The last
estimate is an easy generalization of the first.
2
PropLS2

Proposition 27.3
X
qQ

1
(q)

2
X+N


X+N

X
X
N


an (n) 
|an |2 .
log Q + Q log log Q



R

(mod q) n=X+1
cond R

n=X+1

Proof Suppose that the character (mod q) is induced from the primitive
character (mod r). Let m be the product of the the primes that divide q
but not r and write q = rm` so that (r, m) = 1, and p|` = p|rm. Hence
(q) = (r)(m)` and
X
X
an (n) =
an (n);
n

n: (n,m)=1

and therefore the left side of the above equation equals



2



X 2 (m)
X
X
X
X


1
1


a
(n)
.
n


(m)
(r)
`Q/rm `
mQ
RrQ/m
(mod r) X<nX+N
p|` = p|rm
(n,m)=1
(r,m)=1


.

128

Primes in progressions, on average

The last sum is

r
(r)

m
(m) .

We partition the sum over r into dyadic intervals

y < r 2y; in such an interval we have


the above becomes
 log log Q

X 2 (m)m
(m)2

mQ

 log log Q

r
(r)2

r
(r)

X
y=2i R, i=0,...I
2I R:=Q/m

log log y
,
y

LargeSieve

and so by (16.4)

X+N
X
1
(N + y 2 )
|an |2
y
n=X+1

 X+N
X 2 (m)m  N
X
Q
|an |2 ,
+
2
(m)
R
m
n=X+1

mQ

which implies the result.


Let
(R) (x; q, a) = (x; q, a)

1 X
(q)

(a)

rR (mod q)
r|q
cond =r

CorLS3

(n)(n),

nx

)
so that (1) (N ; q, a) = (N ; q, a) (N
(q) .

Corollary 27.4 For log N R Q with Q N we have


2
X X
X 1

(N ; q, a) (N )  log Q


(q)
(r)

qQ (a,q)=1

rR

+O


2


X



(n)(n)



(mod r) nN

N 2 log2+o(1) N
+ QN log N log log N
R

SumSqk

PropLS2

Proof By (16.3), and taking an = (n), X = 0 in Proposition 27.3, we deduce


that


2
X X
N

log N + Q N log N log log N.
(R) (N ; q, a) 
R
qQ (a,q)=1

by using the prime number theorem. Now, if (mod q) is induced from


(mod r) then
X
nN

(n)(n) =

X
nN

(n)(n)

(pa ) log p,

p N
p|q, p-r

hence the error term in replacing by here is  ((q) (r)) log N , and in
the square is  ((q) (r))N log N , Therefore the total such error is

The Barban-Davenport-Halberstam-Montgomery-Hooley Theorem

129

X X (q) (r)
N log N  N log N log Q log R(log log Q)2  N (log N )2+ ,
(q)

rR qQ
r|q

SumSqs

which is smaller than the above. What remains is, by (16.1),


2


X

X 1
X 1 X
X
X


=

(n)(n)


(q)
(r)


qQ

rR

(mod r) nN

rR
r|q

2

X
X


(r)


(n)(n)
,


(q)
qQ
(mod r) nN
r|q

and the result follows.

Using this we can now prove the Barban-Davenport-Halberstam-Lavrik-MontgomeryHooley theorem.


BDHLMH

Proof of Theorem 27.1 Let Q0 = Q/ log2 N and R = (N log3 N )/Q, and use
the
Siegel-Walfisz Theorem with A = 2C + 6 and B = C + 2 so that Corollary
CorLS3
27.4 yields
2
X X

(N ; q, a) (N )  QN.

(q)
qQ0 (a,q)=1

We are left with the sum for Q0 < q Q, which we will treat as the sum for
Q0 < q N , minus the sum for Q < q N . We describe only how we manipulate
the second sum, as the first is entirely analogous.
Now the qth term in our sum equals
X

log2 p + 2

log p1 log p2

p1 <p2 N
p2 p1 (mod q)

pN

(N )2
,
(q)

plus a small, irrelevant error term made up of contributions from prime powers
that divide q. We sum the middle term over all q in the range Q < q N .
Writing p2 = p1 + qr we have r N/q < N/Q, so that p2 p1 (mod r) with
N p2 p1 + Qr, and therefore the sum equals
X
X
2
{(N ; r, p) (p + Qr; r, p)} log p
rN/Q pN Qr

X
rN/Q

2
(r)

(N p Qr) log p + O

pN Qr

X (N Qr)2
=
+ O (N Q)
(r)
rN/Q

= cN 2 (log N/Q + 0 3/2) + O(QN log(N/Q)),


LemLS1

(r)
logB N
rN/Q
X

by the Siegel-Walfisz theorem and Lemma 27.2. We deduce that the sum of the
middle terms over all q in the range Q0 < q Q is therefore cN 2 log Q/Q0 +

130

Primes in progressions, on average

O(QN log(N/Q)). OnLemLS1


the other hand the sum of the final term over all Q0 <
q Q is, by Lemma 27.2, c(N )2 log Q/Q0 + O(N 2 /Q0 log N ). Using the strong
version of the prime number theorem these two terms sum to O(QN log(N/Q)).
By the prime number theorem with error term O(N/ log N ), the first term sums
to QN log N + O(QN log(N/Q)), yielding the result.
2
27.2

The Bombieri-Vinogradov Theorem

This is an extremely useful tool in analytic number theory, showing that the
primes up to x are well distributed in arithmetic progressions mod q, on average
over q x1/2+o(1) .
The Bombieri-Vinogradov Theorem. For any fixed A > 0 there exists B =
B(A) > 0 such that


X

x
(x)


max (x; q, a)
(q)
(log x)A
(a,q)=1
qQ

where Q =

x/(log x)B . In fact one can take any fixed B > A + 3.

PSelect 1 to be that primitive character with conductor in (1, R] for which


| nx (n)(n)| is maximized. The strong form of the Siegel-Walfisz Theorem
(which needs to be given in the previous
section) states that if primitive 6= 1

P
or 1 then | nx (n)(n)|  x/e4c log x .

StrongBV

Corollary 27.5 If x1/2 /ec log x Q x1/2 then




X

(x)
(x, 1 )

max (x; q, a)
1 (a)
 Q x log3+o(1) x,

(q)
(q)
(a,q)=1
qQ

where the 1 term is only included if cond(1 )|q.

With Q = x1/2 /ec log x we see that we get a much stronger bound than in
the Bombieri-Vinogradov Theorem at the cost of including in or terms.
In order to prove these results we continue to develop the large sieve.
PropLS6

Proposition 27.6 We have


X
qQ

1
(q)



X+M

X

am (m)


(mod q) m=X+1
cond R



+N
YX



bn (n)



n=Y +1

v
!
u X+M
YX
+N
u X

MN
2
log Q + Q + ( M + N ) log Q log log Qt
|am |2
|bn |2 .
R
m=X+1

n=Y +1

The Bombieri-Vinogradov Theorem

131

Exercise
27.7 Prove this result. Remarks: If one Cauchys thepresult in PropoPropLS2
sition 27.3 one obtains
a weaker result,with log2 Q replaced by (Q/R) log Q as

the coefficient of M + N in the bound given. To prove the above one proceeds
analogously to the proof of Proposition 13.2. One can Cauchy in this exercise
with m fixed, to obtain the result given here.
PropLS8

Proposition 27.8 Suppose that an , bn are given sequences


P with an , bn = 0 for
n R2 , and |an | a0 , |bn | b0 for all n x. If cN := mn=N am bn then




x
X 1
X

X


cN (N )  a0 b0
+ Q x log2 x log log x.

(q)
R


qQ

(mod q) N x
cond R

Proof We begin by noting that


X
X
cN (N ) =
am (m) bn (n).
N x

mnx

PropLS6

We will partition the pairs m, n with mn x in order to apply Proposition


27.6.
PropLS6
For the intervals X < m X + M, Y < n Y + N , Proposition 27.6 yields the
upper bound
!

MN
a 0 b0 M N
log Q + Q + ( M + N ) log2 Q log log Q
R

We now describe the partition for m in the range X < m 2X. Let Y = x/X.
We begin with all X < m 2X, n Y /2. Then in step k, with k = 1, 2, . . . K,
we take








2j
2j + 1
2j + 2
2j + 1
1+ k X <m 1+
X, Y
1+
<nY
1+
,
2
2k
2k
2k
for 0 j 2k1 1. The total upper bound from all these terms is
!

XY
2
log Q + KQ + ( X + Y ) log Q log log Q.
 a0 b0 XY
R
Let K be such that 2K  Y . Then, for each m, X < m < 2X there are  1 values
of n x/m not yet accounted. Hence these missing pairs contribute  a0 b0 QX,
and so in our construction we interchange X and Y to guarantee
that X Y .
Hence the total error from these unaccounted-for points is  Q x in total.
We now sum up the upper bound over X = 2j R2 for j = 0, 1, 2, . . . , J where
J
2 = x/R4 (since if m < R2 then bm = 0, and if m > x/R2 then n < R2 and so
cn = 0), to obtain the claimed upper bound.
2
We now prove a version of the Bombieri-Vinogradov Theorem:

132

Primes in progressions, on average

CorLS9

Corollary 27.9 If R e log x and Q x1/2 then


x
X

+ Q x log3 x log log x.
max | (R) (x; q, a)| 
R
(a,q)=1
qQ

Proof The left side is evidently


X 1

(q)
qQ

|(x, )| .

(mod q)
cond R

PropLS8

Our
goal is to bound this using Proposition 27.8 so, as in the proof of Theorem
PNTapsk
20.1, we let g be totally multiplicative with g(p) = 0 if p R2 and g(p) = 1
otherwise. Then we letPropLS8
an = g(n)(n) for n > 1 and bm = g(m) log m. To be able
27.8
we are forced to take a1 = 0 (rather than 1 as in the
to apply Proposition
PNTapsk
proof of Theorem 20.1),PropLS8
and so (a b)(n) = R2 (n) g(n) log n. We substitute
27.8, and bound the contribution of
this into PropositionP
the powers of the
primes R2 by Q pR2 log x  QR2 log x/ log R  Q x, which yields the
above upper bound for the sum of |(x, ) G(x, )|, and P
hence for the sum of
(R)
max(a,q)=1 | (R) (x;
q,
a)

G
(x;
q,
a)|
where
G(x,
)
:=
nx (n)g(n) log n
P
and G(x; q, a) := nx, na (mod q) g(n) log n.
Now, by the small sieve, we know that
 1/2
x log x
1
x
(1)
G (x; q, a) 
u+
(q) log R u
q

where x = R2u , so that this is  x/(q)e4

X
qQ

max |G(R) (x; q, a)| 


(a,q)=1

log x

. We immediately deduce that

X X

x
x

(r)

(q)e4log x  elog x ,
qQ rR
r|q

and the result follows.


A

1/2

1/2

Corollary 27.10 Fix A > 0. If x / log x < Q x


then


X

(x)
max (x; q, a)
 Q x log3 x log log x

(q)
(a,q)=1
qQ

CorLS9

Proof Let R = logA+1 x in Corollary 27.9, and bound | (R) (x; q, a) (1) (x; q, a)|
by the Siegel-Walfisz Theorem.
2
The Bombieri-Vinogradov Theorem is an immediate consequence of this result.
StrongBV

CorLS9

Proof of Theorem 27.5 Let R = ec log x in Corollary 27.9.. There are at most
R2 characters 6= 1 or1 in the sum (R) (x; q, a), and hence their contribution
is  (R2 /(q))x/e4c log x . Summing over all q Q, their total contribution is
 x/R. The result follows.
2

28
INTEGRAL DELAY EQUATIONS
We have seen two basic examples of multiplicative functions:
Those for which f (p) = 1 for all primes p > y, and typically
the mean value
GenFundLem
of f (n) tends to P(f, x), an Euler product (see Proposition 3.6).
Those for which f (p) = 1 for all primes p y. We saw the example of the
smooth numbers, for which the mean value up to y u is given by (u), a function
which we defined in terms of an integral delay equation. We will now show that
this is typical.
FnToDelEqn

Proposition 28.1 Suppose that f is a totally multiplicative function with f (p) =


1 for all p y. Define
(t) :=

1 X
f (d)(d),
(y t )
t
dy

(where (x) = mx (m) as usual) so that |(t)| 1 for all t, and (t) = 1 if
t 1. Let (t) = 1 if t 1, and
Z
1 u
(u t)(t)dt for all u 1.
(28.1)
(u) =
u 0
Ru
(Typically one writes (g h)(u) := 0 g(u t)h(t)dt for the (integral) convolution
of the two functions g and h.) Then, for x = y u , we have


u
1 X
f (m) = (u) + O
.
(28.2)
x
log y

IntDelEqn

IntDel1

mx

Exercise 28.2 Convince yourself that the functional equation for estimating
smooth numbers, that we gave earlier, is a special case of this result.
P
t
t
t
t
Proof Define s(t) :=
) if
t f (m) so that s(t) = 1 + O(y
my
PS(y )/y = y
P
t 1. We note that px |1 f (p)|/p 2 y<px 1/p 2 log u + O(1/ log y).
ConvulApprox
Then, by (??) and the prime number theorem in the form (D) = D+O(D/(log D)1+ ),
we obtain


Z
1 u
u
s(u) =
s(u t)(t)dt + O
.
u 0
log y
Now if (u) = |s(u) (u)| then we deduce that (t) y t if t 1, and
Z
1 u
Cu
(u)
(u t)dt +
,
u 0
log y

134

Integral Delay equations

for some constant C > 0. We claim that (v) < 2Cv/ log y for all v > 0 for if
not, let u be minimal for which (u) 2Cu/ log y, so that
Z
Cu
Cu
1 u 2C(u t)
Cu
(u)
<
dt =
,
log y
log y
u 0
log y
log y
IntDel1

a contradiction. Thus (28.2) follows.

This result shows that the mean value of every such multiplicative function
can be determined in terms of an integral delay equation.
28.1

IntDelEqn

Remarks on (28.1)

We shall suppose that is a measurable function : R0FnToDelEqn


U with (t) = 1
for 0 t 1, and then define (t) as in Proposition 28.1. We make a few
straightforward observations:
Ru
Since each |(t)| 1 hence |(u)| M (u) := u1 0 |(t)|dt.
|(u)| 1 for all u 0 for, if not, there exists u > 1 for which |(u)| |(t)|
for all 0 t u and hence |(u)| M (u). But this would imply |(u)| =
M (u) = |(t)| for all 0 t u, and in particular |(u)| = 1.
M (u) is a non-increasing function since M0 (u) = (|(u)| M (u))/u 0.
IntDelEqn

We will now show that there is a unique solution (u) to (28.1) which can
be given as follows: Define I0 (u) = I0 (u; ) = 1, and for k 1,
Z
1 (t1 )
1 (tk )
Ik (u) = Ik (u; ) =
...
dt1 . . . dtk .
t1 ,...tk 1
t1
tk
t1 +...+tk u

Define for all k 0,


k (u) =

k
X
(1)j
j=0

j!

Ij (u; ),

and

(u) =

X
(1)j
j=0

j!

Ij (u; ).

Our goal is to show that = . We will see how this representation of is a


manifestation of the inclusion-exclusion principle.
Exercise 28.3 Show that for all j 1,
uIj (u) = (1 Ij )(u) + j ((1 ) Ij1 ) (u).
Deduce that uk (u) = (1k )(u)((1)k1 )(u). Then show that (u) = 1
for 0 u 1, and that u (u) = ( )(u) for u > 0.
To show that is the unique such function, suppose that we have another
solution . Note that |(u) (u)| = 0 for 0 u 1 and
Z u
Z u



u|(u) (u)| =
((t) (t))(u t)dt
|(t) (t)|dt.
0

Inclusion-Exclusion inequalities

135

Exercise 28.4 Modify the proof given above to show that |(u)| 1 to now
prove that |(u) (u)| = 0 for all u 0.
CompareTwoChi

Exercise 28.5 Suppose that andIntDelEqn


0 are two such functions, and let and 0
be the corresponding solutions to (28.1). Prove that (u) 0 (u) equals
Z

X
(1)j
j=1

j!

t1 ,...,tj 1
t1 +...+tj u

0 (t1 ) (t1 )
0 (tj ) (tj ) 0
...
(ut1 . . .tj )dt1 . . . dtj .
t1
tj

Deduce that if |(t) 0 (t)|  for all t then |(u) 0 (u)| u 1, for all
u 1.
Exercise 28.6 Suppose that and 0 are two such functions with (t) = 0 (t)
for 0 t u/2. Deduce that
(u) 0 (u) =

Z
u/2tu

28.2

(t) 0 (t) 0
(u t)dt.
t

Inclusion-Exclusion inequalities

Our formula for (= ) looks like an inclusion-exclusion type identity. For a


real-valued function , we now show how to obtain inclusion-exclusion inequalities for .
Proposition 28.7 Suppose that (t) R for each t. Then, for all integers k
0, and all u 0, we have
2k+1 (u) (u) 2k (u).
IntDelEqn

Proof In (28.1) we had u = 1 (1 ) . Subtracting uk = 1 k


(1 ) k1 (which we proved in exercise 3.1) we obtain
uk = 1 k + (1 ) k1 .
where k (u) = (1)k+1 ((u) k (u)). We wish to prove that k (u) 0 for all
u 0, for each k 0. For k = 0 we have 0 = 1 and so 0 (u) = 1 (u)
0 by the above. Then, by induction, we have that uk (u) (1 k )(u) as
1 (t), k1 (u t) 0, and then we deduce our result as in the proof given to
show that |(u)| 1.
2
GenFundLem

Remark 28.8 It would be good to improve Proposition 3.6 to an estimate like


(1 + ou (1))P(f ; x) + o(1/(log x)A ), as in the Fundamental Lemma of Sieve
Theory. The proof there works for f with 0 f (n) 1. As a first goal we could
aim for all real-valued f , that is where 1 f (n) 1, for all n. This Proposition
perhaps can help us use the technology of sieve theory to do this?

136

28.3

Integral Delay equations

A converse Theorem

We now show that for every (appropriate) such integral delay equation there is
an appropriate multiplicative function whose mean value can be determined in
terms of that integral delay equation.
Converse

Proposition 28.9 Let S be a closed subset of U and suppose that is a measurable function whose values lie in, K(S), the convex hull of S, with (t) = 1
for all t 1. Given  > 0 and u 1 there exist arbitrarily large y and f F(S)
with f (n) = 1 for n y and


X


(t) 1
f (m)(m)  for almost all 0 t u.

(y t )
t
my

IntDelEqn

Consequently, if (u) is the solution to (28.1) for this then




u
1 X

f (n) = (u) + O(u 1) + O
.
yu
log y
u
ny

In particular if  = (u)/u log u then




(u)
u
1 X
f
(n)
=
(u)
+
O
+
.
yu
u
log y
u
ny

Proof Since is measurable and (t) belongs to the convex hull of S, we can
find a step function 1 within the convex hull of S such that 1 (t) = 1 for t 1,
and |(t) 1 (t)| /2 for almost all t [0, u]. 2
Now 1 (t) belongs to the convex hull of S and so can be arbitrarily wellapproximated by (integral) linear combinations of elements of S. Hence if 1 (t)
has a fixed value in (t1 , t2 ) then we can select the set of values of f (p) S when
y t1 < p < y t2 to reflect such a linear combination, and therefore if
0 (t) :=

1 X
f (p) log p,
(y t )
t
py

then |0 (t) 1 (t)| /2 for almost all t [0, u]. Hence |(t) 0 (t)|  for
almost all t [0, u]. The proof in exercise 3.3 then implies
that |(u) 0 (u)|
IntDel1

u 1, for all u 1, and the result then follows from (28.2).
2
28.4

An example for Halaszs Theorem

Now suppose that 0 (t) = 1 if t 1, 0 (t) = i if 1 < t u/2 and 0 (t) = 0 if


t > u/2. We let (t) = 0 (t) for t u/2. Suppose that 0 (u) = ei | 0 (u)|. For
2 By

almost all, we mean that the inequality is only violated on a set of measure 0.

An example for Halaszs Theorem

137

u/2 < t u we let (t) = ei() where 0 (u t) = ei | 0 (u t)|. Hence, by the


previous exercise
!
Z
Z
0
(t)
|
(u

t)|
(u) = 0 (u)+
0 (ut)dt = ei | 0 (u)| +
dt .
t
u/2tu t
u/2tu
Let be a complex number with Re() < 1, and let denote the unique
continuous solution to u0 (u) = (1 ) (u 1), for u 1, with the initial
condition (u) = 1 for u 1 (The Dickman-De Bruijn function is the case
= 0).3 For [0, 1], Goldston and McCurley [5] gave an asymptotic expansion
of ,4 and showed that when is not an integer
(u)

e(1)
,
()u1

as u .5 In our example 0 (v) = i (v) for v u/2, and so, for c = e /|(i)| =
3.414868086 . . . we have | 0 (v)| c/v. Hence taking v = u t above
Z
c
c log(u 1)
log u
|(u)| &
dv =

 M eM ,
v(u

v)
u
u
1vu/2
since, in this example we have
1

Z u/2
1 cos(vy)
1 sin(vy) 
dv +
dv ,
yR
v
v
0
1
Z y
Z Y
1 cos t + sin t
sin t
log u/2 + min
dt max
dt log u O(1).
yR 0
Y R 0
t
t

M (x, T ) min

Z

3 We will discuss this example in more detail a little later. Perhaps we should combine the
two discussions.
4 Their proof is in fact valid for all complex with Re() < 1
5 Just as we saw in the Selberg-Tenenbaum Theorem, when is an integer the behaviour
of is very different; in fact (u) = 1/uu+o(u) . Exercise: Use the Structure theorem to
compare these results.

29
LAPLACE TRANSFORMS
For a measurable function
g : [0, ) C we will denote the Laplace transform
R
of g by L(g, s) := 0 g(t)est dt. If g is integrable and grows sub-exponentially
(that is, for every  > 0, |g(t)|  et almost everywhere) then the Laplace
transform
is well defined for all complex numbers s with Re (s) > 0. We begin
IntDelEqn
with (28.1). Multiplying through by uesu and integrating over all u 0 yields
Z
Z Z u
L0 (, s) =
u(u)esu du =
(t)est (u t)es(ut) du
0

t=0

= L(, s)L(, s).


Dividing through by L(, s) and integrating yields
 Z w

L(, w) = L(, 0) exp
L(, s)ds .
0

Exercise 29.1 Show that


L(Ik (u, ), w) =
Since L(, s) = L( , s) =

k0 (1)

L(, s) =

1  1 (v) k
L
,s .
s
v

L(Ik (u, ), w)/k! deduce that


 1 (v) 
1
exp L
,s .
s
v

CompareTwoChi

Deduce, or use exercise 28.5 to show that, more generally


  (v) (v) 
1
2
L(1 , s) = L(2 , s) exp L
,s .
v
We define
E(u) = E (u) := exp

Z
0

sigmaUB

1 (t) 
dt .
t

Lemma 29.2 Suppose that (t) = 1 for t 1 and 0 (t) 1 for all t. Given
u define (t)
= (t) for t u and (t)
= 0 for t > u. We have
Z
e
1
(u)

(t)dt.
E(u) u u

Laplace Transforms

139

Proof By definition (v) =


(v) and E (v) = E (v) for v u. Now
Z u
Z
Z
Z
1
1 u
1
1
(u) =
(u) =

(t)(u
t)dt

(t) =

(t)dt

(t)dt.
u 0
u 0
u 0
u u
For s a small positive real < 1/u, we have

Z 
Z u
 1 (t)


1 (t)

1 (t)
st
L
, s log E(u) =
e dt
dt
t
t
t
0
0

Z u
Z st
1 (t)
e
=
(est 1)dt +
dt
t
t
0
u
= log(us) + O(us),

since =
1
u

R1
0

1et
dt
t

(t)dt =
0

R
1

et
t dt.

Hence


 1 (t)

1
1

e
lim L(
, s) = lim
exp L
,s =
,
s0 us
u s0
t
E(u)

and so we have the result.

30
THE SPECTRUM
30.1

The Mean Value Spectrum

We are interested in what are the possible mean values of multiplicative functions
in certain classes; for example, characters of order m. To this end we let S be a
given subset of the unit disc U, and let T be the unit circle. Let F(S) denote the
class of completely multiplicative functions f such that f (pk ) S for all prime
powers pk .6 Our main concern is:
What numbers arise as mean-values of functions in F(S)?

 P
That is, we define N (S) = N1 nN f (n) : f F(S) and then seek to
understand the (mean value) spectrum
(S) = limN N (S).7
The case of most interest to us is Sm , defined as 0 together with the mth roots
of unity, because F(Sm ) yields the possible character sums of characters of order
m. We begin by making some simple observations.
We see that ({1}) = {1}, and if S1 S2 then (S1 ) (S2 ). Moreover
(S) is a closed subset of the unit disc U, and (S) = (S) where S denotes the
closure of S, and so, henceforth, we shall assume that S is closed.
The hypothesis implies that the set S is closed under taking integer powers,
for if S then let f (p) = and so k = f (pk ) S for all k.
RealElt

Exercise 30.1 Deduce that if there exists S with 6= 1 then there exists a
real number K(S), the convex hull of S.

1SinS_m

Lemma 30.2 (S) = U or S T is finite and only contains roots of unity.


Proof If T but is not a root of unity then the set {k : k 1} is (wellknown to be) dense on the unit circle, T. Hence if S then the closure of
{k : k 1} S is T, and so T S, since S is closed.
But then the multiplicative function f (n) = nit S has mean value
it
N /(1 + it) up to N . As we let N we deduce that (S) contains the circle
{z : |z| = 1/|1 + it|}. By letting t range in (0, ) we deduce that (S) = U.
6 One can develop this theory under the less stringent conditions that (i) f is multiplicative
but not necessarily completely multiplicative; (ii) f (p) S for all primes p, but not necessarily
for prime powers. Change (i) requires only minor adjustments, whereas change (ii) makes the
theory somewhat more complicated.
7 Here and henceforth, if we have a sequence of subsets J
N of the unit disc U := {|z| 1},
then by writing limN JN = J we mean that z J if and only if there is a sequence of
points zN JN with zN z as N .

Factoring mean values

141

Hence if (S) 6= U then all elements of S T are roots of unity. Moreover


there are only finitely many, else they have an accumulation point which is not
a root of unity, and since S (and hence S T) is closed, this point belongs to S.
2
Henceforth we assume that S T is finite and only contains roots of unity.

Exercise 30.3 Show that if 1 S then 1 (S).


NoPtsNrT

Exercise 30.4 Suppose that S T is finite. Fix  > 0. Show that there exists
> 0 such that if z T and |z s|  for all s S T, then |z s| . for
all s S.
Exercise 30.5 Show that if there exists s S such that s 6=GenFundLem
1 then 0 (S).
Show that if 1 6 S then (S) = {0}. (Hint: Use Proposition 3.6).
Henceforth we may assume that there exists 1, RealElt
S with 6= 1 and therefore
there exists a real number K(S) by exercise 30.1.
If z U\{1} then define Ang(z) = arg(1z), so that /2 < Ang(z) < /2.
For any V U, define Ang(V ) to be the supremum of | arg(1 v)| as we range
over all v V with v 6= 1. We will obtain the following improvement of the last
lemma:

SinS_m

Proposition 30.6 (S) = U if and only if Ang(S) = /2. If (S) 6= U then


there exists an integer m such that S lies within the convex hull of the mth roots
of unity; that is S K(S) K(Sm ).

AngMaxRegion

Exercise 30.7 Suppose that Ang(S) = 2 . Prove that S is contained in the


convex hull of {1} {ei : 2 || }.
30.2

Factoring mean values

Our first step in understanding


the spectrum is to prove that when S T is finite
StructThm
a version of Theorem 15.1 holds with t = 0:
MeanValueStructure

Theorem 30.8 Suppose that S is a closed, proper subset of U, and that f


F(S). Let g(pk ) = 1, h(pk ) = f (pk ) if p y, and g(pk ) = f (pk ), h(pk ) = 1 if
p > y, for all k 1. If x = y u then
1X
1X
1X
f (n) =
g(n)
h(n) + ou,y (1).
x
x
x
nx

nx

nx

Throughout this section we will suppose that the mean value


of f , up to N ,
StructThm
is in absolute value. Then |tf (x, log x)|  1/ by Theorem 15.1. We can also
obtain upper bounds on the mean value directly from Halaszs Theorem:
8 Had we required all f (n) S then S would be closed under multiplication, and so S T
would be the set of mth roots of unity, for some integer m 1.

142

The Spectrum

Proposition 30.9 Suppose that S is a proper, closed subset of U. Define


Z 1
CS :=
min(1 Re(se2i ))d.
0

If t = tf (x, log x) then


1X
x

sS

f (n)  CS

nx

| log(|t| log x)| log log x


+
.
(|t| log x)CS
log x

HalExplic1

Proof To obtain this from (??) we need to bound D(f, nit , x) from below. We
may assume that t 1/ log x else the result is trivial. Now
X 1 Re(f (p)pit )
X
1 Re(spit )
D(f, nit , x)2 =

min
sS
p
p
px
e1/t <px
Z t log x
1 Re(seiv )
=
min
dv + O(1) = CS log(t log x) + O(1).
sS
v
1
2
Exercise 30.10 Show that
Z
CS =
Suppose that |
Now for small t,

nx

min (1 Re(ze2i ))d.

0 zK(S)

f (n)| > x. Our proposition yields that |t| S 1/ log x.

D(f, nit , x)2 D(f, 1, x)2 =

X |1 pit |
X Re(f (p)(1 pit ))

p
p
px

px

which is  |t| log x if |t| 1/ log x, and otherwise 2 log(|t| log x) + O(1).
Therefore we deduce that D(f, 1, x)2 S log(1/). This implies that 1 S.
Remark 30.11 In a similar vein to the Proposition, Hall [] asked for the largest
constant such that D(f, nit , x)2 D(f, 1, x)2 whenever f (p) S. This can be
re-expressed as
X Re((1 f (p))( pit )) X 1 Re(pit )

;
p
p
px

px

and then, by the prime number theorem, as


Z 1
min Re((1 s)( e2i ))d 1 + O(1/ log(t log x)).
0

sS

To approximate this we define (S) to be the maximum for which


Z 1
min Re((1 s)( e2i ))d 1;
0

sS

then D(f, nit , x)2 (S)D(f, 1, x)2 + O(1).

Factoring mean values

143

(S)
2 )

where (S) is the length of the

Exercise 30.12 Prove that (S) 21 (1


perimeter of S.
MeanValueStructure

Proof of Theorem
P 30.8.
Suppose that | nx f (n)| > x and let t = tf (x, log x). We have just proved
that |t| S 1/ log x and
D(f, 1, x)2 S log(1/). Taking y = x where  0
StructThm
very slowly, Theorem 15.1 then implies that
1X
1X
1X
f (n) xit
gt (n)
ht (n),
x
x
x
nx

nx

nx

where gt (pk ) = 1, ht (pk ) = f (pk )/pikt if p y, and gt (pk ) = f (pk )/pikt , ht (pk ) =
1 if p > y. Hence the mean values of gt and of ht are both in
absolute value.
GenFundLem
We focus first on the mean value of ht (n). By Proposition 3.6 we see that
1X
1X
ht (n) P(ht ; x) = P(ht ; y) P(f, y) = P(h, y)
h(n),
x
x
nx

nx

since

 

2
2


1 + ht (p) + ht (p ) + . . . 1 + f (p) + f (p ) + . . .  |t| log p ,


p
p2
p
p2
p
and hence


Y
|t| log p
P(ht ; y)
=
1+O
= 1 + O(|t| log y) 1.
P(f, y)
p
py

AsympT1

Now, by Lemma 12.4 we see that


xit

X
X
1X
1
1
1X
gt (n)
gt (n)nit =
g(n) + O
|g(n) gt (n)nit | .
x
x
x
x
nx

nx

nx

nx

Now if |y| = |z| 1 then |y z| |1 z/y| = |1 elog(z/y) |  | log(z/y)|, and so


X
nx

|g(n) gt (n)nit | 

| log(g(n)/gt (n)nit )|

nx

 |t|

X
py
k1

XX

k|t| log p

nx pk kn
py

log p

x
 x|t| log y = o(x),
pk

since g(pk )/gt (pk )(pk )it = (pit )k if p y and = 1 if p > y. The result follows.
2

144

The Spectrum

30.3

The Structure of the Mean Value Spectrum


MeanValueStructure

Theorem 30.8 allows us to factor the spectrum (S) into two parts:
The first corresponds to mean values for multiplicative functions that only
vary from 1 on the small primes. These mean values can be realized in terms of
Euler products. We denote this Euler product spectrum by P (S).
The second corresponds to mean values for multiplicative functions that
only vary from 1 on the large primes. These mean values can be realized in
terms of solutions to integral delay equations. We denote this delay equation
spectrum by (S).
MeanValueStructure
Hence Theorem 30.8 implies that9
(S) = P (S)(S).
We will now be more precise in analyzing the sets P (S) and (S).
30.4

The Euler product spectrum


P
P (S) is the set of mean values x1 nx f (n) where f F(S) with f (pk ) = 1 if
GenFundLem
x
p > y, and log
log y . Proposition 3.6 implies that this is the same as the set of
(finite) Euler products P(f ; x) where f F(S).
EulSpec

Proposition 30.13
P (S) = {e(1)t : t 0, K(S)}.
Proof Since f is totally multiplicative then

1
Y
1
f (p)
P(f ; x) =
1
1
p
p
px

and so



X
X f (pm ) 1
1
= (1 )
log 1
log P(f ; x) =
mpm
p
px
m1

px

for some K(S) since each f (pm ) S. Hence P (S) {e(1)t : t 0,


K(S)}.
In the other direction, for a given and t, select yP
< x very large so that
P
1/p
=
t
+
O(1/z),
and
then
the
f
(p)

S
so
that
y<px
y<px f (p)/p = t +
O(1/y), which is certainly possible if y is sufficiently large. But then log P(f ; x) =
(1 )t + O(1/y). Letting y gives the result.
2
If 6 R then {e(1)t , t 0} is a spiral which begins at 1 and ends at 0.
Exercise 30.14 Deduce that P (S) = P (K(S)).
Exercise 30.15 Prove that P (S) P (S) = P (S).
9 We

define the product of two sets A, B C to be AB = {ab : a A, b B}.

The Euler product spectrum

145

Exercise 30.16 We showed above that we can assume that there exists real
6= 1 with S R. Deduce that P (S) contains the straight-line connecting
0 to 1. Deduce further that P (S) is starlike; that is the straight-line connecting
any point of P (S) to the origin, lies entirely inside P (S).
sizez

Exercise 30.17 Prove that if z P (S) then |z| exp(| arg(z)| cot(Ang(S))).
Corollary 30.18 Ang(P (S)) Ang((S)) Ang(S).
Proof Suppose that S with | arg(1 )| Ang(S) . Now if vt =
e(1)t P (S) then 1 vt = (1 )t (1 )2 t2 /2 + . . .. If t is sufficiently
small then | arg(1 vt )| | arg(1 )|  Ang(S) 2, and the result follows.
2
i

The interior of the unit disk U is {ere

: /2 /2, r > 0}

Corollary 30.19 Select s+ , s S \ {1} so that + = arg(1 s+ ) is maximal


and = arg(1 s ) is minimal, with 2 < 0 + < 2 . Then
i

P (S) = 1 {ere

: + , r > 0}.

This can be written as the boundary and interior of the curves given by
et cos (cos + +i sin + ) and et cos + (cos +i sin ) for 0 t

2
.
sin(+ )

In particular, the circle of radius eN , where N = 2/| tan + tan |, centered


at 0, is the largest such circle which lies inside P (S).
Proof By definition for every point s of S we may write 1 s = rei with
+ for some r > 0. This is therefore true for every K(S).
Now, if = arg(1 s) where s = x + iy then (1 s) = (1 x)(1 + i tan ).
n
Therefore 1x
(1 s) = n + in tan . Hence every point on the line between
n + in tan + and n + in tan takes the form t(1 ) with t 0 and K(S)
for each n 0. This completes the proof of the first part of our result.
Now if n N then all numbers ofEulSpec
the form n + in tan i, with 0 2
are of this form and so Proposition 30.13 yields that P (S) contains the circle
of radius en .
2
Exercise 30.20 Prove that Ang(S) = /2 if and only if there is an infinite
sequence of points rn ein S such that n 0 as n with rn 1 and
rn = 1 + o(n ).
SinS_m

Proof of Proposition 30.6 Suppose that Ang(S)


= /2. Take the points
EulSpec
rn ein S from the last exercise. By Proposition 30.13, the points on the spiral
in
e(1rn e )t (S) for t 0. For each (, ] the consecutive points on the
spiral with argument differ by a multiplicative distance e2(1rn cos n )/| sin n | =
1 + on (1), and hence as n we see that every point on this ray is a limit
point of (S). Hence (S) = U.

146

The Spectrum

Now suppose that Ang(S) < /2. Note that if there exists S
T which
1SinS_m
is not a root of unity then T S (as in the proof of Lemma 30.2) and so
Ang(S) = /2. Hence we may assume that S T is finite and consists only of
roots of unity.
Now suppose that S T which is an mth root of unity. We now show that
Ang(S) < /2. If not then we have a sequence of points rn ein S with n 0
as n with rn 1 and rn = 1 + o(n ). Then rnm eimn = (rn ein )m S
but here mn 0 as n with rnm 1 and rnm = 1 + o(mn ), so that
Ang(S) = /2, by the previous exercise, a contradiction.
Let us suppose that every element of S T is an mth root of unity, and
select M divisible by m, so that Ang(SM ) > maxST Ang(S) and sufficiently
large that the largest distance
from T to the perimeter of K(SM ) is < then
NoPtsNrT
K(S) K(SM ) by exercise 30.4.
2
To simplify our treatment of P (S) we shall now restrict attention to totally
multiplicative functions. Hence we define (S) to be the spectrum of meanvalues of totally multiplicative functions, and similarly P (S) and (S). All of
the above proofs are still valid, and so we deduce that (S) = P (S) (S) and
(S) = (S).
Exercise 30.21 (Open problem) Define (S) to be the spectrum of meanvalues of all multiplicative functions with f (pk ) S (but not necessarily totally
multiplicative). Similarly define P (S). It is evident that P (S) P (S). Can
you find elements of P (S) that do not belong to P (S)? Can you determine
P (S)?10
30.5

The Delay Equation Spectrum


IntDelEqn

Let (S) denote the values (u) = (u) obtained from (28.1) when is a
measurable function
with (t) K(S) for all t 0, with (t) = 1 for 0 t 1.
FnToDelEqn
Proposition 28.1 implies that any mean value of a multiplicative function that
only varies from 1 on the large primes, belongs to (S). On the other hand if
(u) (S) then
there is an f F(S) whose mean value up to x is (u), by
Converse
Proposition 28.9.
Exercise 30.22 Explain why (S) = (K(S)). Deduce that (S) = (K(S)),
so we can assume throughout that S is a convex, closed, proper subset of U.
Lemma 30.23
P (S)(S) (S)
Exercise 30.24 Deduce that (S) and (S) are all also starlike. Then deduce
that (S) = (S).
10 The easiest examples arise by simply taking the p = 2 term. If S = S let f (2k ) = i so
4
. However the spirals in S4 look like etit , so with angle
that (1 12 )(1 + 2i + 4i + . . .) = 1+i
2
/4 the maximum in size is e/4 (1 + i) and e/4 = 0.4559381277 < 1/2.

The Delay Equation Spectrum

147

Proof Suppose that we are given e(1)t P (S) and (u) (S). Let x
be sufficiently large that
we can choose f (p) with z < p y = x1/u , as in the
EulSpec
P(f, y) = e(1)t +O(1/z). We select f (p)
proof of Proposition 30.13, for which
Converse
for y < p
x, as in Proposition 28.9.GenFundLem
We let f (p) = 1 for all p z. Applying
MeanValueStructure
Theorem 30.8 and then Proposition 3.6 we
deduce that the mean value of f is
MeanValueStructure
e(1)t (u). Now applying Theorem 30.8 again, but this time with y there
equal to z here, we find that h = 1, g = f and so the mean value of f belongs to
(S).
2
This result implies that P (S) (S). Are there elements of (S) that do
not belong to P (S)? In general, the spectrum contains more elements than
simply the Euler products. For example, the spectrum of Euler products for
S = [1, 1] is simply the interval [0, 1], whereas negative numbers are part of
(S). We have seen that P (S) is straightforward to fully understand, whereas
(S) remains somewhat mysterious. We will discuss this in more detail in the
next chapter.

31
RESULTS ON SPECTRA
31.1

The spectrum for real-valued multiplicative functions

The spectrum has been fully determined in only one interesting case, where
S = [1, 1]; that is real-valued multiplicative functions. In that case, in [GS], we
proved that ([1, 1]) = [1 , 1] where
1 = 1 2 log(1 +

Z
e) + 4
1

log t
dt = 0.656999 . . . .
t+1

In other words, for any real-valued completely multiplicative function f with


1 f (n) 1, we have
X
f (n) (1 + o(1))x;
nx

with equality if and only if D(f, f1 , x) = o(1) where


(

1
for primes p x1/(1+ e)

f1 (p) =
1 for primes x1/(1+ e) p x.
 
Applying this to the totally multiplicative function f (n) = np , for some
prime p, we deduce that the number of integers below x that are quadratic
residues (mod p) is

 
n
1 + 1
1X
1+

x + o(x) = (0 + o(1))x,
2
p
2
nx

where 0 = 0.171500 . . .11 More colloquially we have:


If x is sufficiently large then, for all primes p, more than
17.15% of the integers up to x are quadratic residues (mod p).
Exercise 31.1 Prove that the constant 0 here is best possible.
11 One

can derive the following curious expression for 0 (from the definition of 1 ):
0 = 1

2
e
1
1
log(1 + e) log
.
+2

6
1+ e
n2 (1 + e)n
n=1

The number of mth power residues up to x

31.2

149

The number of mth power residues up to x

We now establish that similar results hold for m-th power residues. For each
integer m 2, define the minimal density and minimal logarithmic density of
mth power residues modulo primes, to be
m = lim inf inf

x ` prime

0
and m
= lim inf inf

x ` prime

1
x

1,

nx
nam (mod `)

1
log x

X
nx
nam (mod `)

1
.
n

We already know that 2 = 0 and we will see that 20 = 1/2. For m 3 we show
that





1 X km
1
1
1
0
min
m/e .
0 < m (m) m < m1 m
0 e
m
2
(km)!
e
k=0

0
for any m 3. Calculating the
We do not know the exact values of m and m
0
minimum over , we found that 3 0.3245, 40 0.2187, 50 0.14792, and
60 0.1003. However we do obtain the following consequence:

For any given integer m 2, there exists a constant m > 0 such


that if x is sufficiently large then, for all primes p, more than
m % of the integers up to x are mth power residues (mod p).

ImportantEx

31.3

An important example

Consider the multiplicative function f with f (p) = 1 for p y = x1/u and


f (p) = S for y < p x. Write (u) = (u) which satisfies the integral
delay equation
Z u
Z u1
u (u) =
(t)dt +
(t)dt,
u1

0 (u)

and therefore
= (1 ) (u 1)/u. The case = 0 has already been
discussed in detail. In general we can compute the mean value for small u, using
our results that = (u), and Ij (u, ) = 0 if j u. Hence:
If 1 u 2 then the mean value is 1 (1 ) log u. Therefore if = rei
K(S) (with 6= 1) then z = 1 (1 )v (S) for 0 v log 2. If z = mei
then one can showsizez
that m = sin / sin( + ). On the other hand, if z P (S)
then by exercise 30.17
|z| exp( cot )

1
1
sin
<
=
,
1 + cot
cos + sin cot
sin( + )

which is a contradiction. Hence z is in (S) but not in P (S).

150

Results on spectra
2

If 2 u 3 then the mean value is 1 (1 ) log u + (1)


2

dt1 dt2
t1 ,t2 1 t1 t2 .
t1 +t2 u

For =1 we see that 1 ( e) = 0 and hence 01 (1 + e) = 0. In fact


1 (1 + e) = 1 and one can show that this is the absolute minimum value 1
takes. Moreover, by continuity, 1 (u) takes on all values in the interval [1 , 1]
showing that (S) [1 , 1]. This leads us to the multiplicative function f1 in
the first section of this chapter.
IntDelEqn

AverageIntEqn

Exercise 31.2 Show that for any , satisfying (28.1) (that is u(u) = (
)(u) for all u 0) weIntDelEqn
have u(1 )(u) = ((1 ) (1 + ))(u). Go on to show
that if j , j satisfy (28.1) for j = 1, 2 then u(u) = ( )(u) for all u 0
where = 1 + 2 and = 1 2 .
Ru
Define M (u) = M (u) := 0 (t)dt; that is M = (1 AverageIntEqn
). If (t) = 1 for
all t > 1 then M1 (u) = u for 0 u 1 and, by exercise 31.2,
Z u
uM1 (u) = 2
M1 (t)dt for u > 1.
u1

This is much like the functional equation for (u) and can be analyzed in much
the same way:
Exercise 31.3 Prove that M1 (u) = ((2e + o(1))/u log u)u . Use the fact that
this is decreasing so fast to deduce that for all sufficiently large v there exists u
with v < u < v + v/ log v such that 1 (u)  ((2e + o(1))/u log u)u .
31.4

Open questions of interest

What is ([, 1])? That is the spectrum for real-valued f with the each f (p)
[, 1]. An easy Corollary of Corollary 27.18 is that if S contains a non-real point
that (S) contains a negative real-number. We want to know here if this is true
when S is real but contains negative real numbers. Evidently [0, 1] Gamma(S)
is this case.
What is the spectrum for the mean-value of real-valued multiplicative functions up to x, when f (p) = 0 for all p y? We will see that this is useful in
understanding the distribution of quadratic residues.

32
THE NUMBER OF UNSIEVED INTEGERS UP TO X
This is the article original,
Qmore-or-less unedited
One expects around x p6P, px (1 1/p) integers up to x, all of whose prime
factors come from the set P . Of course for some choices of P one may get rather
more integers, and for some choices of P one may get rather less. Hall [4] showed
that one never gets more than e + o(1) times the expected amount (where
is the Euler-Mascheroni constant), which was improved slightly
by Hildebrand
Q
[5]. Hildebrand [6] also showed that for a given value of p6P, px (1 1/p),
the smallest count that you get (asymptotically) is when P consists of all the
primes up to a given point. In this paper we shall improve Hildebrands upper
bound, obtaining a result close to optimal, and also give a substantially shorter
proof of Hildebrands lower bound. As part of the proof we give an improved
Lipschitz-type bound for such counts.
Define
1X
1X
f (n),
and
G(w) := lim sup
f (n),
g(w) := lim inf
x x
x x
nx

nx

where both limits are taken over the class of multiplicative functions f with
P(f, x) = 1/w + o(1).
If f is completely multiplicative with f (p)P
= 1 for p x1/u and f (p) = 0 for
x1/w p x then P(f ; x) = 1/w + o(1) and nx f (n) = (x, x1/w ) x(w).
Hence g(w) (w) and A. Hildebrand [6] established that in fact g(w) = (w).
Since (w) = ww+o(w) note that g(w) decays very rapidly as w increases.

Regarding G(w), R. Hall [4] established that G(w)


[5]
R w e /w andRHildebrand

improved this slightly by showing that G(w) w1 0 (t)dt. Since 0 (t)dt = e

this does
R mark an improvement over Halls result, but the difference from e /w
is w1 w (t)dt = ww+o(w) which is very small. In this paper we shall prove that
G(w) = e /w 1/w2+o(1) , but it remains to determine G(w) more precisely. We
shall also give a shorter proof of Hildebrands result that g(w) = (w).
1

Theorem 32.1 For all w 1 we have that


G(w) max

w0

Z
(w + ) +
0


(t)
dt .
w+t

When w is large, the maximum is attained for log w/ log log w, and yields
G(w)

e
(e + o(1)) log w

.
w
w2 log log w

152
2

The number of unsieved integers up to x

Theorem 32.2 For all large w we have


e
1
2
2/3
w
w exp(c(log w) (log log w)1/3 )

G(w)
for a positive constant c.

We also give an explicit upper bound for G(w) valid for all w.
3

Theorem 32.3 For 1 w we have that G(w) 1 log w + (log w)2 /2 and
equality holds here for 1 w 3/2. For w 1 put (w) := 21 (w + 1/w) +
log w

2 (w 1/w). Then G(w) (w) log(1 + e /(w(w))).


The first bound in Theorem 3 is better than the second for w 3.21 . . ., when
the second bound takes over. Note that the second bound in Theorem 3 equals
e /w (e2 + o(1))/w3 log w, only a little weaker than the bound in Theorem 2,
while being totally explicit.
In the range 1 w 3/2 we may check that the right side of (1.3) equals
1 log w + (log w)2 /2 = G(w). Perhaps it is true that G(w) is given by the right
side of (1.3) for all w.
We end this section by giving a simple construction that proves Theorem 1.
Proof of Theorem 1 Let y be large and consider the completely multiplicative
function f defined by f (p) = 0 for p [y, y w ] and f (p) = 1 for
Q all other primes p.
Put x = y w+ where 0 w and note that P(f, x) = ypyw (1 1/p)
1/w. An integer n x with f (n) = 1 has at most one prime factor between y w
and x, and all its other prime factors are below y. Hence
X
X
f (n) = (x, y) +
(x/p, y),
y w px

nx

and using (1.2) and the prime number theorem this is


x(w + ) + x

X
y w px


1 
log p 
w+
x (w + ) +
p
log y

Z
0


(t)
dt ,
w+t

which gives the lower bound (1.3) for G(w). For large w we see that
Z
(w+)+

and since

R
0

(t)
1
dt =
w+t
w+

t(t)dt < and

R
0

Z
(t)dt+

t(t)
dt+(w+)
(w + )(w + t)

(t)dt = e (1+o(1)) the above is

 1 
1
(e (1+o(1)) ) + O 2 .
w+
w
The quantity above attains a maximum for = (1 + o(1)) log w/ log log w, completing the proof of Theorem 1.
2

Reformulation in terms of integral equations

153

We noted above that G(w) = 1 log w + (log w)2 /2 for 1 w 1.5 (with
the maximum attained in (1.3) at = w). Next we record the bounds obtained
for 1.5 w 2 (though here the maximum is attained with a little smaller
than w).
w
G(w)
G(w)

1.5
.676735
.676736

1.6
1.7
1.8
1.9
2.0
.640255 .608806 .581685 .557392 .535905
.640449 .610155 .584960 .564135 .547080

The upper and lower bounds for G(w) given by Theorems 1 and 3.
32.1
Reformulation in terms of integral equations
Note that P(f, y u ) 1/E(u). Analogously to g(w) and G(w) we may define
g(w) = lim
inf (u),
u,

and

G(w)
= lim sup (u),
u,
E (u)=w

E (u)=w

2.2

where the limits are taken over all pairs u, with u 1, where is a measurable
function for which (t) = 1 for t 1 and (t) [0, 1] for all t, and with
E (u) = w. We shall show that these quantities are in fact equal to g(w) and
G(w) respectively. Something similar was stated (but not very precisely) by
Hildebrand in his discussion paper [7].

Theorem 32.4 We have g(w) = g(w) and G(w) = G(w).


To prove Theorem 2.2 we need to know how small primes affect the meanvalues of multiplicative functions
Prove that
1 w
1  w 

g(w) g(w) min g


,
and
G(w)
G(w) max G
.
wv1 v
wv1 v
v
v
32.2 An open problem or two
Fix , 0 < < 1. Let f be a multiplicative function such that 0 f (n) 1,
and
X f (p) log p
= ( + o(1)) log x.
p
px

Prove that
1
Z 1/
X f (n)
Y
f (p)
(e + o(1))
(t)dt
1
,
n
p
0

nx

px

R +
where is the Dickman-de Bruijn function. (Note that 0 (t)dt = e ). This
inequality is sharp. To see that take f such that f (p) = 1 for all primes p x
and f (p) = 0 otherwise.
R u We can reformulate this in terms of integral equations. Define (u) :=
(t)dt, then Halls conjecture is the following
0

154

The number of unsieved integers up to x

Conjecture 32.5
Z u

Z
(t)dt

u
(u)

Z

(t)dt exp
1


(t)
dt .
t

A stronger conjecture asserts that



(u)

u
(u)


.

P
If true , this implies the result of Hildebrand that lim inf x x1 nx f (n) exists
and is equal to (), where the limit is taken over the class of multiplicative
functions f with


Y
1
f (p) f (p2 )
1
1
+
+
...
1+
= + o(1).
2
p
p
p

px

32.3 Upper bounds for G(w) and Lipschitz estimates


We are able to improve 12/ to 11/ in the special case that (t) [0, 1]
for all t.
4

Theorem 32.6 Let be a measurable function with (t) = 1 for t 1 and


(t) [0, 1] for t > 1, and let denote the corresponding solution to (2.1). Then
 u v 1 1 
u 
|(u) (v)| 
1 + log
whenever 1 v u.
u
uv
Theorem 4 follows immediately from the stronger but more complicated
whenever v
Proposition 4.2 below, and the fact that |(u) (v)| 3(uv)
u
u(1 1/E(u)). This is trivial for v 2u/3, whereas for larger v in the range, we
obtain
e
ue
3(u v)
|(u) (v)|

,
E(v)
vE(u)
u
using Halls result that (u) e /E(u).

Using (3.3) in (3.2) leads to the bound G(w)


e /w C /(w1+1/ log w) for
some positive constant C . Thus if (3.3) holds with = 1 then we would be able
to deduce that G(w) = e /w (log w)O(1) /w2 by Theorem 1.
In order to prove Theorem 3 we give the following explicit Lipschitz estimate
(see also Proposition 4.1 of [2]).

3.1

Proposition 32.7 Let be a measurable function with (t) = 1 for t 1 and


(t) [0, 1] for all t, and let (u) denote the corresponding solution to (2.1).
Then for all u 1 and 1 > 0 we have
 E(u) 1/E(u)
E(u) + 1/E(u) 
+ log E(u)
(u(1 + )) (u),
log(1 + )
2
2
and
 E(u) + 1/E(u)
E(u) 1/E(u) 
(u(1 + )) (u) log(1 + )
+ log E(u)
.
2
2

Upper bounds for G(w) and Lipschitz estimates

155

Proof We shall only prove the lower bound, the proof of the upper bound is
similar. From (2.2a,b) we see that

X
1
(Ij (u(1 + ); ) Ij (u; )) .
j!
j=1

(u(1 + )) (u)

j odd

By symmetry we see that Ij (u(1 + ); ) Ij (u; ) equals


Z
Z
1 (t1 )
1 (tj1 )
1 (tj )
j

dt1 dtj .
max(t
,...,t
,ut
...t
)t
1
j1
1
j1
j
t1
tj1
tj
t1 ,...,tj1 1
tj u(1+)t1 ...tj1

The integral over tj is


log

u/j + u
= log(1 + j) j log(1 + ),
u/j

since max(t1 , . . . , tj1 , u t1 . . . tj1 ) u/j. Further since < 1 we have t1 ,


. . ., tj1 u and so these integrals contribute (log E(u))j1 . Thus we have
(u(1 + )) (u)

X
1 2
j log(1 + )(log E(u))j1 ,
j!
j=1
j odd

and the result follows easily.


2
Proof of Theorem 2.2 Fix w v 1. Suppose (t) = 1 for t 1 and
(t) [0, 1] for all t and let (u) denote the corresponding solution to (2.1) (we

will think of as giving the optimal function for either g(w/v) or G(w/v)).
Let
U 1 be a parameter which we will let tend to infinity. Put 1 (t) = (t/U )
and note that the corresponding solution to (2.1) is 1 (u) = (u/U ). Define
2 (t) = 0 for 1 t v and 2 (t) = 1 (t) for all other t, and let 2 (u) denote
the corresponding solution to (2.1). By Lemma 2.5 we see that for U v
Z

X
(1)j
1 1
1
2 (uU ) = 1 (uU )+
. . . 1 (uU t1 . . .tj )dt1 dtj .
vt1 ,...,tj 1 t1 t2
j!
t
j
j=1
t1 +...+tj uU

By Proposition 3.1 we know that




jv 
1 (uU t1 . . . tj ) = 1 (uU ) + O min 1, E (u) log E (u)
.
uU
Using this above we see easily that for large U with u, v, w fixed we have 2 (uU )
1 (uU )/v = (u)/v and note further that E2 (uU ) = vE1 (uU ) = vE (u).

156

The number of unsieved integers up to x

This scaling argument shows that for 1 v w we have g(w/v) v


g (w)

and that G(w/v)


v G(w).
Using these inequalities in (2.4a) we deduce that

g(w) g(w) and that G(w) G(w)


and combining this with (2.4b) we obtain
Theorem 2.2.
2
Now that Theorem 2.2 has been established, to prove Theorem 3 it suffices

to establish the analogous bounds for G(w)


and we establish these next.
Proof of Theorem 3 Using the inclusion-exclusion upper bound (2.5) with
n = 2 we see that (u) 1 log E(u) + (log E(u))2 /2. It follows that G(w) =

G(w)
1 log w + (log w)2 /2. If w 3/2 then consider (t) = 0 for 1 t w
and (t) = 1 for all other t. Then we see that the corresponding solution (u)

satisfies (u) = 1 log w + (log w)2 /2 for 3 u 2w. Thus G(w)


= 1 log w +
(log w)2 /2 for 1 w 3/2.
We now establish the second bound of the Theorem. As noted in the introduction the second bound is worse than the first for w 3.21 and so we may
suppose that w 2. With ,

as above, note that
(t) 0 for all t, and

(u(1 + ))
(u) (E(u)) log(1 + ) for 0 1
by Proposition 3.1. If E(u) 2 then (E(u)) 7/4 > 1/ log 2 so that exp((u)/(E(u)))
1 < 1. Hence we obtain that
1
u

(t)dt

exp((u)/(E(u)))1

((u) (E(u)) log(1 + ))d

(32.1)


 (u) 

= (u) + (E(u)) exp
1 ,
(E(u))

(32.2)
(32.3)

and inserting this into (3.2) we get the Theorem.


32.4

An improved upper bound: Proof of Theorem 2

Our
R proof of Theorem 2 is also based on (3.2) and obtaining lower bounds for
1
(t)dt. However Theorem 4 is not quite strong enough to obtain this conu u
clusion and so, in this section, we develop a hybrid Lipschitz estimate which for
our problem is almost as good as (3.3) with = 1. We begin with the following
Proposition (compare Lemma 2.2 and Proposition 3.3 of [3]).
4.1

Proposition 32.8 Let be a measurable function with (t) = 1 for t 1 and


(t) in the unit disc for all t. Let be the corresponding solution to (2.1). Let
1 v u be given real numbers, and put = u v. Define


 
Z u
1 (t)eity
F := max exp
Re
dt |1 eiy |.
yR
t
0

An improved upper bound: Proof of Theorem 2

157

Then

log
u

log
u

|(u) (v)|

Z 2/(uF )
eu
1 e2xu
+F +F
dx

x
0
eu
e3
+ F log .

(32.4)
(32.5)
(32.6)

Proof As in the proof of Theorem 3 take (t)


= (t) for t u and (t)
=0
for t > u, and let
be the corresponding solution to (2.1). Set (t) =
(t) = 0
for t < 0. Note that

Z u


(t)(
(u t)
(v t))dt (32.7)
|u(u) v(v)| = |u
(u) v
(v)| =
0
Z u
Z u
Z


|
(t)
(t )|dt =
2t|
(t)
(t )|
e2xt dx dt
0

(32.8)
Z

2
0

{|t
(t) (t )
(t )| + |
(t )|}e2tx dtdx

(32.9)
Z

I(x)dx +
0

2e2tx dtdx = log

u
+

I(x)dx,
0

(32.10)
(32.11)
where

Z
I(x) =

2|t
(t) (t )
(t )|e2tx dt.

As |(u) (v)| u1 (|u(u) v(v)| + |(v)|) u + u1 |u(u) v(v)|, it follows


that
Z

eu 1
|(u) (v)| log
+
I(x)dx.
u

u 0
By Cauchys inequality
 Z u
 Z u

I(x)2 4
e2tx dt
|t
(t) (t )
(t )|2 e2tx dt
0
0
 1 e2xu  Z

2
|t
(t) (t )
(t )|2 e2tx dt .
x
0

(32.12)
(32.13)
(32.14)

By Plancherels formula the second term above is


Z
Z
1
1
=
|L(t
(t)(t)
(t), x+iy)|2 dy =
|L(t
(t), x+iy)|2 |1e(x+iy) |2 dy.
2
2

158

The number of unsieved integers up to x

From (2.1) we see that L(t


(t), x + iy) = L(
, x + iy)L(,
x + iy) and so the
above equals
Z
Z
1
2
(x+iy) 2
2 1
|L(
, x+iy)L(,
x+iy)| |1e
| dy F (x)
|L(,
x+iy)|2 dy
2
2
where
F (x) := max |1 e(x+iy) ||L(
, x + iy)|.
yR

Now, using Plancherels formula again,


Z
Z
Z u
1 e2xu
1
|L(,
x + iy)|2 dy =
|(t)|
2 e2tx dt
e2tx dt =
,
2
2x
0
0
and so

1 e2xu
F (x).
x
We now demonstrate that F (x) is a decreasing function of x. Suppose that
|z|

is real, and recall that the Fourier transform of k(z) :=


is k()
=
R > 0|z|iz
R e
2

1
z
e
dz = 2 +2 . Hence e
= k(z) = k(z) = 2 +2 eiz dz

by Fourier inversion for z > 0. It follows that for + t > 0 we have


Z
1

(1 e(x++iy) )et(x++iy) =
et(x+iy+i) (1 e(x+iy+i) )d.
2 + 2
I(x)

Multiplying both sides by


(t), and integrating t from 0 to , we deduce that
Z

1
L(
, x + iy + i)(1 e(x+iy+i) )d
(1 e(x++iy) )L(
, x + +iy) =
2 + 2
(32.15)

1 Z

max |(1 e(x+iy) )L(


, x + iy)|
d,
yR
2 + 2
(32.16)
(32.17)
and so F (x + ) F (x) as claimed. Therefore F (x) limx0+ F (x).
Now if s = x + iy with x > 0 then
 Z 


Z vs
1 (v)
1 (v)eivy
e
evx
vx
L
,s =
e
dv +
dv
v
v
v
0
0
(32.18)

Z 
1 (v)eivy
=
evx dv + log(x/s),
(32.19)
v
0
(32.20)

An improved upper bound: Proof of Theorem 2

so that

159

 Z 


1
1 (v)eivy
vx
L(, s) = exp
e
dv .
x
v
0

Using this for


we have
|L(
, x + iy)| =


1
exp
x

etx
dt
t

Re
0

 1 (t)eity 
t


etx dt .

For x  1/u we get


Z tx
Z t
Z t
Z 1 t
e
e
e
e 1
1
1
dt =
dt =
dt+
dt+log
= +log
+O(ux),
t
t
t
t
ux
ux
u
ux
1
ux
R1
R t
t
since = 0 1et dt 1 e t dt, so that


 Z u

1 (t)eity

|L(
, x + iy)| = e u exp
Re
dt + O(ux) .
t
0
Note that this is u 1, so that the maximum of |1 e(x+iy) ||L(
, x + iy)|
cannot occur with ky/2k 0 as x 0+ (here ktk denotes the distance from
the nearest integer to t), else F (x) u x + ky/2k 0 as x 0+ , implying
that F (x) = 0 which is ridiculous. Thus the maximum occurs with ky/2k  1
as x 0+ so that 1 e(x+iy) = 1 eiy + O(x) = (1 eiy ){1 + O(x)},
so that




Z u
1 (t)eity
(x+iy)
iy
|1e
||L(
, x+iy)| = u|1e
| exp
dt + O(ux) .
Re
t
0
Therefore F (x) uF {1 + O(ux)} for sufficiently small x; and so F (x) uF .
Also F (x) 2 maxyR |L(
, x + iy)| 2/x. Therefore, by (4.2), we get that
(
1e2xu
uF if x 2/uF
I(x) 2 x
if x > 2/uF,
x2
which when inserted in (4.1) yields the first estimate in the Proposition.
Now if F 1 then
Z 2/(uF )
Z 2/u
Z 2/(uF )
1 e2xu
1
1 e2xu
dx
dx +
dx 2 + log(1/F ),
x
x
x
0
2/u
0
and so we deduce the second estimate of Proposition 4.1. If F > 1 this holds
trivially since |(u) (v)| 2.
2
As an application of this Proposition, we establish the following strangelooking Lipschitz estimate in the case that (t) [0, 1] for all t 1.

160
4.2

The number of unsieved integers up to x

Proposition 32.9 Let be a measurable function with (t) = 1 for t 1 and


(t) [0, 1] for t > 1, and let denote the corresponding solution to (2.1). Let
1 v u be given and write E(u) = (u/(u v))P for P > 0. Then
 u v min{1,1 1 sin(P)} 
u 
.
|(u) (v)| 
1 + log
u
uv
Ru
Proof Let = u v and A = 0 1(t)
dt = log E(u). We will show that
t
A
 Z u 1 (t) cos(ty) 
 min{1,1 1 sin( log(u/)
)}
exp
,
dt min(1, y) 
t
u
0
for all positive y. The result then follows from Proposition 4.1 since F  Left
side of (4.3).
If y e/u then the left side of (4.3) is e/u and the result follows. Henceforth we may suppose that y > e/u. Since cos(x) = 1 + O(x2 ), we get that
R 1/y 1(t) cos(ty)
R 1/y
Ru
dt = 0 1(t)
dt + O(1). Thus if we let z := 1/y 1(t)
dt
t
t
t
0
then
Z u
Z u
1 (t) cos(ty)
1 (t) cos(ty)
dt = A z + O(1) +
dt
(32.21)
t
t
0
1/y
Z u
Z u
1 (t)
1 cos(ty)
dt +
cos(ty)dt
= A z + O(1) +
t
t
1/y
1/y
(32.22)
Z uy
1 (t/y)
= A z + log(uy) + O(1) +
cos(t)dt,
t
1
(32.23)
(32.24)
by making a change of variables, and since (integrating by parts)
Z u
Z u
sin(ty) u
sin(ty)
cos(ty)
dt =
+
dt = O(1).

t
yt
yt2
1/y
1/y
1/y
By periodicity
Z uy
Z
1 (t/y)
cos(t)dt =
G(P) cos P dP, where G(P) :=
t
1
0

X
tP2Z
1tuy

1 (t/y)
t

and the sum over t above is over real values of t in the range [1, uy] such that
t P is an integer multiple of 2. Note that
1
0 G(P) log(uy) + O(1) for all P,

Z
Z u
1 (t)
and
G(P)dP =
dt = z.
t
0
1/y

(32.25)
(32.26)
(32.27)

An improved upper bound: Proof of Theorem 2

161

R
Consider the problem of minimizing 0 G(P) cos PdP over all functions G satisfying these two constraints. Since cos P decreases from 1 to 1 in the range [0, ],
we see that this is achieved by taking G(P) = 0 for P [0, P0 ], and G(P) =
1
1
log(uy) + O(1) for P [ P0 , ], where P0 satisfies P0 ( log(uy) + O(1)) = z.
We conclude that
Z
Z

1
1
log(uy) + O(1) dP = log(uy) sin P0 + O(1)
G(P) cos PdP
cos P

0
P0
(32.28)


1
z
= log(uy) sin
+ O(1)
(32.29)

log(uy) + O(1)


1
z
= log(uy) sin
+ O(1),
(32.30)

log(uy)
(32.31)
since 0 z log(uy). Therefore
Z
0


 z 
1 (t) cos(ty)
1
dt A z + log(uy) 1 sin
+ O(1).
t

log(uy)

In the domain 0 z log(uy), the right side of (4.4) is a non-increasing function


of z, so that it is greater than the value with z replaced by log(uy), that is, it is
> A+O(1). Therefore the left side of (4.3) is  eA min(1, y), which is /u if
A log(uy), as required. If A < log(uy) then the right side of (4.4) is greater than
the value with z replaced by A, which is log(uy) log(uy)
sin(A/ log(uy))+O(1),

so that the left side of (4.3) is




A
1
min(1, y)
(uy) sin( log(uy) ) .
uy

This function is maximized when y = 1/ in the range log(uy) A, at which


point it yields the right side of (4.3), completing the proof.
2
Proof of Theorem 2 Let = E(u) = eA . We may assume that is large,
and that (u) 1/, else our result follows trivially. Let v = (1 + e )u for
some parameter > A, and select (t)
= (t) for t u and (t)
= 0 for t > u,
as earlier. Using Proposition 4.2 we deduce that there is a constant C such that
 A 


.
|
(u)
(v)| C(1 + ) exp + sin

If 2A, then this is C(1 + ) exp((1 1/)) which is easily verified to


be 1/(2) if is sufficiently large. If A < 2A, then the right side of (4.5)
A
is 2C(1 + A) exp( +
sin( )), which is a decreasing function of in our
range. For = A + where := cA2/3 (log A)1/3 , with c > (6/ 2 )1/3 , this equals

162

The number of unsieved integers up to x

 4 

 A 
2 3

1
A+
= 2C(1+A) exp A
+O

2C(1+A) exp A+
sin
.

A+
6 A2
A3
2


Thus we have proved that |


(u)
(v)| 1/(2) for all A + , which implies
that
(v) 1/(2) for u v u(1 + eA ). Therefore
1
u

(t)dt
u

1
u

u(1+eA )

(v)dv
u

1
1
1
ueA
>
,
u
2
22 exp()

which implies the theorem, by (3.2).


2

33
THE LOGARITHMIC SPECTRUM
TruncDirSeries

We saw in section ?? that for any fixed > 0, the spectrum of

X f (n)  X 1

:
f

F(S)
lim
x

n
n
nx

nx

is easily understood in terms of Euler products and (S), except when = 1,


in which case we have the logarithmic spectrum, 0 (S), which is easier to study
than (S): The fact that
Z
1 X f (n)
1 u 1 X
1 X
=
f (n)dt +
f (n),
t
log x
n
u 0 y
x log x
t
nx

nx

ny

implies that 0 (S) K((S)), the convex linear combinations of elements of


(S). Similarly we deduce that
 Z t

IntDelEqn
1
0 (S) =
(t)dt : , as in (28.1), u 1 ,
u 0
so that 0 (S) K((S)). We need to see how much of the theory for (S)
carries over to 0 (S):
33.1
G0UB

Results for logarithmic means

Proposition 33.1
P Let f be a multiplicative function with |f (n)| 1 for all n,
and put g(n) = d|n f (d). Then
 1 X 1 Re(f (p)) 
1 X f (n)
.

 exp
log x
n
2
p
nx

px

Proof Let g = 1 f . Since


X

g(n) =

nx

we see that

XX
nx d|n

f (d) =

X
dx

x

X f (d)
f (d)
+ O(1) = x
+ O(x),
d
d
dx

164

The logarithmic spectrum

 1 
1 X
1 X f (n)
|g(n)| + O


log x
n
x log x
log x
nx

nx

1 X |g(n)|
1

+
2
n
log
x
log x nx

X |1 + f (p)| 2)
+ 1
 exp
p
log x
px

(3.2.1)

by 9.1, and then Mertens theorem. Now 12 (1 Re(z)) 2 |1 + z| 1 Re(z)


whenever |z| 1, and so the result follows.
2
Halasz4Log

G0UB

Now (8.6) together with Proposition 33.1Pimplies that t = tf (x, log x) is


small if the mean value is large. Indeed if | nx f (n)/n| (log x)1 then
D2 (f, nit , x) ( + o(1)) log log x and D2 (f, 1, x) (2 + o(1)) log log x, so that
log(1 + |t| log x) + O(1) = D2 (1, nit , x) (D(f, nit , x) + D(f, 1, x))2

((1 + 2)2  + o(1)) log log x.


Hence
|tf (x, log x)| 

1
.
(log x)16

it
It is not entirely
P surprising that t must be small since if f (n) = n with |t|
1/ log x then nx f (n)/n = log(1/|t|) + O(1).

LogLipsch

Exercise 33.2 Prove the Lipschitz-type estimate


X f (n)
1 X f (n)
1
log 2y


,
log x
n
log(x/y)
n
log x
nx

nx/y

for all functions f with |f (n)| 1. (Hint: Do this from first principles.)
GenFundLem

Exercise 33.3 By using partial summation in Proposition 3.6 or otherwise,


show that if f (pk ) = 1 for all primes p > y then for x = y u we have
X f (n)
= P(f ; y) log x + O(log y).
n

nx

This implies that EulSpec


that Euler product spectrum here is the same as before (see,
e.g., Proposition 30.13).
LogStructThm

Proposition 33.4 Let f be any multiplicative function with |f (n)| 1. Let


g be the completely multiplicative function defined by g(p) = 1 for p y and
g(p) = f (p) for p > y. If x = y u then


1 X f (n)
1 X g(n)
1
= P(f, y)
+ O 1/2 .
log x
n
log x
n
u
nx
nx

Results for logarithmic means

165

Proof Let f 1 =g h so that h(p) = 1 for all p > y. Using the last exercise
we obtain, for v = u and x = y u ,
X g(a) X h(b)
X h(b)
X (g h)(n)
=
+
n
a
b
b
v
v

nx

ax/y

bx/a

by

X
x/y v <ax/b

g(a)
a

X g(a)
X log(y v /b)

=
(P(h; y) log x/a + O(log y)) + O
a
b
v
v
by
ax/y
Z xX
g(a) dt
= P(h; y)
+ O(u log2 y),
a t
1
at

extending the sum over a to all a x. Since g h = f 1 and P(1, y) = 1 we


deduce that
Z xX
Z xX
g(a) dt
f (a) dt
= P(f ; y)
+ O(u log2 y).
a
t
a
t
1
1
at

at

We subtract
the expression for x from that for xy w , with w = u, and then use
LogLipsch
exercise 33.2 to note that

Z xyw X
Z xyw
X f (n)
f (a) dt
dt
log
t

=
+ O(log(t/x))
a
t
log
x
n
t
x
x
at

nx

1 X f (n)
= uw log2 y
+ O(u log2 y).
log x
n
nx

Combining all this information yields the Proposition.

This is our structure theorem, and allows us to assert that 0 (S) = P (S) 0 (S),
and hence 0 (S) is starlike.
Exercise 33.5 Prove that 0 (S) = 0 (S).
One can easily show that there are elements of 0 (S) that do not belong to
P (S): Let (t) = 1 for t 1 and (t) = S for t > 1, so that (t) =
1 (1 ) log t for 1 t 2, and hence
Z
Z

1 u
1 u
1
(t)dt = 1 (1 )
log t dt = 1 (1 ) log u 1 +
u 0
u 1
u
for 1 u 2. This implies that {1 (1 )t : 0 t log 2 1/2} 0 (S) and
if z belongs to this set then arg(1 z) = arg(1 ). In particular this implies

that Ang(
0 (S)) Ang(S). Moreover if 0 <Ang(S) < 2 then one can show, as in
ImportantEx
section 31.3 then z 6 P (S), and hence we have proved that there are elements
of 0 (S) are not in P (S).

166

The logarithmic spectrum

The elements of
0 (S) are of the form (1 )(u)/u, which arise naturally, as
AverageIntEqn
we saw in exercise 31.2. Let us suppose that S = [1, 1] and F(S). Then
1 + (t) = 2 for all t 1 and 1 + (t) 0 for all t 1. Now we know that
M
(u) = M (u) = u for 0 u 1. If M (t) 0 for all t < u then, by exercise
AverageIntEqn
31.2,
Z
Z
u

M (u t)(1 + )(t)dt 2

uM (u) =
0

Gamma0-11

M (u t)dt.
0

Exercise 33.6 Use this functional equation to prove that M (u) > 0 for all
u > 0, or even M (u)  ((2 + o(1))/u log u)u . Deduce that 0 ([1, 1]) = [0, 1].
33.2

Bounding 0 (S)

We are able to say much more about the structure of 0 (S) thanks to the following result:
HullOfG0(S)

Proposition 33.7 Suppose S is a closed subsetQof U with 1 S. Then 0 (S)


n
i
R, the closure of the convex hull of the points i=1 1+s
2 , for all n 1, and all
choices of points s1 , . . ., sn lying in he convex hull of S.
AngMaxRegion

By exercise 30.7 we know that the elements of K(S) are all convex linear
combinations of the points ei with = 0 or 2 || . Hence 0 (S) is a
Qn
ij
where 2 |j | , with
subset of the convex hull
of the points j=1 1+e2
HullOfG0(S)
n 0, by Theorem 33.7. Such a product has magnitude (cos )n cos if
n 1, and so 0 (S) is a subset of the convex hull of {1} {|z| cos }. Now,
if |z| cos then one can show that Ang(z) arcsin(|z|) 2 , and so it
follows that Ang(0 (S)) 2 =Ang(S). In the previous section we showed
that Ang(0 (S)) Ang(S), and so we can now deduce that
Ang(0 (S)) = Ang(S).
HullOfG0(S)

CompareTwoChi

Proof of Proposition 33.7 By exercise 28.5 with 0 (t) = 1 for all t 1 we


have that (u) equals
1 (u) +

X
1
k!

k
Y
1 + (ti )

ti

t1 ,...,tk 1
t1 +...+tk u i=1

k=1

1 (u t1 . . . tk )dt1 . . . dtk .

Integrating yields that M (u) equals


M1 (u)+

X
1
k!

k=1

k
Y
1 + (ti )

t1 ,...,tk 1
t1 +...+tk u i=1

ti

M1 (ut1 . . .tk )dt1 . . . dtk . (33.1)

We have shown that M1 (v) > 0 for all v > 0, so this is a linear combination of
elements of R, with non-negative coefficients. The sum of those coefficients is
M1 (u) +

X
1
k!

k=1

t1 ,...,tk 1
t1 +...+tk u

k
Y
2
M1 (u t1 . . . tk )dt1 . . . dtk ,
t
i=1 i

LinearCombo

Negative truncations

167

which equals M1 (u), that is the case that (t) = 1 for all t; that is (t) = 1 for
all t, and so M
1 (u) = u. Hence we have proved that M (u)/u, which equals the
LinearCombo
quantity in (33.1) divided by u, lies in the convex hull of R, as desired.
2
33.3

Negative truncations
Gamma0-11

In exercise 33.6P
we saw that 0 ([1, 1]) = [0, 1], which might mistake one into
surmising thatP nN f (n)/n 0 whenever f F([1, 1]); however all one can
deduce is that nN f (n)/n oN (log N ). In 1958 Haselgrove showed that
P
k
k
nN (n)/n gets negative, where (p ) = (1) , and recently it was shown
MR2398787
[8] that the first such value is N = 72185376951205. Moreover the sum equals
2.075 . . . 10?9 when N = 72204113780255.
This leads to several questions:
P
What is the minimum possible value of nN f (n)/n for each large NP
? For any
N ? To begin with we show that this is easily bounded below: If g(n) = d|n f (d)
then each g(n) 0 and so
0

g(n) =

nx

X
dx


h x i X  f (d)

x
+1 ,
f (d)
d
d
dx

and hence for any f F([1, 1]) and any N 1 we have


X f (n)
1.
n

nN

This can be somewhat improved:


NegTruncPrecise

Proposition 33.8 If f F([1, 1]) and g(n) =

f (d) then

d|n

 (log log x)2 


X f (n)
1X
1X

=
g(n) + (1 )
f (n) + O
.
n
x
x
(log x)2 3

nx

nx

nx

Proof Proceeding as above we have


X

g(n) =

nx

X
dx

nxo
hxi
X f (d) X
f (d)
=x

f (d)
,
d
d
d
dx

dx

and so for K = [log x], we have


K
X f (n) X
X
x

g(n) =
n
nx

nx

k=1 x/(k+1)<mx/k

We can rewrite each such sum as

x/k

f (m)
m

x
dt + O(x/K).
t2

168

x/k

The logarithmic spectrum

f (m)

x/(k+1) x/(k+1)<mt

x
dt
t2

!

Z x/k
Z x/k 
1X
x log log x
x
x
x

=
dt + O
dt
f (n)
t
x
k + 1 t2
k(log x)2 3 x/(k+1) t2
x/(k+1)
nx
 




X
k+1
1
log log x

=
f (n) log

+O
k
k+1
k(log x)2 3
nx

RealLipsch

by
PKexercise 14.2. Summing up over all k, 1 k K yields the result since
2
k=1 1/(k + 1) = log(K + 1) + 1 + O(1/K).
LipschBounds

GenTruncPrecise

Exercise 33.9 Modify the above proof, using Corollary 14.1, to show that for
any totally multiplicative f with |f (n)| 1 we have the same estimate but with
1 replaced by
Z
{z}
ct := (1 + it)
dz, where t = tf (x, log x).
2+it
z
1

NegVals

Proposition 33.10 There exists a constant c > 0 such that if x is sufficiently


large then there exists f = fx F([1, 1]) for which
X f (n)
c

n
log x
nx

P
Proof We discussed above that there exists an integer N such that nN (n)
n =
2
for some > 0. Now let x > N be large and define f (p) = 1 if x/(N +
1) < p x/N and f (p) = 1 for all other p. If n x then we see that
f (n) = (n) unless n = p` for a (unique) prime p (x/(N + 1), x/N ] in which
case f (n) = (`) = (n) + 2(`). Therefore
X f (n)
X (n)
X
1 X (`)
=
+2
n
n
p
`
nx

nx

x/(N +1)<px/N

X (n)
=
2
n
nx

X
x/(N +1)<px/N

`x/p

1
2

,
p
N log x
2

by the prime number theorem.

This next
Ppart needs editing:
5
Set u = px (1 f (p))/p. By Theorem 2 of A. Hildebrand [?] (with f there
being our function g, K = 2, K2 = 1.1, and z = 2) we obtain that
 
 X max(0, 1 g(p)) 
Y
1 
g(p) g(p2 )
1X
g(n) 
1
1+
+ 2 +. . . exp
x
p
p
p
p
nx

px

px

+ O(exp((log x) )),

Negative truncations

169

where is some positive constant and () = () with being the Dickman


function12 . Since max(0, 1 g(p)) (1 f (p))/2 we deduce that
1X
g(n)  (eu log x)(eu/2 (eu/2 )) + O(exp((log x) ))
x

(33.2)

nx

 eue

u/2

(log x) + O(exp((log x) )),

since () = +o() .
3
On the other hand, a special case of the main result in [?] implies that

1 X

(33.3)
f (n)  eu ,

x
nx

where = 0.32867 . . .. Combining Proposition 3.1 with (3.5) and (3.6) we immediately get that
(x) c/(log log x) for any < 2. This completes the proof
1
of Theorem 32.1.
Remark 33.11 The bound (3.5) is attained only in certain very special cases,
u
that is when there are very few primes p > xe
for which f (p) = 1 + o(1).
In this case
one
can
get
a
far
stronger
bound
than
(3.6).
Since the first part of
1
Theorem 32.1 depends on an interaction between these two bounds, this suggests
that one might be able to improve Theorem 1 significantly by determining how
(3.5) and (3.6) depend upon one another.
Now what about the class of all multiplicative functions, not necessarily
totally multiplicative, with values in [1, 1]? We will P
sketch a proof that we
have the same lower bound  1/(log log x)3/5 unless k1 (1 + f (2k ))/2k 
P
1/(log x)1/20 . Now nx f (n)/n 1 log 2 + o(1), with equality if and only
P
P
k
k
if D (f, f2 ; x) = o(1) where D (f, g; x) :=
px
k1 (1 (f g)(p ))/p , and
k
f2 (2 ) = 1 for all k 1, otherwise f2 (.) is totally multiplicative with f2 (p) =
f1 (p) for all p 3.
UseTotally2FSieved1

Exercise 33.12 Use the special case of exercise 14.8 and (14.1) to prove that


X f2 (n)
1X
log log x

= log 2
f1 (n) + O
.
n
x
(log x)2 3
nx
nx
UseTotally2
GenTruncPrecise

Combining exercises 14.8 and 33.9: Given |f (n)| P


1, let g is totally multiplicative with g(p) = f (p) for all primes p, and G(n) = d|n g(d) then
 x(log log x)2 
X f (n)
1X
1X

.
= C0 (f )
G(n) + (ct C0 (f ) t (f ))
g(n) + O
n
x
x
(log x)2 3

nx

12 The

u 1.

nx

3.5

nx

Dickman function is defined as (u) = 1 for u 1, and (u) = (1/u)

Ru
u1

(t)dt for

3.6

170

The logarithmic spectrum

If 1 f (n) 1 then t = 0 and so we have


 x(log log x)2 
X f (n)
1X
1X

.
= C0 (f )
G(n)+((1)C0 (f )0 (f ))
g(n)+O
n
x
x
(log x)2 3
nx
nx
nx
P
For nx f (n)
value, C0 (f ) must be small else
n to have a not-too-small negative
P
one can argue as above. If C0 (f ) is small then k1 (1 + f (2k ))/2k must be very
small and the main term comes from 0 (f ) times the mean value of g(n). We
can easily then show that the largest negative value comes from when g is close
to f1 .
33.4

Convergence
P
P
We observe that if n f (n)/n converges then px f (p)/p is bounded.
To see
P this we begin by observing that if t t0 then |E(t)|  where
E(t) := n>t f (n)/n. This implies, by partial summation, that if N t then
R
P
1
1+ log1 x
= tN dE(t)/t log x  . Hence if log2 N <  log x then
nN f (n)/n
X f (n) X f (n)
=
+ O(),
1+ log1 x
n
nN
n1 n
P
P
and so the value of n f (n)/n is simply the limit of n1 f (n)/n1+ as 0.
P
1+
Taking logarithms and limits this means that p f (p)/p
exists as 0.
P
Now if = 1/ log x then we have seen that this equals px f (p)/p + O(1). The
result follows.
33.5

Upper bounds revisited


P
Let us suppose that t = tf (x, T ), and write f (n) = ab=n ait g(b).
R a+1/2
Exercise 33.13 Show that 1/a1+it = a1/2 du/u1+it + O((1 + |t|)2 /a3 ), and
deduce that if x A 1 + |t| then


X
xit Ait
(1 + |t|)2
1
=
+
O
a1it
it
A2
x>aA

We therefore deduce, for A (1 + |t|)2 ,


X f (n)
X 1
g(b)
=

n
a1it
b

nx

abx

X g(b) X
X 1
X
1
g(b)
+
b
a1it
a1it
b
aA
bx/A
ax/b
x/A<bx/a



xit X g(b)
g(b)
(1 + |t|)2 X |g(b)| X 1
=
1+it + O
+
it
b
b
x
(x/b)
a
=

bx/A

bx/A

aA

X
x/A<bx

|g(b)|
.
b

Upper bounds revisited

171

(3.2.2)
|g(b)|
A)2 P
The second-to-last term is  1. The last term is  (log
bx
log x
b . by (9.2).
Just taking absolute values above, with A = (1 + |t|)2 , we deduce when T log x


X f (n)
1
(log log x)2 X |g(b)|

+
.
n
|t|
log x
b
nx

bx

34
THE POLYA-VINOGRADOV INEQUALITY
GenGSums

By (10.3) we have for a primitive character (mod q),


 
M
+N
M
+N
X
X
X
an
1
(a)e
(n) =
g()
q
n=M +1 a (mod q)
n=M +1


X
1
a(2M + N + 1) sin N a/q
.
=
(a)e
g()
2q
sin a/q
a

(mod q)

Taking absolute values, we obtain



M +N

X
1 X
1



(n)
q log q.



q
| sin a/q|
n=M +1

(a,q)=1

Exercise 34.1 Justify this last step. Indicate how one might improve this to

(2/ + o(1)) q log q.


There are various ways one can develop the series above. The most useful is
due to Polya:
Exercise 34.2 Prove that if 0 < 1 and is a character mod q then


X
q log q
g() X (n)
(1 e(n)) + O 1 +
(34.1)
(n) =
2i
n
N
nq

Polya

1|n|N

for any N 1. (Hint: Think: Fourier analysis.)


Exercise 34.3 Deduce that
X
g()
(n) =
(2 (2))(1 (1))L(1, ) + O(1).
2i
n<q/2

ExpSums2Chars

ExptoChi

Exercise 34.4 Using (23.9) deduce that if (b, r) = 1 then


X f (n)e(bn/r) X f (d)
=
n
d(r/d)

nN

d|r

(b)g()

(mod r/d)

X f (n)(n)
.
n

nN/d

Exercise 34.5 Deduce that if (b, r) = 1 then

X
nbq/r

(n) =

X 2(d)
g()

(1 (1))L(1, )
2i
d(r/d)
d|r

X
(mod r/d)
()(1)=1

(b)g()L(1, )+O (1)

A lower bound on distances

173

Let X = x/ logA x. If 0 < 1 then Dirichlets approximation theorem tells


us that there exists a rational number b/r with 1 r X such that | b/r|
1/rX. Therefore if n R := 1/| b/r| then |e(n) e(bn/r)|  n| b/r|;
and otherwise |e(n) e(bn/r)|  1. Hence




X
X
X f (n)(e(n) e(bn/r))
1


| b/r| +


n
n


nx

nmin{R,x}

min{R,x}<nx

 log(1 + | b/r|x)  log log x.


Select 1 (mod r/d1 ) as that character with conductor dividing r for which
M = Mf 1 (x, log x) is minimal.13 We now bound the contribution of the other
ExptoChi
terms in exercise 34.4. For the other characters j we know that Mf j (x, log x)
eta2

x
(2/3o(1)) log( log
log r )+O(1) by Proposition 24.1; and that if k is sufficiently large

etak

x
then Mf j (x, log x) (1 ) log( log
log r ) + O(1) by Proposition 24.2. Substituting
Halasz4Log
ExptoChi
these bounds into (8.6), and then the bounds from there into exercise 34.4, we
obtain14
X f (n)e(bn/r)
X f (n) (n)
g(1 )
1
f (d1 ) 1 (b)
n
d1 (r/d1 )
n
nx
nx/d



r1/2 (log x)1/3 + r1/2 (r log x)o(1) .

34.1 A lower bound on distances


When has given order g > 1, we wish to bound
D((n), (n)nit , x)2 =

X 1 Re(()(p)/pit )
p

px

from below, where |t| < (log x)2 . The smallest the pth term can be, for given (p)
and pit , is when (p) is that gth root of unity nearest to (p)pit . If is a character
mod r the Siegel-Walfisz Theorem tells us that there are roughly equal number
of primes p h (mod r) for each (h, r) = 1 in the interval [z, z + z/(log z)3A ]
provided log log z > (1/A) log log x. If has order k we may write each (p) =
e(`/k), the ` depending on the arithmetic progression that p belongs to mod
k. Also pit = z it + o(1) = e2i + o(1) where := (t log z)/2. Hence
(
)
 

X 1 Re(()(p)/pit ) X 1 1 k1
X
a
`

1 min cos 2
+
+ o(1)
0ag1
p
p k
g
k
p
p
`=0

where the sum is over the primes in [z, z + z/(log z)3A ].


13 This is not quite correct. We need to work ex 31.4 by writing it in terms of primitive
characters and then use those. Nonetheless the calculations done here are the correct ones.
14 Can we improve the last term using the Pretentious large sieve?

174

The Polya-Vinogradov Inequality

Exercise 34.6 Show that if L = [g, k] and L/g is even then,


1

sin g

sin L


cos

2
L


{L}

1
2


.

Show that if L/g is odd then we replace {L} 1/2 by {L}.


Exercise 34.7 Deduce that if L/g is even then the mean of this last function,
for [0, u] is 1 g sin g for all u. Deduce that if r < (log x)2 with A > 1/
and has even order k then
D((n), (n)nit , x)2
g

1 sin O().
log log x

g
(Also deduce that if L/g is odd and D((n), (n)nit , x)2 = o(log log x) then
D((n), (n), x)2 = o(log log x) and g = L (i.e. k divides g).)
We deduce from the above that if (g 3 is odd and)




X (n)e(bn/r)
g

r

 1
(log x) sin g +o(1) + r1/2 (r log x)o(1) .


n
rd1 (r)
nx

MV1

We apply this bound when r log x. By partial summation on (23.6) for the
sum between r1+ and x, for x r2 , we obtain




X f (n)e(bn/r)
log x

 log r + p
+ log log x.


n
(r)
nx

We use this bound for r > log x.
Combining the above (and this needsPolya
tidying up) we obtain that if is a
primitive character of order g then, by (34.1), for any N 1 we have
X
g

(n)  q(log q) sin g +o(1) .


(34.2)
nN

We believe that this exponent is best possible with this method (this needs
some explanation!).
34.2

Using the Pretentious Generalized Riemann Hypothesis

PVimprove

REFERENCES
MR1654882
MR1716573
MR1956972
MR1956974
MR1767398
MR0250990
MR891718
MR2398787
MR0093504
MR0132733
MR0132732
MR0148628
MR0148626
MR551333

MR623788
MR741085
MR0332702

MR0439794
MR675168

Gennady Bachman. Some remarks on nonnegative multiplicative functions on


arithmetic progressions. J. Number Theory, 73(1):7291, 1998.
Gennady Bachman. Exponential sums with multiplicative coefficients. Electron.
Res. Announc. Amer. Math. Soc., 5:128135 (electronic), 1999.
Gennady Bachman. On a Brun-Titchmarsh inequality for multiplicative functions. Acta Arith., 106(1):125, 2003.
Gennady Bachman. On exponential sums with multiplicative coefficients. II.
Acta Arith., 106(1):4157, 2003.
Antal Balog and Trevor D. Wooley. Sums of two squares in short intervals.
Canad. J. Math., 52(4):673694, 2000.
M. B. Barban and P. P. Vehov. Summation of multiplicative functions of polynomials. Mat. Zametki, 5:669680, 1969.
Enrico Bombieri. Le grand crible dans la theorie analytique des nombres.
Asterisque, (18):103, 1987.
Peter Borwein, Ron Ferguson, and Michael J. Mossinghoff. Sign changes in
sums of the Liouville function. Math. Comp., 77(263):16811694, 2008.
D. A. Burgess. The distribution of quadratic residues and non-residues. Mathematika, 4:106112, 1957.
D. A. Burgess. On character sums and L-series. Proc. London Math. Soc. (3),
12:193206, 1962.
D. A. Burgess. On character sums and primitive roots. Proc. London Math.
Soc. (3), 12:179192, 1962.
D. A. Burgess. A note on the distribution of residues and non-residues. J.
London Math. Soc., 38:253256, 1963.
D. A. Burgess. On character sums and L-series. II. Proc. London Math. Soc.
(3), 13:524536, 1963.
Hedi Daboussi. Remarques sur les fonctions multiplicatives. In Seminaire
Delange-Pisot-Poitou, 18e annee: 1976/77, Theorie des nombres, Fasc. 1,
pages Exp. No. 4, 3. Secretariat Math., Paris, 1977.
Hedi Daboussi. Sur les fonctions multiplicatives ayant une valeur moyenne non
nulle. Bull. Soc. Math. France, 109(2):183205, 1981.
Hedi Daboussi. Sur le theor`eme des nombres premiers. C. R. Acad. Sci. Paris
Ser. I Math., 298(8):161164, 1984.
Hedi Daboussi and Hubert Delange. Quelques proprietes des fonctions multiplicatives de module au plus egal `a 1. C. R. Acad. Sci. Paris Ser. A,
278:657660, 1974.
Hedi Daboussi and Hubert Delange. On a theorem of P. D. T. A. Elliott on
multiplicative functions. J. London Math. Soc. (2), 14(2):345356, 1976.
Hedi Daboussi and Hubert Delange. On multiplicative arithmetical functions

176

MR839933
MR1790423

MR719245
MR0387220
MR933499
MR962733
MR1041494
MR993299

MR1117012

MR1067262
MR1072058
MR1042765

MR1084180

MR1292619
MR1225952
MR1222182
MR1911217
MR586468

References

whose modulus does not exceed one. J. London Math. Soc. (2), 26(2):245
264, 1982.
Hedi Daboussi and Hubert Delange. On a class of multiplicative functions. Acta
Sci. Math. (Szeged), 49(1-4):143149, 1985.
Harold Davenport. Multiplicative number theory, volume 74 of Graduate Texts
in Mathematics. Springer-Verlag, New York, third edition, 2000. Revised
and with a preface by Hugh L. Montgomery.
Hubert Delange. Sur les fonctions arithmetiques multiplicatives de module 1.
Acta Arith., 42(2):121151, 1983.
P. D. T. A. Elliott. A mean-value theorem for multiplicative functions. Proc.
London Math. Soc. (3), 31(4):418438, 1975.
P. D. T. A. Elliott. Multiplicative functions on arithmetic progressions. Mathematika, 34(2):199206, 1987.
P. D. T. A. Elliott. Multiplicative functions on arithmetic progressions. II.
Mathematika, 35(1):3850, 1988.
P. D. T. A. Elliott. Extrapolating the mean-values of multiplicative functions.
Nederl. Akad. Wetensch. Indag. Math., 51(4):409420, 1989.
P. D. T. A. Elliott. Multiplicative functions on arithmetic progressions. In
ementaire des Nombres, 1987
Groupe de Travail en Theorie Analytique et El
1988, volume 89 of Publ. Math. Orsay, pages 3137. Univ. Paris XI, Orsay,
1989.
P. D. T. A. Elliott. Multiplicative functions on arithmetic progressions. III.
The large moduli. In A tribute to Paul Erd
os, pages 177194. Cambridge
Univ. Press, Cambridge, 1990.
P. D. T. A. Elliott. Multiplicative functions on arithmetic progressions. IV. The
middle moduli. J. London Math. Soc. (2), 41(2):201216, 1990.
P. D. T. A. Elliott. Multiplicative functions on arithmetic progressions. V.
Composite moduli. J. London Math. Soc. (2), 41(3):408424, 1990.
P. D. T. A. Elliott. Multiplicative functions |g| 1 and their convolutions: an
overview. In Seminaire de Theorie des Nombres, Paris 198788, volume 81
of Progr. Math., pages 6375. Birkhauser Boston, Boston, MA, 1990.
P. D. T. A. Elliott. Some remarks about multiplicative functions of modulus
1. In Analytic number theory (Allerton Park, IL, 1989), volume 85 of
Progr. Math., pages 159164. Birkhauser Boston, Boston, MA, 1990.
P. D. T. A. Elliott. On the correlation of multiplicative functions. Notas Soc.
Mat. Chile, 11(1):111, 1992.
P. D. T. A. Elliott. Multiplicative functions on arithmetic progressions. VI.
More middle moduli. J. Number Theory, 44(2):178208, 1993.
P. D. T. A. Elliott. On the correlation of multiplicative and the sum of additive
arithmetic functions. Mem. Amer. Math. Soc., 112(538):viii+88, 1994.
P. D. T. A. Elliott. Multiplicative functions on arithmetic progressions. VII.
Large moduli. J. London Math. Soc. (2), 66(1):1428, 2002.
P. Erd
os and I. Z. Ruzsa. On the small sieve. I. Sifting by primes. J. Number
Theory, 12(3):385394, 1980.

References
MR1978371
MR2647984

MRNotIn4
MR925610
MR2024414
MR2053604

MR1691308

MR1815216
MR1829755
MR1930263
MR2016245
MR2099829
MR2335187

MR2276774

MR2362198

MR2299742
MR2437976

MR0209237

177

Etienne Fouvry and Philippe Michel. Sommes de modules de sommes


dexponentielles. Pacific J. Math., 209(2):261288, 2003.
John Friedlander and Henryk Iwaniec. Opera de cribro, volume 57 of American Mathematical Society Colloquium Publications. American Mathematical
Society, Providence, RI, 2010.
L. Goldmakher. Multiplicative mimicry and improvements of the PolyaVinogradov inequality. Alg. Num Theory, 2011.
D. A. Goldston and Kevin S. McCurley. Sieving the positive integers by large
primes. J. Number Theory, 28(1):94115, 1988.
A. Granville and K. Soundararajan. The distribution of values of L(1, d ).
Geom. Funct. Anal., 13(5):9921028, 2003.
A. Granville and K. Soundararajan. Errata to: The distribution of values of
L(1, d ) [Geom. Funct. Anal. 13 (2003), no. 5, 9921028; ]. Geom. Funct.
Anal., 14(1):245246, 2004.
Andrew Granville and K. Soundararajan. Motivating the multiplicative spectrum. In Topics in number theory (University Park, PA, 1997), volume 467
of Math. Appl., pages 115. Kluwer Acad. Publ., Dordrecht, 1999.
Andrew Granville and K. Soundararajan. Large character sums. J. Amer.
Math. Soc., 14(2):365397 (electronic), 2001.
Andrew Granville and K. Soundararajan. The spectrum of multiplicative functions. Ann. of Math. (2), 153(2):407470, 2001.
Andrew Granville and K. Soundararajan. Upper bounds for |L(1, )|. Q. J.
Math., 53(3):265284, 2002.
Andrew Granville and K. Soundararajan. Decay of mean values of multiplicative
functions. Canad. J. Math., 55(6):11911230, 2003.
Andrew Granville and K. Soundararajan. The number of unsieved integers up
to x. Acta Arith., 115(4):305328, 2004.
Andrew Granville and K. Soundararajan. Extreme values of |(1 + it)|. In The
Riemann zeta function and related themes: papers in honour of Professor K.
Ramachandra, volume 2 of Ramanujan Math. Soc. Lect. Notes Ser., pages
6580. Ramanujan Math. Soc., Mysore, 2006.
Andrew Granville and K. Soundararajan. Large character sums: pretentious characters and the Polya-Vinogradov theorem. J. Amer. Math. Soc.,
20(2):357384 (electronic), 2007.
Andrew Granville and K. Soundararajan. Negative values of truncations to
L(1, ). In Analytic number theory, volume 7 of Clay Math. Proc., pages
141148. Amer. Math. Soc., Providence, RI, 2007.
Andrew Granville and K. Soundararajan. An uncertainty principle for arithmetic sequences. Ann. of Math. (2), 165(2):593635, 2007.
Andrew Granville and Kannan Soundararajan. Pretentious multiplicative functions and an inequality for the zeta-function. In Anatomy of integers, volume 46 of CRM Proc. Lecture Notes, pages 191197. Amer. Math. Soc.,
Providence, RI, 2008.
G. Hal
asz. A note on the distribution of multiplicative number-theoretical

178

MR0230694
MR0246836
MR0319930
MR0319931

MR728404

MR0369292

MR0424730

MR527762
MR0340193
MR1346679
MR1417699
MR1113432

MR810949
MR1467991

MR791589

MR827113
MR818818
MR905750

References

functions. Studia Sci. Math. Hungar., 1:113117, 1966.

G. Hal
asz. Uber
die Mittelwerte multiplikativer zahlentheoretischer Funktionen. Acta Math. Acad. Sci. Hungar., 19:365403, 1968.

G. Hal
asz. Uber
die Konvergenz multiplikativer zahlentheoretischer Funktionen. Studia Sci. Math. Hungar., 4:171178, 1969.
G. Hal
asz. On the distribution of additive and the mean values of multiplicative
arithmetic functions. Studia Sci. Math. Hungar., 6:211233, 1971.
G. Hal
asz. Remarks to my paper: On the distribution of additive and the
mean values of multiplicative arithmetic functions. Acta Math. Acad. Sci.
Hungar., 23:425432, 1972.
G. Hal
asz. On random multiplicative functions. In Hubert Delange colloquium
(Orsay, 1982), volume 83 of Publ. Math. Orsay, pages 7496. Univ. Paris
XI, Orsay, 1983.
G
abor Hal
asz. On the distribution of additive arithmetic functions. Acta Arith.,
27:143152, 1975. Collection of articles in memory of Juri Vladimirovic
Linnik.
H. Halberstam and H.-E. Richert. Sieve methods. Academic Press [A subsidiary
of Harcourt Brace Jovanovich, Publishers], London-New York, 1974. London
Mathematical Society Monographs, No. 4.
H. Halberstam and H.-E. Richert. On a result of R. R. Hall. J. Number Theory,
11(1):7689, 1979.
R. R. Hall. Halving an estimate obtained from Selbergs upper bound method.
Acta Arith., 25:347351, 1973/74.
R. R. Hall. A sharp inequality of Halasz type for the mean value of a multiplicative arithmetic function. Mathematika, 42(1):144157, 1995.
R. R. Hall. Proof of a conjecture of Heath-Brown concerning quadratic residues.
Proc. Edinburgh Math. Soc. (2), 39(3):581588, 1996.
R. R. Hall and G. Tenenbaum. Effective mean value estimates for complex
multiplicative functions. Math. Proc. Cambridge Philos. Soc., 110(2):337
351, 1991.
A. Hildebrand. Quantitative mean value theorems for nonnegative multiplicative functions. I. J. London Math. Soc. (2), 30(3):394406, 1984.
A. Hildebrand. Extremal problems in sieve theory. S
urikaisekikenky
usho
K
oky
uroku, (958):19, 1996. Analytic number theory (Japanese) (Kyoto,
1994).
Adolf Hildebrand. Fonctions multiplicatives et equations integrales. In Seminar
on number theory, Paris 198283 (Paris, 1982/1983), volume 51 of Progr.
Math., pages 115124. Birkhauser Boston, Boston, MA, 1984.
Adolf Hildebrand. A note on Burgess character sum estimate. C. R. Math.
Rep. Acad. Sci. Canada, 8(1):3537, 1986.
Adolf Hildebrand. On Wirsings mean value theorem for multiplicative functions. Bull. London Math. Soc., 18(2):147152, 1986.
Adolf Hildebrand. Multiplicative functions in short intervals. Canad. J. Math.,
39(3):646672, 1987.

References
MR921088
MR955953
MR1023920

MR1078172

MR991697
MR1265913
MR1253441
MR0314790
MR2173378
MRNotIn1
MR0457371
MR1689558

MR1830577
MR0337847
MR2378655

MR0444586
MR0485730
MR0485731
MR0485732
MR0485737

179

Adolf Hildebrand. Quantitative mean value theorems for nonnegative multiplicative functions. II. Acta Arith., 48(3):209260, 1987.
Adolf Hildebrand. Large values of character sums. J. Number Theory,
29(3):271296, 1988.
Adolf Hildebrand. Some new applications of the large sieve. In Number theory
(New York, 1985/1988), volume 1383 of Lecture Notes in Math., pages 76
88. Springer, Berlin, 1989.
Adolf Hildebrand. The asymptotic behavior of the solutions of a class of
differential-difference equations. J. London Math. Soc. (2), 42(1):1131,
1990.
Adolf Hildebrand. Multiplicative functions on arithmetic progressions. Proc.
Amer. Math. Soc., 108(2):307318, 1990.
Adolf Hildebrand and Gerald Tenenbaum. Integers without large prime factors.
J. Theor. Nombres Bordeaux, 5(2):411484, 1993.
Adolf Hildebrand and Gerald Tenenbaum. On a class of differential-difference
equations arising in number theory. J. Anal. Math., 61:145179, 1993.
B. V. Levin, N. M. Timofeev, and S. T. Tuljaganov. Distribution of the values
of multiplicative functions. Litovsk. Mat. Sb., 13(1):87100, 232, 1973.
H. Maier and A. Sankaranarayanan. On a certain general exponential sum. Int.
J. Number Theory, 1(2):183192, 2005.
H. L. Montgomery. A note on the mean values of multiplicative functions. Inst.
Mittag-Leffler, 17, 19**.
H. L. Montgomery and R. C. Vaughan. Exponential sums with multiplicative
coefficients. Invent. Math., 43(1):6982, 1977.
H. L. Montgomery and R. C. Vaughan. Extreme values of Dirichlet L-functions
at 1. In Number theory in progress, Vol. 2 (Zakopane-Koscielisko, 1997),
pages 10391052. de Gruyter, Berlin, 1999.
H. L. Montgomery and R. C. Vaughan. Mean values of multiplicative functions.
Period. Math. Hungar., 43(1-2):199214, 2001.
Hugh L. Montgomery. Topics in multiplicative number theory. Lecture Notes
in Mathematics, Vol. 227. Springer-Verlag, Berlin, 1971.
Hugh L. Montgomery and Robert C. Vaughan. Multiplicative number theory. I.
Classical theory, volume 97 of Cambridge Studies in Advanced Mathematics.
Cambridge University Press, Cambridge, 2007.
J. Pintz. Elementary methods in the theory of L-functions. I. Heckes theorem.
Acta Arith., 31(1):5360, 1976.
J. Pintz. Elementary methods in the theory of L-functions. II. On the greatest
real zero of a real L-function. Acta Arith., 31(3):273289, 1976.
J. Pintz. Elementary methods in the theory of L-functions. III. The Deuringphenomenon. Acta Arith., 31(3):295306, 1976.
J. Pintz. Elementary methods in the theory of L-functions. IV. The Heilbronn
phenomenon. Acta Arith., 31(4):419429, 1976.
J. Pintz. Corrigendum: Elementary methods in the theory of L-functions.
VII. Upper bound for L(1, ) (Acta Arith. 32 (1977), no. 4, 397406).

180

MR0485733
MR0485735
MR0485736
MR0485738
MRNotIn2
MR552470
MR0292772
MR1366197

MRNotIn3
MR0131389
MR0223318

References

Acta Arith., 33(3):293295, 1977.


J. Pintz. Elementary methods in the theory of L-functions. V. The theorems
of Landau and Page. Acta Arith., 32(2):163171, 1977.
J. Pintz. Elementary methods in the theory of L-functions. VI. On the least
prime quadratic residue (mod ). Acta Arith., 32(2):173178, 1977.
J. Pintz. Elementary methods in the theory of L-functions. VII. Upper bound
for L(1, ). Acta Arith., 32(4):397406, 1977.
J. Pintz. Elementary methods in the theory of L-functions. VIII. Real zeros of
real L-functions. Acta Arith., 33(1):8998, 1977.
A. Selberg. Note on a paper of L.G. Sathe. J. Ind. Math. Soc. B, 18:8387,
1954.
P. Shiu. A Brun-Titchmarsh theorem for multiplicative functions. J. Reine
Angew. Math., 313:161170, 1980.
P. J. Stephens. Optimizing the size of L(1, ). Proc. London Math. Soc. (3),
24:114, 1972.
Gerald Tenenbaum. Introduction `
a la theorie analytique et probabiliste des
nombres, volume 1 of Cours Specialises [Specialized Courses]. Societe
Mathematique de France, Paris, second edition, 1995.
A. Wintner. The theory of measure in arithmetical semigroups. ?, Baltimore,
1944.
Eduard Wirsing. Das asymptotische Verhalten von Summen u
ber multiplikative
Funktionen. Math. Ann., 143:75102, 1961.
Eduard Wirsing. Das asymptotische Verhalten von Summen u
ber multiplikative
Funktionen. II. Acta Math. Acad. Sci. Hungar., 18:411467, 1967.