Você está na página 1de 24

18.

330 Lecture Notes:


Chebyshev Spectral Methods
Homer Reid
April 29, 2014

Contents
1 The question

2 The classical answer

3 The modern answer for periodic functions

4 The modern answer for non-periodic functions

5 Chebyshev polynomials

10

6 Chebyshev spectral methods

14

18.330 Lecture Notes

The question

In these notes we will concern ourselves with the following basic question: Given
a function f (x) on an interval x [a, b],
1. How accurately can we characterize f using only samples of its value at
N sample points {xn } in the interval [a, b]?
2. What is the optimal way to to choose the N sample points {xn }?
What does it mean to characterize a function f (x) over an interval [a, b]?
There are at least three possible answers:
Rb
1. We may want to evaluate the integral a f (x) dx. In this case, the problem
of characterizing f from N function samples is the problem of designing
an N -point quadrature rule.
2. We may want to evaluate the derivative of f at each of our sample points
using the information contained in the sample values. This is the problem
of constructing a differentiation stencil, and it arises when we try to solve
ODEs or PDEs: in that case we are trying to reconstruct f (x) given knowledge of its derivative, so generally upon constructing the differentiation
stencil we will want to invert it.
3. We may want to construct an interpolant f interp (x) that agrees with f (x)
at the sample points but smoothly interpolates between those points in
a way that mimics the original function f (x) as closely as possible. For
example, f (x) may be the result of an experimental measurement or the
result of a costly numerical calculation, and we might to accelerate calculation of f (x) at arbitrary values of x by precomputing f (xn ) at just the
sample points {xn } and then interpolating to get values at intermediate
points x.
In a sense, the first half of our course was devoted to studying the answer
to this question furnished by classical numerical analysis, while the second half
has been focused on the modern answer. Lets begin by reviewing what the
classical approach had to offer.

18.330 Lecture Notes

The classical answer

Classical numerical analysis answers the question of how to choose the sample
points {xn } in the simplest possible way: We simply take the sample points to
be evenly spaced throughout the interval [a, b]:1
xn = a + n,

n = 0, 1, , N,

ba
.
N

In this case,
The quadrature rules one obtains are the usual Newton-Cotes quadrature
rules, which we studied in the first and second weeks of our course. These
work by fitting polynomials through the function samples and then integrating those polynomials to approximate the integral of the the function.
The differentiation stencils one obtains are the usual finite-difference stencils, which we studied in the third and fourth weeks of our course. These
may again be interpreted as a form of polynomial interpolation: we are
essentially constructing and differentiating a low-degree approximation to
the Taylor-series polynomial
The interpolant one constructs is the unique N th degree polynomial P interp (x)
that agrees with the values of the underlying function f (x) at the N + 1
sample points. Although we didnt get to this in the first unit of our
course, it turns out to be easy to write down a formula for this polynomial
in terms of the sample points {xn } and the values of f at those points,
{fn } {f (xn )}. For example, for the cases N = 1, 2, 3 we have2
P1interp (x) = f1

(x x1 )
(x x2 )
+ f2
(x1 x2 )
(x2 x1 )

P2interp (x) = f1

(x x1 )(x x3 )
(x x1 )(x x2 )
(x x2 )(x x3 )
+ f2
+ f3
(x1 x2 )(x1 x3 )
(x2 x1 )(x2 x3 )
(x3 x1 )(x3 x2 )

P3interp (x) = f1

(x x2 )(x x3 )(x x4 )
(x x1 )(x x3 )(x x4 )
+ f2
(x1 x2 )(x1 x3 )(x1 x4 )
(x2 x1 )(x2 x3 )(x2 x4 )
(x x1 )(x x2 )(x x4 )
(x x1 )(x x2 )(x x3 )
+ f3
+ f4
(x3 x1 )(x3 x2 )(x3 x4 )
(x4 x1 )(x4 x2 )(x4 x3 )

The formula of this type for general N is called the Lagrange interpolation
formula; it constructs an N th degree polynomial passing through N + 1
fixed data points (xn , fn ).
1 Technically

we have here a set of N + 1 points, not N points as we stated above.


you see the pattern here? The general expression for PN includes one term for each
sample point xm . The numerator of this term is a product of N linear factors which are
constructed to ensure that the numerator vanishes whenever x equals one of the other sample
points (x = xn , n 6= m). The denominator of this term is just a constant chosen to replicate
the value of the numerator at x = xm , which ensures that the fraction evaluates to 1 at
x = xm . Then we just multiply by fm to obtain a term which yields fm at xm and vanishes
at all the other sample points. Summing all such terms for each sample point, we obtain an
N th degree polynomial which yields fn at each sample point xn .
2 Do

18.330 Lecture Notes

Performance of the classical approach on general functions


How well does the classical approach work?
Integration: If we divide our interval into N subintervals and approximate
the integral over each subinterval using a pth-order Newton-Cotes quadrature rule, then we saw in Unit 1 that for general functions the error decays
1
, i.e. algebraically with N (as opposed to exponentially with N ).
like N p+1
Differentiation: If we estimate derivative values via a pth-order finitedifference stencil using function samples at points spaced by multiples of
, then the error decays like p , or like N1p . [For example, the forward
(x)
has error proportional
finite-difference approximation f 0 (x) f (x+)f

(x)
to , while the centered finite-difference f 0 (x) f (x+)f
has error
2
proportional to 2 .] Thus here again we find convergence algebraic in N ,
not exponential in N .
Interpolation: Polynomial interpolation in evenly-spaced sample points
is a notoriously badly-behaved procedure due to the Runge phenomenon
(we will discuss it briefly in an appendix). The Runge phenomenon is so
severe that, in some cases, the polynomial interpolant through N evenlyspaced function samples points doesnt just converge slowly as N .
It doesnt converge at all!
To summarize the results of the classical approach,
Classical approach: To characterize a function over an interval using N function choose the sample points to be evenlyspaced points and construct polynomial interpolants. The approach in general yields convergence algebraic in N for integration and differentiation, but does not converge for interpolation
of some functions.

Performance of the classical approach on periodic functions


However, as we saw already in PSet 1, there is one exception to the general
rule of algebraic convergence: If the function we are integrating is periodic
over the interval in question, then simple Newton-Cotes using evenly-spaced
functions achieves convergence exponential in N (although differentiation and
interpolation continue to behave as above even for periodic functions). This
observation forms the basis of the modern approach, to which we now turn.

18.330 Lecture Notes

The modern answer for periodic functions

The classical approachto use evenly-spaced function samples and construct


polynomialsyields slow convergence in general and non-convergence (of the
polynomial interpolant) in some cases.
The modern approach, for periodic functions, retains the evenly-spaced sample points of the classical approach but throws out the idea of using polynomials
to interpolate them, choosing instead to construct trigonometric interpolants
consisting of linear combinations of sinusoids of various frequencies.3
The performance of the modern approach for periodic functions follows logically by aggregating a series of observations we made in our discussion of Fourier
analysis:
If a function f (t) is periodic with period T , then it has a Fourier-series
representation of the form
f (t) =

fen ein0 t

n=

The
Modern approach, periodic functions: To characterize a
periodic function over an interval using N function samples,
choose the sample points to be evenly spaced throughout the
interval and construct a trigonometric interpolant consisting of
a sum of N sinusoids. The approach in general yields convergence exponential in N for integration, differentiation, and interpolation.

Performance of the modern approach for periodic functions

3 Linear

P
combinations of sinusoids like
[an sin n0 t + bn cos n0 t] are sometimes called
trigonometric polynomials since they are in fact polynomials in the variable ei0 t , but I
personally find this terminology a little confusing.

18.330 Lecture Notes

The modern answer for non-periodic functions

The modern answer to the characterization problemsample at evenly-spaced


points and construct a trigonometric interpolantworks very well for periodic
functions. What do we do if we have a non-periodic function? Easy: we make
it into a periodic function. When you have such a powerful hammer, treat
everything like a nail! Lets review how this construction works.

Construct a smooth periodic version of f (x)


To construct a periodic function out of a non-periodic function f (x), we restrict
our attention to the interval x [1 : 1] (if you need to consider a different
interval, just shift and scale variables accordingly) and define
g() = f (cos ).
This is a smooth4 periodic function. As varies from 0 to , g() traces out the
behavior of f (x) over the interval [1, 1] [that is, g() traces out f (x) backwards].
When crosses and continues on to 2, g() turns around and begins to retrace
its steps, going backwards over the same terrain it covered between = 0 and
. Figure 1 (which also appeared in our notes on Clenshaw-Curtis quadrature)
shows an example of a non-periodic function f (x) and the periodic function g()
that captures the behavior of f over the interval [1, 1].

Write down a Fourier cosine series for g()


Because g() is 2periodic and even, it has a a Fourier cosine series of the
form

g() =
with coefficients

2
e
a =

e
a0 X
+
e
a cos()
2
=1
Z

(1)

g() cos() d.

(2)

Sample g() at N + 1 evenly-spaced points and construct an


interpolant
Now consider sampling the function g() at N + 1 evenly-spaced points distributed throughout the interval [0, ], including the endpoints:
 n 
gn g(n) = g
,
n = 0, 1, , N
(3)
N
4 Assuming f is smooth. The construction of the g function doesnt do anything to smooth
out discontinuities in f or any of its derivatives; it only smoothes out the discontinuities arising
from the mismatch at the endpoints.

18.330 Lecture Notes

f(x)

-1

-1

-2

-2

-3

-3

-4

-2

-1.5

-1

-0.5

0
x

0.5

1.5

-4

(a)
5

c
4

-1

-1

-2

-2

-3

-3

-4

-4

(b)
Figure 1: (a) A function f (t) that we want to integrate over the interval [1, 1].
(b) The function g() = f (cos ). Note the following facts: (1) g() is periodic
with period 2. (2) g() is an even function of . (3) Over the interval 0
, g() traces out the behavior of f (t) as t varies from 1 1 [i.e. g()
traces out f (t) backwards.] However, (4) g() knows nothing about what f (t)
does outside the range 1 < t < 1, which can make it a little tricky to compare
the two plots. For example, g() has local minima at = 0, even though f (t)
does not have local minima at t = 1, 1.

18.330 Lecture Notes

The discrete Fourier transform of the set of samples {gn } yields a set of Fourier
coefficients {e
g }:
DFT
{gn } {e
g }
From the {e
g } coefficients we can reconstruct the original {gn } samples through
the magic of the inverse DFT:
IDFT

{e
g }

{gn }

where the specific form of the reconstruction is


gn =

N
X

ge ein .

(4)

=0

Now proceeding exactly as in our discussion of trigonometric interpolation, we


continue equation (4) from the integer variable n to a real-valued variable :
g interp () =

N
X

ge ei

(5)

=0

Note that g interp () is (in general) not the same function as the original g();
the difference is that the sum in (6) is truncated at = N , whereas the Fourier
series for the full function g() will in general contain infinitely many terms.
The form of (5) may be simplified by noting that, because g() is an even
function of , its Fourier series includes only cosine terms:
N/2

g interp () =

e
a0 X
+
e
a cos()
2
=1

(6)

where the e
an coefficients are related to the gen coefficients computed by the DFT
according to
e
a0 = 2e
g0 ,
e
a = (e
g + ge ) = 2e
g .
[The last equality here follows from the fact that, for an even function g(), the
Fourier series coefficients for positive and negative are equal, ge = ge .]
The procedure we have outlined above uses general DFT techniques for
computing the numbers a . In this particular case, because g() is an even
function, it is possible to accelerate the calculation by a factor of 4 using the
discrete cosine transform, a specialized version of the discrete Fourier transform.
We wont elaborate on this detail here.

Express g interp () in terms of the variable x


Finally, lets now ask what equation (1) looks like in terms of the original variable
x. If we recall the original definition
g() f (cos )

(7)

18.330 Lecture Notes

we can manipulate this to read


f (x) = g(arccos x).

(8)

Now plugging in the approximation (1) yields an approximation to f :


N/2

interp

e
a0 X
e
a cos (n arccos x)
(x) =
+
2
=1

(9)

Equation (9) would appear at first blush to define a horribly ugly function of
x. It took the twisted5 genius of the Russian mathematician P. L. Chebyshev
to figure out that in fact equation (9) defines a polynomial function of x. To
understand how this could possibly be the case, we must now make a brief foray
in the world of the Chebyshev polynomials.

5 We

intend this adjective in the most admiring possible sense.

18.330 Lecture Notes

10

Chebyshev polynomials

Trigonometric definition
The definition of the Chebyshev polynomials is inspired by the observation, from
high-school trigonometry, that cos(n) is a polynomial in cos for any n. For
example,
cos 2 = 2 cos2 1
cos 3 = 4 cos3 3 cos
cos 4 = 8 cos4 8 cos2 + 1
The polynomials on the RHS of these equations define the Chebyshev polynomials for n = 2, 3, 4. More generally, the nth Chebyshev polynomial Tn (x) is
defined by the equation
cos n = Tn (cos )
and the first few Chebyshev polynomials are
T0 (x) = 1
T1 (x) = x
T2 (x) = 2x2 1
T3 (x) = 4x3 3x
T4 (x) = 8x4 8x2 + 1.
Figure 2 plots the first several Chebyshev polynomials. Notice the following
important fact: For all n and all x [1, 1], we have 1 Tn (x) 1. This
boundedness property of the Chebyshev polynomials turns out to be quite useful
in practice.
On the other hand, the Chebyshev polynomials are not bounded between
1 and 1 for values of x outside the interval [1, 1] (nor, being polynomials,
could they possibly be). Figure 3 shows what happens to T15 (x) as soon as we
get even the slightest little bit outside the range x [1, 1]: the polynomial
takes off to . In almost all situations involving Chebyshev polynomials we
will be interested in their behavior within the interval [1, 1].

Completeness and Orthogonality


The Chebyshev polynomials constitute our first example of an orthogonal family
of polynomials. We will have more to say about this idea later, but for the time
being the salient points are the following:
1. The Chebyshev polynomials are complete: Any N th-degree polynomial
can be expressed exactly (and uniquely) as a linear combination of T0 (x), T1 (x), , TN (x).
Thus the set of N + 1 functions {Tn } for n = 0, , N forms a basis of
the N + 1-dimensional vector space of N -th degree polynomials.

18.330 Lecture Notes

11

1.5

1.5

0.5

0.5

-0.5

-0.5

0.5

0.5

-0.5

-0.5

-1

-1

-1.5

-1

-0.5

0
x

0.5

-1

-1

-1.5

-1

-0.5

T0 (x)

0
x

0.5

T1 (x)

1.5

1.5

0.5

0.5
0

-0.5

-0.5

0.5

0.5

-0.5

-0.5

-1

-1

-1.5

-1

-0.5

0
x

0.5

-1

-1

-1.5

-1

-0.5

T2 (x)

0
x

0.5

T3 (x)

1.5

1.5

0.5

0.5

-0.5

-0.5

0.5

0.5

-0.5

-0.5

-1

-1

-1.5

-1

-0.5

0
x

T4 (x)

0.5

-1

-1

-1.5

-1

-0.5

0
x

0.5

T15 (x)

Figure 2: The Chebyshev polynomials T04 (x) and T15 (x).

18.330 Lecture Notes

12

15

15

10

10

-5

-5

-10

-10

-15

-15
-1

-0.5

0.5

Figure 3: The Chebyshev polynomials Tn (x) take off to for values of x


outside the range [1 : 1]. Shown here is the case T15 (x).

18.330 Lecture Notes

13

2. The Chebyshev polynomials are orthogonal with respect to the following


inner product:6
Z 1
f (x)g(x)dx

hf, gi
.
1 x2
1
Orthogonality means that if we insert Tn and Tm in the inner product we
get zero unless n = m:

hTn , Tm i = nm .
(10)
2
Taken together, these two properties furnish a convenient way to represent arbitrary functions as linear combinations of Chebyshev polynomials. The first
property tells us that, given any function f (x), we can write f (x) in the form
f (x) =

Cn Tn (x).

(11)

n=0

The second property gives us a convenient way to extract the Cn coefficients:


Just take the inner product of both sides of (11) with Tm (x). Because of orthogonality (equation 10), every term on the RHS dies except for the one involving
Cm , and we find

hf, Tm i = Cm
2
[where the /2 factor here comes from equation (10)]. In other words, the
Chebyshev expansion coefficients of a general function f (x) are
Cm

2
=

f (x)Tm (x)

dx.
1 x2

(12)

Equations (11) and (12) amount to form what we might refer to as the forward
and inverse discrete Chebyshev transforms of a function f (x).

6 An inner product on a vector space V is just a rule that assigns a real number to any pair
of elements in V . (Mathematicians would say it is a map V V R.) The rule has to be
linear (the inner product of a linear combination is a linear combination of the inner products)
and non-degenerate, meaning no non-zero element has vanishing inner product with itself.

18.330 Lecture Notes

14

Chebyshev spectral methods

Chebyshev spectral methods furnish the second half of the modern solution to
the problem we posed at the beginning of these notes, namely, how best to
characterize a function using samples of its value at N points.
Recall that the first half of the modern solution went like this:
Modern approach, periodic functions: To characterize a
periodic function over an interval using N function samples,
choose the sample points to be evenly spaced throughout the
interval and construct a trigonometric interpolant consisting of
a sum of N sinusoids. The approach in general yields convergence exponential in N for integration, differentiation, and interpolation.
The second half of the modern solution now reads like this:
Modern approach, non-periodic functions: To characterize a non-periodic function over an interval using N function
samples, map the interval into [1, 1], choose the sample points
to be Chebyshev points, and construct a polynomial interpolant
consisting of a sum of N Chebyshev polynomials. The approach
in general yields convergence exponential in N for integration,
differentiation, and interpolation.
Lets now investigate how Chebyshev spectral methods work for each of the
various aspects of the characterization problem we considered above.

Chebyshev approximation
As we saw previously, a function f (x) on the interval [1, 1] may be represented
exactly as a linear combination of Chebyshev polynomials:
f (x) =

Cn Tn (x)

(13)

n=0

One way to obtain a formula for the C coefficients in this expansion is to take
the inner product of both sides with Tm (x) and use the orthogonality of the T
functions:
hf, Tm i
hTm , Tm
Z
2 1 f (x)Tm (x)

=
dx.
1
1 x2

Cm =

(14)

However, there are better ways to compute these coefficients, as discussed below.

18.330 Lecture Notes

15

If we restrict the sum in (15) to include only its first N terms, we obtain an
approximate representation of f (x), the N th Chebyshev approximant:
f approx (x) =

N
1
X

Cn Tn (x)

(15)

n=0

Chebyshev interpolation
The coefficients Cn in formula (15) for the Chebyshev approximant may be
computed using the integral formula (13), but there are easier ways to get them.
These are based on the following alternative characterization of (15):
The N -th Chebyshev approximant (15) is the unique N -th degree polynomial that agrees with f (x) at the N + 1 Chebyshev
points xn = cos n
N , n = 0, 1, , N.
Thus, when we construct (15), we are really constructing an interpolant that
smoothly connects N + 1 samples of f (x) evaluated at the Chebyshev points.
In particular, the values of f at the Chebyshev points are the only data we need
to construct f approx in (15). This is not obvious from expression (14), which
would seem to suggest that we need to know f throughout the interval [1, 1].
How do we use this characterization of (15) to compute the Chebyshev expansion coefficients {Cn } in (15)? There are at least two ways to proceed:
1. We could use the Lagrange interpolation formula to construct the unique
N -th degree polynomial running through the
 data points {xn , f (xn )} for
the N + 1 Chebyshev points xn = cos n
N , n = 0, 1, , N.
2. We could observe that the Cn coefficients are the coefficients in the Fourier
cosine series of the even 2-periodic function
 g() = f (cos ). The samples
of g() at evenly-spaced points g n
N  are precisely just the samples
of f (x) at the Chebyshev points cos n
N , and the Fourier cosine series
coefficients may be computed by computing the discrete cosine transform
of the set of numbers {fn }:
{fn }
where

DCT


n 
fn = f cos
,
N

{Cn }
n = 0, 1, , N.

Option 1 here is discussed in Trefethen, Spectral Methods in MATLAB,


Chapter 6 (see particularly Exercise 6.1).
Here we will focus on option 2. The numbers Cn are just the Fourier cosine
series coefficients of g(), i.e. the numbers we called e
a in equation (2):
Z
2
Cn =
f (cos ) cos(n)d.
0

18.330 Lecture Notes

16

We compute the integral using a simple (N + 1)-point trapezoidal rule:




 n 
2 h1
2n
Cn =
+ f2 cos
f0 + f1 cos
+
N 2
N
N


i
(N 1)n
1
+ fN 1 cos
+ fN cos N
N
2

(16)

where


 n 
fn f cos
N
If we write out equation (16) for all of the Cn coefficients at once, we have an
(N + 1)-dimensional linear system relating the sets of numbers {fn } and {Cn }:

12

1
2

1
2
2
N
1
2

.
..

1
2

cos N

cos 2
N

cos 3
N

cos 2
N

cos 4
N

cos 6
N

cos 3
N
..
.

cos 6
N
..
.

cos 9
N
..
.

cos

cos 2

cos 3

..

1
2

1
2 cos

2 cos 2

cos
3
2

..

1
cos
N

f0

f1


f2
=


f3

..

.


fN

C0

C1

C2

C3

..
.

CN

which we could write in the form


f = C

(17)

where f and C are the (N + 1)-dimensional vectors of function samples at


Chebyshev points and Chebyshev expansion coefficients, respectively, and the
elements of the matrix are

1
m=0

N ,

nm
nm = N2 cos
, m = 1, , N 1

1
m=N
N cos n,
where the n, m indices run from 0 to N .
Using equation (17) directly is actually not a good way to compute the C
coefficients from the f samples, because the computational cost of the matrixvector multiplication scales like N 2 , whereas FFT techniques (the fast cosine
transform) can perform the same computation with cost scaling like N log N .
However, the existence of the matrix is useful for deriving Clenshaw-Curtis
quadrature rules and Chebyshev differentiation matrices, as we will now see.

18.330 Lecture Notes

17

Chebyshev integration
The Chebyshev spectral approach to integrating a function f (x) goes like this:
1. Construct the N th Chebyshev approximant f approx (x) to f (x) [equation
(15)].
2. Integrate the approximant and take this as an approximation to the integral.
In symbols, we have
Z

f approx (x) dx

f (x) dx
1

Insert equation (15):

N
X

Cm

Tm (x) dx.

(18)

m=0

But the integrals of the Chebyshev polynomials can be evaluated in closed form,
with the result
(
Z 1
2
m even
2,
(19)
Tm (x) dx = 1m
0,
m odd.
1
Thus equation (18) reads
Z

N
X

f (x) dx
1

m=0
m even

2Cm
.
1 m2

(20)

Does this expression look familiar? It is exactly what we found in our discussion
of Clenshaw-Curtis quadrature, except there we interpreted the integral (19) in
the equivalent form
Z 1
Z
Tm (x) dx =
cos(m) sin d.
1

Thus the Chebyshev spectral approach to integration is just Clenshaw-Curtis


quadrature. As we have observed, the Cm coefficients may be computed exactly
up to m = N using N + 1 samples of the function f (x) (where the samples
are taken at the Chebyshev points). Indeed, we can write (20) in the form of
a vector-vector product involving the vector C of Chebyshev expansion coeffi-

18.330 Lecture Notes

18

cients:

W=

f (x) dx WT C,

2
0
2
122

0
2
142

..
.

2
1N 2

Now plugging in equation (17) yields


Z

f (x) dx WT f

(21)

= wt f

(22)

which just illustrates that the weights of the (N + 1)-point Clenshaw-Curtis


quadrature rule are the elements of the vector w = WT .

Chebyshev differentiation
In the first unit of our course we saw how to use finite-difference techniques to
approximate derivative values from function values. For example, if feven is a
vector of function samples taken at evenly-spaced points in an interval [a, b] i.e.
if

f (a)
f (a + )

feven = f (a + 2)

..

.
f (b)
then the vector of derivative values at the sample points may be represented in
the centered-finite-difference approximation as a matrix-vector product of the
form
0
feven
= DCFD feven

18.330 Lecture Notes

19

where7

DCFD

0
1
0
0

0
0

1
0
1
0

0
1
0
1

0
0

0
0

0
0
0
0
.
..
0 0
0 1
0
0
1
0

0
0
0
0

1
0

As we saw in our discussion of finite-difference techniques, this approximation


will converge like 1/N 2 , i.e. the error between our approximate derivative and
the actual derivative will decay like 1/N 2 .
Now that we are equipped with Chebyshev spectral methods, we can write
a numerical differentiation stencil whose errors will decay exponentially 8 in N .
Indeed, following the general spirit of Chebyshev spectral methods, all we have
to do is
1. Construct the N th Chebyshev approximant f approx (x) to f (x) [equation
(15)].
2. Differentiate the approximant and take this as an approximation to the
derivative.
The N th Chebyshev approximant to f (x) is
fapprox (x) =

N
X

Cm Tm (x)

m=0

Differentiating, we find
0
fapprox
(x) =

N
X

0
Cm Tm
(x).

m=0

If we evaluate
this formula at each of the (N + 1) Chebyshev points xn =

0
cos n
,
n
=
0,
1, , N , we obtain a vector fcheb
whose entries are approximate
N
values of the derivative of f at the Chebyshev points, and which is related to
the vector C of Chebyshev coefficients via a matrix-vector product relationship:
0
0

f (x0 )
T0 (x0 ) T10 (x0 ) T20 (x0 ) TN0 (x0 )
C0
f 0 (x1 ) T00 (x1 ) T10 (x1 ) T20 (x1 ) TN0 (x1 ) C1
0

f (x2 ) T00 (x2 ) T10 (x2 ) T20 (x2 ) TN0 (x2 ) C2

..

..

..
..
..
..
..
.

.

.
.
.
.
.
f 0 (xN )
CN
T00 (xN ) T10 (xN ) T20 (xN ) TN0 (xN )
{z
} |
{z
} | {z }
|
f 0
cheb

T0

(23)
7 We

are here assuming that f vanishes to the left and right of the endpoints; as we saw
earlier in the course, it is easy to generalize to arbitrary boundary values of f .
8 Technically: faster than any polynomial in N .

18.330 Lecture Notes

20

Lets abbreviate this equation by writing


0
fcheb
= T0 C

where T0 is the (N + 1) (N + 1)-dimensional matrix in (23). If we now plug


in C = f cheb [equation 17], we get
0
0
fcheb
= T
fcheb
|{z}
Dcheb

This equation identifies the (N + 1) (N + 1) matrix


Dcheb = T0
as the matrix that operates on a vector of f samples at Chebyshev points to
yield a vector of f 0 samples at Chebyshev points.

Second derivatives
What if we need to compute second derivatives? Easy! Just go like this:
00
0
fcheb
= Dcheb fcheb


= Dcheb Dcheb fcheb

2
= Dcheb fcheb .

This equation identifies the (N +1)(N +1) matrix (Dcheb )2 , i.e just the square
of the matrix Dcheb , as the matrix that operates on a vector of f samples at
Chebyshev points to yield a vector of f 00 samples at Chebyshev points.

Chebyshev Boundary-Value Problems


Earlier in the course we used finite-difference differentiation matrices to solve
boundary-value problems, with errors decaying like 1/N p where N is the number
of sample points and p is some integer power. Now that we are equipped with
Chebyshev spectral methods, we can use Chebyshev differentiation matrices
like Dcheb to solve boundary-value problems with errors decaying exponentially
rapidly with N .
Although this process is conceptually just as straightforward as was our
use of finite-difference stencils to solve boundary-value problems earlier in the
course, there are a couple of minor technical details to consider that slightly
complicate the story. To illustrate how these can be tamed, lets work out a
Chebyshev algorithm for solving a boundary-value problem of the form
 
 
f 00 (x) + f 0 (x) + 2 f (x) = g(x),
f xL fL ,
f xR fR .
(24)
In this equation, and are fixed parameters, g(x) is a known forcing function,
the x variable ranges over the interval [xL , xR ], and the boundary values of f at
the left and right endpoints are fL , fR . (The subscripts L and R stand for
left and right.)

18.330 Lecture Notes

21

Rescaling to [1, 1]
Our boundary-value problem is defined on the interval [xL , xR ], but Chebyshev
spectral methods are nicest when we are working on the interval [1, 1]. Thus,
before we do anything else, lets redefine our problem so that the independent
variable runs over [1, 1]. That is, we will write x as a linear function of a new
variable (i.e. x = A + B for constants A, B to be determined) such that x
runs from xL to xR as runs from 1 to 1. As you can easily check, the unique
choice that works is
xR + xL
xR xL
,
xM
.
x( ) = W + xM ,
W
2
2
(Note that W is just half the width of the interval, while xM is the midpoint of
the interval. Here W stands for width, while M stands for midpoint.)
I will use the symbols F( ) and G( ) to denote new functions of obtained
by evaluating the old functions f (x) and g(x) at the point x = x( ):








F( ) f x( ) = f W + xM
G( ) g x( ) = g W + xM .
One consequence of the change of variables is that derivatives with respect to x
acquire factors9 of W when we write them in terms of derivatives with respect
to :
1
1 0
F ( ),
f 00 (x) = 2 F 0 ( ).
f (x) = F( ),
f 0 (x) =
W
W
Now we just rewrite the differential equation (24 in terms of the new variable
:


1 00
0
F ( ) +
F ( ) + 2 F( ) = G( ),
F 1 fL ,
F + 1 fR .
2
W
W
(25)
We now have a differential equation defined on the interval [1, 1], and we can
apply Chebyshev spectral methods.
Discretization
The next step is to discretize. Fix a value of N and consider the set of (N + 1)
Chebyshev points10 in the interval [1 : 1]:
n


n o
n = cos
,
n = 0, 1, , N
total of N + 1 sample points (26)
N
9 You can use dimensional analysis as a mnemonic device to help you remember where the
W factors go: Think of x as a quantity with units of length (so W , the width of an interval
in x, has units of length too), while is dimensionless. We know that x derivatives like df /dx
have units of inverse length [and d2 f /dx2 has units of (inverse length)2 ], but derivatives
like dF /d are dimensionless, so to recover a quantity like df /dx from a quantity like dF /d
we have to divide the latter by a quantity with units of length, i.e. by one factor of W .
Alternatively, you can think in terms of this symbolic identity:
1 d
d
=
.
dx
W d
10 As usual in Fourier and Chebyshev methods, there is some annoying confusion here over
precisely what N means, and failure to get this minor point straight can lead to annoying

18.330 Lecture Notes

22

Let F , F 0 , F 00 , and G be vectors of length N + 1 containing samples of F,


derivatives of F, and G at the Chebyshev points:

G0
F000
F00
F0
G1
F100
F10
F1

G2
F200
F20
F2

G=
F 00 =
F0 =
F =
,
,
,
..
..
..
..

.
.
.
.

GN 1

F 00

F0
FN 1
N 1
N 1
00
0
GN
FN
FN
FN
(27)
where
Fn F(n ),

Fn0 F 0 (n ),

Fn00 F 00 (n ),

Gn G(n ).

The vectors F 0 and F 00 can be obtained by operating on F with the Chebyshev


differentiation matrices we constructed earlier:
F 0 = DF ,

F 00 = D2 F

where D is what we earlier called Dcheb .


In discretized form, the boundary-value problem (25) thus becomes a linearalgebra problem involving some matrices and some vectors:
 1


2
2
D
+
D
+

F =G
(28)
2
|W
{zW
}
M

where M is just a convenient name that we have assigned to the (N +1)(N +1)
matrix in parentheses.
Handling of boundary values
The only remaining complication is to account for the boundary values. To do
this, note that the first and last entries in the vector F are actually known, not
errors that result from being off by 1.
The way we have written things (which is the conventional formulation of Chebyshev spectral methods), N is the number of angular segments into which the upper-half-circle is split,
which means that the number of sample points is actually one larger than N ; this is because
the index n in equation (26) needs to runs from 0 to N inclusive because we need to include
and 1 = cos NN .
sample points at both 1 = cos 0
N
This is straightforward enough, but it differs from the convention typically used in
DFT/FFT methods, where N (not N + 1) is the number of sample points. This corresponds
to the fact that, in DFT/FFT methods, the index n only runs from 0 to N 1, not all the
way to N . One way to think about the distinction is that, in DFT/FFT methods, the point
n = N is equivalent to n = 0 (it corresponds to one full lap around the unit circle in the complex plane), so including it would be redundant. On the other hand, in Chebyshev methods
[more broadly, in discrete sine/cosine transform methods (DST/DCT methods) as opposed
to discrete Fourier transform methods], the point n = N corresponds to one half lap around
the unit circle, taking us to = ; this is inequivalent to n = 0 and thus the corresponding
sample point must be retained.

18.330 Lecture Notes

23

unknown, quantities: they are simply11 given by the boundary conditions, i.e.
we have
F0 fR ,
FN fL .
(29)
This means that equation (28), which consists of N + 1 simultaneous linear
equations, actually gives us more equations than we need; we want to eliminate
the first and last of those equations and solve a reduced (N 1)-dimensional
system for just the unknown (N 1) quantities F1 , , FN 1 .
To separate out what is known from what is unknown on the LHS of equation
(28), lets write the N +1 equations implicit in that statement in a {1, (N 1), 1}
block form:

F0
G0
M00
v1T
M0N

=
(30)

.
v2
M
v
F
G
int
3
int int

T
MN 0
v4
FN
GN
MN N
In this equation, F int and G int are the interior portions of the F and G
vectors, containing just the values of F and G at the N 1 interior Chebyshev
points:

F1
G1

F2
G2

..
..
F int =
G int =
,
.
.
.

FN 2
GN 2
FN 1
GN 1
Also, in equation (30), v1,2,3,4 are (N 1)-dimensional vectors obtained by
slicing out chunks of the original matrix M, and Mint is the (N 1) (N 1)
interior chunk of M. In a high-level language like julia these may be extracted
from M using the following commands:
MInt = M[ 2:end-1, 2:end-1 ];
v1
= M[
1, 2:end-1 ];
v2
= M[ 2:end-1, 1
];

11 Careful! In Chebyshev spectral methods, the angle = n [the argument of the cosine
N
in equation (26)] runs from = 0 to = as the index n runs from 0 to N . This means that
the variable winds up running backwards from 1 to 1 as n runs from 0 to N , i.e.

n = 0 corresponds to = +1,

n = N corresponds to = 1.

Looking at equation (28), this yields the at-first-surprising conclusion that the boundary value
at the right endpoint, fR , wants to go in the first slot of the vector F , while fL wants to go
in the last slot of the vector, as in (29).

18.330 Lecture Notes

v3
v4

24

= M[ 2:end-1, end
];
= M[
end, 2:end-1 ];

The portion of (30) that we now want to solve is the interior portion, i.e.
the innermost (N 1) (N 1) chunk of the system, which reads
F0 v2 + Mint F int + FN v3 = G int
or, swinging all known quantities over the RHS so that we have a linear system
relating unknowns to knowns,
Mint F int = G int F0 v2 FN v3
in terms of which our solution vector will be
h
i
F int = M1
int G int F0 v2 FN v3 .

(31)

This equation gives us only the innermost (N 1) entries in our solution vector
F ; the outer 2 entries are obtained by just plugging in the given boundary
conditions.
If this procedure seems complicated, its actually nothing more than what
we did earlier in our treatment of finite-difference solutions to boundary-value
problems. For example, in the section titled Finite-differencing as matrixvector multiplication in the Numerical Differentiation lecture notes, the RHS
of the boundary-value problem involved a vector we called , which depended
on the boundary values. This vector is equivalent to the vector F0 v2 +Fn v3 that
appears on the RHS of (31). The only difference is that in the finite-difference
case this vector is sparse (almost all of its entries are zero), whereas here the
vector is dense.
This reflects the fact that finite-differencing is essentially a local procedure,
which estimates derivatives from function samples only at immediately adjacent points; in contrast, Chebyshev differentiation is inherently global, with
each sample of the derivative needing to know information on the entire set of
function samples. This non-locality makes Chebyshev methods more costly for a
given number of samples, but is also responsible for the dramatically accelerated
convergence properties.

Você também pode gostar