The world’s Largest Sharp Brain Virtual Experts Marketplace Just a click Away
Levels Tought:
Elementary,Middle School,High School,College,University,PHD
| Teaching Since: | Apr 2017 |
| Last Sign in: | 103 Weeks Ago, 2 Days Ago |
| Questions Answered: | 4870 |
| Tutorials Posted: | 4863 |
MBA IT, Mater in Science and Technology
Devry
Jul-1996 - Jul-2000
Professor
Devry University
Mar-2010 - Oct-2016
Hello, I need help for problem 3 and 4. This is a Numerical Analysis course that requires Python.
problem 3 asks to write a function in Python, and problem 4 requires to use this function to do some application and analysis.
Interpolation ∗ Hector D. Ceniceros 1 Approximation Theory Given f ∈ C[a, b], we would like to find a “good” approximation to it
by “simpler functions”, i.e. functions in a given class (or family) Φ. For
example,Φ = Pn = {all polynomials of degree ≤ n}.
A natural problem is that of finding the best approximation to f by functions in Φ. But how do we measure the accuracy of any approximation? that
is, what norm1 do we use? We have several choices for norms of functions.
The most commonly used are:
1. The max or infinity norm: f ∞ = supx∈[a,b] |f (x)|. 2. The 2-norm: f 2 =( b
a f 2 (x)dx) 2 . 3. The p-norm: f p =( b
a f p (x)dx) p . 1 1 Later, we will need to consider weighted norms: for some positive function
ω(x) in [a, b] (it could be zero on a finite number of points) we define
1
2 b (1) f ω,2 2 = ω(x)f (x)dx . a
∗ These are lecture notes for Math 104 A. These notes and all course materials are
protected by United States Federal Copyright Law, the California Civil Code. The UC
Policy 102.23 expressly prohibits students (and all other persons) from recording lectures
or discussions and from distributing or selling lectures notes and all other course materials
without the prior written permission of the instructor.
1
A norm · is a real valued function on a vector space such that (1) f > 0, f ≡ 0,
(2) λf = |λ| f for any λ scalar, and (3) f + g ≤ f + g . 1 Then, by best approximation in Φ we mean a function p ∈ Φ such that
f − p ≤ f − q , ∀q ∈ Φ.
Computationally, it is often more efficient to seek not the best approximation
but one that is sufficiently accurate and fast converging to f . The central
building block for this approximation is the problem of interpolation. 2 Interpolation Let us focus on the case of approximating a given function by a polynomial
of degree at most n. Then the interpolation problem can be stated as follows:
Given n+1 distinct points, x0 , x1 , ..., xn called nodes and corresponding values
f (x0 ), f (x1 ), ..., f (xn ), find a polynomial of degree at most n, Pn (x), which
satisfies (the interpolation property)
Pn (x0 ) = f (x0 )
Pn (x1 ) = f (x1 )
.
.
.
Pn (xn ) = f (xn ).
Let us represent such polynomial as Pn (x) = a0 + a1 x + · · · + an xn . Then,
the interpolation property means
Pn (x0 ) = f (x0 ), Pn (x1 ) = f (x1 ), · · · , Pn (xn ) = f (xn ),
which implies
a0 + a1 x0 + · · · + an xn = f (x0 )
0
a0 + a1 x1 + · · · + an xn = f (x1 )
1
.
.
.
a0 + a1 xn + · · · + an xn = f (xn ).
n
This is a linear system of n + 1 equations in n + 1 unknowns (the polynomial
coefficients a0 , a1 , . . . , an ). In matrix form: 1 x0 x2 · · · xn
a0
f (x0 )
0
0
1 x1 x2 · · · xn a1 f (x1 ) 1
1 (2)
. . = . . . . .
.
.
1 xn x2 · · · xn
an
f (xn )
n
n
2 Does this linear system have a solution? Is this solution unique? The answer
is yes to both. Here is a simple proof. Take f ≡ 0, then Pn (xj ) = 0,for
j = 0, 1, ..., n but Pn is a polynomial of degree ≤ n, it cannot have n + 1
zeros unless Pn (x) ≡ 0, which implies a0 = a1 = · · · = an = 0. That is,
the homogenous problem associated with (2) has only the trivial solution.
Therefore, (2) has a unique solution.
In general the values to interpolate might not come from a function.
They are just data supplied to us. We will often write (x0 , f0 ), (x1 , f1 ), etc.,
to emphasize this more general setting.
Example 1. As an illustration let us consider interpolation by a linear polynomial, P1 (x). Suppose we are given (x0 , f0 ) and (x1 , f1 ). We have written
P1 (x) explicitly in the Introduction. We write it now in a different form:
(3) P1 (x) = x − x0
x − x1
f0 +
f1
x0 − x1
x1 − x0 Clearly, this polynomial has degree at most 1 and satisfies the interpolation
property:
(4)
(5) P1 (x0 ) = f0 ,
P1 (x1 ) = f1 . Example 2. Given (x0 , f0 ), (x1 , f1 ), (x2 , f2 ) let us construct P2 (x), the polynomial of degree at most 2 which interpolates these points. The way we have
written P1 (x) in (3) is suggestive of how to explicitly write P2 (x):
P2 (x) = (x − x0 )(x − x2 )
(x − x0 )(x − x1 )
(x − x1 )(x − x2 )
f0 +
f1 +
f2 .
(x0 − x1 )(x0 − x2 )
(x1 − x0 )(x1 − x2 )
(x2 − x0 )(x1 − x1 ) If we define
(x − x1 )(x − x2 )
,
(x0 − x1 )(x0 − x2 )
(x − x0 )(x − x2 )
(2)
l1 (x) =
,
(x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
(2)
,
l2 (x) =
(x2 − x0 )(x1 − x1 )
(2) (6) l0 (x) = (7)
(8)
then we simply have
(9) (2) (2) (2) P2 (x) = l0 (x)f0 + l1 (x)f1 + l2 (x)f2 .
3 Note that each of the polynomials (6), (7), and (8) are exactly of degree 2
(2)
and they satisfy lj (xk ) = δjk 2 . Therefore, it follows that P2 (x) given by (9)
satisfies the interpolation property
(10)
(11)
(12) P2 (x0 ) = f0 ,
P2 (x1 ) = f1 ,
P2 (x2 ) = f2 . We can now write down the polynomial (of degree at most n) which
interpolates n + 1 given values, (x0 , f0 ), . . . , (xn , fn ), where the interpolation
nodes x0 , . . . , xn are assumed distinct.
Define
(n) lj (x) =
(13) (x − x0 ) · · · (x − xj−1 )(x − xj+1 · · · (x − xn )
(xj − x0 ) · · · (xj − xj−1 )(xj − xj+1 · · · (xj − xn )
n =
k=0,k=j (x − xk )
,
(xj − xk ) for j = 0, 1, ..., n. These are called the elementary Lagrange polynomials of degree n. Note that
(n)
lj (xk ) = δjk . Therefore
n (14) Pn (x) = (n)
l0 (x)f0 + (n)
l1 (x)f1 + ··· + (n)
ln (x)fn (n) lj (x)fj =
j=0 interpolates the given data, i.e., it satisfies the interpolation property Pn (xj ) =
fj for j = 0, 1, 2, . . . , n. Relation (14) is called the Lagrange form of the interpolating polynomial. The following result summarizes our discussion.
Theorem 1. Given the n + 1 values (x0 , f0 ), . . . , (xn , fn ), for x0 , x1 , ..., xn
distinct. There is a unique polynomial of degree at most n, Pn (x), such that
Pn (xj ) = fj for j = 0, 1, . . . , n.
Proof. Pn (x) in (14) is of degree at most n and interpolates the data. Uniqueness follows from the fundamental algebra : suppose there is another polynomial Qn (x) of degree at most n such that Qn (xj ) = fj for j = 0, 1, . . . , n.
Consider W (x) = Pn (x) − Qn (x). This is a polynomial of degree at most n
and W (xj ) = Pn (xj ) − Qn (xj ) = fj − fj = 0 for j = 0, 1, 2, . . . , n, which is
impossible unless W (x) ≡ 0 which implies Qn = Pn .
2 δjk is the Kronecker delta, i.e. δjk = 0 if k = j and 1 if k = j. 4 3 Connection to Best Approximation We can view interpolation as a linear operator. Suppose that we have n + 1
distinct nodes x0 , x1 , . . . , xn contained in an interval [a, b]. Let f and g be two
continuous functions in [a, b] and α and β two scalars. Then, the interpolating
polynomial for αf (x) + βg(x) is P (x) = αPn (x) + βQn (x) where Pn (x) and
Qn (x) are the interpolating polynomials of f and g, respectively. This follows
immediately from (14). Also, note that if f is a polynomial of degree at most
n, its interpolating polynomial is itself, i.e. Pn (x) = f (x).
∗
Now suppose that Pn (x) is the best polynomial approximation of f in the
max or uniform norm, i.e.
min f − P (15) p∈Pn ∞ ∗
= f − Pn ∞, where Pn = {all polynomials of degree ≤ n}. Let Pn (x) be the interpolating
polynomials of f at x0 , x1 , . . . , xn . Then,
f − Pn ∞ ∗
∗
= f − Pn − (Pn − Pn ) ∞ ∗
≤ f − Pn ∞ ∗
+ Pn − Pn ∞. ∗
But Pn (x) − Pn (x) is a polynomial of degree at most n which interpolates
∗
f − Pn , that is
n Pn (x) − ∗
Pn (x) (n) ∗
lj (x)(f (xj ) − Pn (xj )). =
j=0 Therefore,
(16) ∗
Pn − Pn ∞ ∗
≤ Λ n f − Pn ∞ where
n (17) (n) |li (x)| Λn = max a≤x≤b j=0 is called the Lebesgue Constant. Using this in (16) we obtain
(18) f − Pn ∞ ∗
≤ (1 + Λn ) f − Pn ∞. This inequality connects the interpolation error f − Pn ∞ with the best
∗
approximation error f − Pn ∞ . Let us see if can extract more information
from this connection.
5 There is a fundamental result in approximation theory, which states that
any continuous function can be approximated uniformly, with arbitrary accuracy by a polynomial. This is the celebrated Weierstrass Theorem.
Theorem 2. Weierstrass Theorem. Let f be a continuous function in [a, b].
Given > 0 there is a polynomial P such that
f −P ∞ < . Weierstrass theorem implies that as we increase the degree the best ap∗
proximation polynomial converges uniformly to f , that is f − Pn ∞ → 0 as
n → ∞. However, because the Lebesgue constant Λn is not bounded in n,
we cannot conclude that f − Pn ∞ as n → ∞, i.e. that the interpolating
polynomial, as we add more and more nodes, converges uniformly to f . That
depends on the regularity of f and on the distribution of the nodes. We will
discuss this further later. 4 Barycentric Formula The Lagrange form of the interpolating polynomial is not convenient for computations. If we want to increase the degree of the polynomial we cannot
reuse the work done in getting and evaluating a lower degree one. However, we can obtain a very efficient formula by rewriting the interpolating
polynomial as follows.
Let
w(x) = (x − x0 )(x − x1 ) · · · (x − xn ). (19) (n) The numerator in lj (x) is w(x)/(x − xj ). Thus, we can write
(n) (20) (n)
lj (x) λj
,
= w(x)
x − xj (n) λj = 1 . n (xj − xk )
k=0
k=j Therefore
(n) n (21) Pn (x) = w(x)
j=0 (n) n
λj
λj
f (xj ) = w(x)
f (xj ).
x − xj
x − xj
j=0 6 Now, note that from (14) with f (x) ≡ 1 it follows that
n
(n) (22) lj (x) = 1
j=0 and diving (21) by (22) we get the so-called Barycentric Formula for interpolation:
(n) n (23) Pn (x) = j=0 λj
f (xj )
x − xj
n j=0 (n) , for x = xj , j = 0, 1, . . . , n. λj
x − xj For x = xj , j = 0, 1, . . . , n, the interpolation property should be used:
Pn (xj ) = f (xj ).
(n)
The numbers λj depend only on the nodes x0 , x1 , ..., xn and not on given
values f (x0 ), f (x1 ), ..., f (xn ). We can precompute them efficiently as follows:
(0) λ0 = 1;
for m = 1 : n
for j = 0 : m − 1
λ (m) (m−1) j
λj = xj −xm ;
end
(m)
λm = m−1 1 ; (xm − xk )
k=0 end
If we want to add one more point (xn+1 , f (xn+1 )) we just extend the
(n+1)
(n+1)
(n+1)
m-loop to n + 1 to generate λ0
, λ1
, · · · , λn+1 . 7 For equidistributed points, xj = x0 + jh, j = 0, 1, . . . , n we have
(n) λj =
=
=
=
=
= 1
(xj − x0 ) · · · (xj − xj−1 )(xj − xj+1 ) · · · (xj − xn )
1
(jh)[(j − 1)h] · · · (h)(−h)(−2h) · · · (j − n)h
1
n−j hn [j(j − 1) · · · 1][1 · 2 · · · (n − j)]
(−1)
n!
(−1)j−n
hn n! j!(n − j)!
(−1)j−n n
hn n!
j
n
(−1)
n
(−1)j
.
n n!
h
j
n We can omit the factor (−1) because it cancels out in the Barycentric Forhn n!
mula. Thus, for equidistributed points we can use
(24) 5 (n) λj = (−1)j n
,
j j = 0, 1, . . . n. Newton’s Form and Divided Differences There is another representation of the interpolating polynomial which is both
very efficient computationally and very convenient in the derivation of numerical methods based on interpolation. The idea of this representation, due
to Newton, is to use successively lower order polynomials for constructing
Pn (x).
Suppose we have gotten Pn−1 (x), that is the interpolating poly of f at
x0 , ..., xn−1 and we would like to obtain Pn (x), the interpolating poly of f
at x0 , ..., xn by reusing Pn−1 (x). Pn (x) − Pn−1 (x) = R(x), where R(x) is a
polynomial of degree at most n. Moreover, for j = 0, . . . , n − 1
(25) r(xj ) = Pn (xj ) − Pn−1 (xj ) = f (xj ) − f (xj ) = 0. Therefore, R(x) can be factored as
(26) R(x) = cn (x − x0 )(x − x1 ) · · · (x − xn−1 ).
8 The constant cn is called the n-th divided difference of f with respect to
x0 , x1 , ..., xn , and is usually denoted as f [x0 , . . . , xn ]. Thus, we have
(27) Pn (x) = Pn−1 (x) + f [x0 , . . . , xn ](x − x0 )(x − x1 ) · · · (x − xn−1 ) By the same argument, we have
(28) Pn−1 (x) = Pn−2 (x) + f [x0 , . . . , xn−1 ](x − x0 )(x − x1 ) · · · (x − xn−2 ), etc. So we arrive at Newton’s Form of Pn (x):
(29)
Pn (x) = f [x0 ] + f [x0 , x1 ](x − x0 ) + . . . + f [x0 , . . . , xn ](x − x0 ) · · · (x − xn−1 ).
Note that for n = 1
P1 (x) = f [x0 ] + f [x0 , x1 ](x − x0 )
P1 (x0 ) = f [x0 ] = f (x0 )
P1 (x1 ) = f [x0 ] + f [x0 , x1 ](x1 − x0 ) = f (x1 ) Therefore
(30) f [x0 ] = f (x0 )
f (x1 ) − f (x0 )
f [x0 , x1 ] =
.
x1 − x0 (31)
and
(32) P1 (x) = f (x0 ) + f (x1 ) − f (x0 )
(x − x0 ).
x1 − x0 Define f [xj ] = f (xj ) for j = 0, 1, ...n. The following identity will allow us
to compute all the required divided differences.
Theorem 3.
(33) f [x0 , x1 , ..., xk ] = f [x1 , x2 , ..., xk ] − f [x0 , x1 , ..., xk−1 ]
.
xk − x0 9 Proof. Let Pk−1 be the interpolating polynomial of degree at most k − 1 of f
at x1 , . . . , xk and Qk−1 the interpolating polynomial of degree at most k − 1
of f at x0 , . . . , xk−1 . Then
(34) P (x) = Pk−1 (x) + x − xk
[Pk−1 (x) − Qk−1 (x)].
xk − x0 is a polynomial of degree at most k and for j = 1, 2, ....k − 1
Pk (xj ) = f (xj ) + xj − xk
[f (xj ) − f (xj )] = f (xj ).
xk − x0 Moreover, Pk (x0 ) = Qk−1 (x0 ) = f (x0 ) and Pk (xk ) = Pk−1 (xk ) = f (xk ).
Therefore, P is the interpolation poly of f at x0 , ..., xk . The leading order
coefficient of Pk (x) is f [x0 , ..., xk ] and equating this with the leading order
[x
coefficient of P , f [x1 ,...,xk ]−f−x00,x1 ,...xk−1 ] , gives (33).
xk
To use (33) to obtain the divided difference we construct a table. Setting fj = f (xj ) this divided difference table is built column by column as
illustrated below for n = 3.
3th order
xj 0th order 1th order 2th order
x0
f0
f [x0 , x1 ]
f1
f [x0 , x1 , x2 ]
x1
f [x1 , x2 ]
f [x0 , x1 , x2 , x3 ]
f2
f [x1 , x2 , x3 ]
x2
f [x2 , x3 ]
x3
f3
Example 3. Let f (x) = 1 + x2 and xj = j − 1 for j = 0, . . . , 3. Then
xj 0th order 1th order
0
1
2−1
= 1
1−0
1
2
6−2
=3
2−1
2
5
10−5
=5
3−2
3
10 2th order 3th order 3−1
2−0 = 1
0 5−3
3−1 =1 so P3 (x) = 1 + 1(x − 0) + 1(x − 0)(x − 1) + 0(x − 0)(x − 1)(x − 2) = 1 + x2 .
10 After computing the divided differences, we need to evaluate Pn at a given
point x. This can be done efficiently by suitably factoring it. For example,
for n = 3 we have
P3 (x) = c0 + c1 (x − x0 ) + c2 (x − x0 )(x − x1 ) + c3 (x − x0 )(x − x1 )(x − x2 )
= c0 + (x − x0 ) {c1 + (x − x1 )[c2 + (x − x2 )c3 ]}
For general n we can use the following Horner-like scheme to get p = Pn (x):
p = cn ;
for k = n − 1 : 0
p = ck + (x − xk ) ∗ p;
end
Note that
(35) Pk (x) = Pk−1 (x) + f [x0 , ..., xk ](x − x0 ) · · · (x − xk−1 ) where Pk−1 (x) is a polynomial of degree ≤ k − 1. Then, the leading order
(k)
coefficient of Pk (x) is f [x0 , ..., xk ] and consequently Pk (x) = k!f [x0 , ..., xk ]
and thus
(36) 6 f [x0 , ..., xk ] = 1 (k)
P (x).
k! k Cauchy’s Remainder In the Introduction we proved that if x0 , x1 , and x are in [a, b] then
1
f (x) − P1 (x) = f (ξ(x))(x − x0 )(x − x1 ),
2
where ξ(x) ∈ (a, b). The general result about the interpolation error is the
following theorem:
Theorem 4. Let f ∈ C n+1 [a, b], x0 , x1 , ..., xn , x be contained in [a, b], and
Pn (x) be the interpolation polynomial of degree≤ n of f at x0 , ..., xn then
(37) f (x) − Pn (x) = 1
f (n+1) (ξ(x))(x − x0 )(x − x1 ) · · · (x − xn ),
(n + 1)! where min{x0 , . . . , xn , x} < ξ(x) < max{x0 , . . . , xn , x}.
11 Proof. The right hand side of (37) is known as the Cauchy Remainder and
the following proof is due to Cauchy.
For x equal to one of the nodes xj the result is trivially true. Take x fixed
not equal to any of the nodes and define
(38) φ(t) = f (t) − Pn (t) − [f (x) − Pn (x)] (t − x0 )(t − x1 ) · · · (t − xn )
.
(x − x0 )(x − x1 ) · · · (x − xn ) Clearly, φ ∈ cn+1 [a, b] and vanishes at t = x0 , x1 , ..., xn , x. That is, phi has
at least n + 2 zeros. Applying Rolle’s Theorem n + 1 times we conclude that
there exists a point ξ(x) ∈ (a, b) such that φ(n+1) (ξ(x)) = 0. Therefore,
0 = φ(n+1) (ξ(x)) = f (n+1) (ξ(x)) − [f (x) − Pn (x)] (n + 1)!
(x − x0 )(x − x1 ) · · · (x − xn ) from which (37) follows. Note that the repeated application of Rolle’s theorem implies that ξ(x) is between min{x0 , x1 , ..., xn , x} and max{x0 , x1 , ..., xn , x}.
We can use the same argument to relate divided differences to the derivatives of f . Let Pn+1 (t) be the interpolating polynomial of f at x0 , . . . , xn , x.
Then f (t) − Pn+1 (t) vanishes at t = x0 , . . . , xn , x. By repeated application of
Rolle’s theorem there is ξ(x) between min{x0 , x1 , ..., xn , x} and max{x0 , x1 , ..., xn , x}
such that
n+1
0 = f n+1 (ξ(x)) − Pn+1 (ξ) = f n+1 (ξ(x)) − (n + 1)!f [x0 , . . . , xn , x]. Therefore
(39) f [x0 , ..., xn , x] = f (n+1) (ξ(x))
.
(n + 1)! Similarly, if Pk is the interpolating polynomial of f at x0 , x1 , ..., xk , then
1
(40)
f [x0 , ..., xk ] = f (k) (ξ),
k!
where min{x0 , . . . , xk } < ξ < max{x0 , . . . , xk }. Suppose that we now let
x1 , ..., xk → x0 . Then ξ → x0 and
1
(41)
lim f [x0 , ..., xk ] = f (k) (x0 ).
x1 ,...,xk →x0
k!
We can use this relation to define a divided difference where there are
“coincident” nodes. For example f [x0 , x1 ] when x0 = x1 by f [x0 , x0 ] = f (x0 ),
etc. This is going to be very useful for the following interpolation problem.
12 7 Hermite Interpolation The Hermite interpolation problem is: given values of f and some of its
derivatives at the nodes x0 , x1 , ..., xn , find the interpolating polynomial of
smallest degree interpolating those values. This polynomial is called the
Hermite Interpolation Polynomial and can be obtained with a minor modification to the Newton’s form representation.
For example: Suppose we look for a polynomial of P of lowest degree
which satisfies the interpolation conditions:
P (x0 ) = f (x0 ),
P (x0 ) = f (x0 ),
P (x1 ) = f (x1 ),
P (x1 ) = f (x1 ).
We can view this problem as a limiting case of polynomial interpolation of
f at two pairs of coincident nodes, x0 , x0 , x1 , x1 and we can use Newton’s
Interpolation form to obtain P . The table of divided differences, in view of
(41), is (42) x0
x0
x1
x1 f (x0 )
f (x0 ) f (x0 )
f (x1 ) f [x0 , x1 ] f [x0 , x0 , x1 ]
f (x1 ) f (x1 ) f [x0 , x1 , x1 ] f [x0 , x0 , x1 , x1 ] and
P (x) = f (x0 ) + f (x0 )(x − x0 ) + f [x0 , x0 , x1 ](x − x0 )2 + f [x0 , x0 , x1 , x1 ](x − x0 )2 (x − x1 ).
√
Example 4. Let f (0) = 1, f (0) = 0 and f (1) = 2. Find the Hermite
Interpolation Polynomial.
We construct the table of divided differences as follows:
0
0
(43)
1 1
1
√ 2 0
√ 2−1 and therefore √
2−1 √
√
P (x) = 1 + 0(x − 0) + ( 2 − 1)(x − 0)2 = 1 + ( 2 − 1)x2 .
13
-----------