Example 4.1.1

In the theory of vibrations of a circular drum, the displacement of the drumhead can be expressed in terms of pure harmonic modes,

$$J_m(\omega_{k,m} r) \cos(m\theta) \cos(c \omega_{k,m} t),$$

where $(r,\theta)$ are polar coordinates, $0\le r\le 1$, $t$ is time, $m$ is a positive integer, $c$ is a material parameter, and $J_m$ is a Bessel function of the first kind. The quantity $\omega_{k,m}$ is a resonant frequency and is a positive root of the equation

$$J_m(\omega_{k,m}) = 0,$$

which states that the drumhead is clamped around the rim. Tabulating approximations to the zeros of Bessel functions has occupied countless mathematician-hours throughout the centuries.

From the graph we see roots near 6, 10, 13, 16, and 19. We use nlsolve from the NLsolve package to find these roots accurately. (It uses vector variables, so we have to adapt it for use with scalars.)

Example 4.1.2

Consider first the function

At the root $r=1$, we have $f'(r)=-1$. If the values of $f$ were perturbed at any point by noise of size, say, $0.05$, we can imagine finding the root of the function as though drawn with a thick line, whose edges we show here.

The possible values for a perturbed root all lie within the interval where the black lines intersect the $x$ axis. The width of that zone is about the same as the vertical distance between the lines.

By contrast, consider the function

Now $f'(1)=-0.01$, and the graph of $f$ will be much shallower near $x=1$. Look at the effect this has on our thick rendering:

The vertical displacements in this picture are exactly the same as before. But the potential horizontal displacement of the root is much wider. In fact, if we perturb the function upward by the amount drawn here, the root disappears entirely!

Example 4.2.1

Let's convert the roots of a quadratic polynomial $f(x)$ to a fixed point problem. [Had to change Poly to Polynomial.]

We'll define $g(x)=x-f(x)$. Intersections of its graph with the line $y=x$ are fixed points of $g$ and thus roots of $f$. (Only one is shown in the chosen plot range.)

If we evalaute $g(2.1)$, we get a value of almost 2.6.

So $g(x)$ is considerably closer to a fixed point than $x$ was. The value $y=g(x)$ ought to become our new $x$ value! Changing the $x$ coordinate in this way is the same as following a horizontal line over to the graph of $y=x$.

Now we can compute a new value for $y$. We leave $x$ alone here, so we travel along a vertical line to the graph of $g$.

You see that we are in a position to repeat these steps as often as we like. Let's apply them a few times and see the result.

The process spirals in beautifully toward the fixed point we seek. Our last estimate has almost 4 accurate digits. [Verify the correct index for r, had to change from 1 to 2.]

Now let's try to find the other fixed point $\approx 1.29$ in the same way. We'll use 1.3 as a starting approximation.

Example 4.2.3

This time, the iteration is pushing us away from the correct answer. [changed Poly.]

Here is the fixed point iteration. This time we keep track of the whole sequence of approximations.

It's easiest to construct and plot the sequence of errors. [Changed index 1 to 2]

It's quite clear that the convergence quickly settles into a linear rate. We could estimate this rate by doing a least-squares fit to a straight line. Keep in mind that the values for small $k$ should be left out of the computation, as they don't represent the linear trend.

We can exponentiate the slope to get the convergence constant $\sigma$.

The numerical values of the error should decrease by a factor of $\sigma$ at each iteration. We can check this easily with an elementwise division.

The methods for finding $\sigma$ agree well.

Example 4.3.1

Suppose we want to find a root of the function

From the graph, it is clear that there is a root near $x=1$. So we call that our initial guess, $x_1$.

Next, we can compute the tangent line at the point $\bigl(x_1,f(x_1)\bigr)$, using the derivative.

In lieu of finding the root of $f$ itself, we settle for finding the root of the tangent line approximation, which is trivial. Call this $x_2$, our next approximation to the root.

The residual (value of $f$) is smaller than before, but not zero. So we repeat the process with a new tangent line based on the latest point on the curve.

We appear to be getting closer to the true root each time.

Example 4.3.2

We again look at finding a solution of $xe^x=2$ near $x=1$. To apply Newton's method, we need to calculate values of both the residual function $f$ and its derivative.

We don't know the exact root, so we use nlsolve (from NLsolve) to determine the "true" value.

We use $x_1=2$ as a starting guess and apply the iteration in a loop, storing the sequence of iterates in a vector.

Here is the sequence of errors.

Glancing at the exponents of the errors, they roughly form a neat doubling sequence until the error is comparable to machine precision. We can see this more precisely by taking logs.

Quadratic convergence isn't as graphically distinctive as linear convergence.

This looks faster than linear convergence, but it's not easy to say more. If we could use infinite precision, the curve would continue to steepen forever.

Example 4.3.3

Suppose we want to solve $e^x=x+c$ for multiple values of $c$. We can create functions for $f$ and $f'$ in each case.

There's a subtlety about the definition of f. It uses whatever value is assigned to c at the moment f is called. (This is unlike MATLAB, which locks in the value defined for c at the moment of definition.) If we later change the value assigned to c, the function is changed also. [Thiis seems not to be true!]

Example 4.4.1

We return to finding a root of the equation $xe^x=2$.

From the graph, it's clear that there is a root near $x=1$. To be more precise, there is a root in the interval $[0.5,1]$. So let us take the endpoints of that interval as two initial approximations.

Instead of constructing the tangent line by evaluating the derivative, we can construct a linear model function by drawing the line between the two points $\bigl(x_1,f(x_1)\bigr)$ and $\bigl(x_2,f(x_2)\bigr)$. This is called a secant line.

As before, the next value in the iteration is the root of this linear model.

For the next linear model, we use the line through the two most recent points. The next iterate is the root of that secant line, and so on.

Example 4.4.2

We check the convergence of the secant method from the previous example.

We don't know the exact root, so we use nlsolve to get a substitute.

Here is the sequence of errors.

It's not so easy to see the convergence rate by looking at these numbers. But we can check the ratios of the log of successive errors.

It seems to be heading toward a constant ratio of about 1.6 before it bumps up against machine precision.

Example 4.4.3

Here we look for a root of $x+\cos(10x)$ that is close to 1.

We choose three values to get the iteration started.

If we were using "forward" interpolation, we would ask for the polynomial interpolant of $y$ as a function of $x$. But that parabola has no real roots.

To do inverse interpolation, we swap the roles of $x$ and $y$ in the interpolation.

We seek the value of $x$ that makes $y$ zero. This means evaluating $q$ at zero.

We repeat the process a few more times.

Here is the sequence of errors.

The error seems to be superlinear, but subquadratic.

Example 4.5.2

Let us use Newton's method on the system defined by the function

Here is a function that computes the Jacobian matrix.

(These functions could be written as separate files, or embedded within another function as we have done here.) Our initial guess at a root is the origin.

We need the value (residual) of the nonlinear function, and its Jacobian, at this value for $\mathbf{x}$.

We solve for the Newton step and compute the new estimate.

Here is the new residual.

We don't seem to be especially close to a root. Let's iterate a few more times.

We find the infinity norm of the residuals.

We don't know an exact answer, so we will take the last computed value as its surrogate.

The following will subtract r from every column of x.

Now we take infinity norms and check for quadratic convergence.

For a brief time, we see ratios around 2, but then the limitation of double precision makes it impossible for the doubling to continue.

Example 4.5.3

As before, the system is defined by its residual and Jacobian, but this time we implement them as a single function.

Our initial guess is the origin. The output has one column per iteration.

The last column contains the final Newton estimate. We'll compute the residual there in order to check the quality of the result.

Let's use the convergence to the first component of the root as a proxy for the convergence of the vectors.

The exponents approximately double, as is expected of quadratic convergence.

Example 4.6.1

To solve a nonlinear system, we need to code only the function defining the system (residual vector), and not its Jacobian.

In all other respects usage is the same as for the newtonsys function.

It's always a good idea to check the accuracy of the root, by measuring the residual (backward error).

Looking at the convergence of the first component, we find a subquadratic convergence rate, just as with the secant method.

Example 4.7.1

Inhibited enzyme reactions often follow what are known as Michaelis–Menten kinetics, in which a reaction rate $v$ follows a law of the form

$$v(x) = \frac{V x}{K_m + x},$$

where $x$ is the concentration of a substrate. The real values $V$ and $K_m$ are parameters that are free to fit to data. For this example we cook up some artificial data with $V=2$ and $K_m=1/2$.

The idea is to pretend that we know nothing of the origins of this data and use nonlinear least squares on the misfit to find the parameters in the theoretical model function $v(x)$. Note in the Jacobian that the derivatives are not with respect to $x$, but with respect to the two parameters, which are contained in the vector c.

The final values are close to the noise-free values of $V=2$, $K_m=0.5$ that we used to generate the data. We can calculate the amount of misfit at the end, although it's not completely clear what a "good" value would be. Graphically, the model looks reasonable.

For this model, we also have the option of linearizing the fit process. Rewrite the model as $1/v= (a/x)+b$ for the new parameters $\alpha=K_m/V$ and $\beta=1/V$. This corresponds to the misfit function whose entries are

$$f_i(\alpha,\beta) = \alpha \cdot \frac{1}{x_i} + \beta - \frac{1}{y_i},$$

for $i=1,\ldots,m$. Although the misfit is nonlinear in $x$ and $y$, it's linear in the unknown parameters $\alpha$ and $\beta$, and so can be posed and solved as a linear least-squares problem.

The two fits are different, because they do not optimize the same quantities.