Newton’s method works for convex real functions
Theorem 1.
Let be a convex differentiable functionon an interval , with at least one root.Then the following sequence obtained from Newton’s method,
will converge to a root of , provided that and for the given starting point .
Obviously, “convex” can be replaced by “concave” in Theorem 1.
The proof will proceed in several steps. First, a simple converse result:
Theorem 2.
Let be a differentiable convex function,and be the sequence in Theorem 1.If it is convergent to a number ,then is necessarily a root of .
Proof.
We have
(The first limit is zero by local boundedness of at ,which follows from being finite and monotone11Actually, a differentiable convex function must necessarily have a continuous
derivative
,for the derivative is increasing, and it cannot have any jump discontinuities byDarboux’s Theorem (http://planetmath.org/DarbouxsTheorem)..)∎
Proof of Theorem 1
Case A: when , , and
We claim that whenever , then too.Recall that the graph of a convex function always liesabove any of its tangent lines.So in particular, the point is higher than the point, which is on the tangent line by definition.By induction
, we have for all .
We also have for all .By hypothesis the slope of the tangent line is positive,and lies above the horizontal axis.Then the intersection
point of the tangent line and the horizontal axis,which is , must be to the left of .
So the sequence is decreasing. It converges to a real number ordiverges to .If the limit is a real number, by Theorem 2 that numberis a root of , and we are done.
If the limit is , then we must have for all . Hence for all , as is monotoneand can be arbitrarily large negative.In other words, has no root to begin with.
Case B: when , , and
This situation is the (left-right) mirror of Case A,except that because the slope of the curve is negative,the sequence is increasing this time. We still have ,so the sequence either converges to a root of , or diverges to (in which case has no root).
Case C: when , , and
To begin, note that the first part of Case A, asserting that for all , still goes through only assuming — in which case the sequence would not be defined after the point .We show that, in fact, for all ,so the rest of Case A goes through.
Suppose is the smallest integerfor which .Also assume , otherwise we would be done anyway.The reasoning in Case A shows .
The function is strictly positive on ,for the graph of lies above the tangent line segment extendingfrom to , which in turn liesstrictly above the horizontal axis.Since is decreasing to the left of and increasing to the right, never touches the horizontal axis anywhere.This contradicts our hypothesis that has roots.
Case D: when , , and
This situation is the (left-right) mirrorof Case C. The same argument shows that for all (or for some ), and theargument of Case B applies.
Case E: for general intervals
The only concern is that the iteration might go outside the interval . We show that it does not.
Suppose .Then as we have proved, for all .Suppose but .Then has no root, because the graph of lies abovethe tangent line from to which lies above the horizontal axis.
The limit of could conceivably go just outsidethe interval .But this case is similar to the case already considered in Case A: possesses no root then.
Finally, suppose , the case we have ignored all along.Our hypotheses assume that is defined.We immediately have ,since the graph of lies above the tangent linefrom to .But this just brings us back to the other cases.
Examples
Newton’s method for finding square roots of positive numbers (which reduces to the ancient divide-and-average method) alwaysconverges, for the target function is convex.
References
- 1 Michael Spivak. Calculus, third edition. Publish or Perish, 1994.