Practical implementation of the Householder's numerical method for root finding in polynomials.
Abstract
The Householder's method is actually a set of higher order iterative root finding methods. For order 1, it reduces to Newton-Rhapson iterations, and for order 2 reduces to the the Halley iterations, it can be generalized for higher orders. The convergence order is O+1 where O it's the order method.
Introduction
Polynomials are one of the most important mathematical objects. They serve in a vast field of engineering, mathematics, and physics, for example many solutions of differential equations in physics pass through using orthogonal polynomials (Hermite to describe quantum states, Legendre in spherical harmonics and so on). As far as engineering is concerned, polynomials are essential tool to solve many applied problems. For example polynomials are the key piece constructing RF filters and diplexers (see bibliography [4], [5]) or even analyzing LTI (Linear Time Invariant) systems since associated integro-differential equations to rational functions of polynomials by Laplace Transform. The set of applications is huge, they are even used in coding theory, error code detection and error code correction applied to telecommunications and large data storage systems since there is a particular family of polynomials related to Galois fields. The solution to many of these problems pass through solve the roots of the polynomials, but as far as we know, there is not direct solution for polynomials of degree 5 or greather. The solution then is to use iterative methods to get an approximation to these roots.
The well known set of methods such as Newton Rhapson, Müller or Bairstow [4], and eigenvalue method (this last in Matlab ®) performs well the root finding task. But all of these methods loses the convergence properties when the roots have multiplicity order greather than 1. The search for algorithms that overcome this problem has led me to investigate further and I have come across a generalization of certain higher order iterative methods among which Newton Raphson belongs to this family.
When I learned about the Householder iterations, I saw that the code was implemented with symbolic calculation. I set out to find a way to do it purely numerically and ran into several problems that I try to summarize in this article. Finally I include in my Github repository (please, see below) the python codes that demonstrate the validity of the practical implementation of the method.
Problem statement
In the search for roots of the above equation.
There is a family of iterative methods consisting of the following iteration scheme:
Note that for r = 1 the Newton-Rhapson method is obtained:
And for r = 2 this lead us to Halley's method:
We are interested here in creating an automatic method to generate Householder iterations with an arbitrary order and specifically for polynomials, instead of having to operate manually. For this we must be able to mechanically obtain the k derivatives of
up to an arbitrary order.
Algorithm implementation
To differentiate k times a function of the last form, we can use the well-known differentiation rule for the quotient of functions:
If we apply the differentiation on this function iteratively, he expression becomes quite complex, however we can use the following recursive scheme. Let:
We have the following recurrence relationship:
Let's iterate two times:
The last expression can be simplified as:
Lastly if we iterate a third time this lead us to:
At this point, the recursive scheme becomes clear by induction:
If we come back to the Householder iterations in their most general form, we have for the k-th order:
Numerical issues
Being
the resulting polynomial of µk increases in degree N-1.
Furthermore, an explosion of degree and of the values of the coefficients can occur when convolving v0 with the preceding derivatives of µk. In addition to this, an explosion of degree and of the values of the coefficients can occur when convolving v0 with the preceding derivatives of µk.
Let us designate the operator:
as a function that returns the degree of a polynomial, we have then the following properties:
Let us then take the evolution of the µk throughout the iterations of (13):
Furthermore, the derivation of the polynomials alters the coefficients as follows:
With all this, it is possible to guess (and in fact it is verified as it will be shown later), an extremely large explosion in degree and values of the coefficients that can lead to overflow in the digital machine as well as convergence problems.
If we take for example the following polynomial:
We have the following sequence:
If we take the following polynomial:
The 8th term is:
Recursive calculation scheme
Let us consider the possibility of calculating numerically the values µk, µ'k and v0 without evaluating intermediate polynomials in order to avoid the aforementioned explosion of polynomial coefficients.
We must Keep in mind that the final result of the polynomial in Horner's scheme, is the product of a sequence of consecutive additions and multiplications, the result of which an intermediate calculation may exceed the final result.
We therefore consider how to obtain the recursive scheme (11) from initial values µ0, v0:
To obtain the term µ1 we need µ0 and µ'0, then for µ2, we need µ1 and µ'1 which in turn requires µ0, µ'0 and µ''0. In general, by induction for the k-th term, 2 terms would be needed derived from k-1 th iteration, 3 from k-2 th, 4 from k-3 th and so on until reaching k + 1 terms derived from µ0. If we group them, there is an upper triangular matrix:
Let's take a look at (22) and consider the computation graph generated for each product when we apply a derivative:
For each product, the derivative expands the calculation graph by a factor of 2, and from successive expansions we can see that there is a general behavior:
That is, due to the combinatorial relationship, the exponential expansion O(2^n) of the branches, contracts to a linear expansion O(n) (we can see here Newton's binomial coefficients).
This recursive operation is equivalent to searching the coefficients of the matrix (23) whose rows and columns are constructed from:
The construction is therefore carried out from top to bottom.
We will also need the successive derivatives of v0 up to order k + 1 . Since µ0 = 1 all the elements below diagonal of the (23) are zero, and the same remains for:
Algorithm
At this point, we are ready to build the generic algorithm for Householder avoiding the creation of polynomials:
Where N is the order of method, x0 initial guess point, maxiter the maximum number of iterations, and epsilon_x the desired tolerance.
Practical code in Python
The following Python code has been created according with the pseudocode above and successfully tested. This code does not currently cover efficiency, but rather consists of a proof of concept and could be improved in many aspects. For example, every time the matrix calculation in (23) is invoked, the binomial coefficients are reconstructed, but in fact they remain fixed during all iterations.
The same goes for the calculation of v0 and its derivatives in terms of optimization by constructing Horner's scheme. Leaving all these suboptimal aspects aside, in the calculation, the expressions type (24) in the form of scalar products have been compacted into a matrix and vector form.
Code 1: Binomial coefficients in matrix form
Code 2: Matrix calculation defined in expression (23)
Code 3: Proposed algorithm iteration example
Code 4: Householder's iterations convolving polynomials
The following code is not part of the exposed algorithm, it aims to justify the reasons for the study in improving precision by establishing a comparison (benchmark) between applying the colvonving polynomials method and the proposed algorithm. In this case the benchmark or reference algorithm is packed into a class.
And the instantiation is as follows:
Code 5: Householder's convolving polynomials class iterations
Some tests and results
The following tables shows us a comparison between the convolutional (HH Conv) algorithm and the new proposed algorithm (HH Prop). Where x0 is the starting point or initial guess point for some root of polynomial P. xk is the last point, converged point or root approximation for P. And finally epsilon_k is the error estimation (whose log10(|eps_k|) indicates the stable number of decimal places achieved in the last iteration).
Target polynomial:
Table 1: Householder's iterations order 6
Table 2: Householder's iterations order 3
Discussion
Tables 1 and 2 shows clearly convergence issues evaluating large polynomials and how these issues get worse when we increase the order of the iterations. As we can see at table 1 (Order 6 iterations), there is no convergence in many cases for convolutional algorithm against proposed algorithm.
There is also a case that may surprise in table 1 for the starting point 1.01 + i, both algorithms converge and have same error. Being theoretically equivalent except for the propagation of calculation errors, this should not surprise us. There may be conditions under which both algorithms behave equivalently. In the rest of cases we can see how convergence even manages to double the number of decimal places for the same number of iterations.
The convergence to different points in table 2 for an initial point x0 = 2, may be indicative of convergence problems for attractor 1.0 against point -1.0, this remains for further study
Why the proposed algorithm works better?
The algorithm proposed here in ùatrix form has a recursive scheme for calculating the higher-order coefficients for the iterations, which are obtained from the zero-order coefficients, this largely remedies this problem, although in its calculations it may also have relatively large numbers . In fact, in the evaluation of polynomials we have:
If we consider a small error in x:
However in evaluating a quotient:
Here if the two numbers u and v are of similar order, these errors tend to be "desensitized" against the previuos case.
Overcoming other convergence issues
The two examples considered (order 6 and order 3) for the chosen polynomial represent, a vague frontier where the polynomial convolution algorithm begins to fail when exceeding the precision of the numerical representation in the machine.
Coefficients normalization
Due to floating point arithmetic in a computer, it is recommended that this renormalization or scaling factor, be a positive power (or negative if the values are very small) of 2, in this way, only the exponent part of the of the coefficients in floating point will change while the mantissa remains unchanged.
I have found that a Coefficients normalization helps, but does not solve the problem as efficiently. The ideal would be to use both techniques, where the renormalization of coefficients is a pre-processing of the polynomial. Pre-processing a polynomial or it's domain in complex plane it's not new, in [4] no-linear complex mapping is used to separate clustered roots.
Good guess initial point in complex plane
Generally speaking, evaluating polynomials with values far away from it's roots implies also large values, even some methods only have convergence guaranteed if the starting point is in some region near to a root, [3] estimates the upper and lower bound of the radii in the complex plane where the roots can be found:
It's higly recommended to use these bounds. Futhermore if our target polynomial is an Hurwitz polynomial, like those that appear in passive radio frequency filters [4] and [5], the roots will be restricted to the left half plane in complex domain.
Github Code repository
Here you can download Jupyter notebooks of tested examples.
Bibliography
- [Oscar Veliz, 2019] Householder's Method www.youtube.com/watch?v=F9DFewL0mho (In the comments below the video I discuss with the author about some practical issues of the method).
- [Wikipedia] Householder's Method en.wikipedia.org/wiki/Householder%27s_method
- [Demidovich I.A Maron, 1984] Cálculo numérico fundamental ISBN 10: 842830887X, ISBN 13: 9788428308878
- [Sedra 1978] Filter theory and design: Active and passive (Matrix series in circuits and systems). ISBN 10: 0916460142, ISBN 13: 9780916460143
- [Cameron, Raafat, Chandra 2007] Microwave Filters for Communication Systems: Fundamentals, Design and Applications. ISBN: 978-1-119-29238-8
Online preview of notebooks https://nbviewer.jupyter.org/github/byteptr/householder_root_finding/blob/master/hh_raw.ipynb https://nbviewer.jupyter.org/github/byteptr/householder_root_finding/blob/master/hh_matrix.ipynb