There are a couple of problems that are more or less unique to
polynomial regression, and each of them has approaches for controlling
them. These are nonlinear bias, polynomial wiggle, and explosive
Nonlinear bias is an inevitable result of using nonlinear models
with data containing noise. For example, we have an excellent model
to "predict" the area of a circle, A, using the diameter,
D: A = pi D2/4. A circle 2 meters in diameter is accurately
computed to have an area A = 3.14159 square meters. Now, suppose
we have two measurements. One measurement is off by 10% on the high
side, and the other is 10% low (i.e. 2.2 m and 1.8 m). They average
to the true value of 2.0 m, and thus are an unbiased sample. However,
the areas computed with these two values are 3.17301 m2
and 2.54469 m2. They average to 3.17301 m2,
which is 1% too high!
The fact that the formula for the area of a circle exhibits nonlinear
bias does not move us to replace it with a linear approximation.
That would produce a different type of bias of a worse sort, in
which the model is inherently incapable of describing the behavior
of the underlying relationship. Instead, we generally live with
the problem as the better solution available. Furthermore, there
are mathematical methods available that can compensate for nonlinear
The same is true for Multivariate Polynomial Regression models.
Although including nonlinear effects in a model introduces nonlinear
bias, this is preferable to ignoring the nonlinear effects.
If a set of data containing noise is fit to a polynomial with too
few degrees of freedom, the resulting polynomial can produce errors
much greater than those of the data. This situation is called polynomial
wiggle. The extreme case is interpolation, in which there are as
many parameters in the polynomial as there are data. Recall from
the Basic Introduction to MPR that the degrees of freedom, df, is
equal to the number of data points, n, minus the number of parameters,
p (df = n - p). Thus interpolation is the situation in which df
= 0. A low degrees of freedom may be desired if noise is sufficiently
small. Otherwise the wiggle problem may occur.
To illustrate, we start with the equation y = ex (see Figure below).
The dark blue line is the plot of the equation over the range from
0.0 to 0.4. Next we picked four points along the line, plus one
that is displaced by noise (e.g. measurement error). With five points
it is possible to exactly fit (interpolate) a fourth-order polynomial.
The red line shows the resulting polynomial. Notice that it goes
through each point exactly. You can see that the largest error in
the curve is greater than the individual error at the one "wrong"
point. This is polynomial wiggle.
This problem can be minimized by ensuring that there are adequate
degrees of freedom. If we fit the same five points using a quadratic
polynomial, we obtain the model shown by the green line. Although
the model no longer fits the four "good" points exactly,
the overall fit is much improved. A quadratic model has three parameters,
so the degrees of freedom in this example is 5 - 3 = 2.
A related control measure is to keep the maximum exponent low,
say 2 or 3.
Another problem is explosive behavior, which is characterized by
extremely high errors when the model is used outside the range of
the data used in generating the model.
The Figure below shows an example based on the same exponential
equation as shown above in the discussion of polynomial wiggle.
The blue line, again, is a plot of the exact exponential equation.
The red diamonds show points that have had noise added to each of
them, representing measurement error.
Again, we took the five points and fitted a fourth-order polynomial
to them. This resulted in the red line, which goes through each
data point exactly. However, as soon as you use the model to make
a prediction outside the range of the data points used to create
this model, the prediction becomes wildly wrong.
The preferred way to deal with this problem is simply not to use
the model outside the range of the data. This is true for any empirical
model, not just for polynomials. Even linear models can be very
unreliable outside the range for which they were validated. When
the TaylorFit software saves the model specification in a "pfm"
file, it also saves the ranges of all the variables used in the
model, which the user can use to check that he or she is within
the proper range.
A second, but riskier, strategy is the same as the treatment for
polynomial wiggle: Maintain adequate degrees of freedom. When we
fit a quadratic equation to the five points (df = 2) we get a model
shown by the green line in the Figure. However, this only widens
the range before the model explodes.
It's interesting to note that if we fit the fourth-order polynomial
to the data without noise, we get a good fit over a much wider range
than the data, even wider than the quadratic fit to the noisy data.
But eventually, it, too, explodes. Of course the Taylor Theorem
proves that it is possible to attain any degree of accuracy over
any range desired, if one chooses a polynomial of high enough degree.
This is the basis for the claim that the MPR model is capable of
describing any functional relationship.