Homepage     About Us     Contact   
  An innovator in modeling software and services
   Do the Polynomial Puzzle!
    TaylorFit Software Product Model Development Services Customer Applications   
MPR Time Series Analysis
 Engineering
  Primer on Modeling
  Our Products
  Other Applications
  MPR for Time Series Analysis
  Data Types Needed for MPR
  Example Applications
  Down Load Users' Manual FREE
  Logical Capabilities of MPR
  Other Modeling Methods
  Bibliography
Technical Description of Multivariate Polynomial Regression
for Time-Series Analysis

We will consider applications to data formed from measurements sampled at equal time intervals, TAU, of nx input, state and output variables. Here we will not distinguish among these types of variables. The data point x[i,j] is the measurement of variable i taken at time step j. In general, each variable may depend upon previous measurements of itself and the other variables, except for input variables which do not depend upon state and output variables.

A type of ARMA model applied to such systems is the vector autoregressive (VAR) model, in which a prediction is found by a linear combination of previous (lagged) measurements, x[i,j-l]:

[1]

where 1 ³ i ³ nx, 1 ³ kp ³ nx; kp ¹ i if lp = 0; lm+1 ³ j ³ nd; lm ³ 1 is the maximum lag; and e[i,j] is the error in the model prediction. The parameters a are determined by fitting the model to a set of data. An identification process is used to select which of the possible terms in equation 1 contribute significantly to the model, and only those terms are retained.

More complex behaviors, such as coupled sensitivities between variables or curvature in the responses, could be included in an VAR model by adding polynomial terms to equation 2:
[4a]

[4b]

The additional parameters of this model compared to equation 1 are bp,m, the (usually positive integer) exponents for each multiplicand in each term, and nm is the maximum number of multiplicands in each term of the model. The indices i, j, k, and l are defined as for equation 1.

The model is made tractable by restricting the values that can be taken on by the exponents, b, the lags, and the value of nm, and by including in the model only those terms which contribute significantly to the fit. The fitting procedure involves a stepwise selection process, described below, in which a set of candidate terms are tested for inclusion in the model.

The restricted set of candidate terms are formed as follows: First, a list of ne candidate exponents is selected, not including zero which is always assumed. Then, a list of lags to be considered is formed. Formation of this list may be an iterative process involving sequentially adding lags until the model cannot be improved. In some cases discontinuous lags may be added to the list to represent expected seasonal effects. If lag 0 is included in the list, then variable i is being correlated to "current" values of the other variables, and variable i, lag 0 must be excluded from the candidate terms. The total number of lags in the list, which may include lag zero, is nl.

The stepwise procedure then selects a set of polynomial terms from the candidates that optimizes the fitting criteria. The resulting MPR model can thus be completely specified by a table containing the following information for each term:

kp,1, lp,1, b1; kp,2, lp,2, b2; . . . kp,nm, lp,nm, bnm; ap

The Number of Candidate Terms
Adding lagged values increases the number of "independent variables." The total number of independent variables, nv will be:
[6]

where l0 equals one if the list of lags includes zero, and equals zero otherwise, and v0 equals one if lagged dependent variables are included as independent variables, and zero otherwise.
The maximum value that nm can take is nv. If nm = nv, then the total number of candidate terms is:
[5]

This may result in a large number of terms to be tested for selection into the model. For example, if there are three independent variables (nx=4) and no lags (nl=1), and ten exponents (ne=10), then there are nt = 1331 possible terms. Experience with a wide variety of datasets has shown that nm can often be restricted to two or three. For nm £ nv:
[7]

where C(nv, m) is the number of combinations of nv objects taken m at a time. For the example above, if nm = 2, the number of candidate terms drops to 331.