Simetrica, LLC - Mathematical Modeling Software and Consultant Services

Primer on Modeling

Our Products

Other Applications

MPR for Time Series Analysis

Data Types Needed for MPR

Example Applications

Down Load Users' Manual FREE

Logical Capabilities of MPR

Other Modeling Methods

Bibliography

This is a frequently asked question. The short answer is as follows:
Any numerical data that can be arranged in rows and columns can be modeled by MPR.

Each row represents a single "data point." Each data point is a single measurement of the dependent variable and its associated independent variables. Each column represents a different variable. To take the retail store example, each store is represented by a row in the dataset. Each variable (gross margin, supplies-to-equipment ratio, market share, etc.) occupies a different column.

There should be more data points than variables (more rows than columns), although there can be exceptions to this. The final model can have no more terms or coefficients than there are data points in the dataset used for fitting.

If some data points have missing variables, the data point must either be removed, or the missing data must be filled in. A simple way to do this is to fill it in with the mean of the rest of the data points for that variable.

If the dataset is a time-series, then there can be no missing data points. That is, no measurements can have been skipped. If there are missing data, then they must be filled in as described above, or by interpolating between the data before and after.

The data must be numerical. However, some qualitative variables can be transformed into numbers. In the retail stores example, some variables had a yes/no quality. These could be represented numerically as a 1 or a 0. Similarly, male/female, treated/untreated, or any other two-way distinction could be represented this way. This is called coded variables or dummy variables.

Sometimes the dependent variable can be a dummy variable. An example is the wavemaker machine example, where failure of the machine was coded as a 1, and nonfailure was coded as 0. Then, when a prediction is made, the output should be rounded off to 0 or 1. There is a different type of regression designed for this situation called logistic regression. MPR can be used in place of logistic regression. It then brings along its advantage of being capable of describing nonlinearities including interactions.