The Multivariate Regression Composer (MVR) is an open-source software for generating and selecting non-linear regression models in Matlab. The basic version of the MVR Composer could be downloaded as a zipped file (release 61) or as the latest version from sourceforge using Subversion. I explain how to install the MVR Composer and how to use it in this article.
The input data to make your model:
- the regression sample set, dependent and independent variable values,
- set of the primitive functions (optional),
- set of the initial models (optional).
The result is the non-linear regression model of the optimal structure:
- model as a formula, the symbolic description to be used in further research and publication,
- vector of the model parameters to be used in forecasting,
- plot of the model is .png of in .eps for TeX publications.
Multivariate Regression Composer readme
MVR generates and selects non-linear regression models. It was written in the Matlab language and intended to be used as an open-source code.
Introduction
This software is intended as a curve-fitting tool. The models (curves) are generated using a set of primitive functions. More information on the algorithms can be found in the presentation and the paper. The complete documentation in English is coming. The The fields of applications are biology, physics, ecology, economics, etc.
Mathematical modeling has two issues: first, to create a model of a dynamic system using knowledge, and second, to discover a model and expertise using the measured data. So there are the model-driven and the data-driven approaches, each with strengths and weaknesses. The first one gives models that experts in a field of application could interpret, but usually, they have poor prediction quality. The second one provides models of good quality but is often too complex and non-interpretable by experts. The suggested approach gathers strong sides of these two: the result of the model could be explained, and it relies on the measured data. It allows getting the model with acceptable quality and generalization ability compared to universal models.
A model is selected from an inductively generated set of trial models according to adequacy: the model must be simple, stable, and precise. These criteria are target functions, and they are assigned according to given data. It is supposed that the given data carries the information on the searched model and the noise. The hypothesis of the probability distribution function defines a data generation hypothesis and, as follows, the target functions.
The outline of the automatic model creation is the following. A sample data consisting of several independent variables and one dependent variable are given. Experts make a set of terminal functions. These models are arbitrary superpositions, inductively generated using terminal functions. Experts could also make initial models for inductive modification. When generated models are tuned, a model of the optimal structure is selected.
Thus, the result is the non-linear regression model of the optimal structure and
- model as a formula, the symbolic description to be used in further research and publication
- vector of the model parameters to be used in forecasting
- plot of the model is .png of in .eps for TeX publications.
Installation
This software is a Matlab toolbox, so you need the Matlab system. There are two ways to use the software:
A. Download it from /files/mvr61.zip, unzip, and run "main.m"
.
B. Get the latest version. Connect your SVN shell extension to sourceforge.net. To do that, you need the following:
- Download and install TortioseSVN;
- Make the folder somedrive:\somefolder\mvr;
- Click the folder to get the context menu and choose Tortoise->Checkout;
- Put URL of Repository https://mvr.svn.sourceforge.net/svnroot/mvr;
- This will download the software.
Run demo project
To watch the MVR demo, you must run main.m
— demo project. There are three examples:
main('demo.prj.txt')
two-variatemain('sinc.prj.txt')
one-variate regressionmain('options.prj.txt')
stock-market options (Brent Crude Oil) modelling
Do your own project
To make your project, you have to do the following.
- Make the data file
"filename.dat.txt"
with the following contenty, x1, x2, ..., xn ... y, x1, x2, ..., xn
see for example
"demo.dat.txt"
or"sinc.dat.txt"
. - Make the registry file
"filename.dat.txt"
with the following contentfunction_, n, m, [opt initial parameters], [domain]
n
is the number of the arguments of the function,m
is the number of the parameters of the function. The initial parameters so that the function should be identity, if possible, for exampleparabola_ 1, 3, [0 1 0], [] % y = w(1) + w(2)*x + w(3)*x.^2;
See more examples in
"demo.reg.txt"
. The file"function_.m"
must be placed in the folder"mvr\func\"
with the contentfunction y=function_(w,x) y = w(1)*x;
See more examples in this folder, and note the main rules:
- No matter what a shape x has, scalar, vector, or matrix, y must be the same shape.
- Use parameter vector
w
as a set of scalars, say,w(1), ..., w(k)
. See the example above. - Function names are
"function[number of arguments][a|l]_.m"
, wherea
for the affine transformation of the argument,say y = sqrt(w(2)* x + w(1));
l
for the linear transformation, sayy = sqrt(w(1)* x);
- The sign
"_"
is used to avoid possible collision with the other Matlab functions.
- Make the initial model file
"filename.mdl.txt"
with with the following content:foo2_(foo_(foo2_(x1, foo_(x2))),...)
All function
foo_, foo2_
must be in the registry file. See more examples in"demo.mdl.txt"
. - Make the project file
"filename.prj.txt"
with the contentDataFile = 'filename.dat.txt'; ModelsFile = 'filename.mdl.txt'; RegistryFile = 'filename.reg.txt'; ...
etc., see, for example,
"demo.prj.txt"
- Place these files in the folder
"mvr\data\"
and runmain('filename.prj.txt')
.
Download MVR Composer
The MVR Composer is an open-source software to generate linear and non-linear models. It is hosted at sourceforge.net. Click to download mvr61.zip.