It shows how to use minimal least squares method to fit data. We will generate one-dimensional regression sample set. Tree linear regression models will be constructed, their parameters will be tuned according to the sample set.
Contents
Generate a sample set
Assume a sample set reflects some ‘unknown’ law we must discover.
x = linspace(0,10,200)'; % independent variable y = x + 1.5*sin(x) + randn(size(x,1),1); % dependent variable %y = 0.5 + 0.7*x + randn(size(x,1),1); % alternative set % dlmwrite('data4lsf.csv', [x,y]); % we can save the set in a .csv file D = [y,x]; % usually we store dependent variable in the 1st column, and independent variable(s) behind
Load the file
If an external sample data file is given, load it.
%D = dlmread('data4lsf.csv'); y = D( :,1); % the 1st column is independent variable. x = D( :,2:end); % the rest are independent variables x = x(:,1); % below we consider one-dimensional regression
The first model, linear function
The 1st model is just straight line, y = a x + b
A = [x, ones(size(x))]; % linear function w = inv(A'*A)*(A'*y); % the normal equation y1 = w(1) .* x + w(2); % the regression r = y-y1; % calculate residuals y1txt = sprintf('y_1=w_1x+w_2, SSE=%0.2f', r'*r); % SSE means Sum Square Error
The second model, quadratic function
The regression model is y = a x^2 + b x + c
A = [x.^2, x, ones(size(x))]; % quadratic function w = inv(A'*A)*(A'*y); % the normal equation y2 = w(1) .* x.^2 + w(2).* x + w(3); % the regression r = y-y2; % calculate residuals y2txt = sprintf('y_2=w_1x^2+w_2x+w_3, SSE=%0.2f', r'*r);
The third model, linear combination of linear and sine
The regression model is y = a x + b sin(x) + c
A = [x, sin(x), ones(size(x))]; % linear and sine function w = inv(A'*A)*(A'*y); % the normal equation y3 = w(1) .* x + w(2).* sin(x) + w(3); % the regression r = y-y3; % calculate residuals y3txt = sprintf('y_3=w_1x+w_2sin(x)+w_3, SSE=%0.2f', r'*r);
Plot the results
h = figure; plot(x,y,'k.'); hold on plot(x,[y1 y2 y3]);%,'LineWidth',2); hold off title('Least Squares Linear Fit'); legend('sample data',y1txt,y2txt,y3txt); xlabel('x'); ylabel('y'); % saveas(h,'img/demo_least_squares_fit.png','png'); close(h);
Note that ‘linear model’ does mean ‘streight line’. It means the model is a linear combination (weighted sum) of some functions.
% this file: demo_least_squares_fit