Least angle regression (LARS)

LARS is one of the most effective algorithms to select linear regression models. It is not so greedy as step-wise regression and takes not so much steps as Lasso.

Contents

% Least Angle Regression, demo

Create synthetic data

We think of a model, see below the code. Try to guess the model with LARS.

% Assign a model
f = inline('w(1)*x.^0 + w(2)*x.^(1/2) + w(3)*x.^1', 'w','x'); % synthetic model
b = [1 -2 2]'; % parameters
x = [0.1:0.1:1]'; % independent variable
y = f(b,x); % dependent variable

% We use the following features to find the model we thought of
g = {'x.^0', 'x.^(1/2)', 'x.^1', 'x.^(3/2)', 'x.*log(x)'};
% make the sample set; columns are independent variables
X = eval([ '[' g{1} ' ' g{2} ' ' g{3} ' ' g{4} ' ' g{5} ']']);

Run LARS

[n,p] = size(X);
muA = zeros(n,1);  % estimation of the dependent variable
beta = zeros(p,1); % estimation of parameters
betaLtst = beta';  % keep parameters in a storage

for i = 1:p
    % correlation coefficients between each feature (column of X) and vector of residuals
    c = X'*(y-muA); % note that columns of X are centered and normalized
    [C, A] = max(abs(c)); % find maximal value of correlation and corresponding index j of column in X
    % A = find(C == abs(c));
    % Aplus = find(C==c); % never used
    Sj  = sign(c(A)); % get sign of j-th correlation coefficient
    XA = X(:,A).*(ones(n,1)*Sj'); %
    % XA = X(:,A)*Sj; %
    G = XA'*XA; % norm of XA
    oA = ones(1,length(A)); % vector of ones
    AA =(oA*G^(-1)*oA')^(-0.5); % inverse matrix in the normal equation

    wA = AA*G^(-1)*oA'; % parameters to compute the unit bisector
    uA = XA*wA; % compute the unit bisector
    a = X'*uA; % product vector to compute new gamma
    if i<p % for all columns of X but the last
        M = [(C-c)./(AA-a);(C+c)./(AA+a)];
        M(find(M<=0)) = +Inf;
        gamma = min(M);
    else
        gamma  = C/AA;
    end

    muA = muA + gamma*uA; % make new approximation of the dependent variable
    beta(A) = beta(A) + gamma*wA.*Sj; % make new parameters
    betaLtst = [betaLtst; beta']; % store the parameters at k-th step
end

Plot parameters at each step of LARS

s1 = sum(abs(betaLtst),2);
plot(s1, betaLtst);
hold on
plot(s1, betaLtst);
xlabel('sum of model parameters');
ylabel('parameters');
legend('\beta_3', '\beta_1', '\beta_2')
hold off

Least angle regression (LARS)

Plot the results

idx = find(beta ~= 0);
disp('Recovered model contains')
g(idx)

% Recover LARS results using the last value of beta
y1 = X(:,idx)*beta(idx);

% Recover Least Squares result
Xa = X(:,idx);
y3 = Xa * pinv(Xa'*Xa)*Xa'*y;

plot(x,y,'*');
hold on
plot(x,y1,'r-');
plot(x,y3,'b-');
legend('Source data','LARS C_p metric', 'LARS selected')
Recovered model contains

ans = 

    'x.^0'    'x.^(1/2)'    'x.^1'

Least angle regression (LARS)

Warning

% Note that Least Squares
w = pinv(X'*X)*X'*y
% give the same result (for this very example, of course)
w =

    1.0000
   -2.0000
    2.0000
    0.0000
   -0.0000

You may also like...