Least angle regression (LARS)

by admin · March 6, 2010

LARS is one of the most effective algorithms to select linear regression models. It is not so greedy as step-wise regression and takes not so much steps as Lasso.

Contents

Create synthetic data
Run LARS
Plot parameters at each step of LARS
Plot the results
Warning

% Least Angle Regression, demo

Create synthetic data

We think of a model, see below the code. Try to guess the model with LARS.

% Assign a model
f = inline('w(1)*x.^0 + w(2)*x.^(1/2) + w(3)*x.^1', 'w','x'); % synthetic model
b = [1 -2 2]'; % parameters
x = [0.1:0.1:1]'; % independent variable
y = f(b,x); % dependent variable

% We use the following features to find the model we thought of
g = {'x.^0', 'x.^(1/2)', 'x.^1', 'x.^(3/2)', 'x.*log(x)'};
% make the sample set; columns are independent variables
X = eval([ '[' g{1} ' ' g{2} ' ' g{3} ' ' g{4} ' ' g{5} ']']);

Run LARS

[n,p] = size(X);
muA = zeros(n,1);  % estimation of the dependent variable
beta = zeros(p,1); % estimation of parameters
betaLtst = beta';  % keep parameters in a storage

for i = 1:p
    % correlation coefficients between each feature (column of X) and vector of residuals
    c = X'*(y-muA); % note that columns of X are centered and normalized
    [C, A] = max(abs(c)); % find maximal value of correlation and corresponding index j of column in X
    % A = find(C == abs(c));
    % Aplus = find(C==c); % never used
    Sj  = sign(c(A)); % get sign of j-th correlation coefficient
    XA = X(:,A).*(ones(n,1)*Sj'); %
    % XA = X(:,A)*Sj; %
    G = XA'*XA; % norm of XA
    oA = ones(1,length(A)); % vector of ones
    AA =(oA*G^(-1)*oA')^(-0.5); % inverse matrix in the normal equation

    wA = AA*G^(-1)*oA'; % parameters to compute the unit bisector
    uA = XA*wA; % compute the unit bisector
    a = X'*uA; % product vector to compute new gamma
    if i<p % for all columns of X but the last
        M = [(C-c)./(AA-a);(C+c)./(AA+a)];
        M(find(M<=0)) = +Inf;
        gamma = min(M);
    else
        gamma  = C/AA;
    end

    muA = muA + gamma*uA; % make new approximation of the dependent variable
    beta(A) = beta(A) + gamma*wA.*Sj; % make new parameters
    betaLtst = [betaLtst; beta']; % store the parameters at k-th step
end

Plot parameters at each step of LARS

s1 = sum(abs(betaLtst),2);
plot(s1, betaLtst);
hold on
plot(s1, betaLtst);
xlabel('sum of model parameters');
ylabel('parameters');
legend('\beta_3', '\beta_1', '\beta_2')
hold off

Least angle regression (LARS)

Plot the results

idx = find(beta ~= 0);
disp('Recovered model contains')
g(idx)

% Recover LARS results using the last value of beta
y1 = X(:,idx)*beta(idx);

% Recover Least Squares result
Xa = X(:,idx);
y3 = Xa * pinv(Xa'*Xa)*Xa'*y;

plot(x,y,'*');
hold on
plot(x,y1,'r-');
plot(x,y3,'b-');
legend('Source data','LARS C_p metric', 'LARS selected')

Recovered model contains

ans = 

    'x.^0'    'x.^(1/2)'    'x.^1'

Least angle regression (LARS)

Warning

% Note that Least Squares
w = pinv(X'*X)*X'*y
% give the same result (for this very example, of course)

Least angle regression (LARS)

Create synthetic data

Run LARS

Plot parameters at each step of LARS

Plot the results

Warning

You may also like...

Ratings and Indices: Modeling, Estimation, Computation

Generalized linear models example

ROC analysis (Receiver Operating Characteristic example)