% Brock University - Economics 4F10
% Practical Worksheet #2

% The following worksheet demonstrates how to perform diagnostic tests for
% a linear regression model of the form y = X*beta + u, u~N(0,sigma2), in
% MATLAB, where for simplicity% X = nX3, and b = 3x1 (i.e. three regressors,
% including the constant), and how to calculate the coefficient of
% determination (the R2).

% Papers cited in comments:

% White, H. (1980). A heteroskedasticity-consistent covariance matrix
% estimator and a direct test for heteroskedasticity. Econometrica: Journal
% of the Econometric Society, 48(4), 817–838.

% Engle, R. "Autoregressive Conditional Heteroscedasticity with Estimates
% of the Variance of United Kingdom Inflation. Econometrica. Vol. 96, 1988,
% pp. 893-920.

% Durbin, J., and Watson, G., "Testing for Serial Correlation in
%  Least-Squares Regression," Biometrika, vol. 38, 1951, pp. 159–171.

Load data and set parameters

load SampleRegressionData; % load data (same as in Practical Worksheet #1)
n = size(X,1); % set number of observations equal to the # of rows in X
k = size(X,2) + 1; % set number of coefficients to # of columns in X plus constant

Estimate coefficients vector using OLS and its covariance matrix

% Steps are the same as in Practical Worksheet #1

xmat = [ones(n,1) X];
betahat = inv(xmat'*xmat)*(xmat'*y);
yhat = xmat*betahat;
uhat = y - yhat;
sigma2hat = (uhat'*uhat)/(n-k);
covbetahat = sigma2hat*inv(xmat'*xmat);

Estimate the coefficient of determination (the R-squared)

% Recall that R^2 is defined as the ratio of explained sum of squares (ESS)
% over total sum of squares (TSS).

TSS = sum((y-mean(y)).^2); % find TSS
ESS = sum((yhat-mean(y)).^2); % find ESS
R2 = ESS/TSS; % find coefficient of determination
disp(R2); % display the R^2

Do diagnostic test #1: White's test for conditional heteroskedasticity

% Recall that the test of White (1980) for conditional heteroskedasticity
% involves regressing the squared residuals on all regressors included in
% X, their squares and cross-products, using an auxiliary regression.

% First, define the new data matrix containing regressors for auxiliary
% model, and the new dependent variable:

xmat_whites_test = [ones(n,1) X X.^2 X(:,1).*X(:,2)];
ymat_whites_test = uhat.^2;

% Next, estimate the auxiliary model using OLS:

betahat_whites_test = inv(xmat_whites_test'*xmat_whites_test)*...

% Recall that the test statistic for the test of White (1980) is n*R^2,
% where R^2 is the coefficient of determination from the auxiliary
% regression, and n is the sample size as before.

R2_auxiliary_regression = sum((xmat_whites_test*betahat_whites_test - mean(ymat_whites_test)).^2)/...
    sum((ymat_whites_test - mean(ymat_whites_test)).^2); % R^2 from auxiliary regression

whites_test_statistic = n*R2_auxiliary_regression; % White's test statistic

% Recall that under H0 of homoskedasticity, this statistic asymptotically
% follows a Chi-Squared distribution with (p-1) degrees of freedom, where
% p is the number of columnts in the auxiliary regression data matrix.

% To carry out the test, look up corresponding cricical value, and reject
% if whites_test_statistic > critical value.

% If you have MATLAB's Statistics Toolbox installed, you can obtain the
% critical value using the chi2inv() function (see help for more info.):

level_of_significance = 0.01;
whites_critical_value = chi2inv(1-level_of_significance,...
    size(xmat_whites_test,2) - 1);

disp('Whites test statistic:');
disp('Whites critical value:');

% Note: if you have Econometrics Toolbox, you can also test for residual
% heteroskedasticity using the ARCH test of Engle (1988). See MATLAB help
% for archtest() function.
Whites test statistic:

Whites critical value:

Do diagnostic test #2: the Durbin-Watson d-test for serial correlaiton

% Recall that the d-test of Durbin and Watson (1951) can be used to test
% for the presense of serial correltion among regression residuals. If
% present, this can lead to bias and inefficiency of OLS estimates.

% Recall that the DW-statistic was simply the ratio of the sum of squared
% differences between successive residuals, and the residual sum of
% squares, the SSR.

dw_statistic = sum((uhat(2:end)-uhat(1:end-1)).^2)/sum(uhat.^2);

disp('Durbin-Watson Statistic:');

% Recall that the limiting distribution of DW statistic is a bit
% complicated, and involves rejection, non-rejection and indecision
% regions. Here, the statist falls in nonrejection region, and hence we do
% not reject the null of no auto-correlation.

% Note: if you have Statistics Toolbox, you can also run the DW test using
% the dwtest() function. See help for more info.
Durbin-Watson Statistic: