- Load data and set parameters
- Estimate coefficients vector using OLS and its covariance matrix
- Estimate the coefficient of determination (the R-squared)
- Do diagnostic test #1: White's test for conditional heteroskedasticity
- Do diagnostic test #2: the Durbin-Watson d-test for serial correlaiton
% Brock University - Economics 4F10 % Practical Worksheet #2 % The following worksheet demonstrates how to perform diagnostic tests for % a linear regression model of the form y = X*beta + u, u~N(0,sigma2), in % MATLAB, where for simplicity% X = nX3, and b = 3x1 (i.e. three regressors, % including the constant), and how to calculate the coefficient of % determination (the R2). % Papers cited in comments: % White, H. (1980). A heteroskedasticity-consistent covariance matrix % estimator and a direct test for heteroskedasticity. Econometrica: Journal % of the Econometric Society, 48(4), 817â€“838. % Engle, R. "Autoregressive Conditional Heteroscedasticity with Estimates % of the Variance of United Kingdom Inflation. Econometrica. Vol. 96, 1988, % pp. 893-920. % Durbin, J., and Watson, G., "Testing for Serial Correlation in % Least-Squares Regression," Biometrika, vol. 38, 1951, pp. 159â€“171.
clear; load SampleRegressionData; % load data (same as in Practical Worksheet #1) n = size(X,1); % set number of observations equal to the # of rows in X k = size(X,2) + 1; % set number of coefficients to # of columns in X plus constant
% Steps are the same as in Practical Worksheet #1 xmat = [ones(n,1) X]; betahat = inv(xmat'*xmat)*(xmat'*y); yhat = xmat*betahat; uhat = y - yhat; sigma2hat = (uhat'*uhat)/(n-k); covbetahat = sigma2hat*inv(xmat'*xmat);
% Recall that R^2 is defined as the ratio of explained sum of squares (ESS) % over total sum of squares (TSS). TSS = sum((y-mean(y)).^2); % find TSS ESS = sum((yhat-mean(y)).^2); % find ESS R2 = ESS/TSS; % find coefficient of determination disp(R2); % display the R^2
% Recall that the test of White (1980) for conditional heteroskedasticity % involves regressing the squared residuals on all regressors included in % X, their squares and cross-products, using an auxiliary regression. % First, define the new data matrix containing regressors for auxiliary % model, and the new dependent variable: xmat_whites_test = [ones(n,1) X X.^2 X(:,1).*X(:,2)]; ymat_whites_test = uhat.^2; % Next, estimate the auxiliary model using OLS: betahat_whites_test = inv(xmat_whites_test'*xmat_whites_test)*... (xmat_whites_test'*ymat_whites_test); % Recall that the test statistic for the test of White (1980) is n*R^2, % where R^2 is the coefficient of determination from the auxiliary % regression, and n is the sample size as before. R2_auxiliary_regression = sum((xmat_whites_test*betahat_whites_test - mean(ymat_whites_test)).^2)/... sum((ymat_whites_test - mean(ymat_whites_test)).^2); % R^2 from auxiliary regression whites_test_statistic = n*R2_auxiliary_regression; % White's test statistic % Recall that under H0 of homoskedasticity, this statistic asymptotically % follows a Chi-Squared distribution with (p-1) degrees of freedom, where % p is the number of columnts in the auxiliary regression data matrix. % To carry out the test, look up corresponding cricical value, and reject % if whites_test_statistic > critical value. % If you have MATLAB's Statistics Toolbox installed, you can obtain the % critical value using the chi2inv() function (see help for more info.): level_of_significance = 0.01; whites_critical_value = chi2inv(1-level_of_significance,... size(xmat_whites_test,2) - 1); disp('Whites test statistic:'); disp(whites_test_statistic); disp('Whites critical value:'); disp(whites_critical_value); % Note: if you have Econometrics Toolbox, you can also test for residual % heteroskedasticity using the ARCH test of Engle (1988). See MATLAB help % for archtest() function.
Whites test statistic: 12.6228 Whites critical value: 15.0863
% Recall that the d-test of Durbin and Watson (1951) can be used to test % for the presense of serial correltion among regression residuals. If % present, this can lead to bias and inefficiency of OLS estimates. % Recall that the DW-statistic was simply the ratio of the sum of squared % differences between successive residuals, and the residual sum of % squares, the SSR. dw_statistic = sum((uhat(2:end)-uhat(1:end-1)).^2)/sum(uhat.^2); disp('Durbin-Watson Statistic:'); disp(dw_statistic); % Recall that the limiting distribution of DW statistic is a bit % complicated, and involves rejection, non-rejection and indecision % regions. Here, the statist falls in nonrejection region, and hence we do % not reject the null of no auto-correlation. % Note: if you have Statistics Toolbox, you can also run the DW test using % the dwtest() function. See help for more info.
Durbin-Watson Statistic: 2.0510