INTRODUCTION TO ECONOMETRICS Assessment Period: January 2019
L1090
THE UNIVERSITY OF SUSSEX
BA AND BSc
Jan 2019 (A1)
Table of Contents
INTRODUCTION TO ECONOMETRICS
Assessment Period: January 2019 (A1)
DO NOT TURN OVER UNTIL INSTRUCTED TO BY THE LEAD INVIGILATOR
SECTION A – Answer all questions
 A researcher is studying the effect of parents’ level of education on educational achievement of individuals. Using data on highest grade of education completed (educ), mother’s level of education (motheduc), father’s level of education (fatheduc), a measure of cognitive ability (abil), and the number of siblings per individual, the researcher estimates a number of models. The table below shows the OLS estimates, with standard errors in brackets.
Model 1  Model 2  Model 3  
Dependent variable  educ  educ  educ 
Independent variables  
motheduc  0.30
(0.03) 
–  0.17
(0.03) 
fatheduc  0.19
(0.02) 
–  0.11
(0.02) 
abil  –  0.52
(0.03) 
0.39
(0.03) 
abil^{ 2}  –  0.05
(0.01) 
0.05
(0.01) 
Number of siblings  0.15
(0.03) 
0.10
(0.03) 

Intercept  6.94
(0.32) 
12.14 (0.12)  8.74
(0.31) 
N  1230  1230  1230 
SSR  5114.31  4193.89  3741.82 
R2  0.25  0.38  0.45 
 First explain what is a ceteris paribus effect. Secondly, regarding ceteris paribus analysis, explain the advantage of using multiple regression analysis, over simple regression analysis. [5 marks]
The ceteris paribus effect is the effect of one variable on the dependent variable when all other factors are held constant [2 marks].
In simple linear regression, we assume there are no other factors that affect the dependent variable apart from the only explanatory variable of the model. The effect of all those other factors are assumed to be zero on average, and they show up in the error term of the model.
In Multiple linear regression we explicitly control for all factors that we believe might have an effect on the dependent variable. Hence by controlling for those factors in the model, we are more confident that we can capture the ceteris paribus effect of the variable of interest on the dependent variable [3 marks].
 Interpret each of the OLS estimates in model (1). [6 marks]
Model 1 has 2 slope coefficients and one intercept.
The coefficient of motheduc predicts that an extra level of education qualification of the mother, increasesd the respondent’s education by 0.3 grade, on average and ceteris paribus.
This effect is predicted to be 0.19 of a grade, for an extra grade of father’s education, on average and ceteris paribus.
The intercepts predicts that for individuals with zero parental education, the highest level of education is on average 6.94. [2 marks per coefficient].
 Interpret the coefficient of abil^{ 2}, in model (3). Can you reject the claim that the returns to ability are linear? Be explicit about the hypothesis that you make in order to answer the question. [7 marks]
The coefficient of abil^{ 2} measures the marginal effect of ability on educ (students can explain this in terms of nonlinear effect, or how the effect of ability varies at different levels and is not fixed) [2 marks]
To test for linear returns we test whether the coefficient of abil^{ 2 }is statistically different from zero. The null hypothesis of this test is:
𝛽_{𝑎𝑏𝑖𝑙 2 }= 0 [1 mark]
Against the alternative 𝛽_{𝑎𝑏𝑖𝑙 2 }≠ 0. [1 mark] , no marks for a onetailed alternative.
t[1 mark]
To test this null we need to compare the test statistics with the critical value from z distribution. The critical value at 5% significance level, for a two tailed test is 1.96.
[1 mark]
If student lost a mark above for setting a one tailed test but they have obtained the correct critical value for a one tailed test they should be given the 1 mark for this section.
Since the test statistics is greater than the critical value, we reject the null that the returns to ability are linear. [1 mark]
 Comparing model (2) with model (3), at 5% level of significance, test the hypothesis that parents’ (mother and father) education has no effect on educational achievements of an individual. Be explicit about the hypothesis that you make. [7 marks]
Here we are testing that father’s and mother’s education have no effect on the dependent variable, hence the null hypothesis is 𝛽_{motheduc }= 𝛽_{fatheduc }= 0. This hypothesis excludes two of the variables from the model.
To test this hypothesis, we compare the rsquared of the restricted with unrestricted model, through an Ftest.
The unrestricted model is model (3) and the restricted model is model (2) were we have excluded motheduc and fatheduc from the model.
The test statistic is
Where is the rsquared of the unrestricted model, model 3 in this case. is the rsquared of restricted model, model 2. is the number of restrictions, 2 restrictions in this case.
𝑘 is the number of independent variables in the unrestricted model, k= 5 in this case.
F= critical value for 𝐹_{2 ,1230−5−1}, at 1% significance level is 4.605. Since the Fstatistic is greater than the critical value we reject the null that father’s and mother’s education has no effect on an individual’s education attainment.
Students can choose a different significance level to form their answer.
 e) Comparing model 1 and model 3, why did the coefficient of parents’ (mother and father) education change after introducing ability to the model? [10 marks]
Comparing model 1 and model 3, we can see that after controlling for ability, the coefficients of both mother’s and father’s education have reduced [2 marks for commenting on the effect].
This can suggest that ability is correlated with the dependent variable and with father’s and mother’s education [3 marks for mentioning the correlation between ability with and other regressors].
without further information we cannot verify the size of the bias. Model 3 controls for ability_squared and number of siblings too, therefore we cannot isolate the omitted variable bias that is due to excluding ability from the model.
One possible scenario is that ability has a positive, though nonlinear effect on educational attainment (as seen by its coefficient). It might be the case that father’s and mother’s education are positively correlated with ability as well. Therefore, not controlling for ability will create an omitted variable bias. If people whose parents are better educated have higher level of ability, then not controlling for ability results in an upward bias in the coefficient of father’s and mother’s education.
[5 marks for a similar discussion on omitted variable bias].
 f) Test the overall significance of model (3). [5 marks]
We can do an Ftest for the statistical significance of the overall model:
The null hypothesis of the test is that all the coefficients are zero, against the alternative that at least one coefficient is different from zero.
R ^{2 }/(k)
F = (1 R 2 )/(n−k−1) ~ Fk,n−k−1
−
F =
The Fstatistic is 200.2 and the critical value for 𝐹_{5,1224} at 1% significance level is 3.017. Hence we reject the null and conclude that the model is overall significant.
 The graph below is the plot of a simple linear regression of the dependent variable (Y) against the independent variable (X).
 Copy the diagram on your answer sheet and on it, label the mean value of y (𝑌̅), the fitted regression line, and for the point indicated by the arrow label the actual observation (𝑦_{𝑖}), the residual (𝑢_{𝑖}) and the predicted value (𝑦̂_{𝑖}).
1 mark for correctly indicating each value.
 Explain what is 𝑅^{2} of the regression, how is it measured and what is its interpretation.
The 𝑅^{2} is a measure of goodness of it and it indicates the proportion of the variation in y that is explained by our fitted line.
The 𝑅^{2} can be measured by dividing the explained sum of variations in y by total sum of variations in y:
R2 _{= }SST SSE _{ }
Students can use other version of the rsquared formula.
[5 marks]
 Briefly explain each of the following concepts:
 Homoscedasticity [5 marks]
Homoscedasticiy is one of the GaussMarkov assumptions and it says that in a simple or multiple linear regression model, the errors of the regression have the same variance given any values of the explanatory variable(s).
Or an answer like this: Constant variance of error terms conditional on values of x.
 Stationary process [5 marks]
Stationarity has to do with the joint distribution of a process as it moves through time. A time series is stationary if its stochastic properties and its temporal dependence structure do not change over time.
 Regression residual [5 marks]
The residual for an observation, is the difference between the actual observation and its predicted value.
 Multicollinearity [5 marks]
A problem that arises when collinearity between two or more independent variables leads to a lack of statistically significant coefficients even when a satisfactory overall explanatory power of the model is obtained. (If students state in their own words that multicollinearity is correlation between two or more independent variables that can lead to an increase in standard errors of estimators, they should be given full marks
SECTION B – Answer ONE question
 A model of homicide rates in the USA is estimated, using state level data, as follows:
HR_{i }= β_{0} + β_{1}UE_{i} + β_{2}INCOME_{i} + β_{3}SOUTH_{i} +β_{4}ETHNIC_{i} + u_{i}, i=1,….51
Where HR is the number of homicides (murders) per 100,000 population in state i, UE is the male unemployment rate in percentages, INCOME is mean per capita income in dollars, SOUTH is a binary variable which takes a value of 1 if the state is southern, 0 otherwise, and ETHNIC is the percentage of the state population that is not white.
The model is estimated with OLS and the following results are found (standard errors are shown in brackets).
𝐻𝑅̂ = 8 + 0.65 UE + 0.0005INCOME + 2.4SOUTH +0.21ETHNIC (1)
(1.3) (0.26) (0.0002) (1.0) (0.04)
R^{2}=0.58
 Interpret the coefficient of SOUTH and test for its statistical significance at the 5% level. [5 marks]
Interpretation: On average and ceteris paribus, the homicide rate in the south is predicted to be 2.4 per 100,000 population higher than other states.
 marks]
To test for the significance of the coefficient, we use a twotailed ttest as the following:
𝐻0: 𝛽𝑠𝑜𝑢𝑡ℎ = 0
𝑯_{𝟏}: 𝛽_{𝑠𝑜𝑢𝑡ℎ }≠ 𝟎
The tstat = (under the null, the tstat follows a tdistribution with nk1 degrees of freedom)
n=51, and k=4. The critical value at 5% significance level with 46 degrees of freedom is 2.021.
At 5% level of significance, the tstat is larger than the critical value, hence we reject the null and conclude that the coefficient of south is statistically significant.
 marks of significance test]
 Sketch a simple graph of HR against INCOME which illustrates how homicide rates are predicted to differ between southern and nonsouthern states. [5 marks]
 The model was reestimated with an interaction term between SOUTH and each of the continuous variables, and the following results were obtained.
𝐻𝑅̂ = 15.7 + 0.99UE + 0.0009INCOME + 18.7SOUTH +0.19ETHNIC
2.4 
(3.2) (0.31) (0.0003) (7.8) (0.04)
0.88SOUTH*UE 0.0008SOUTH*INCOME – 0.12SOUTH*ETHNIC (2)
(0.54) (0.0004) (0.04)
R^{2}=0.62
 Interpret the effect of the male unemployment rate on homicide rates and how this differs between southern and nonsouthern states. [5 marks]
In northern states the effect of UE on HR is just the coefficient of UE: 1 percentage point increase in UE raises HR by 0.99 murders per 100000 population, on average and ceteris paribus.
In southern states the effect of UE on HR is (0.99 – 0.88=) 0.11. One percentage point rise in UE in southern states raises HR by 0.11 homicides per 100000 population, on average and ceteris paribus.
 Write down an expression that shows the predicted difference in homicide rates between southern and nonsouthern states. [5 marks]
Northern states:
HR=15.7 + 0.99UE + 0.0009INCOME +0.19ETHNIC
Southern states:
HR = 15.7 + 0.99UE + 0.0009INCOME + 18.7 +0.19ETHNIC 0.88*UE 0.0008*INCOME – 0.12*ETHNIC
The expression for the predicted difference is:
∆𝑯𝑹 = 18.70.88*UE – 0.0008*INCOME – 0.12*ETHNIC
 Can you use the provided information to carry out an Ftest of whether the determinants of homicide rates differ significantly between southern and nonsouthern states? Explain what would you have to do to compute the Fstatistic. [10 marks]
Yes we can. The model in part (a) restricts the determinants of homicide rate to be the same between southern and northern states while the model in part (c) allows these determinants to be different. [2 marks] If the answer says no we can’t as we don’t have the RSS of the two models, students should be given the 2 points, as they only practiced this Ftest with the RSS and not rsquared.
To test whether these coefficients are statistically different between the two regions we use an Ftest to test whether the interacted coefficients are jointly significant:
𝐻_{0}: the three interaction terms have coefficients=0
𝐻_{𝐴}: At least one of these three coefficients is different from zero
Then we use the 𝑅^{2} of restricted and unrestricted models to construct the Fstatistics and carry out an Ftest.
The and the , the number of restrictions is 3.
[8 marks]
(up to this point suffices for full mark for this question as students were not asked to carry out the Ftest).
F=((0.620.58)/3)((10.62)/(5171))=0.01333/0.00883=1.51
Critical value of F at 5% with 3 and 43 degrees of freedom =2.84
So we do not reject the null and we conclude that determinants of HR are not statistically different between southern and nonsouthern states.
 A researcher estimates a model of the following form
𝑌_{𝑡}= a + 𝑏_{1}𝑋_{1𝑡} + 𝑏_{2}𝑋_{2𝑡} + 𝑢_{𝑡} t=1 to 16
 Explain what autocorrelation is, how it might arise and discuss the consequences for the OLS estimates. [10 marks]
Autocorrelation refers to the correlation between error terms over time. More specifically autocorrelation is the violation of the following assumption:
Conditional on the explanatory variables, the unobserved factors must not be correlated over time.
[4 marks]
Autocorrelation can occur for various reasons. For example, when conditional on knowing the values of the independent variables, omitted factors are correlated over time we will have an autocorrelation issue.
Another situation that might result in autocorrelation is when past values of the dependent variable feed forward to future values of explanatory variables. [2 marks – one reason is enough]
The consequence of autocorrelation on OLS parameters are that the OLS estimates are no longer BLUE and tests based on t and F are no longer valid. [4 marks]
 The correlation coefficient between the residuals 𝑢̂_{𝑡} and the lagged residuals 𝑢̂_{𝑡−1} from the model is calculated to be 0.456. Use this to implement a test for autocorrelation, specifying clearly the null and alternative hypotheses. Interpret your results. [10 marks]
This provides the basis for the DW test: [2 marks for correct recognition]
DW=2(1p) where p is the correlation between the errors. So DW=1.088.
The null being tested is that p=0 and the alternative is that p0. DW is bounded by 0 and 4, sine the highest correlation coefficient is 1.
The critical values, with 2 independent variables and n=16 in our original model, are approximately 0.946 and 1.543, and we use these to mark on the bounds of the inconclusive region.
 The researcher estimates a second model of the form:
𝑌_{𝑡}= a + 𝑏_{1}𝑋_{1𝑡} + 𝑏_{2}𝑋_{2𝑡} + 𝑏_{3}𝑌_{𝑡−1} + 𝑢_{𝑡} t=1 to 16
and obtains a DurbinWatson statistic of 1.75. The coefficient of 𝑌_{𝑡−1} is estimated to be 0.65 with a standard error of 0.06. Use this information to test for autocorrelation in this model. [10 marks]
Since we have a lagged dependent variable we need to use Durbin’s h test, which is defined as
T
h=ˆ*
1−T var()
[ 4 marks for recognising the test statistic]
We need to use the estimated DW to get p
DW=2(1p), 1.75=2(1p) so p= 0.125, and using the info about we calculate [2 marks]
[1 mark]
T=16 in original model but now we have a lagged dependent variable, T=15. [1 mark, student should not be penalised twice if they insert wrong T in the formula above. They should lose only the 1 mark dedicated to correct realisation of T].
The null and alternative is the same as part (b) of this question. Under the null the h statistic would follow a standard normal distribution [1 marks].
This h is less than any critical value at conventional levels from the z table so we cannot reject the null that there is no serial correlation [1 marks].
 A tobacco company is investigating the determinants of tobacco consumption. It estimates the following model using OLS (N=807, R^{2}=0.053)
𝑡𝑜𝑏𝑎𝑐𝑐𝑜̂ = 3.64 + 0.88income – 0.501educ + 0.571age – 0.0057age^{2}
(24.08) (0.728) (0.167) (0.160) (0.0017)
where tobacco is the number of cigarettes consumed per week, income is annual income measured in £1000s, educ is years of schooling and age is the age of each customer, also measured in years. (Standard errors are in parentheses).
 What is the purpose of the age squared term? Explain whether or not you think it should be included in the model. [4 marks]
The age squared controls for nonlinear relationship between age and tobacco consumption. This says that the effect of an increase in age on tobacco consumption depends on the level of age. [2 marks]
If the coefficient of agesquared is statistically significant then it should be included in the model. In this case, the tstatistics is 0.0057/0.0017 = 3.35. With a sample size of 807 we can compare this statistic with the critical value from z distribution. At 1% level of significance the critical value is approximately 2.57 hence the agesquared should be included in the model. [2 marks]
 Write down an expression that shows the effect of a change in age on tobacco consumption, and use this to show how the effect of being a year older on tobacco consumption is different for a person aged 20 and a person aged 60. [6 marks]
∆𝑡𝑜𝑏𝑎𝑐𝑐𝑜̂ = 0.571∆𝑎𝑔𝑒 − 0.0114 𝑎𝑔𝑒 ∗ ∆𝑎𝑔𝑒
[2 marks for either of these expressions]
For ∆𝑎𝑔𝑒 = 1 at different age levels we have:
At 20 years old, the effect of getting one year older on tobacco consumption is an increase in consumption by 0.343 of a cigarette per week:
∆𝑡𝑜𝑏𝑎𝑐𝑐𝑜̂ = 0.571 ∗ 1 − 0.0114 ∗ 20 ∗ 1 = 0.343
At 60 years old on the other hand, the effect of getting one year older on tobacco consumption is a decrease of 0.113 of a cigarette per week:
∆𝑡𝑜𝑏𝑎𝑐𝑐𝑜̂ = 0.571 ∗ 1 − 0.0114 ∗ 60 ∗ 1 = 0.113
[4 marks for similar answer]
 Sketch a graph illustrating the relationship between age and tobacco consumption, and calculate at what age tobacco consumption is predicted to decline. [7 marks]
age
[4 marks]
To find the turning point we need to use the first order condition and set the first derivative equal to zero:
= 0
Solving for this gives age=50. Therefore at age 50, cigarette consumption starts to decline. [3 marks]
 Test the overall significance of the fitted line in this question, at the 5% [4 marks]
We can do an Ftest for the statistical significance of the overall model:
The null hypothesis of the test is that all the coefficients are zero, against the alternative that at least one coefficients is different from zero. R ^{2 }/(k)
F = (1 R 2 )/(n−k−1) ~ Fk,n−k−1 −
F =
The Fstatistic is 11.22 and the critical value for 𝐹_{5,1224} at 5% significance level is 2.371. Hence we reject the null and conclude that the model is overall significant.
 You are told that the errors are heteroskedastic. What is the consequence on each of the following:
 i) The standard error of the OLS estimators [3 marks]
The standard error of OLS estimators are not the smallest under heteroskedasticiy ii) The Ftests [3 marks]
With heterokedastic errors, the Fstatistic does not follow an F distribution, making the test invalid
iii) The bias in OLS estimators [3 marks]
Hetersokedasticiy has no effect on the bias.
END OF PAPER