INTRODUCTION TO ECONOMETRICS Assessment Period: January 2019

Custom Research Paper Writing Service 


Don't use plagiarized sources. Get Your Custom Essay on
INTRODUCTION TO ECONOMETRICS Assessment Period: January 2019
Just from $13/Page
Order Essay



Jan 2019 (A1)


Assessment Period: January 2019 (A1)




SECTION A – Answer all questions


  1. A researcher is studying the effect of parents’ level of education on educational achievement of individuals. Using data on highest grade of education completed (educ), mother’s level of education (motheduc), father’s level of education (fatheduc), a measure of cognitive ability (abil), and the number of siblings per individual, the researcher estimates a number of models. The table below shows the OLS estimates, with standard errors in brackets.


  Model 1 Model 2 Model 3
Dependent variable educ educ educ
Independent variables      
motheduc 0.30




fatheduc 0.19




abil 0.52




abil 2 0.05




Number of siblings   -0.15




Intercept 6.94


12.14 (0.12) 8.74


N 1230 1230 1230
SSR 5114.31 4193.89 3741.82
R2 0.25 0.38 0.45


  1. First explain what is a ceteris paribus effect. Secondly, regarding ceteris paribus analysis, explain the advantage of using multiple regression analysis, over simple regression analysis. [5 marks]


The ceteris paribus effect is the effect of one variable on the dependent variable when all other factors are held constant [2 marks].

In simple linear regression, we assume there are no other factors that affect the dependent variable apart from the only explanatory variable of the model. The effect of all those other factors are assumed to be zero on average, and they show up in the error term of the model.

In Multiple linear regression we explicitly control for all factors that we believe might have an effect on the dependent variable. Hence by controlling for those factors in the model, we are more confident that we can capture the ceteris paribus effect of the variable of interest on the dependent variable [3 marks].


  1. Interpret each of the OLS estimates in model (1). [6 marks]


Model 1 has 2 slope coefficients and one intercept.

The coefficient of motheduc predicts that an extra level of education qualification of the mother, increasesd the respondent’s education by 0.3 grade, on average and ceteris paribus.


This effect is predicted to be 0.19 of a grade, for an extra grade of father’s education, on average and ceteris paribus.


The intercepts predicts that for individuals with zero parental education, the highest level of education is on average 6.94.  [2 marks per coefficient].


  1. Interpret the coefficient of abil 2, in model (3). Can you reject the claim that the returns to ability are linear? Be explicit about the hypothesis that you make in order to answer the question. [7 marks]

The coefficient of abil 2 measures the marginal effect of ability on educ (students can explain this in terms of nonlinear effect, or how the effect of ability varies at different levels and is not fixed) [2 marks]


To test for linear returns we test whether  the coefficient of abilis statistically different from zero. The null hypothesis of this test is:

????? 2 = 0 [1 mark]


Against the alternative ????? 2 ≠ 0.  [1 mark] , no marks for a one-tailed alternative.


t-[1 mark]


To test this null we need to compare the test statistics with the critical value from z distribution. The critical value at 5% significance level, for a two tailed test is 1.96.

[1 mark]

If student lost a mark above for setting a one tailed test but they have obtained the correct critical value for a one tailed test they should be given the 1 mark for this section.


Since the test statistics is greater than the critical value, we reject the null that the returns to ability are linear. [1 mark]


  1. Comparing model (2) with model (3), at 5% level of significance, test the hypothesis that parents’ (mother and father) education has no effect on educational achievements of an individual. Be explicit about the hypothesis that you make. [7 marks]


Here we are testing that father’s and mother’s education have no effect on the dependent variable, hence the null hypothesis is  ?motheduc = ?fatheduc = 0. This hypothesis excludes two of the variables from the model.


To test this hypothesis, we compare the r-squared of the restricted with unrestricted model, through an F-test.


The unrestricted model is model (3) and the restricted model is model (2) were we have excluded motheduc and fatheduc from the model.


The test statistic is

Where is the r-squared of the unrestricted model, model 3 in this case.   is the r-squared of restricted model, model 2.   is the number of restrictions, 2 restrictions in this case.

? is the number of independent variables in the unrestricted model, k= 5 in this case.


F=  critical value for ?2 ,1230−5−1, at 1% significance level is 4.605. Since the F-statistic is greater than the critical value we reject the null that father’s and mother’s education has no effect on an individual’s education attainment.


Students can choose a different significance level to form their answer.


  1. e) Comparing model 1 and model 3, why did the coefficient of parents’ (mother and father) education change after introducing ability to the model? [10 marks]


Comparing model 1 and model 3, we can see that after controlling for ability, the coefficients of both mother’s and father’s education have reduced  [2 marks for commenting on the effect].


This can suggest that ability is correlated with the dependent variable and with father’s and mother’s education [3 marks for mentioning the correlation between ability with and other regressors].


without further information we cannot verify the size of the bias. Model 3 controls for ability_squared and number of siblings too, therefore we cannot isolate the omitted variable bias that is due to excluding ability from the model.


One possible scenario is that ability has a positive, though non-linear effect on educational attainment (as seen by its coefficient). It might be the case that father’s and mother’s education are positively correlated with ability as well. Therefore, not controlling for ability will create an omitted variable bias. If people whose parents are better educated have higher level of ability, then not controlling for ability results in an upward bias in the coefficient of father’s and mother’s education.


[5 marks for a similar discussion on omitted variable bias].


  1. f) Test the overall significance of model (3). [5 marks]

We can do an F-test for the statistical significance of the overall model:


The null hypothesis of the test is that all the coefficients are zero, against the alternative that at least one coefficient is different from zero.

R 2 /(k)

F = (1 R   2 )/(nk−1) ~   Fk,nk−1


F =


The F-statistic is 200.2 and the critical value for ?5,1224 at 1% significance level is 3.017. Hence we reject the null and conclude that the model is overall significant.

  1. The graph below is the plot of a simple linear regression of the dependent variable (Y) against the independent variable (X).
  2. Copy the diagram on your answer sheet and on it, label the mean value of y (?̅), the fitted regression line, and for the point indicated by the arrow label the actual observation (??), the residual (??) and the predicted value (?̂?).


1 mark for correctly indicating each value.


  1. Explain what is ?2 of the regression, how is it measured and what is its interpretation.


The ?2 is a measure of goodness of it and it indicates the proportion of the variation in y that is explained by our fitted line.

The ?2 can be measured by dividing the explained sum of variations in y by total sum of variations in y:


R2 = SST SSE  


Students can use other version of the r-squared formula.


[5 marks]


  1. Briefly explain each of the following concepts:


  1. Homoscedasticity                                                                       [5 marks]

Homoscedasticiy is one of the Gauss-Markov assumptions and it says that in a simple or multiple linear regression model, the errors of the regression have the same variance given any values of the explanatory variable(s).


Or an answer like this: Constant variance of error terms conditional on values of x.


  1. Stationary process                                                              [5 marks]

Stationarity has to do with the joint distribution of  a process as it moves through time.  A time series is stationary if  its stochastic  properties and its temporal dependence structure do not change over time.


  1. Regression residual                                                                       [5 marks]

The residual for an observation, is the difference between the actual observation and its predicted value.


  1. Multicollinearity                                                                        [5 marks]

A problem that arises when collinearity between two or more independent variables leads to a lack of statistically significant coefficients even when a satisfactory overall explanatory power of the  model is obtained. (If students state in their own words that multicollinearity is correlation between two or more independent variables that can lead to an increase in standard errors of estimators, they should be given full marks

SECTION B – Answer ONE question


  1. A model of homicide rates in the USA is estimated, using state level data, as follows:


HRi = β0 + β1UEi + β2INCOMEi + β3SOUTHi4ETHNICi + ui,    i=1,….51


Where HR is the number of homicides (murders) per 100,000 population in state i, UE is the male unemployment rate in percentages, INCOME is mean per capita income in dollars, SOUTH is a binary variable which takes a value of 1 if the state is southern, 0 otherwise, and ETHNIC is the percentage of the state population that is not white.


The model is estimated with OLS and the following results are found (standard errors are shown in brackets).


??̂ = -8 + 0.65 UE + 0.0005INCOME + 2.4SOUTH +0.21ETHNIC   (1)

(1.3)   (0.26)      (0.0002)                   (1.0)              (0.04)




  1. Interpret the coefficient of SOUTH and test for its statistical significance at the 5% level. [5 marks]


Interpretation: On average and ceteris paribus, the homicide rate in the south is predicted to be 2.4 per 100,000 population higher than other states.

  • marks]


To test for the significance of the coefficient, we use a two-tailed t-test as the following:


?0: ?????ℎ = 0

??: ?????ℎ ≠ ?


The t-stat = (under the null, the t-stat follows a t-distribution with n-k-1 degrees of freedom)


n=51, and k=4. The critical value at 5% significance level with 46 degrees of freedom is  2.021.


At 5% level of significance, the t-stat is larger than the critical value, hence we reject the null and conclude that the coefficient of south is statistically significant.


  • marks of significance test]


  1. Sketch a simple graph of HR against INCOME which illustrates how homicide rates are predicted to differ between southern and non-southern states. [5 marks]


  1. The model was re-estimated with an interaction term between SOUTH and each of the continuous variables, and the following results were obtained.


??̂ = -15.7 + 0.99UE + 0.0009INCOME + 18.7SOUTH +0.19ETHNIC


(3.2)   (0.31)          (0.0003)                  (7.8)              (0.04)



(0.54)                       (0.0004)                                (0.04)



  1. Interpret the effect of the male unemployment rate on homicide rates and how this differs between southern and non-southern states. [5 marks]


In  northern  states  the  effect  of  UE  on  HR  is  just  the  coefficient  of  UE:  1 percentage  point  increase  in  UE  raises  HR by  0.99  murders  per  100000 population, on average and ceteris paribus.


In  southern  states  the  effect  of  UE  on  HR  is  (0.99 – 0.88=) 0.11. One percentage   point   rise   in   UE   in   southern   states   raises   HR   by   0.11  homicides per 100000 population, on average and ceteris paribus.


  1. Write down an expression that shows the predicted difference in homicide rates between southern and non-southern states. [5 marks]


                        Northern states:

HR=-15.7 + 0.99UE + 0.0009INCOME +0.19ETHNIC


Southern states:

HR = -15.7 + 0.99UE + 0.0009INCOME + 18.7 +0.19ETHNIC           -0.88*UE -0.0008*INCOME – 0.12*ETHNIC


The expression for the predicted difference is:

∆?? = 18.7-0.88*UE – 0.0008*INCOME – 0.12*ETHNIC

  1. Can you use the provided information to carry out an F-test of whether the determinants of homicide rates differ significantly between southern and non-southern states? Explain what would you have to do to compute the F-statistic. [10 marks]


Yes we can. The model in part (a) restricts the determinants of homicide rate to be the same between southern and northern states while the model in part (c) allows these determinants to be different.  [2 marks] If the answer says no we can’t as we don’t have the RSS of the two models, students should be given the 2 points, as they only practiced this F-test with the RSS and not r-squared.


To test whether these coefficients are statistically different between the two regions we use an F-test to test whether the interacted coefficients are jointly significant:


?0: the three interaction terms have coefficients=0

??: At least one of these three coefficients is different from zero


Then we use the ?2 of restricted and unrestricted models to construct the F-statistics and carry out an F-test.


The   and the , the number of restrictions is 3.

[8 marks]

(up to this point suffices for full mark for this question as students were not asked to carry out the F-test).




Critical value of F at 5% with 3 and 43 degrees of freedom =2.84


So we do not reject the null and we conclude that determinants of HR are not statistically different between southern and non-southern states.


  1. A researcher estimates a model of the following form


??=  a + ?1?1?  + ?2?2? + ??    t=1 to 16


  1. Explain what autocorrelation is, how it might arise and discuss the consequences for the OLS estimates. [10 marks]

Autocorrelation refers to the correlation between error terms over time. More specifically autocorrelation is the violation of the following assumption:

Conditional on the explanatory variables, the unobserved factors must not be correlated over time.

[4 marks]


Autocorrelation can occur for various reasons. For example, when conditional on knowing the values of the independent variables, omitted factors are correlated over time we will have an autocorrelation issue.

Another situation that might result in autocorrelation is when past values of the dependent variable feed forward to future values of explanatory variables. [2 marks – one reason is enough]

The consequence of autocorrelation on OLS parameters are that the OLS estimates are no longer BLUE and tests based on t and F are no longer valid.  [4 marks]


  1. The correlation coefficient between the residuals ?̂? and the lagged residuals ?̂?−1 from the model is calculated to be 0.456. Use this to implement a test for autocorrelation, specifying clearly the null and alternative hypotheses. Interpret your results. [10 marks]


This provides the basis for the DW test: [2 marks for correct recognition]


DW=2(1-p) where p is the correlation between the errors. So DW=1.088.


The null being tested is that p=0 and the alternative is that p0. DW is bounded by 0 and 4, sine the highest correlation coefficient is 1.


The critical values, with 2 independent variables and n=16 in our original model, are approximately 0.946 and 1.543, and we use these to mark on the bounds of the inconclusive region.

  1. The researcher estimates a second model of the form:


??=  a + ?1?1?  + ?2?2? + ?3??−1 + ??            t=1 to 16


and obtains a Durbin-Watson statistic of 1.75. The coefficient of ??−1 is estimated to be 0.65 with a standard error of 0.06. Use this information to test for autocorrelation in this model.  [10 marks]


Since we have a lagged dependent variable we need to use Durbin’s h test, which is defined as




1−T var()


[ 4 marks for recognising the test statistic]


We need to use the estimated DW to get p

DW=2(1-p), 1.75=2(1-p) so p=  0.125, and using the info about  we calculate [2 marks]


[1 mark]


T=16 in original model but now we have a lagged dependent variable, T=15.  [1 mark,  student should not be penalised twice if they insert wrong T in the formula above. They should lose only the 1 mark dedicated to correct realisation of T].

The null and alternative is the same as part (b) of this question. Under the null the h statistic would follow a standard normal distribution [1 marks].

This h  is less than any critical value at conventional levels from the z table so we cannot reject the null that there is no serial correlation [1 marks].

  1. A tobacco company is investigating the determinants of tobacco consumption. It estimates the following model using OLS (N=807, R2=0.053)


???????̂ = -3.64 + 0.88income – 0.501educ + 0.571age  – 0.0057age2

(24.08)  (0.728)           (0.167)        (0.160)       (0.0017)


where tobacco is the number of cigarettes consumed per week, income is annual income measured in £1000s, educ is years of schooling and age is the age of each customer, also measured in years. (Standard errors are in parentheses).


  1. What is the purpose of the age squared term? Explain whether or not you think it should be included in the model. [4 marks]


The age squared controls for nonlinear relationship between age and tobacco consumption. This says that the effect of an increase in age on tobacco consumption depends on the level of age. [2 marks]


If the coefficient of age-squared is statistically significant then it should be included in the model. In this case, the t-statistics is 0.0057/0.0017 = 3.35. With a sample size of 807 we can compare this statistic with the critical value from z distribution. At 1% level of significance the critical value is approximately 2.57 hence the age-squared should be included in the model. [2 marks]


  1. Write down an expression that shows the effect of a change in age on tobacco consumption, and use this to show how the effect of being a year older on tobacco consumption is different for a person aged 20 and a person aged 60. [6 marks]


∆???????̂  = 0.571∆??? − 0.0114 ??? ∗ ∆???


[2 marks for either of these expressions]


For ∆??? = 1 at different age levels we have:


At 20 years old, the effect of getting one year older on tobacco consumption is an increase in consumption by 0.343 of a cigarette per week:

∆???????̂  = 0.571 ∗ 1 − 0.0114 ∗ 20 ∗ 1 = 0.343


At 60 years old on the other hand, the effect of getting one year older on tobacco consumption is a decrease of 0.113 of a cigarette per week:

∆???????̂  = 0.571 ∗ 1 − 0.0114 ∗ 60 ∗ 1 = -0.113


[4 marks for similar answer]


  1. Sketch a graph illustrating the relationship between age and tobacco consumption, and calculate at what age tobacco consumption is predicted to decline. [7 marks]



[4 marks]


To find the turning point we need to use the first order condition and set the first derivative equal to zero:

= 0


Solving for this gives age=50. Therefore at age 50, cigarette consumption starts to decline.   [3 marks]



  1. Test the overall significance of the fitted line in this question,  at  the  5%       [4 marks]


We can do an F-test for the statistical significance of the overall model:


The null hypothesis of the test is that all the coefficients are zero, against the alternative that at least one coefficients is different from zero.  R 2 /(k)

F = (1 R    2 )/(nk−1) ~   Fk,nk−1 −

F =


The F-statistic is 11.22 and the critical value for ?5,1224 at 5% significance level is 2.371. Hence we reject the null and conclude that the model is overall significant.


  1. You are told that the errors are heteroskedastic. What is the consequence on each of the following:


  1. i) The standard error of the OLS estimators [3 marks]

The standard error of OLS estimators are not the smallest under heteroskedasticiy ii) The F-tests    [3 marks]

With heterokedastic errors, the F-statistic does not follow an F distribution, making the test invalid

iii)       The bias in OLS estimators   [3 marks]

Hetersokedasticiy has no effect on the bias.



Calculate the price
Make an order in advance and get the best price
Pages (550 words)
*Price with a welcome 15% discount applied.
Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.
We know how difficult it is to be a student these days. That's why our prices are one of the most affordable on the market, and there are no hidden fees.

Instead, we offer bonuses, discounts, and free services to make your experience outstanding.
How it works
Receive a 100% original paper that will pass Turnitin from a top essay writing service
step 1
Upload your instructions
Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.
Pro service tips
How to get the most out of your experience with Home Work Stand
One writer throughout the entire course
If you like the writer, you can hire them again. Just copy & paste their ID on the order form ("Preferred Writer's ID" field). This way, your vocabulary will be uniform, and the writer will be aware of your needs.
The same paper from different writers
You can order essay or any other work from two different writers to choose the best one or give another version to a friend. This can be done through the add-on "Same paper from another writer."
Copy of sources used by the writer
Our college essay writers work with ScienceDirect and other databases. They can send you articles or materials used in PDF or through screenshots. Just tick the "Copy of sources" field on the order form.
See why 20k+ students have chosen us as their sole writing assistance provider
Check out the latest reviews and opinions submitted by real customers worldwide and make an informed decision.
Business Studies
Amazing job on a tight deadline. Met all requirements and some. Will hire again if needed.
Customer 452445, March 3rd, 2022
Human Resources Management (HRM)
Always amazing work! Easy communication and made changes according to my needs!!
Customer 452445, January 5th, 2022
Classic English Literature
Great detailed work, I am very happy with the result of this writer's work. Thank you so much for delivering my paper perfectly and way before the due date.
Customer 452445, January 19th, 2022
Human Resources Management (HRM)
Great work. Always deliver before the deadline.
Customer 452445, March 24th, 2022
I am so pleased with your services Thank you so much for a well-done job with my papers!
Customer 452445, January 12th, 2022
Allows consistent with great work and following the instructions. Thank you for all your hard work!
Customer 452445, December 14th, 2021
Computer science
The work and detail provided in the assignment were exactly what I was looking for! Very pleased with the whole process and the data which was analyzed. I could not have asked for a better writer, thank you again for your hard work, and Happy Holidays
Customer 452445, December 10th, 2021
Business Studies
Great Work. Very Thorough
Customer 452445, January 12th, 2022
Very well written and very impressed with this service!
Customer 452445, June 14th, 2022
Business Studies
It's the second time I use this service and it does not let me down. Work quality is so good for its price!
Customer 452445, June 6th, 2022
Criminal law
I got an A. I recommend homeworkstand.
Customer 452445, October 31st, 2021
It was very helpful and well thought out.
Customer 452445, January 26th, 2022
Customer reviews in total
Current satisfaction rate
3 pages
Average paper length
Customers referred by a friend
15% OFF your first order
Use a coupon FIRST15 and enjoy expert help with any task at the most affordable price.
Claim my 15% OFF Order in Chat

Order your essay today and save 15% with the discount code ESSAYHELP