761N1 Essential Quantitative Finance
THE UNIVERSITY OF SUSSEX
MSc EXAMINATION 2020/21
761N1 Essential Quantitative Finance
Assessment Period: January 2021 A1
Duration: 24 hour Take-Away Paper
PLEASE WRITE YOUR CANDIDATE NUMBER ON THE QUESTION PAPER
The total marks for this paper are 100.
Section A is worth 60 marks and consists of 4 questions.
You should attempt only 3 of these questions.
Each question in Section A has 4 parts, and each part carries 5 marks.
Section B is worth 40 marks and consists of 1 computer-based question.
The question in Section B has 5 parts, and each part carries 8 marks.
You should attempt all parts of section B.
You will need the spreadsheet ‘Data.xlsx’ for this section.
Section A
Section A is worth 60 marks and consists of 4 questions. You should attempt only 3 of these questions. Each question in Section A has 4 parts, and each part carries 5 marks.
Question 1 (20 points)
Answer the following questions. Show each step in your calculation.
Suppose you want to form a portfolio for a hedge fund, and the candidates are n insurance companies, n technology companies, and n estate companies. In every month of the next year, you want to randomly choose exactly one company to invest. You can not invest in the same company twice, and the companies are equally likely to be selected.
- What is the probability that you invest in a technology company in the second month?
- What is the probability that in the first two months you invest in companies from different categories?
A test is being used to determine whether or not an individual patient has a certain type of disease. The test is not 100% accurate: if the individual has the disease, the probability that the test is positive is 0.95; and if the individual does not have the disease, the probability that the test is positive is 0.02. In the population as a whole, it is estimated that 1 in 5 people have the disease.
- Assuming an individual is selected at random from the population and tested. The result is positive. What is the probability that this individual has the disease?
- Assuming an individual is selected at random from the population and tested. The result is negative. What is the probability that this individual has the disease?
Question 2 (20 points)
Let X ∼ N(µ,σ2) be a random variable with a normal distribution with mean
µ and variance σ2, let and Φ(x) = P(X ≤ x) be the density function and the cumulative distribution functions of X, respectively. Let Y be the corresponding lognormal distribution.
- Compute the probability P(Y ≤ y). By differentiating the function P(Y ≤ (5)
y), or otherwise, derive the probability density function of Y .
- Using the conclusion in part (a), or otherwise, derive the mean of Y . (5)
Let X1 ∼ N(0,4) and X2 ∼ N(1,9) be two normal random variables. The correlation between X1 and X2 is 0.5.
- Compute the mean and variance for X1 +2X2 and X1 −3X2. (5)
- Compute the correlation between X1 +2X2 and X1 −3X2. (5)
Question 3 (20 points)
(a) (i) Suppose the population mean and variance are unknown for a random (5) variable X, and you have a random sample n < 30, outline the general steps for a two-sided hypothesis test for comparing two population means.
(ii) Now suppose you have two random samples (sample sizes n1,n2 < 30) drawn from two populations with unknown population means and variances, outline the general steps for a standard two-sided hypothesis test for comparing two population means.
- A study was conducted in London to determine the amount of hours spent (5) on Facebook by university and high school students. For this reason, a questionnaire was administered to a random sample
university and a random sample of 50 high school students in London. The hours per day spent on Facebook were recorded. Summaries of the data are shown in the table below,
Sample size | Sample mean | Sample variance | |
University students | nA = 40 | µA = 3 | s2A = 11 |
High school students | nB = 50 | µB = 2 | s2B = 11 |
Use an appropriate hypothesis test to determine whether the mean hours spent per day on Facebook for university students is higher than that of high school students for the 5% significance level. Assume that the two samples are independent.
- Now an independent study was conducted in Manchester to determine the (5) amount of hours spent on Facebook by university and high school students. For this reason, a questionnaire was administered to a random sample
of 60 university and a random sample of
70 high school students in Manchester. The hours per day spent on Facebook were recorded. Summaries of the data are shown in the table below,
Sample size | Sample mean | Sample variance | |
University students | nC = 60 | µC = 4 | s2C = 14 |
High school students | nD = 70 | µD = 3 | s2D = 10 |
What are the approximate distributions of X¯A − X¯B and X¯C − X¯D, where
A A A B B B C C C , andD D D ?
- Using part (d), or otherwise, construct a two-sided symmetric 95% confidence (5) interval for .
Question 4 (20 points)
A simple linear regression model
is estimated using the observations {xt,yt} for t = 1,…,300. The OLS estimate of β is 1.2 with an estimated standard error of 0.1. The sample standard deviation of Y is 0.4 and the sample correlation between X and Y is 0.6.
- Test the hypothesis that β = 1, against the alternative that β < 1 using a (5) significance level of 2.5%. State the highest significance level at which the null hypothesis could be rejected (you may want to use a calculator or a statistics software).
- Calculate the sample standard deviation of X. Perform an analysis of vari (5) ance (ANOVA) giving the TSS, ESS and RSS (i.e. the total, explained and residual sum of squares) and test the goodness-of-fit of the model.
- State and interpret the Gauss-Markov theorem. Describe, briefly, three typ (5) ical causes for violation of the Gauss-Markov assumptions.
- The OLS residuals have a sample skewness of −0.1 and a kurtosis of 0.3. (5) Is it reasonable to suppose that they are observations on an i.i.d. normally distributed error process? What are the consequences of non-normality in the residuals.
Section B
Section B is worth 40 marks and consists of 1 computer-based question. The question in Section B has 5 parts, and each part carries 8 marks. You should attempt all parts of section B. You will need the spreadsheet ‘Data.xlsx’ for this section.
Question 5 (40 points)
You are given a 10-year monthly data file with the title ‘Sales.xlsx’. The first column is the monthly sales in the UK of a ski equipment company, the second column is the monthly return (percentage change) in GDP in the UK, and the third column is the price for the material required to produce the ski equipment. (a) Plot the time series and histogram for each column. Calculate the correlation (8) between sales and GDP return (the first two columns), and the correlation between sales and material price (the first and third columns).
- Build a multiple regression model for the sales, using the GDP return and (8) material price as regressors. Obtain the ordinary least squares (OLS) estimates for the parameters, and discuss whether the GDP return or material price has a significant effect on the sales, at the 95% significance level.
- Plot the fitted lines and provide a qualitative interpretation of the coeffi (8) cients. What can you say about the goodness of fit of the model?
- You realise that the ski is a seasonal activity, so you want to improve the (8) regression model. Incorporate the seasonality into the regression model, by considering seasons as categorical variables. Obtain the ordinary least squares (OLS) estimates for the parameters.
- Compare the goodness of fit between this model and the one in (b). Does (8) multicollinearity exist in your categorical variables? If so, how can you deal with this problem?