question 1:

In the problems, we will look into how different characteristics of homes relate to home prices, identify the main drivers of pricing, and build linear models to predict prices on the basis of these other numerical characteristics.

You need to download a data set Mount Pleasant Real Estate Data (available from http://www.hawkeslearning.com/Statistics/dis/datasets.html). This data set includes information about 245 properties for sale in three communities in the suburban town of Mount Pleasant, South Carolina, in 2017. Use a spreadsheet software or R to solve the problems.

(1)

Eliminate duplexes and properties with prices over $850,000 from the data. Eliminate non- numeric variables and redundant variables from the data.

(2)

Which variable correlates most strongly with price?

(3) Find the regression line Y = ?0 + ?1x with the variable chosen in the previous problem. [The lm function in R or the Analysis ToolPak add-in for Excel will do this.]

For the remaining problems, consider the following variables associated with each property.

x1 = number of bedrooms x2 = number of bathrooms x3 = number of stories

x4 = square footage

x5 = house has pool?

(4) Construct the multivariable least squares model with predictors x1, x2, x3, x4, x5. [First, con- vert x5 to binary.]

(5) Use a hypothesis test to determine if the model is useful for predicting home values at a level ?. State the p-value and interpret.

(6) Are any variables not useful predictors of home price at significance level ? = 0.05? State the p-values of any rejected variables. What does this mean practically?

question 2)

use data in excel (attached) to solve the following:

1: Calculate the log returns for each day on the list (you could use opening prices or closing

prices).

2:Calculate the volatility for the whole data set. Assume volatility is (approximately) constant through time.

3:Calculate the mean of the interest rate over the first year.

4:Assume the prior distribution of the interest rate is normal with the mean over the first year and volatility calculated above and is normally distributed. Use the second year of data to update the prior to a posterior distribution for the interest rate.

5:Repeat the previous problem based on additional data from the third year of prices.

6. If you were to continue updating the distribution based on more and more data, what would happen?