Você está na página 1de 2

Regression and Time Series Models

Problem Set 1

1. Generate a random sample of size 1000 from the following distributions:

(a) exponential(Use mean = 0.1)


(b) normal(Use mean = 2, standard deviation = 1)
(c) continuous uniform (Use range(3,5))

2. Plot histograms for the samples generated in Question 1 and match them with the shapes of
the original sampling distributions.

3. Generate a random sample of size of n = 10 from the exponential distribution with mean 3.
Calculate the mean of the generated sample. Repeat this process 100 times and in each case
record the sample mean. Plot the histogram of these sample means. Does this plot resemble
to normal distribution? If the size of the sample is changed from n = 10 to n = 150, what
is your observation. Can you think of a theoretical result that supports this observation.
Further, perform the same exercise by replacing the exponential distribution with Poisson
distribution with mean equal to 31 .

4. Generate a random sample of size n = 50 from a standard normal distribution. Calculate


the 95% confidence interval for the mean. Does the mean of the sample lie in this confidence
interval? Repeat the previous steps 100 times and see for yourself how many times the
true mean lies in the confidence interval. Does your experiment agree with the concept of
confidence interval.

5. Data about the caret size of diamonds and their corresponding price is given in the file
diamond.csv. Fit a linear regression model to predict the price of a diamond given its
caret size.

6. Global warming is an important environment issue in the contemporary world. Data about
the cover of ice on earth and its corresponding year is provided in the file ice data.csv.
Fit a linear regression model to predict the cover of ice in the year 2017.

7. Load the data given in the file data 1.csv. Fit a linear regression model for this data and
compute the residuals. Can you say the residuals are normally distributed?

8. Load the data given in the file data 2.csv. Fit a linear regression model for this data and
compute the residuals. Can you say the residuals are normally distributed?

1
9. Consider the following simple linear regression model

y = 0 + 1 x +

where N (0, 2 ). It is known that the


explanatory variable x affects the response vari-
able y via the linear relationship y = 3 + x. The measurements of the response vari-
able y are collected using four different instruments. Each instrument has different level
of accuracy. It should be noted that there is no measurement error in measuring the ex-
planatory variable. The data for each instrument is stored in the following files namely
instrument 1.csv,instrument 2.csv, instrument 3.csv andinstrument 4.csv.
Fit linear regression model for each of the data sets. Which one do you think is a more reliable
instrument and why?

Você também pode gostar