Escolar Documentos
Profissional Documentos
Cultura Documentos
UNIVERSITY OF TORONTO
Faculty of Arts and Science
ECO220Y1Y
PART 1 OF 2
Duration - 3 hours
This exam consists of two parts and a SCANTRON form. You may detach the formula
sheets and statistical tables (Standard Normal, Student t and F) stapled to this part.
You are responsible for turning in the completed SCANTRON form and all 10 pages of
Part 2 of this exam. For both parts combined there are a total of 120 possible points.
ENTER YOUR NAME AND STUDENT # ON BOTH THE PINK SCANTRON FORM
AND PART 2 BEFORE THE END OF THE EXAM IS ANNOUNCED.
Part 1: 20 multiple choice questions worth 3 points each for a total of 60 points
A correct answer is worth 3 points and an incorrect answer is worth 0 points
Answers must be properly recorded on the pink SCANTRON form to earn marks
Print your LAST NAME and INITIALS in the boxes AND darken each letter in the
corresponding bracket below each box; Sign your name in the SIGNATURE box
Print your 9 digit STUDENT NUMBER in the boxes AND darken each number in
the corresponding bracket below each box
Use only a pencil or blue or black ball point pen
o Pencil strongly recommended because it can be erased
Make dark solid marks that fill the bubble completely
Erase completely any marks you want to change
Crossing out a marked box is not acceptable and is incorrect
If more than one answer is marked then that question earns 0 points
► Questions (1) – (2): This relative frequency histogram describes the GPA of a
sample of students. GPA is recorded to the nearest hundredth.
.2
.1477
.1136
.1
.0455.0341
0
1 1.5 2 2.5 3 3.5 4
GPA
(A) 19
(B) 31
(C) between 0 and 19
(D) between 0 and 31
(E) between 19 and 31
(2) What is the most specific statement you can make about the sample median?
(A) It is 2.50
(B) It is 3.24
(C) It is greater than 2.50
(D) It is between 2.50 and 3.00
(E) It is between 3.00 and 3.50
(3) You are asked to estimate average customer satisfaction measured on a scale from
0 to 10 with an interval width no bigger than 1 (i.e. with a margin of error no bigger than
0.5). You believe the standard deviation across customers is 2. How many customers
should you randomly sample?
(A) 54
(B) 56
(C) 58
(D) 60
(E) 62
Part 1 of 2 Page 3 of 8
(4) Two variables have the same mean and the same range. Which would cause the
standard deviations to differ substantially?
Statement 3: The two variables are highly negatively correlated with each other
(A) Only 1
(B) Only 2
(C) Only 3
(D) Only 2 and 3
(E) All three: 1, 2 and 3
(5) An instructor says the marks on a quiz are normally distributed with a mean of 64
and standard deviation of 12. Which is the best approximation of the 60th percentile?
(A) 60
(B) 62
(C) 65
(D) 66
(E) 67
► Questions (6) – (7): To be profitable a hotel gift shop must have revenue of at least
$1,240 per day. On average each visitor spends $10 with a standard deviation of $20.
(6) For a random sample of 100 visitors which phrase should you expect to describe a
histogram of the money spent?
(7) In a day with 100 visitors, what is the chance that the gift shop will be profitable?
(A) 0.0014
(B) 0.0043
(C) 0.1151
(D) 0.3849
(E) 0.4957
Part 1 of 2 Page 4 of 8
(8) A population is Bell shaped with a mean of 10. The standard deviation is unknown.
Which sample size would result in the smallest probability that the sample mean is
between 10 and 12?
(A) n = 2
(B) n = 4
(C) n = 8
(D) n = 16
(E) n = 32
(9) An employer claims that special exemptions regarding reimbursement for client
dinners are rare: it happens only 1 percent of the time. An accountant audits a random
selection of 100 transactions and finds three cases with special exemptions. What is the
probability of finding three or more such cases if the employer’s claim is true?
(A) 0.0776
(B) 0.0782
(C) 0.0788
(D) 0.0794
(E) 0.0800
(10) When conducting a survey of people, which would cause non-sampling error?
(11) Consider H0: = 10 and H1: < 10 where is the mean of a Normal population.
For a sample size of 16 randomly selected from that population you obtain a sample
standard deviation of 1.37. Which values of the sample mean would allow you to reject
the null hypothesis and infer that the research (alternative) hypothesis is true?
► Questions (12) – (13): A 2008 article in the Chronicle of Higher Education entitled
“Colleges Mine Data to Predict Dropouts” discusses how some universities have
analyzed data to try to identify students at risk of failure or drop-out. “[At a major
university], researchers monitored how frequently students viewed discussion posts and
content pages on course web sites for three different courses to find connections
between online engagement and academic success. [In the table below] students who
were ‘successful’ received an A, B, or C in the class, and students who were
‘unsuccessful’ received a D, F, or an incomplete.”
(12) Which approach would directly address the question: “How much more often do
successful students typically view the content pages compared to unsuccessful
students?”
(13) After reviewing the table above, the university considers upgrading to a new course
management system to increase students’ course-specific online activity. While it is
substantially more expensive than the current system, the Dean says “This additional
expense is clearly worth it. Better engaging our students online will improve their course
performance and substantially reduce the number of students getting a D or F or
withdrawing.” The university buys the new system. While students are positive about it
and the number of views increases, the results are disappointing and the Dean’s
predictions do not come true. Which best explains why?
► Questions (14) – (17): 209 randomly selected voters are shown a video of a political
candidate and asked to grade the candidate on friendliness from 0 to 100 where 100 is
the best. The video is identical in all cases except that the color of backdrop (the curtain
behind the candidate) is randomly varied among the colors of green, orange, and grey.
Also the room temperature where the voter watches the video is randomly assigned a
value between 18 and 30 degrees Celsius. The regression model includes the
temperature (tmp) and the temperature squared (tmp2). Here are the estimated
regression coefficients with standard errors in parentheses. The R2 is 0.1019.
(14) If green were the omitted category then the results would be y-hat = __.
Criticism 1: These are observational data and we cannot infer that the temperature and
color have a causal effect on voters’ perceptions of friendliness
Criticism 3: Because the researcher did not control for other factors that would affect
perceived friendliness such as the personality of the voter watching, the topic discussed
in the video, or the body language used, the estimates of the effect of temperature and
color on the rating are biased
(A) Only 1
(B) Only 2
(C) Only 3
(D) Only 1 and 2
(E) Only 1 and 3
Part 1 of 2 Page 7 of 8
(16) What if we thought the effect of temperature depends on the backdrop color?
(17) “Is there sufficient evidence to infer a non-linear relationship between temperature
and perceived friendliness?” To answer, what is the value of the test statistic?
(A) -2.28
(B) -2.05
(C) -1.96
(D) 2.05
(E) 2.28
► Questions (18) – (20): For a new public holiday three different dates are being
considered: Date 1, Date 2 and Date 3. A researcher randomly samples 120 people and
asks which date is most preferred. P̂1 , P̂2 , and P̂3 are the fraction preferring Dates 1, 2,
and 3 respectively. Suppose that in the entire population 50 percent prefer Date 1, 30
percent prefer Date 2, and 20 percent prefer Date 3.
(18) Which sample proportion would have the most sampling error (i.e. variability)?
(A) P̂1
(B) P̂2
(C) P̂3
(D) All three have the same because the sample size is 120 in all three cases
(E) All three are subject to zero sampling error because the sample has been
randomly selected
(19) If H0: p1 = 0.5 and H1: p1 ≠ 0.5, which sample proportion selecting Date 1 would
result in a Type I error?
(A) 0.40
(B) 0.45
(C) 0.50
(D) 0.55
(E) 0.58
Part 1 of 2 Page 8 of 8
(20) If H0: p1 = 0.6 and H1: p1 ≠ 0.6, which sample proportion selecting Date 1 would
result in a Type II error?
(A) 0.3
(B) 0.4
(C) 0.6
(D) 0.7
(E) 0.8
Make sure to properly mark your name, student number, signature, and answers
for Part 1 on the pink SCANTRON form before the end of the exam is announced.
No extra time is permitted.
N n
Population
xi Sample
xi
Mean: i 1
Mean: X i 1
N n
2
N n
n x
xi xi X
2
2
Population 1 n 2 i 1 i
Variance: 2 i 1 Sample
s
2 i 1
[ xi ]
N Variance: n 1 n 1 i 1 n
P( A and B )
P( A | B) P( AC ) 1 P ( A) P ( AC | B ) 1 P ( A | B )
P( B)
Multiplication Addition
Rule: P( A and B ) P ( A | B ) P ( B ) Rule: P( A or B ) P ( A) P ( B ) P ( A and B )
Expected
Value: E[ X ] xp( x) Variance: V [ X ] E[( X ) 2 ] 2 ( x μ) 2 p ( x)
all x all x
Combinatorial Binomial n!
formula: n! probability: p( x) p x (1 p ) n x
C xn x!(n x)!
x!(n x)!
for x 0, 1, 2, ... , n
X
z statistic: z Confidence interval estimator of when 2 is known: X z / 2
n n
2
z
Sample size to estimate ± when 2 is known: n /2
X s
t statistic: t Confidence interval estimator of when 2 is unknown: X t / 2
s n n
Inference about p (population proportion):
pˆ p
Test statistic: z CI estimator: pˆ z / 2 pˆ (1 pˆ ) / n
p (1 p ) / n
2
z pˆ (1 pˆ )
Sample size to estimate p ± : n / 2
Inference about (μ 1 – μ 2) when 12 = 22:
( X 1 X 2 ) ( 1 2 ) s12 s22
Test statistic: t CI estimator: ( X 1 X 2 ) t / 2
s2
s 2 n1 n2
1
2
n1 n2
s 2
1 n1 s22 n2
2
s2
1 n1
2
s22 n2
2
n1 1 n2 1
Inference about p1 – p2:
Test statistic:
z
pˆ1 pˆ 2 p1 p2
[H0: (p1 – p2) = 0]
1 1
pˆ (1 pˆ )
n1 n2
pˆ 1 (1 pˆ 1 ) pˆ 2 (1 pˆ 2 )
CI estimator: pˆ1 pˆ 2 z / 2
n1 n2
SIMPLE REGRESSION: yi xi i
cov( x, y )
Least squares line / linear regression line: yˆ a bx b a y bx
s x2
Coefficient of Determination: R 2
cov( x, y )
2
2 ( s xy ) 2
SSE (n 1) s y 2
s x2 s 2y s x
Standard error of estimate: Standard error of least squares slope estimate:
SSE SSE s
s s2 sb
n2 n2 (n 1) s x2
Test statistic & confidence interval estimator for β:
b
t b t / 2 sb n2
sb
Prediction Interval for y for a given value of x (xg): Confidence Interval for expected value of y given x (xg):
2 2
1 ( xg X ) 1 ( xg X )
yˆ t / 2 s 1 n2 yˆ t / 2 s n2
n (n 1) s x2 n (n 1) s x2
SIMPLE & MULTIPLE REGRESSION:
n n n n
SST ( yi y ) 2 SSR ( yˆ i y ) 2 SSE ei2 ( yˆ i yi ) 2 SST SSR SSE
i 1 i 1 i 1 i 1
SSE
(ei 0) 2 SSE SSE (n k 1)
i 1
s s2 Adj. R 2 1
n k 1 n k 1 n k 1 yi y (n 1)
2
Critical Values of t: 0 tA
ν t0.10 t0.05 t0.025 t0.01 t0.005 ν t0.10 t0.05 t0.025 t0.01 t0.005
1 3.078 6.314 12.706 31.821 63.657 24 1.318 1.711 2.064 2.492 2.797
2 1.886 2.920 4.303 6.965 9.925 25 1.316 1.708 2.060 2.485 2.787
3 1.638 2.353 3.182 4.541 5.841 26 1.315 1.706 2.056 2.479 2.779
4 1.533 2.132 2.776 3.747 4.604 27 1.314 1.703 2.052 2.473 2.771
5 1.476 2.015 2.571 3.365 4.032 28 1.313 1.701 2.048 2.467 2.763
6 1.440 1.943 2.447 3.143 3.707 29 1.311 1.699 2.045 2.462 2.756
7 1.415 1.895 2.365 2.998 3.499 30 1.310 1.697 2.042 2.457 2.750
8 1.397 1.860 2.306 2.896 3.355 35 1.306 1.690 2.030 2.438 2.724
9 1.383 1.833 2.262 2.821 3.250 40 1.303 1.684 2.021 2.423 2.704
10 1.372 1.812 2.228 2.764 3.169 45 1.301 1.679 2.014 2.412 2.690
11 1.363 1.796 2.201 2.718 3.106 50 1.299 1.676 2.009 2.403 2.678
12 1.356 1.782 2.179 2.681 3.055 55 1.297 1.673 2.004 2.396 2.668
13 1.350 1.771 2.160 2.650 3.012 60 1.296 1.671 2.000 2.390 2.660
14 1.345 1.761 2.145 2.624 2.977 70 1.294 1.667 1.994 2.381 2.648
15 1.341 1.753 2.131 2.602 2.947 80 1.292 1.664 1.990 2.374 2.639
16 1.337 1.746 2.120 2.583 2.921 90 1.291 1.662 1.987 2.368 2.632
17 1.333 1.740 2.110 2.567 2.898 100 1.290 1.660 1.984 2.364 2.626
18 1.330 1.734 2.101 2.552 2.878 120 1.289 1.658 1.980 2.358 2.617
19 1.328 1.729 2.093 2.539 2.861 140 1.288 1.656 1.977 2.353 2.611
20 1.325 1.725 2.086 2.528 2.845 160 1.287 1.654 1.975 2.350 2.607
21 1.323 1.721 2.080 2.518 2.831 180 1.286 1.653 1.973 2.347 2.603
22 1.321 1.717 2.074 2.508 2.819 200 1.286 1.653 1.972 2.345 2.601
23 1.319 1.714 2.069 2.500 2.807 ∞ 1.282 1.645 1.960 2.326 2.576
Degrees of freedom: ν
A