Escolar Documentos
Profissional Documentos
Cultura Documentos
Methods
Response Variables and Explanatory Variables
Response Variable: As a measure of team success, number of regular season wins for the
2014-2015 season was selected. This variable excludes tournament play both in conference
tournaments and the NCAA tournament, since each tournament results in one loss for all but the
winning team.
Explanatory Variables: Our rational for the selection of response variables was to choose
variables that reflected items that coaches have control over. This meant that statistics that were
directly related to the score (i.e. scoring more points in a game leads to more wins) were
excluded. Variables were selected because they were all tied to a direct actionable goal for the
teams coach. Our explanatory variables were as followed:
All American: A binary variable that has a value of 1 when a team has a previous AllAmerican High School player. Each year 24 high school students are selected as the top players
of their class. Since our sample includes only the top 25 teams, it is assumed that the vast
majority of these teams have the prestige to attract top players. The actionable goal for a teams
coach in this case is the recruitment of top players. Is it worth expending team and school
resources in order to recruit a star player, or is time better spent recruiting a more varied team?
Free Throws: a continuous variable of the percent of free throws a team makes over the
course of a season. Practicing Free throws is not always easy for college coaches as it is very
individual. Is it worth the time for a coach to pull below average free throw players out of team
practice, or is time better spent on team activities?
Coefficients:
Estim
ate
(Intercept)
Assist
Free Throw
All-American
Possession
Personal
Fouls/Game
Blocks
Off.Rebounds
1.1801
33
0.0138
82
0.0087
17
0.0655
12
0.0179
1
0.0066
03
0.0011
69
0.0064
56
Std.
Error
t
value
Pr(>|t|)
0.463863
2.544
0.01888
0.008996
1.543
0.13776
0.004525
1.926
0.06769
0.035786
1.831
0.08138
0.004859
-3.686
0.00137
0.011659
-0.566
0.5781
0.010149
0.115
0.9096
0.013287
0.486
0.6329
Normality:
From the Q-Q Plot we see that the errors of the data are
essentially normally distributed. While the number of
data points is not large we do not find any non-normality
significant enough to transform Y.
Global F-test:
The global F-test comparing the mean of win percent to the nave model is significant with a pvalue of .038 The ANOVA table is given below:
Analysis of Variance
Table
Model 1: Win.Pct ~ 1
Model 2: Win.Pct ~ (All.American + Free.Throw + Off.Rebounds + Pers.Fouls.p.game + Poss +
Blocks + Assist)
Pr(>F
Res.Df
RSS
Df
SS
F
)
Model 1
25
0.23154
0.1202
2.779 0.038
Model 2
18
0.11127
7
7
5
05
Variable Selection:
Since our degrees of freedom did not allow us to check all interactions we used the pairs plot
(see above) to select interactions to test. The interaction between personal fouls per game and
possessions, as well as fouls per game and offensive rebounds were chosen because they were
the only interactions with R^2 values above .60, and because both possessions (fast play) and
offensive rebounds are aggressive and could lead to more fouls. In order to select variables for
our model, we used both forward and backward selection. Both yielded the same model. Both the
first and last steop of backward selection are shown below:
Start: AIC=-122.73
Win.Pct ~ (All.American + Free.Throw + Off.Rebounds + Assist.Turnover +
Off.Efficiency + Pers.Fouls.p.game + Poss + Blocks + Assist) Off.Efficiency - Assist.Turnover + Poss:Pers.Fouls.p.game +
Off.Rebounds:Pers.Fouls.p.game
Df Sum of Sq
RSS
AIC
- Blocks
1 0.0000648 0.10745 -124.71
- Pers.Fouls.p.game:Poss
1 0.0017769 0.10916 -124.30
- Off.Rebounds:Pers.Fouls.p.game 1 0.0036634 0.11105 -123.85
<none>
0.10738 -122.72
- All.American
1 0.0110104 0.11840 -122.19
- Free.Throw
1 0.0121651 0.11955 -121.94
- Assist
1 0.0123097 0.11969 -121.90
Step: AIC=-131.14
Win.Pct ~ All.American + Free.Throw + Poss + Assist
Df Sum of Sq
RSS
AIC
<none>
0.11412 -131.14
- Assist
1 0.012939 0.12706 -130.35
- All.American 1 0.018212 0.13234 -129.29
- Free.Throw 1 0.020168 0.13429 -128.91
- Poss
1 0.073832 0.18796 -120.17
Results
Our study yielded the following model:
Win Percent= 1.180133+.013882(Assists)+.0087(Free Throws)+.0655(AllAmerican)-0.018(Possession)
Coefficients:
Estimate
(Intercept)
Assist
Free Throw
All-American
Possession
Std. Error
1.180133
0.013882
0.008717
0.065512
-0.01791
0.463863
0.008996
0.004525
0.035786
0.004859
t value
2.544
1.543
1.926
1.831
-3.686
Pr(>|t|)
0.01888
0.13776
0.06769
0.08138
0.00137
Intercept: The intercept of this data has no interpretable meaning, and is of no real value as all
teams have at least 55 possessions per game. Since the coefficient for possession is negative this
means that the intercept assumes a team has 0 possessions in a game.
Assists: When all other variables are held constant, we expect an increase of one assist per game
to correspond to an increase in winning percentage of 1.3% per season.
Free Throw: When all other variables are held constant, we expect an increase in free throws
made in a season of 1% to correspond to correspond to an increase in wining percent of .8% per
season.
All American: When all other variables are held constant, we expect a team with an AllAmerican to have a win percentage 6.5% better than a team without an All-American.
Possession: When all other variables are held constant, we expect an increase of one possessions
per game to correspond to a win percentage 1.7% lower.
When evaluating these variables using a t-test we see that possessions is the only statistically
significant variable at the .95 significance level. Both free throws and All-American are
significant at the .9 significance level. While included in the model, assists is not statistically
significant.