Escolar Documentos
Profissional Documentos
Cultura Documentos
STAT 101
1
Variable Introduction
The variables were accessed from the General Social Survey (GSS) Data Explorer site. I
have decided to use the following variables for my Statistics Data Analysis Project. For my
nominal variable, I chose to use marital status. For my ordinal variable, I chose to use an
opinion that asked respondents to rate their general happiness. For my first interval variable, I
chose to use respondents income. For my second interval variable, I chose to use is how many
people the responder keeps in touch with during a typical weekday. I have also identified my
ordinal variable (rate your happiness) as the main dependent variable for my analysis.
Ordinal: Rate Your General Happiness
I have decided to use my ordinal variable as the main dependent variable for my project.
The ordinal variable contains categories that are named and ranked. The categories are put in a
sequence. However, the distance between each categorical answer is unknown. Because of the
specific rankings for each of the variables categories, opinion questions are considered to be
ordinal variables.
The researcher asked each respondent how she or he would rate their general
happiness. The researcher used the following categories for this variable: 1) very happy; 2)
pretty happy; 3) not too happy; and 4) dont know. It is also interesting to note that the researcher
also added a no answer category.
I chose to use this variable as my main variable because I feel that many factors can
affect the individuals appraisal of happiness in their life. Each individuals subjective definition
of what is happiness and what causes happiness varies according to unique factors. General
happiness can be linked to all of the other variables I have chosen. The effects of a relationship
Kathryn Respicio
STAT 101
status, a respondents annual income, and how many people an individual keeps in contact with
can all affect how an individual rates their general happiness.
Kathryn Respicio
STAT 101
are named as such because the categories are named, ranked, and have numerical meaning. The
distance between each category has a numerical value. Because there is an exact distance
between dollars, this question is considered to be interval.
Respondents were asked to report what their income earnings were for the prior year. The
researcher specifies to label the amount prior to taxes or other deductions taken out. The
categories are listed in a specific numerical value starting at $1,000 and ending at $25,000 or
more. The researcher also added categories for respondents who chose not to reveal their income
earning and for those who were unsure of their exact earnings.
Kathryn Respicio
STAT 101
people the respondent can keep in contact with, this question is considered to be an interval
question.
The respondent was asked to choose a categorical number based on how many
individuals she or he keeps in contact with on a regular weekday. The smallest categorical
number is 0-4 persons. The largest categorical number is 50 or more individuals. It is
interesting to note that the researcher added 2 categories for individuals who could not choose an
answer (cant choose) and another category for respondents who did not provide an answer
(no answer).
Variable Interpretation
ORDINAL: Rate Your General Happiness
General Happiness
Opinion
f
cf
%
31.06
Very happy
786
786
%
218 55.45
Pretty happy
1403
9
%
253 13.49
Not too happy
341
0
%
N=
2530
100%
c%
31.06
%
86.51
%
100%
Kathryn Respicio
STAT 101
Married
1158
Widowed
209
Divorced
411
Separated
81
Never Married
675
N=
2534
cf
115
8
136
7
177
8
185
9
253
4
%
45.69
%
8.25%
16.22
%
3.20%
26.64
%
c%
45.62
%
53.94
%
70.16
%
73.36
%
100%
100%
The category with the highest mode is married. In other words, 45.62% of respondents
reported that they were married at the time they answered this survey. The second highest mode
is never married. At least 26.20% of respondents who had married prior to answering this
survey question. The reason for this category being the second highest value could be reflected in
Kathryn Respicio
STAT 101
current trends. Marriage is no longer considered to be a high priority amongst the millennial
generation. More emphasis is put on independence and career success. Research also shows that
people cohabitate together for a longer period of time prior to getting married.
One could also factor in the use of technology. With the advent of social media dating
sites and apps, individuals are given more options to as to who they would like to casually date.
Personality profiles and pictures are used to factor in whether the candidate is worthy of a casual
date or romantic relationship. This factor could also contribute as to why individuals wait longer
to get married.
The third highest category is divorced. 15.19% of respondents had been divorced at the
time this research question was answered. Combining the current attitudes about marriage with
the flexibility of divorce, current trends reflect an increase in divorce rates.
It is important to note that I did not include respondents who chose the no answer category. I
wanted the data to reflect answers from respondents who chose actual marital status categories.
INTERVAL: Respondents Income
Yearly
Income
Under $1000
f
32
$1000 - $2999
35
$3000 - $4999
63
$5000 - $6999
56
$7000 - $9999
$10000 $14999
$15000 $19999
$20000 $24999
44
$25000+
111
87
143
951
Mid-Point
$500
$1,499.5
0
$3,999.5
0
$5,999.5
0
$8,499.5
0
$12,499.
50
$17,499.
50
$22,499.
50
$25,000.
50
Xf
$16,000
cf
32
X^2
$250,000
X^2f
$8,000,000
$52,482.50
67
$251,968.50
130
$335,972
186
$373,978
$1,387,444.5
0
$1,522,456.5
0
$3,217,428.5
0
$23,775,475.
50
230
$2,248,500.25
$15,996,000.2
0
$35,994,000.2
0
$72,241,500.2
0
$88,946,009
$1,007,748,012.
60
$2,015,664,011.
20
$3,178,626,008.
80
341
$156,237,500
$17,342,362,500
428
$306,232,500
$26,642,227,500
571
152
2
$506,227,500
$625,025,000.
30
$72,390,532,500
$594,398,775,28
5
Kathryn Respicio
STAT 101
N=
1522
7
$717,072,881,82
6.60
$30,612,755
When looking at the annual income for each responder, the mean is $20,113.50. The
median is in the 761.5th position. In other words, the median is in the $250,000 category. Lastly,
the mode is also in the $250,000 category. There appears to be a negative skew since the mean is
much smaller than the median. The mean is the appropriate measure to use because of the high
variability in the data.
When using the variability measures, the standard deviation value is $8,160.02. Because
the standard deviation is almost equivalent to one-third of the mean, the SD reveals that there is
typical variation within this data. In other words, there is not that much variation within the data.
It is important to note that I did not include respondents who chose the refused and
dont know categories. I wanted the data to reflect answers from respondents who chose actual
income categories.
INTERVAL: How
Number of
Contacts
f
0-4 persons
278
5-9 persons
358
10-19 persons
286
Xf
556
2506
cf
278
636
X^2
4
49
X^2f
1112
17542
4147
922
210.25
60131.5
Kathryn Respicio
STAT 101
20-49 persons
216
50 or more
115
N=
1253
34 persons
50.5
persons
7344
5807.5
113
8
125
3
1156
2550.2
5
20,360.5
249696
293278.7
5
621760.2
5
When looking at the number of persons each responder keeps in contact with on a typical
weekday, the mean is 16.25 persons. The median is in the 627th position. In other words, the
median is in the 5-9 person category. Lastly, the mode is 5-9 persons. There appears to be a
positive skew since the mean is much larger than the median. The mean is the appropriate
measure to use since there is a high amount variability in the data.
When using the variability measures, the standard deviation value is 15.24 persons. Since
the mean is very close to the SD, there is a lot of variation indicated in this data.
It is important to note that I did not include respondents who chose the cant choose
and no answer. I wanted the data to reflect answers from respondents who chose an actual
category that reflects the number of people she or keeps in touch with during the weekday.