Você está na página 1de 8

Kathryn Respicio

STAT 101

1
Variable Introduction

The variables were accessed from the General Social Survey (GSS) Data Explorer site. I
have decided to use the following variables for my Statistics Data Analysis Project. For my
nominal variable, I chose to use marital status. For my ordinal variable, I chose to use an
opinion that asked respondents to rate their general happiness. For my first interval variable, I
chose to use respondents income. For my second interval variable, I chose to use is how many
people the responder keeps in touch with during a typical weekday. I have also identified my
ordinal variable (rate your happiness) as the main dependent variable for my analysis.
Ordinal: Rate Your General Happiness
I have decided to use my ordinal variable as the main dependent variable for my project.
The ordinal variable contains categories that are named and ranked. The categories are put in a
sequence. However, the distance between each categorical answer is unknown. Because of the
specific rankings for each of the variables categories, opinion questions are considered to be
ordinal variables.
The researcher asked each respondent how she or he would rate their general
happiness. The researcher used the following categories for this variable: 1) very happy; 2)
pretty happy; 3) not too happy; and 4) dont know. It is also interesting to note that the researcher
also added a no answer category.
I chose to use this variable as my main variable because I feel that many factors can
affect the individuals appraisal of happiness in their life. Each individuals subjective definition
of what is happiness and what causes happiness varies according to unique factors. General
happiness can be linked to all of the other variables I have chosen. The effects of a relationship

Kathryn Respicio
STAT 101
status, a respondents annual income, and how many people an individual keeps in contact with
can all affect how an individual rates their general happiness.

Nominal: Marital Status


My nominal variable focuses on marital status. The nominal variable has named
categories. However, these categories have no identifiable ranking or logical order. Research
questions ask a respondent to identify with a specific category.
The researcher asked each respondent to identify their current relationship status. Here
are the following categories: 1) married; 2) widowed; 3) divorced; 4) separated; and 5) never
married. It is interesting to note that different stages of separation are added as categories.

Interval: Respondents Income


My first interval variable centers on each respondents income value. Interval variables

Kathryn Respicio
STAT 101

are named as such because the categories are named, ranked, and have numerical meaning. The
distance between each category has a numerical value. Because there is an exact distance
between dollars, this question is considered to be interval.
Respondents were asked to report what their income earnings were for the prior year. The
researcher specifies to label the amount prior to taxes or other deductions taken out. The
categories are listed in a specific numerical value starting at $1,000 and ending at $25,000 or
more. The researcher also added categories for respondents who chose not to reveal their income
earning and for those who were unsure of their exact earnings.

Interval: How many people in contact with in a typical weekday?


The last interval variable I focused on in the number of people the respondent keeps in
contact with during a typical weekday. Because there is an exact count between the number of

Kathryn Respicio
STAT 101

people the respondent can keep in contact with, this question is considered to be an interval
question.
The respondent was asked to choose a categorical number based on how many
individuals she or he keeps in contact with on a regular weekday. The smallest categorical
number is 0-4 persons. The largest categorical number is 50 or more individuals. It is
interesting to note that the researcher added 2 categories for individuals who could not choose an
answer (cant choose) and another category for respondents who did not provide an answer
(no answer).

Variable Interpretation
ORDINAL: Rate Your General Happiness
General Happiness
Opinion
f
cf
%
31.06
Very happy
786
786
%
218 55.45
Pretty happy
1403
9
%
253 13.49
Not too happy
341
0
%
N=
2530
100%

c%
31.06
%
86.51
%
100%

Kathryn Respicio
STAT 101

The mode is a central tendency measure. It is interpreted as the most common or


prevalent value in the distribution. The category with the highest frequency is pretty happy.
55.27% of responders reported their general happiness as pretty happy. The category with the
second highest frequency is very happy. It is interesting to note that less responders reported
that they were very or completely happy than their counterparts who are pretty happy. More
information is needed to discern which factors contribute to what makes an individual happy.
The category with the lowest frequency is dont know with 0.24%. Lastly, 0.1% of responders
did not provide an answer for this question.
It is important to note that I did not include respondents who chose the dont know or
no answer categories. I wanted the data to reflect answers from respondents who chose
happiness ratings.
NOMINAL: Marital Status
Marital
Status

Married

1158

Widowed

209

Divorced

411

Separated

81

Never Married

675
N=
2534

cf
115
8
136
7
177
8
185
9
253
4

%
45.69
%
8.25%
16.22
%
3.20%
26.64
%

c%
45.62
%
53.94
%
70.16
%
73.36
%
100%

100%

The category with the highest mode is married. In other words, 45.62% of respondents
reported that they were married at the time they answered this survey. The second highest mode
is never married. At least 26.20% of respondents who had married prior to answering this
survey question. The reason for this category being the second highest value could be reflected in

Kathryn Respicio
STAT 101

current trends. Marriage is no longer considered to be a high priority amongst the millennial
generation. More emphasis is put on independence and career success. Research also shows that
people cohabitate together for a longer period of time prior to getting married.
One could also factor in the use of technology. With the advent of social media dating
sites and apps, individuals are given more options to as to who they would like to casually date.
Personality profiles and pictures are used to factor in whether the candidate is worthy of a casual
date or romantic relationship. This factor could also contribute as to why individuals wait longer
to get married.
The third highest category is divorced. 15.19% of respondents had been divorced at the
time this research question was answered. Combining the current attitudes about marriage with
the flexibility of divorce, current trends reflect an increase in divorce rates.
It is important to note that I did not include respondents who chose the no answer category. I
wanted the data to reflect answers from respondents who chose actual marital status categories.
INTERVAL: Respondents Income
Yearly
Income
Under $1000

f
32

$1000 - $2999

35

$3000 - $4999

63

$5000 - $6999

56

$7000 - $9999
$10000 $14999
$15000 $19999
$20000 $24999

44

$25000+

111
87
143
951

Mid-Point
$500
$1,499.5
0
$3,999.5
0
$5,999.5
0
$8,499.5
0
$12,499.
50
$17,499.
50
$22,499.
50
$25,000.
50

Xf
$16,000

cf
32

X^2
$250,000

X^2f
$8,000,000

$52,482.50

67

$251,968.50

130

$335,972

186

$373,978
$1,387,444.5
0
$1,522,456.5
0
$3,217,428.5
0
$23,775,475.
50

230

$2,248,500.25
$15,996,000.2
0
$35,994,000.2
0
$72,241,500.2
0

$88,946,009
$1,007,748,012.
60
$2,015,664,011.
20
$3,178,626,008.
80

341

$156,237,500

$17,342,362,500

428

$306,232,500

$26,642,227,500

571
152
2

$506,227,500
$625,025,000.
30

$72,390,532,500
$594,398,775,28
5

Kathryn Respicio
STAT 101
N=
1522

7
$717,072,881,82
6.60

$30,612,755

When looking at the annual income for each responder, the mean is $20,113.50. The
median is in the 761.5th position. In other words, the median is in the $250,000 category. Lastly,
the mode is also in the $250,000 category. There appears to be a negative skew since the mean is
much smaller than the median. The mean is the appropriate measure to use because of the high
variability in the data.
When using the variability measures, the standard deviation value is $8,160.02. Because
the standard deviation is almost equivalent to one-third of the mean, the SD reveals that there is
typical variation within this data. In other words, there is not that much variation within the data.
It is important to note that I did not include respondents who chose the refused and
dont know categories. I wanted the data to reflect answers from respondents who chose actual
income categories.
INTERVAL: How
Number of
Contacts
f
0-4 persons
278
5-9 persons
358
10-19 persons

286

many people do you keep in contact with?


Mid-Point
2 persons
7 persons
14.5
persons

Xf
556
2506

cf
278
636

X^2
4
49

X^2f
1112
17542

4147

922

210.25

60131.5

Kathryn Respicio
STAT 101
20-49 persons

216

50 or more

115
N=
1253

34 persons
50.5
persons

7344
5807.5

113
8
125
3

1156
2550.2
5

20,360.5

249696
293278.7
5
621760.2
5

When looking at the number of persons each responder keeps in contact with on a typical
weekday, the mean is 16.25 persons. The median is in the 627th position. In other words, the
median is in the 5-9 person category. Lastly, the mode is 5-9 persons. There appears to be a
positive skew since the mean is much larger than the median. The mean is the appropriate
measure to use since there is a high amount variability in the data.
When using the variability measures, the standard deviation value is 15.24 persons. Since
the mean is very close to the SD, there is a lot of variation indicated in this data.
It is important to note that I did not include respondents who chose the cant choose
and no answer. I wanted the data to reflect answers from respondents who chose an actual
category that reflects the number of people she or keeps in touch with during the weekday.

Você também pode gostar