Você está na página 1de 33

Data Analysis

Descriptive statistics intend to describe a big chunk of


data with summary charts and tables, but do not attempt
to draw conclusions about the population from which
the sample was taken.
You are simply summarizing the data you have with
pretty charts and graphskind of like telling someone
the key points of a book (executive summary) as
opposed to just handing them a thick book (raw data).

DESCRIPTIVE

Text to explain what the charts


and tables are showing

Descriptive statistics usually involve measures of


central tendency (mean, median, mode) and
measures of dispersion (variance, standard
deviation, etc.).

Nominal
Ordinal
Interval
Ratio

Types of Data &


Measurement Scales:

Nominal scales are used for labeling.


Nominal level provides the least amount of information of all.
Only the mode can be used as a means of central tendency.

Examples of Nominal Scales

NOMINAL

ORDINAL
It is the order of the values (rank) is whats important and
significant.
Difference between each one is not known.
Ex. In each case, we know that a #4 is better than a #3 or
#2, but we dont knowand cannot quantify
how much better it is.
Ex is the difference between OK and Unhappy the
same as the difference between Very Happy and
Happy? We cant say.

ORDINAL
Ordinal scales are typically measures of non-numeric
concepts like satisfaction, happiness, discomfort, etc.
The best way to determine central tendency on a set
of ordinal data is to use the mode or median; the
mean cannot be defined from an ordinal set.

Example of Ordinal Scales

Numeric scales in which we know not only the order,


but also the exact differences between the values.
Ex , the difference between 60 and 50 degrees is a
measurable 10 degrees, as is the difference between 80
and 70 degrees.
Interval scales increments are known, consistent, and
measurable.

INTERVAL

Interval scales not only tell us about order, but


also about the value between each item.
Mean, Median & mode can be calculated from
this data.
If you rank interval data it becomes ordinal data.

PROBLEM WITH INTERVAL SCALE


They dont have a true zero.
With interval data, we can add and subtract, but cannot
multiply or divide.
example, there is no such thing as no
temperature. Without a true zero, it is impossible to
compute ratios.
consider this: 10 degrees + 10 degrees = 20 degrees. No
problem there. 20 degrees is not twice as hot as 10 degrees,
however, because there is no such thing as no temperature
when it comes to the Celsius scale.

Bottom line, interval scales are great, but we cannot calculate


ratios, which brings us to our last measurement scale

Device Provides Two Examples of


Ratio Scales (height and weight)

RATIO

Ratio
They tell us about the order, they tell us the exact value
between units.
They also have an absolute zerowhich allows for a wide
range of both descriptive and inferential statistics to be
applied.
These variables can be meaningfully added, subtracted,
multiplied, divided (ratios).
Central tendency can be measured by mode, median, or
mean; measures of dispersion, such as standard deviation
and coefficient of variation can also be calculated from
ratio scales.

Nominal variables are used to name, or label a series of


values.
Ordinal scales provide good information about
the order of choices, such as in a customer satisfaction
survey.
Interval scales give us the order of values + the ability to
quantify the difference between each one.
Ratio scales give us the ultimateorder, interval values,
plus the ability to calculate ratios since a true zero can
be defined.

Summary

Mean; sum divided by the no of values

Measures of Central Tendency

Number in the middle of a given set of numbers arranged in


order of increasing magnitude

Median

element that appears most frequently in a given set of


elements

Mode

Range; difference between the largest and the smallest data values.
Set A. 93, 96, 98, 99, 99, 99, 100
Set B. 10, 29, 52, 69, 87, 92, 100

range in Set A shown above is 7, and the range in Set B shown


above is 90.

Measures of Dispersion
How spread out a set of data is.

measure that summarises the amount by which every


value within a dataset varies from the mean.

Standard Deviation

This is usually because of outliers in your data:


1 participant is much older than the average.
1 participant is much better (or worse) at the task (DV).
1 participant has an extremely high (or low) IQ.

Skewed Data

(i.e., lack of symmetry) in the data

Rule One. If the mean is less than the median, the


data are skewed to the left.
Rule Two. If the mean is greater than the median,
the data are skewed to the right.

Find the mean of the distribution.


Subtract each score from the mean.
Square each result (deviation).
Add the squared deviations together.
Divide by the total number of scores (n-1) you subtract 1 to get rid
of any outliers.
This result is called the variance.
Find the square root of the variance. This is the SD.
Now you can compare the mean to the SD.
Ex: compare 3 class scores out of 100: 78, 80, 92; 2/3 of your
scores are 1 SD from the mean.

To calculate the SD:

Level of Measurement: Level of Measurement:


Level of Measurement: Ordinal Data
Interval & Ratio Data
Nominal Data
Central Tendency:
Percentages; mode
Possible
Tables/Charts:
Frequency table, pie
chart, bar chart
Dispersion:
none

Central Tendency:
Percentages; mode,
median
Possible
Tables/Charts:
Frequency table,
frequency polygon, bar
chart
Dispersion:
Range

Descriptive Statistics:
Mean, median, mode,
Possible
Tables/Charts:
Frequency table; box
and whisker plot; bar
chart; histogram

Dispersion:
Quartiles, range,
standard deviation

In inferential statistics, you are testing a hypothesis and


drawing conclusions about a population, based on your
sample.
Simple difference between descriptive and inferential
statistics, all you need to remember is that descriptive
statistics summarize your current dataset and inferential
statistics aim to draw conclusions about an additional
population outside of your dataset.

INFERENTIAL

It allows researchers to make well


reasoned inferences about the population in question.
Generally two forms: estimation statistics and
hypothesis testing.

Estimation statistics is a fancy way of saying that you


are estimating population values based on your sample
data.

Estimation Statistics

Hypothesis testing is simply another way of drawing


conclusions about a population parameter (parameter is
simply a number, such as a mean, that includes the full
population and not just a sample).

Hypothesis Testing

Chi-Squared Test
Mann-Whitney U Test
Wilcoxon Signed-Ranks Test

Suitable Statistical Test

Level of Measurement:
Nominal Data

Level of Measurement:
Ordinal, Interval, Data

Test for Independent


Samples Design:
Chi-squared (X2)test

Test for Independent


Samples Design:
Mann-Whitney U test

Test for Repeated


Measures Design:
None

Test for Repeated


Measures Design:
Wilcoxon Signed Ranks
Test

Você também pode gostar