Você está na página 1de 73

Marikina Polytechnic College

Bachelor of Technical Teacher Education Major in Mathematics


Statistics in a Word
It can beIt fun,
can and sometimes
be fun, useful, to
and sometimes summarize
useful, a
to summarize a
discipline in only ainfew
discipline onlywords.
a few So,
words. So,
Economics is aboutis. .about
Economics . . . . Money (and why it is good).
Psychology:
Psychology: Why we think what we think (we think).
Biology:Biology: Life.
Anthropology:
Anthropology: Who?
History:History: What, where, and when?
Philosophy:
Philosophy: Why?
Engineering:
Engineering: How?
Accounting:
Accounting: How much?
In such In
a caricature, Statistics
such a caricature, is aboutis. .a.. . Variation.
Statistics
Marikina Polytechnic College 2
Statistics in a Word

Statistics is about variation.

Data vary because we don’t see everything and


because even what we do see and measure, we
measure imperfectly.

So, in a very basic way, Statistics is about the real,


imperfect world in which we live.

Marikina Polytechnic College 3


I n t ro d u c t i o n t o s tat i s t i c s

1. What is statistics?
2. Why should I study statistics?
3. How can studying statistics help me in
my profession?

Marikina Polytechnic College 4


W h at i s s tat i s t i c s ?
STATISTICS is the science of
• planning studies and experiments,
• collecting,
• organizing,
• presenting,
• Analyzing or summarizing,
• interpreting, and
• drawing conclusions based on the data.

It is also a way of reasoning, along with a collection of tools


and methods, designed to help us understand the world.
Marikina Polytechnic College 5
s tat i s t i c s
• Collection of data refers to the process of obtaining
information.
• Organization of data refers to the determining/ascertaining
(after a calculation, investigation, experiment, survey, or study)
manner of presenting the data into tables, graphs, or charts
so that logical and statistical conclusion can be drawn from the
collected measurements.
• Analysis of data refers to the process of extracting from the
given data relevant information from which numerical
description can be formed.
• Interpretation of data refers to the task of drawing
conclusions from the analyzed data.
Marikina Polytechnic College 6
W h y s h o u l d I s t u dy s tat i s t i c s ?

• Essential for both understanding and conducting


research

• Used to analyze data

• Can help to discriminate between fact and fiction

• Helpful in knowing when, and for what purpose, a


statistician should be consulted
Marikina Polytechnic College 7
H i s to r i c a l n ot e

Sir John Sinclair


• A Scottish landowner and president of the Board
of Agriculture who introduced the word statistics
into the English language in the 1798 publication
of his book on a statistical account of Scotland.

• The word statistics is derived from the Latin word


status, which is loosely defined as a statesman.

Marikina Polytechnic College 8


H i s to r i c a l n ot e

• The origin of descriptive statistics can be traced to


data collection methods used in censuses taken by the
Babylonians and Egyptians between 4500 and
3000 B.C.

Roman Emperor Augustus(27 B.C. – A.D. 17)


• He conducted surveys on births and deaths of the
citizens of the empire, as well as the
number of livestock each owned and the crops each
citizen harvested yearly.
Marikina Polytechnic College 9
H i s to r i c a l n ot e

• Inferential statistics originated in the 1600s, when


John Graunt published his book on population growth,
Natural and Political Observations Made upon the Bills
of Mortality.

Edmund Halley
• Mathematician/astronomer, published the first complete
mortality tables. (Insurance companies use mortality
tables to determine life insurance rates.)

Marikina Polytechnic College 10


B r a n c h e s o f S tat i s t i c s

Statistics

Descriptive statistics Inferential statistics


that involves the that involves using a
• collection, sample
• organization, • to interpret, and
• presentation, • draw conclusions
• analysis or based on the data or
summarization about a population.
of data.
Marikina Polytechnic College 11
B r a n c h e s o f S tat i s t i c s

Inferential statistics

• A basic tool in the study of inferential statistics is


probability.

• An area of inferential statistics called hypothesis testing


is a decision-making process for evaluating claims about
a population, based on information obtained from
samples.

Marikina Polytechnic College 12


Classification of
variables
and
data

Marikina Polytechnic College 13


C l a s s i f i c at i o n o f va r i a b l e s a n d data
VARIABLES

Qualitative •Dependent Quantitative


(Categorical) •Independent (Numerical)
• Dichotomous
•Dichotomous •Dichotomous
• Trichotomous •Discrete
•Trichotomous
• Multinomous DATA •Trichotomous
•Continuous
•Multinomous
•Multinomous

Collection Scales of
Sources Presentation
Methods Measurement
Marikina Polytechnic College 14
C l a s s i f i c at i o n o f va r i a b l e s a n d data

Collection Scales of
Sources Presentation
Methods Measurement

• Surveys • Nominal • Textual method


• Primary data • Ordinal
• Observation • Tabular method
• Secondary data
• Experimentation • Interval • Graphical method
• Ratio

Marikina Polytechnic College 15


Va r i a b l e

Variable (or Response Variable)


• A characteristic or attribute of interest about each
individual element of a population or sample that can
assume different values.

example

• A student’s age at entrance into college, the color of the


student’s hair, the student’s height, and the student’s
weight are four variables.

Marikina Polytechnic College 16


data
• It is the collection of observations.
• It consists of information coming from observations
(realized value of a variable), counts, measurements, or
responses.
• It is the set of values collected from the variable from
each of the elements that belong to the sample. Once all
the data are collected, it is common practice to refer to
the set of data as the sample.
example
• The set of 30 heights gathered from 30 students is an
example of a set of data.
Marikina Polytechnic College 17
Data va l u e
• The value of the variable associated with one element of
a population or sample. This value may be a number, a
word, or a symbol.

example
• Angelo entered college at age “23,” his hair is “brown,”
he is “71 inches” tall, and he weighs “183 pounds.” These
four data values are the values for the four variables as
applied to Angelo.

Marikina Polytechnic College 18


Data s e t s
Data sets are called populations and samples.
Population
• Collection of all outcomes, responses,
measurements, or counts that are of interest.
• Consists of all subjects (human or otherwise) that Population
are being studied.
Sample
Sample
• A sample is a subset, or part, of a population.
• A sample is a group of subjects selected from a
population.
Element
Elementary unit or Element
• It is a member of the population whose measurement on the
variable of interest is what we wish to examine.
Marikina Polytechnic College 19
Experiment

• A planned activity whose results yield a set of data.

• An experiment includes the activities for both


selecting the elements and obtaining the data
values.

Marikina Polytechnic College 20


E x p e r i m e n ta l C l a s s i f i c at i o n
A researcher may classify variables according to the
function they serve in the experiment.
• Independent variables are variables controlled by
the experimenter/researcher, and expected to have an
effect on the behavior of the subjects. The
independent variable is also called explanatory
variable.
• Dependent variable is some measure of the behavior
of subjects and expected to be influenced by the
independent variable. The dependent variable is also
called outcome variable.
Marikina Polytechnic College 21
E x p e r i m e n ta l C l a s s i f i c at i o n

For example, in the sit-up study, the researchers gave


the groups two different types of instructions, general
and specific.

• Hence, the independent variable is the type of


instruction.

• The dependent variable, then, is the resultant


variable, that is, the number of sit-ups each group was
able to perform after four days of exercise.

Marikina Polytechnic College 22


Pa r a m e t e r a n d S tat i s t i c
Parameter
• A numerical description of a population
characteristic.
• A numerical value summarizing all the data of an
entire population.

Statistic
• A numerical description of a sample characteristic.
• A numerical value summarizing the sample data.

Marikina Polytechnic College 23


Pa r a m e t e r a n d S tat i s t i c
Symbolic Notation for Sample and Population Measures
Sample Population
Statistical Measure
Statistic Parameter
Mean x 
Standard deviation s 
Variance s 2
 2

Size n N
Proportion p 
Correlation r 
Marikina Polytechnic College 24
R e l at i o n s h i p s a m o n g P ro b a b i l i t y, s t at i s t i c s ,
p o p u l at i o n , a n d s a m p l e

Probability or
Population Sampling Theory Sample

Statistical Inference
Parameter or Statistic
Inferential Statistics
Marikina Polytechnic College 25
example
Population=No. of Humans in Whole World (Approx. 6 billion)
Parameter of Interest = Average food human being consumes in a day.

Representative Sample
100,000 people from all over
the world

Parameter Statistic
Average food human being Average weight of food
consumes in the world is 8 consumed by sampled
pounds. humans = 8 pounds
http://www.professorpatel.com/population--sample.html
Marikina Polytechnic College 26
Sources of data
• Primary data are date documented by the primary
source. The data collectors themselves documented this
data.
E x a m p l e : census, sample survey, experiment

• Secondary data are data documented by a secondary


source. An individual/agency, other than the data
collectors, documented this data.
E x a m p l e : books, journals, magazines, theses

Marikina Polytechnic College 27


Sources of data
Big data
• It refers to data sets so large and so complex that their
analysis is beyond the capabilities of traditional software
tools.
• Analysis of big data may require software simultaneously
running in parallel on many different computers.

Data science
• It involves applications of statistics, computer science,
and software engineering, along with some other relevant
fields (such as sociology or finance).
Marikina Polytechnic College 28
S o m e a g e n c i e s w h e r e a r e s e a rc h e r c a n ava i l
o f p r i m a ry data
Some agencies where a researcher can avail of primary
data.
• Central Bank (CB) is a primary source of data on banking and
finance.
• National Statistics Office (NSO) is a primary source of data on
population, housing, and establishments.
• Pulse Asia is a primary source of data on opinions or
sentiments of the people on current issues.
• Bureau of Agricultural Statistics (BAS) is a primary source of
data on agricultural and livestock.

Marikina Polytechnic College 29


S o m e a g e n c i e s w h e r e a r e s e a rc h e r c a n ava i l
o f s e c o n da ry data
• The United Nations complied data for its yearbook, which were
originally gathered by government statistical agencies of
different countries.
• A medical researcher’s documented data for his research
paper, which were originally collected by the Department of
Health
• The documented data of the research team of congressman
for its report, which were originally collected by the Department
of Education and Commission of Higher Education; and
• The documented data of a student for his thesis, which were
originally collected by the Department of Labor and
Employment.
Marikina Polytechnic College 30
Data C o l l e c t i o n M e t h o d s
1. SURVEYS
• It is a method of collecting data on the variable of interest
by asking people questions. When data came from
asking all the people in the population, then the study is
called a census*. On the other hand, when data came
from asking a sample of people selected from a well-
defined population, then the study is called sample
survey.

(*Census or Registration requires the enactment of law to take


effect for it needs the participation of a large, if not the entire,
population.)
Marikina Polytechnic College 31
Data C o l l e c t i o n M e t h o d s

DIFFERENT METHODS OF COMMUNICATION

a) Personal Interview
• It refers to as the direct method of gathering data since
this requires a face-to-face inquiry with the respondent.

b) Self-Administered Questionnaire
• It is an inventory of information listed down to which a
respondent answers. There is no face-to-face
confrontation.
Marikina Polytechnic College 32
Data C o l l e c t i o n M e t h o d s
2. OBSERVATION
• It is a method of collecting data on the phenomenon of
interest by recording the observations made about the
phenomenon as it actually happens.

• It makes use of the different human senses in gathering


information.

• It is useful in studying the reactions and behavior of


individuals or groups of persons/objects in a given
situation or environment as it happens.
Marikina Polytechnic College 33
Data C o l l e c t i o n M e t h o d s
3. EXPERIMENTATION

• It is a method of collecting data where there is direct


human intervention on the conditions that may affect the
values of the variables of interest.

• It is conducted in laboratories where specimens are


subjected to some aspects of control to find out cause
and effect relationships.

Marikina Polytechnic College 34


C o m pa r i s o n o f S u rv e y, E x p e r i m e n t a n d
O b s e rvat i o n M e t h o d
Data Collection Method
Aspect
Survey Experiment Observation
Assessing the
reliability of
Generally Sometimes Oftentimes
generalizations
possible difficult difficult
about a well-defined
population
Ability to establish
Poor Superior Poor
cause-and-effect
Realism of data Realistic Least realistic More realistic
Marikina Polytechnic College 35
types of data

Marikina Polytechnic College 36


T y p e s o f data
Qualitative, or Attribute, or Categorical Variable

• consist of attributes, labels, or nonnumerical entries.


• A variable that describes or categorizes an element of a
population.
 Dichotomous - divided into 2 equal branches, classes
or categories
 Trichotomous - divided into 3 equal branches, classes
or categories
 Multinomous - divided into 4 or more branches,
classes or categories
Marikina Polytechnic College 37
T y p e s o f data
Quantitative, or Numerical Variable

• A variable that quantifies an element of a


population.

• It consists of numerical measurements or counts


and can be ordered or ranked.

Marikina Polytechnic College 38


Q u a n t i tat i v e , o r N u m e r i c a l Va r i a b l e
DISCRETE VARIABLES
• Assume values that can be counted.
• Can be assigned values such as 0, 1, 2, 3 and are said
to be countable.
example
Examples of discrete variables are the number of children
in a family, the number of students in a classroom, and the
number of calls received by a switchboard operator each
day for a month.

Marikina Polytechnic College 39


Q u a n t i tat i v e , o r N u m e r i c a l Va r i a b l e
CONTINUOUS VARIABLES
• can assume an infinite number of values in an interval
between any two specific values. They are obtained by
measuring. They often include fractions and decimals.
example

Marikina Polytechnic College 40


T y p e s o f data

CAUTION
• Continuous data can be measured, but not counted. If we
select a particular data value from continuous data, there
is no “next” data value.
Marikina Polytechnic College 41
Scales of Measurement

Marikina Polytechnic College 42


Measurement scales

Measurement
• It is the process of determining the value or label of
the variable based on what has been observed.

Marikina Polytechnic College 43


Measurement scales

Nominal Level of Examples


Measurement
• Data are qualitative only.
• Data at this level are
categorized using names,
labels, or qualities. No
mathematical computations
can be made at this level.

Marikina Polytechnic College 44


Measurement scales

Ordinal Level of Examples


Measurement
• Data are qualitative or
quantitative.
• Data at this level can be
arranged in order, or
ranked, but differences
between data entries are
not meaningful.
Marikina Polytechnic College 45
Measurement scales
Interval Level of Measurement
• Data can be ordered, and Examples
meaningful differences
between data entries can be
calculated.
• At the interval level, a zero
entry simply represents a
position on a scale; the entry is
not an inherent zero.
Note: An inherent zero is a zero that
implies “none.”
Marikina Polytechnic College 46
Measurement scales
Ratio level of measurement
Examples
• Data are similar to data at the
interval level, with the added
property that a zero entry is an
inherent zero.
• A ratio of two data values can
be formed so that one data
value can be meaningfully
expressed as a multiple of
another.

Marikina Polytechnic College 47


Measurement scales
Level of measurement has all of the following properties:
a) The numbers in the system are used to classify a
person/object into distinct, nonoverlapping, and
complete/exhaustive categories.
b) The system arranges categories according to
magnitude/degree.
c) The system has a fixed unit of measurement representing a
set of size throughout the scale; and
d) The system has an absolute zero.
• Ratio level of measurement satisfies a, b, c, and d
• Interval level of measurement satisfies only a, b, and c
• Ordinal level of measurement satisfies only a, and b
• Nominal level of measurement satisfies only a
Marikina Polytechnic College 48
T y p e s o f Data & M e a s u r e m e n t s c a l e s

Discrete Continuous

Qualitative
Nominal

Ordinal
Quantitative

Interval

Ratio
Marikina Polytechnic College 49
Presentation
of Data

Marikina Polytechnic College 50


M e t h o d s o f P r e s e n t i n g Data

1. Textual method
• This method presents the collected data in narrative and
paragraphs forms.
2. Tabular method
• This method presents the collected data in table which
are orderly arranged in rows and columns for an easier
and more comprehensive comparison of figures.
3. Graphical method
• This method presents the collected data in visual or
pictorial form to get a clear view of data (e.g. histogram,
pie chart, pareto chart, pictograph, etc.).
Marikina Polytechnic College 51
Sampling and
Sampling techniq ues

Marikina Polytechnic College 52


Sampling techniques
Census
• It is a count or measure of an entire population.
Taking a census provides complete information, but it
is often costly and difficult to perform.

Sampling
• It refers to the process of selecting individuals from
target population.

Sampling frame
• A list of all elements or other units containing the
elements or members in a population.
Marikina Polytechnic College 53
Sampling techniques
• Probability Sampling
• Nonprobability Sampling
P ro ba b i l i t y S a m p l i n g
• Probability sampling or random sampling is a process
whose members had an equal chance of being selected
from the population.
• Types of Probability sampling
○ Simple Random Sampling ○ Cluster Sampling
○ Systematic Sampling ○ Multistage Sampling
○ Stratified Sampling
Marikina Polytechnic College 54
T y p e s o f P ro ba b i l i t y s a m p l i n g
Simple Random Sampling
• It is a process of selecting n sample size in the
population via random numbers or through lottery.

Marikina Polytechnic College 55


T y p e s o f P ro ba b i l i t y s a m p l i n g
Systematic Sampling
• A systematic sample is a sample in which each member
of the population is assigned a number. We select some
starting point and then select every kth (such as every
3rd) element in the population.

Marikina Polytechnic College 56


T y p e s o f P ro ba b i l i t y s a m p l i n g
Stratified Sampling
• A stratified sample is a sample obtained by dividing the
population into subgroups, called strata, according to
various homogeneous characteristics and then selecting
members from each stratum for the sample.
Group
of Men

Group of
Women
Marikina Polytechnic College 57
T y p e s o f P ro ba b i l i t y s a m p l i n g
Cluster Sampling
• A method of sampling in which the members of a population are arranged in
groups (the ‘clusters’). A number of clusters are selected at random and
those chosen are then subsampled. The clusters generally consist of natural
groupings, for example, families, hospitals, schools, etc. Then the researcher
randomly selects some of these clusters and uses all members of the
selected clusters as the subjects of the samples.

Marikina Polytechnic College 58


T y p e s o f P ro ba b i l i t y s a m p l i n g
Simple one-stage cluster sample:
• List all the clusters in the population, and
from the list, select the clusters – usually
with simple random sampling (SRS)
strategy. All units (elements) in the
sampled clusters are selected for the
survey.
Simple two-stage cluster sample:
• List all the clusters in the population.
First, select the clusters, usually by
simple random sampling (SRS). The
units (elements) in the selected clusters
of the first-stage are then sampled in the
second-stage, usually by simple random
sampling (or often by systematic
sampling). Marikina Polytechnic College 59
T y p e s o f P ro ba b i l i t y s a m p l i n g
Multistage Sampling
• A sample design in which the elements of the
sampling frame are subdivided and the sample can be
obtained by using combination of methods. This is
usually used for national, regional, provincial or country
level studies.
example
1st level : 4 provinces/region = 4
2nd level : 3 municipalities per province = 12
3rd level : 2 barangays per municipality = 24
4th level : 10 respondents per barangay
Marikina Polytechnic College = 240 60
N o n p ro ba b i l i t y S a m p l i n g

• Nonprobability sampling or nonrandom sampling is a


sampling procedure where samples selected in a
deliberate manner with little or no attention to
randomization.
• Some segments of the population do not have a chance
of being selected or included in the sample or cannot be
specified
• Types of Nonprobability sampling
○ Convenience Sampling ○ Snowball Sampling
○ Purposive Sampling ○ Networking Sampling
○ Quota Sampling
Marikina Polytechnic College 61
t y p e s o f N o n p ro ba b i l i t y S a m p l i n g
Convenience Sampling
• A convenience sample consists only of available
members of the population.

Marikina Polytechnic College 62


t y p e s o f N o n p ro ba b i l i t y S a m p l i n g
Purposive Sampling
• Its is also called judgment sampling
• The sampling units are selected personally or
subjectively by the researcher, who attempts to obtain a
sample that appears to be representative of the
population.
example
• A human resource director interviews the qualified
applicants in supervisory position. (Note: Qualified
applicants are selected by the HRD which is based from
his own judgment.) Marikina Polytechnic College 63
t y p e s o f N o n p ro ba b i l i t y S a m p l i n g
Quota Sampling
• In this method, the researcher determines the sampling
size which should be filled up.
• The basic idea is to set a target number of completed
interviews with specified subgroups of the population of
interest.
example
• For example, a researcher might ask for a sample of 100
females, or 100 individuals between the ages of 20-30.

Marikina Polytechnic College 64


t y p e s o f N o n p ro ba b i l i t y S a m p l i n g

Snowball Sampling
• It involves starting a
process with one
individual or group
and using their
contacts to develop
the sample, hence
“snowball”.

Marikina Polytechnic College 65


t y p e s o f N o n p ro ba b i l i t y S a m p l i n g

Networking Sampling
• This is used to find socially devalued urban
populations such as addicts, alcoholics, child
abusers and criminals, because they are usually
“hidden from outsiders.”

Marikina Polytechnic College 66


Determining The
Sample size

Marikina Polytechnic College 67


Slovin’s formula
It is used to calculate an appropriate sample
size from a population.

n N
1  Ne 2
where
n = Number of samples
N = Total population
e = margin of error or tolerance (5%)
Margin of error is a value which quantifies possible
sampling error.
Marikina Polytechnic College 68
Example: Slovin’s formula

1. The student’s population of Concepcion


College is 2,436. Compute the number of
sample using Slovin’s formula. Use e = 5%.
Solution:
n N 2, 43 6
N  2, 436 n
1  N e2 7. 09
e  0.05 2, 436
  343.58
n? 
1  2, 436 0 .05  2

 n  344
2, 436

1  6.09
Marikina Polytechnic College 69
Example: Slovin’s formula
2. At 5% margin of error, the number of student
respondents of a certain study is 265. Assuming
that the researcher used the Slovin’s formula, what
is the student’s population?
Solution:
n N 2 n  N  nNe 2
1  Ne
n 1  Ne 2   N n  N 1  ne 2 

n  nNe  N
2 n N
1  ne 2
Marikina Polytechnic College 70
Example: Slovin’s formula
(Cont.)
Solution:
n  265 N n 2 65
N
1  ne 2
0. 33 7 5
e  0. 05
N ?  265
 785.185
1  26 5  0.0 5 
2

 2 65  N  786
1  0.6625

Marikina Polytechnic College 71


I m p ro p e r u s e o f S l o v i n ’ s f o r m u l a

TRUTH: Not proper to call it Slovin’s Formula since he did


NOT derive it! The term Slovin’s Formula originated in the
Philippines. Based on the formula, by setting population
standard deviation is P=0.5, we are getting the largest
possible sample, which may be good or bad. It is valid only
under simple random sampling and any other design that is
theoretically more efficient than simple random sampling
(e.g., one-stage stratified sampling). Lastly, the setting up of
the value of e varies from one purpose to another. Of
course, the smaller, the better.
Dr. Arturo Y. Pacificador, Jr., Chair of the Mathematics Department of De La Salle University in Manila

Marikina Polytechnic College 72


S tat i s t i c s a n d C o m p u t e r s

Today, with the availability of user-friendly statistical


software such as Microsoft Excel, statistical capabilities
are within reach of all managers. In addition, there are many
other ‘off-the-shelf’ software packages use on computer.
These include SPSS, SPlus, Minitab, NCSS, Statgraphics,
SYSTAT, EViews, UNISTAT and Stata, to name a few. Some
work as Excel add-ins. A search of the internet will identify
many other statistical packages and list their capabilities.

Marikina Polytechnic College 73

Você também pode gostar