Escolar Documentos
Profissional Documentos
Cultura Documentos
Gathering Techniques
S R Kulkarni
TechNet IT Enabled Services, Pune
Design of Sampling
The process of obtaining information from a subset (sample) of
a larger group (population)
Population
Sample
Terminology
Population
The entire group of people of interest from whom the
researcher needs to obtain information *what Info?*.
Element (sampling unit)
one unit from a population
Sampling
The selection of a subset of the population
Sampling Frame
Listing of population from which a sample is chosen
Census
Complete enumeration of the entire population
Terminology
Parameter
The variable of interest It is unknown. If known
then what?
Statistic
The information obtained from the sample about the
parameter
Goal
Assumption
The sample chosen is representative of the population
Non-Probability Sampling
Probability Sampling
Determine Appropriate
Sample Size
Execute Sampling
Design
sampling Unit.
Scope ............
Timing ..........
individuals
families
component of a machine
individuals over 20
families with 2 kids
Replaced components of a machine
individuals who have bought one
families who eat fast food
Components of a machine failed once
bought over the last seven days
Individuals
Household
Streets
Telephone numbers
Companies
Big Question is
How to select sampling unit and
a sample from the target
population
Ex. To study the socio-economic
status of the loyal customers, which
customer should be selected in a
sample from a population of all loyal
customers in available database
Nonrandom Sampling
Every unit of the population does not have the same
probability of being included in the sample.
Open the selection bias
Not appropriate data collection methods for mostas
nonprobability sampling7-13
Steps:
Population is divided into mutually exclusive and
exhaustive strata based on an appropriate population
characteristic. (e.g. race, age, gender etc.)
Simple random samples are then drawn from each
stratum.
Cluster Sample
The population is divided into subgroups (clusters)
Math
Alliance
Project
Cluster Sampling
Clusters of population units are selected at random
and then all or some randomly chosen units in the
selected clusters are studied.
Steps:
Population is divided into mutually exclusive and
exhaustive subgroups, or clusters. Ideally, each
cluster adequately represents the population.
A simple random sample of a few clusters is
selected.
All or some randomly chosen units in the selected
clusters are studied.
Systematic Sample
Every kth member ( for example: every 10th
Errors in Sampling
Non-Observation Errors
Sampling error: naturally occurs
Coverage error: people sampled do not match the
population of interest
Underrepresentation
Non-response: wont or cant participate
Errors of Observation
Interview error- interaction between interviewer
Non-Probability Sampling
Non-Probability Sampling
Subjective procedure in which the probability of
selection for some population units are zero or
unknown before drawing the sample.
information is obtained from a non-representative
sample of the population
Sampling error can not be computed
Survey results cannot be projected to the
population
Convenience Sample
Selection of whichever individuals are easiest to
reach
It is done at the convenience of the researcher
Types of Non-Probability
Sampling (I)
Judgement Sampling
A researcher exerts some effort in selecting a
sample that seems to be most appropriate for
the study.
Quota Sampling
The population is divided into cells on the basis of relevant
control characteristics.
A quota of sample units is established for each cell.
50 women, 50 men
Practice
To conduct a survey of long-distance calling patterns, a researcher
opens a telephone book to a random page, closes his eyes, puts
his finger down on the page, and then reads off the next 50
names. Which of the following are true statements?
I. The survey design incorporates chance
II. The procedure results in a simple random sample
III. The procedure could easily result in selection bias
a)
b)
c)
d)
e)
I and II
I and III
II and III
I, II and III
None of the above gives the complete set of true responses
Practice
A large elementary school has 15
classrooms, with 24 children in each
classroom. A sample of 30 children is
chosen by the following procedure:
What next ?
Descriptive statistics:
Organize and summarize data from
samples
Inferential statistics:
Infer information about the population
based on what we know from sample
data
Types of Variables
Quantitative
Measured in amounts
Ht, Wt, Test score
Discrete:
separate categories
Letter grade
Qualitative
Measured in categories
Gender, race, diagnosis
Continuous:
infinite values in between
GPA
Scales of Measurement
Nominal Scale: Categories, labels, data carry no numerical
value
Frequency Distributions
After collecting data, the first task for a
researcher is to organize and simplify the
data so that it is possible to get a general
overview of the results.
One method for simplifying and organizing
data is to construct a frequency
distribution.