Você está na página 1de 48

Research Methods

Consuelo L. Orense
Sampling

Sampling - the process of selecting samples from a


given population
- predicting the behavior of the population
based on the information taken from a representative
sample of the population

Census – collection of information from the total


population
Advantage of sampling

 Can get adequate information on a smaller


number of respondents in the population

 Less time for data collection


 Can give immediate results

 Not affected by temporal changes

 Less costly
 Reduced personnel and project cost
Definitions
 Population - refers to the totality of the group being
studied
Parameter – any numerical value that is used to describe a
population; ex : There are 1000 employees in this
institution
 Sample- a part of the population from where
information is solicited; a representative of the
population being studied
Statistic – numerical value that describes a sample; Ex :
Among the 1000 employees of this institution, 50% are
females
Definitions, example
Subject – the person from whom data is collected;
also called participants (esp in qualitative research)
The participants in the study were 200 school children from
Bonifacio Elementary School. Of these 120 (60 %) were girls
and 80 (40 %) were boys.
Sample
: 200
Population : children
all school
children BES
Sampling Designs

Sampling
Designs

Probability Sampling Non Probability Sampling

Simple Systemati Cluster Multi


Stratified convenien
Random c stage ce Quota Purposive
Sampling designs : A. Probability sampling

 A. Probability sampling – a method of sampling in


which the probability of being selected is known
 Sampling frame – listing of subjects who are to
be sampled; this is one requirement for using a
probability-type of sampling
 Target Population – the group to whom the data
will be generalized. Ex: data taken from
randomly selected call center facilities in different
cities to represent call center population in the
country
Sampling designs : A. Probability sampling

 1. Simple random sampling – method selects a


number of subjects (N) such that every member has
an equal chance of being selected
• Selection is done by assigning numbers of the
participants from a sampling frame then selecting
them using a table of random numbers
• This approach is not convenient if the population
is large and not numbered.
Sampling designs

 2. Systematic sampling – every kth element (such as


every fifth) is selected from a list of all elements in
the population (N) , beginning with a randomly
selected element.
 For example, if we want to select 100 subjects from a
population of 5, 000 population, every kth element
would correspond to every 50th subject. Suppose the
first element randomly selected was 24, the next is 74 th
subject the next is 124 and so on.
Sampling designs

 3. Stratified sampling – a modification of either


simple random or systematic sampling in which the
population is first divided into homogenous groups
or strata, then the elements are selected by random
in each strata.
 Example : Grade 1, 2, 3 strata ; rural/urban
barangays
Sampling designs

 Cluster sampling – involves randomly choosing


naturally occurring groups or areas and then
selecting the individual element from the chosen
groups
 Examples of clusters: schools, universities, class; block of
20 households

 Multi-stage – sampling done at different levels i.e.,


region, province, city, barangay; done for big
surveys or national surveys
Sampling designs: B. Non-probability design

 B. Non-probability sampling – probability of


selection is not known

 1. Convenience sampling – also known as accidental


sampling; takes people or other units that are
readily available ; for example, churchgoers on a
Sunday mass; shoppers in a supermarker; people
who arrive on a scene
Sampling designs: B. Non-probability design

 2. Quota sampling – selects subjects in the same


proportion that are found in the general population
but not in a random manner
 Example : A study that chose 20 boys and 20 girls

 3. Purposive sampling : people are chosen for a


purpose or using a set of criteria; for example,
residents of Manila for 5 years , new cases of
diabetes, pregnant women with singleton delivery,
not taking medication for hypertension
Factors Affecting Sample Size a

 Size of the populationb


 Amount of error to be toleratedb
 Variability of the data – standard deviationb
 Proportion or rate of the conditionb
 Homogeneity of the population
aThere are other factors aside from those mentioned
bQuestions a statistician would ask when consulting on sample size


Options in determining sample size

 Manual calculation using statistical formulas


 Consider data type, study design, hypothesis to be
tested

 Table for determining sample size


 From books and journal articles

 Software-based calculations
 EpiCalc (Epi Info), Right Size, Gpower, PASS, Minitab,
etc
Guidelines for selecting a sample size

 For small population (with fewer than 100 people) survey


entire population
 If population size is around 500, 50% of the population should
be sampled
 If population size is around 1,500, 20% should be sampled
 Beyond a certain point (5000 or more), a sample size of 400
is adequate
 Generally speaking, the larger the population, the smaller one
needs to get a representative sample* (Leedy, Paul, 2005.
Practical Research)
What will I do with my data ?
 A. Identify the type of data and its purpose in the
study
 scale of measurement of data : nominal, ordinal, interval,
ratio
 Nominal variables – cannot compute for mean, but can be
summarized by categories to compute percentage, rates, ratios
 Example : percent overweight / underweight; prevalence rate of
diabetes, risk ratio of stroke
 Interval / ratio type of data – classified as higher order data ;
can compute for means, medians; can be reclassified into
nominal data, for example, hemoglobin value transformed into
anemic /non-anemic categories
What will I do with my data ?
 B. Review study objectives to determine the appropriate
method of analyzing the data
 If objective aims to describe
 Examples from objectives : to describe participant socio-demographic
profile; to determine grades of students; measure product acceptability;
percent of favorable answer
 If objective is to correlate / relate or determine association
 Examples from objective : to correlate age and body weight; relate socio-
economic factors with nutritional status
 If objective is to compare
 Compare test scores in Math and in English
 To determine the difference in weight after a diet high in fiber
 To determine differences in height of students by year level
 To compare percent of children with color blindness
Data processing

 Steps undertaken to put the data into a form that is


suitable for statistical analysis.

 Stage of research where data is checked for


completeness, consistency and accuracy

 The process generates statistical outputs such as


descriptive statistics, cross-tabulation, regression, etc
which the researcher can analyze
Steps in data processing

 1. Editing

 2. Coding

 3. Creating datafile

 4. Summarizing data
1. Editing

 Editing – done by examining the completed form or


questionnaire to detect errors or omissions
A vital step to ensure completeness, consistency and
accuracy of data
 Editing should be done immediately after interview,
before the interviewer leaves the place of interview
especially if contact with subject will be difficult after
the interview
 Forms should be checked by supervisor after editing by
interviewer
2. Coding

 Coding – process of converting data into numbers or


symbols that can be more easily counted or
tabulated
 Coding instruction should be ready before coding to
avoid confusion that might arise with changes in coding
scheme of the study variables

 Dummy tables (frame or structure for data tabulation)


prepared ahead of time is a key step in the planning
and pre-analysis phase of the study; a dummy table is
essentially a table with row and column headings but
without data inside the cells
Example of coding instruction

Variable Code Description


Sex 1 Male
2 Female
Agegroup 1 20-29 yrs
2 30-39 yrs
Education 1 Finished Elementary
school
2 High School
Civil status 1 Single
2 Married
Dummy table
Table Table title
number
Table 1. Distribution of students ABS High School
according to year level Column
header
Year level (Row Number of students (column
heading) heading)
Freshmen 350
Sophomore 300
Junior 250
Senior 200
Total N = 1,100

Source
note
Source : ABS High School Registrar
Dummy table

Table 1. Distribution of respondents by sex and


educational level
Educational Male Female Total
level
Number % Number % Number %

Elementary
High School
Vocational
College
Post graduate
Total
3. Create datafile

 Create datafile – involves storing data in a medium


that can be readily processed for analysis or filed
for future use; the process is called data encoding*
 Spreadsheets (Excel) or other statistical software can
be used to encode data
 All variables to be analyzed should be placed in a
single spreadsheet or datafile

* In computer technology, encoding is the process of putting a sequence


ofcharacters into a special format for transmission or storage purposes
(Webopedia.com)
Types of variables on a spreadsheet

Grade average of students and their hemoglobin levels


Initial Age Sex Grade Rank Hemoglobin Anemic?
(yrs) (mg/dL)
AM 9 M 90 2 11.0 Yes
BC 10 M 95 3 11.0 Yes
KL 11 F 88 5 12.5 No
MR 9 M 91 1 10.0 Yes
CG 8 F 86 6 12.0 No
NY 10 F 85 7 10.5 Yes
JF 8 M 89 4 11.1 Yes

Note : if a numeric variable (in column) is to be used for computation,


do not include a character, letter or symbol with the number
4. Data summary

 Summary of data – involves computation of


necessary parameters prior to statistical analysis;
example, cross-tabulations of age and sex,
frequency tables
 Tip : To check for correct data encoding or validity of
collected data, run a frequency table from a statistical
software for ALL variables

 Inspect minimum and maximum values of the variable


Calculations on spreadsheet

 Excel computations from the formula bar


On the cell where you want to place the
answer, type the “equal” sign (=), then click the
particular cell/s that are included in the computation
Example : to find the price of rice per sack (50kg)
Rice variety Price per kilog Price per sack
Sinandomeng 33 =
Intan 31
Dinorado 35
Milagrosa 36
Calculations on spreadsheet

Sorting – helps view data in ascending or descending


order; enables one to see extreme values or
check erroneous data
Simple sort
Select cells
Click Sort, Ascending / Descending
To sort more than one criteria
Click Data / Sort to display the
Sort dialogue box
Fill in the boxes accordingly
Calculations on spreadsheet

Filter – allows viewing specific data


can help recode numeric data
Data
Select data
Filter (or click funnel shaped icon); arrow buttons will
appear beside the variable to view data arranged in ascending
or descending order

If using Office 2010 or higher, click Custom Autofilter


Define criteria (if equal, greater than, less than…
Always close filter dialogue to restore display of all data
Calculations on spreadsheet

 Excel functions
Click the function icon just above
the spreadsheet heading
When the selection box
appears, check the desired
function, highlight the data
and fill up the information box
before you click OK
Installing the data analysis toolpack

Click the MSOffice icon at the uppermost left


corner of the spreadsheet. Two boxes will
appear, the Excel options and the Exit Excel.
Click the Excel options

When the dialogue box opens, click Add-ins;


click
Analysis toolpack and VBA to install

Go to Formulas menu to check if the


Fx function is installed
Data Analysis
 A procedure that involves computation of the
desired indicators of the study.
 Indicator – computed or collated collective
characteristics of individuals that comprise the study
population; ex. Weight-for-age as indicator of
nutritional status of children
 Parameter – indicators derived from the entire
population
 The output of data analysis enables the researcher
to interpret the results of the study to answer the
sub-problems.
Data Analysis
 Consider the research questions/objectives of the
study
 To describe – use appropriate descriptive statistics

 To compare – use comparison statistics

 To correlate / determine relationship – use correlation


statistics for correlation or association


Statistical techniques

 Methods and tools used in analyzing and


interpreting data

Types of statistical techniques


 1. Descriptive statistics – describe nature and
characteristics of an event

 Example: mean, median , mode, standard deviation,


variance, range
Statistical techniques

 2. Inferential statistics – involves test of significance


based on hypotheses
 theresults can be generalized to the population if
sample is representative

 Examples : tests of comparison, correlation,


regression
Descriptive statistics

1. Measures of central tendency


 Mean - average value of the variable
 - symbol of population mean
 x – mean of sample
 in tables, mean ± sd should always be given

 Median - middlemost observation


 Measure of dispersion for median usually Q2 – Q3
 Mode most frequently occurring value
Descriptive statistics

2. Measures of variability or dispersion


 Standard deviation (sd) – a measure of distance of an
observation from the mean
 The higher the sd, the greater the spread of the
distribution
 Variance – square of standard deviation
 Range - numerical difference between highest and
lowest scores
 Percentile rank – indicates percentage of scores at or
below a particular score
Shapes of distributions

 Negatively  Most values lie in top half of


skewed the range with values
“tailing” at the left side of
the distribution
 Positively skewed  Values tail to the right of the
distribution
 Values spread evenly across
 Uniform the whole range
 Symmetrical shape (normal
 Mound or humped distribution)
Examples of distributions
Implications concerning the shape
of the distribution of data

 Normal distribution  Use parametric tests


 Use mean, sd or SE
(standard error) for
descriptive statistics
 Non-normal  Use non parametric statistics
distributions or distribution-free tests or
transform data
 Use median, Q1-Q3 for
descriptive statistics
Two categories of statistics in inferential
studies
 Use of inferential statistics allow researchers to
draw inferences about the population
 Parametric statistics - can be used when:
 Datais normally distributed
 Sample size is adequate
 Ruleof thumb (Drew, Hardman & Hosp, 2008) samples <15
per group should not use parametric tests
 Intervalor ratio type of data
 Equality of variances (homocedasticity)
 Advantage of parametric tests – more powerful than
non-parametric tests (more likely to detect a
significant difference)

 Non –parametric tests – also called distribution-free


tests because they do not assume normality of
distribution; also used when sample is small
Statistical tests
Type of problem Interval /ratio data Ordinal data Nominal data
A. Comparison of
parameters or indicators

1. Single population Z-test Kolmogorov- Chi-square test


Smirnov
T-test one sample test
2. Two population

a. Related samples Paired t-test Wilcoxon matched McNemar test


pairs ranks test
b. Independent Independent t-test Mann-Whitney U Fisher’s exact
samples (Student’s t-test) test probability test;
Chi-square test
3. Three or more groups

a. Related samples F-test two way ANOVA Friedman’s Two- Cochran’s Q


or repeated measures way ANOVA test
Statistical tests
Type of problem Interval /ratio data Ordinal data Nominal data

3. Three or more groups


(cont)

b. Independent F-test one-way Kruskall-Wallis Chi-square test


samples ANOVA one-way ANOVA

B. Study of relationship Regression Spearman rank Kappa test


between variables Correlation correlation Contingency
coefficient coefficient test
End of presentation
Thank you!

Você também pode gostar