Você está na página 1de 43

Research Methodology

A Research is defined as systematic evaluation of a general thought to find the truth through
scientific method in social interest. Research is charaterized by a research question.

Qualities of Researcher
✔ Finding/searching/developing/compiling something new
✔ Verifying theories or concepts or advancing old concepts
✔ Using scientific method

Why Yoga research


In the yogic texts there are lot of knowledge which has been experienced by great sages. But they
are qualitative and need to be reconfirmed. The phenominun needs to be explained through
scientific validity. There is a responsibility to show the scientific evidence of yoga. We can’t show
by reading books. We have responsibility to show scientific evidence of yoga.

Qualities of a good yoga researcher


✔ Must related to core area of research
✔ Must Practice Higher practices and experience the effects
✔ Application of yoga practices with module development with limitations and benefits
✔ Compare Science and Scripture e.g. scriptural research
✔ Keep technological updates and mix those tools
✔ Experiemental verification through scientific method
✔ Eye for details & Genuine interest in people. Be compassionate and curious. Faith in his
project. Honesty and sincerity. Turn adversity into opportunity. Managerial skills i.e. good
with people and having lots of patience.
✔ Must know Yogic texts, anatomy and physiology

Types of research

Classification is done for the sake of convenience. Various permutations and combinations are
possible. Some categories may overlap. Main point is good/concrete evidence which is clean. It is
better to focus on quantitative research.

On the basic of General


1) Basic- Why and how of a phenomina. We see the fundamental research questions or just the
mechanism. Novelty is top most. E.g. What happens with right nostril breathing.
2) Applied- Here we see if we can use basic research for the societial good. E.g. Right nostril
breathing for weight loss.

On the basic of Technique


1) Experimental- Chose sample, collect data and analysic. Here we experiment and prove with
evidence.
2) Theoretical- Conceptual theory framework.

On the Basic of Methodology


1) Qualitative- Experience and non numerical
2) Quantitative- Numerical involving statistical analysic
3) Mixed- Mixture of both qualitative and quantitative
Research Process
Research process is a 7 step process spread in 3 phases

1) Methodology (Planning)
(a) Research question- Doable, Unique, Socially relevant
b) Literature review- To see what has been done and what needs to be done
c) Design- Experimental plan/blueprint

2) Techniques of data collection (Execution/Implementation/Action)


(d) Data collection- Actual process

3) Statistical analysis (Analysic/Postmortem)


e) Analyse & Interpret
f) Infer and make conclusion- Information you derive, you become aware here.
g) Dissemination- Publication in journals and presentations

The time to do all phases may be different. After data collection experiment is dead and must be
revived during post mortem. Analysis is also part of planning as its not necessary to have data to do
analysis. The 1) Methodology and 3) Statistical analysis are like wings and must be strong

Phase 1- a) Research Question


a) Research Question- A question intended to be solved scientifically.

Qualities of a research question


1) Doable (Time bound, Tangible)
2) Novel/Unique
3) Socially relevant

Scientific Method v/s Non Scientific method

Scientific Method (Facts based) Non Scientific Method


(Fallacies/Inaccuracies)
Emperical and Observation i.e Pratyaksha Intutions and Beliefs
Pramana
Objective- True for all Consensus i.e majority decide
Repeatible- True for all conditions and times Authority e.g. some established person says that
sheersasana improves memory
Transparent Observation/Pattern
Not Falsifiable- E.g. people in earth who believe Informal logic (Two students bunked class, both
in god and those who don’t must be friends)
Logically consistent Subjective
Types of Research question

1) Descriptive- Here we describe things as they are. They are opinion. E.g male and female in Msc.
Prevalence of diabetes in India.
2) Correlational- Finding relationship between two variables e.g. creative rangoli and females.
3) Causal- Which identify cause and effect relationship between two variables. They are one on
one. E.g punch and pain.
Correlation is not causal but causal is always correlational. E.g one person introduces new flavors
of ice creams in the market. Now the demand goes up for ice creams. He comes to conclusion that
because of him only the demand for ice creams has gone up. But it can be due to summer heat also.
Therefore many factors effect correlation. But causal is definite. E.g Only trataka can improve eye
sight. Or two group are there one does trataka and one doesn’t.

Phase 1 b) Literature Review

It takes the longest time in research. It is the systematic review of existing knowledge and critical
appraisal/review and summarizing them to see what is done and what is not done. We can save time
by seeing other’s experiences and also ensure no duplication. We can also get information from
experts in the field.

Why Literature review


✔ To know the novelty. (What has been done and what needs to be done)
✔ To know the field and the people
✔ To get methodological insights e.g tools and methods used.
✔ To avoid plaglarism (It comes in reporting- giving credit to information is avoiding
plaglarism. Or just paraphrase in your own words).
✔ To Improve timeline and technique

When Literature review


It is done in all phases. Of course the magnitude will be different.
Phase 1- 80% - To ensure novelty.
Phase 2- 5% - To handle practical challenges and contemporary problems. Midway correction.
Phase 3- 15%- To compare our research with previous one’s.
Since new developments are coming at moment’s notice therefore we should do literature review
throughout all the 3 phases.

How Literature review

We need to search the relevant information sources for our information i.e specific database with
discipline. They can be categorized into two headings:

Offline (Physical)
Books, Journals, Library, Experts, Manuscripts/ Palm leaves
Online
Categories Sources
Medical Pubmed
Social Network group of researchers Research gate
Psychology Pschinfo, Apa Psycnet
Social science, nature, engineering Science direct (Paid), Sci-hub.cc (can search by
PMID, DOI or urll)
Education ERIC
Others Google scholar, Shodhganga (UGC), NDL
(Open source), proquest, Libraryofyoga.com
(Svyasa digital repository), Coursera (Online
free courses with test and quiz)

How Literature review

It defines the process of literature review

1) Define what you want to search- Objective of literature review. Find diabetic population among
the adult population.
2) Look for secondary sources of information- Source is origin from where you get the
information. For secondary sources, authenticity cannot be guaranteed. They are not first hand e.g.
books, newspapers, wikipedia, google. But they are good to start with.
3) Primary sources- They are more original and authentic than secondary sources but also more
technical. It can be understood by experts only. We can use primary sources to cite the information
and secondary sources to find. Scientific peer reviewed journals are 99% authentic as they go for
peer review where its either accepted or raised with objections. Peer reviewed journals are
published only after validating the correctness. Its very rigorous.
4) Choose appropriate database- E.g. Online or offline
5) Key words selection- People should know how I got the information. Important thing in
research is reproducability.
6) Collect- Read all information but if the dataure. If its large then refine or use specific keywords.
See the current developments and updated information. Focus on recent literature. If historical
review then we can go back. If applied research is there e.g new disease then go four recent
literature. We should maximum go for 3 years, 5 years and not more than 10 years of literature.
Exception is classical papers which can be quoted. Outdated information is not useful.
7) Organize- Organize your data as per various classifications or levels. Sort the data under
appropriate headings. There is a software called Mendeley which helps in organizing data. It is used
for reference manager.

Organizing involves personal creativity. We can use APA style of referencing. Article list in excel.
Author Title Design Measurem Result Conclusion
ent tool
Sharma Yoga & The Group Memory & Increase
Children attention memory
scale by 20%
8) Critically read, appraise and summarize- We read and summarize the 10 pages information
into 2 to 3 pages.

Types of Literature

1) Original Articles
2) Single case study
3) Review Articles- It is of 3 types
a) Narrative review- It is by experts in the field. Bias is possible even though we respect the
experience of the teacher.
b) Systematic review- You choose the filter for articles. There is a QC of data.
c) Meta analysis- Here we do systematic review and also statistically combine/pool the effect size.
It shows the strength of evidence. Pratyaksha pramana. The reliability of evidence is called the
strength of evidence. It is highest in Meta analysis as compared to systematic and narrative review.
E.g. review articles (experts come together and write), using statistical tools to bring effect size.

Effect size- It is the effect brought by the intervention. The size of effect is the effect size. It is
expressed numerically. Meta Analysis summates effect size for all and brings a common effect size.
E.g out of 100 values, we would get one number showing the average of all the effect sizes.

In this articles are systematically analyzed and involves summation of reviews. In this rigorous
statistics is involved. Even if domain is same (E.g. diabetes), but still it needs to be organized for
effect size in large data.

Through Meta analysis(process) we get effect size(number). Only for practical purpose and not for
theory. It is like a numerical survey and statistically rigourous. It is done on review paper on
experimental basis.

Phase 1 c) Experimental Plan/Study Design


Here we focus on design. The heart of design is to reduce noise and increase signal. Since signal is
god given, we must put more focus on reducing the noise. Main objective of research design is to
reduce noise (CF's) and improve signal. Confounding factors (CF’s) are the unwanted variables
which effect the dependent variable even though researcher is not interested in studying them.
In research experiment design, we have full freedom to enhance research and reduce CV.
The main question is how can we right measure effect? Purpose of research is to capture signal and
reduce noise. E.g. there will be difference between singing in a beautiful garden or in a bus stand.
For Instance if we have 1 week, 1 hour, 1 month yoga, there will be different effect each time.
Effect is God Gift and is different each time.
We can only secure signal after DV is fixed. We have more freedom in reducing noise. We must
focuson eye of fish like Arjuna.
CF 1

IV--------------------------------------------DV

CF 2
Therefore, we must reduce CV and we must enhance the rigour of our design. This can be done by
increasing all types of validity.
Population & Sample
Research is divided into two areas.
1) Methodology in research (Concepts, terminology)
2) Statistics (Derive meaningful information)

All people eligible to participate in a research are called population. The population is the group of
people defined by our research question e.g Msc students and Bhajan effect, then all Msc students
who attend Bhajan will become population.

A small subset from big population is sample. Sample is necessary as not possible to take all
population because resources will large and unaffordable. So we try to take a small sample from
population.
H(a)= Effect of yoga on type II diabetes in Bangalore
Population= Number of people with type II diabetes in Bangalore = 12,10,200
Sample= A group of 300 people from the population.
How we should select sample to ensure that it is a true representative of whole population. If
sample is a true representative then probability will increase.
The process of selecting subjects/sample randomly from population is called random sampling. If
we select the people randomly then the sample is called the true representative of the population. It
is done before selecting people. Whereas the process of dividing sample in a group is called
randomization. It is done after selecting people. If done randomly then we ensure that the
noise/disturbing factors are equally spread across all groups equally.

The process of selecting sample from population is called sampling.

Types of Sampling
1) Probability sampling (Random sampling)- It means each person has equal chance to
participate in the study. This is done to have representativeness of population in sample. It is of four
types.
a) Simple random sampling- Here first we list whole population (say 2,20,100 people) and then
using tool e.g Randomizer.org get randomly 300 people. Each and every individual in the
population has equal chance to participate in the study. Electronic method is quicker and simple
compared to chit method/Lottery system.
b) Systematic random sampling- Here we select people systematically. The choice of 1st person is
random, then it becomes systematic. But uniformity should be there in selection. See below e.g gap
of 5 people in selection.

17 22 27 32 37 42 47 52

c) Stratified random sampling- Once you know the nature of population in advance, select
layer/strata of sample based on same preknown proportion. E.g you know colourblind men:
colourblind women = 3:1. If we know this, we shall not take same proportion is our study as gender
is effecting the study.
So E.g we shall take 75 males and 25 females in a sample of 100. E.g smartphone addiction in rural
and urban areas will be different, so we shall take higher proportion of subjects from urban areas.
We fix this proportion from previous literature.
Here we are selecting subjects whereas in matching we allocate. Matching (Implementation level) is
done after stratified sampling (Planning level).

4) Cluster- Here we create blocks or clusters for huge geographic region. When we select clusters
randomly its called cluster random sampling. E.g make more small clusters and select small number
of people from all.
In case density of population is more in some clusters, then we can
combine with stratified random sampling. It shall be a combination of 2
probability sampling, primary one will be cluster while secondary one
will be stratified.

2) Non Probability Sampling (Non Random)- This is done when


researcher is not interested in generalization of study or true probability
sampling is not possible. The results cannot be generalized as in
probability sampling as the sample is not a true representative. That is
each individual doesn't have equal chance to participate in the study.
Also the bias is more than the probability sampling.

Its is three types.

1) Convenient sampling- We put notice in notice board and whoever is interested can come and
join the experiment. But if a person is not coming to see the notice board, he will not know. The
way out is to take more people. 80% of researchers do convenient sampling. You conveniently
recruit people into study. You choose the method.

Note: In case of Multi centric studies like you study subjects in two yoga universities, the external
validity will be increased.

2) Cluster sampling- When you have large areas, you divide and select. E.g survey.
3) Snowball sampling- Here we show a presentation and then ask participants to answer the survey.
It spreads like word of mouth, if people like they tell others. Snowball becomes big as it roles in the
snow.
4) Quota sampling- I need 300 male and 300 female. As soon as I get that number, then I stop
recruiting more sample.
5) Purposive- Effect of meditation on mind. E.g great yogic with more than 10 years of meditation
experience will be taken. The purpose is well defined and will select target population as per that
purpose. The purpose need to be very unique/Special.
Note: The bias at recruitment level may be because of following non probability sampling.

Concept of Validity
There is a concept of Validity. When a thing is serving its purpose what is is supposed to. Before we
study types of validity, its important to know about types of varirables.
Variable is something which is not constant. Its score of values change. E.g. Age.

There are three types of Variables.

Independent Confounding Dependent Variable


Variable (IV) Variable (CV) (DV)

Cause Effect

Effect of Yoga Brahmi Tablet On Memory

(IV)- It is cause related. It is decided and manupulated by researcher. It indicates cause of the effect.
It tells story about the cause.

(DV)- It is effect related. After the IV is fixed, what are changes in dependent variables. It shows
effect of the intervention.

(CV)- It is unwanted noise. The researcher is not interested in the study of noise but it effects the
dependent variable.

E.g.
IV
1) Frequency of yoga practice – Daily/1 Week/2 Week
2) Duration of yoga – 1 Hour

Effect (DV) will be more if duration and frequency increase. CV is like memory tablet.
Now lets discuss types of validity.
1) Internal Validity- A research design will have hight internal validity if CV is less. That is only
IV cause/influence DV but not CV on DV.
2) External Validity- It refers to generalizability of results across all locations. Meaning result is
not restricted to small group or place. We might also say that our research is same for all conditions.
But more a research is generalizable then external validity is high. E.g Yoga in prashanti and Yoga
in Bengaluru is different.

We can have values for validity.

(Low) 0 ---------------------- 1 (Best)


Anywhere Between
3) Construct validity- There are two things. One is tangible which can be viewed through sense
perception. Another is Intangible e.g. unperceivable emotions like happiness, depression. They are
invisible.

Construct means Imaginary/Unseen/Intangible which you can't measure directly but the researcher
is interested in measuring.
First we shall define that imaginary concept. We are free to choose the way we want for definition.
Operation tool is not important but how we measure is important.

E.g. Anxiety

Construct Operational Definition


Definition
State of mind It translates construct in terms of your measuring variable
where a person is:
1) Restless & 1) The slowness of responding to the questions
Disturbed
2) Galvanized skin 2) High galvanized skin, greater Anxiety
(Sweat)
3) Stammer 3) Number of times a person stammers, higher the anxiety

So construct is measurement at physical level. We need to see if tool is measuring effect. We need to
use secondary/Imperfect methods but we have no choice as need to measure.
E.g. Instead of measuring reaction time of a person through stop watch. We can use measuring
scale. One person will drop the scale and other person will catch.
Construct validity means whether tool which we use to measure construct is really measuring the
intended construct.

Important- All above 3 types of validity are done before data collection

4) Statistical Validity- We need to use right statistical test for right design and situation. We use the
right tool and from that did we infer correctly or not? E.g wrong to use MRI for fever. This is done
after data collection.

Types of Design
There are three types of research design based on randomization.

1) Experimental- Also called true experimental design- We allocate subjects randomly. There is
cause and effect relationship between variables. Randomized control trial is an example of
experimental design where we have control groups. Control groups don’t participate in the
intervention but rest all conditions are same. They help us to quantify the confounding factor.
Experimental design focus more on CF’s and how to control them.
2) Quasi Experimental Design- Here multiple groups are there for comparison but the second
group is not a control group. It appears like experimental but there is no randomization.
3) Non Experimental design- No multiple groups are there for comparison and no randomization
at all. There is no cause and effect between variables.

Experimental design- The process of randomly distributing candidates in an experiment to control


CF's. CF produce bias. So we equally distribute CF's among two groups and leave it to God's will.
It is best in the case, where a researcher doesn't know CF so cannot control them. E.g. memory
experiment, we wish to study growth hormone. But male and female hormone cycle is different. If I
take only male subjects, then study is not generalizable.

E.g I study memory. IQ influence memory. But some people are intelligent and some are below
average. But its not practically and experimentally possible to control it. So we go for random.

Note: We must study/measure IQ as it is a CF. We need to use statistical tools to control it.

2) Quasi Experimental design- Quasi means like. This like half selection is random and half not
random.
3) Non Experimental design- It is of five types.
3.1) Correlational design
3.2) Cross sectional design
3.3) Cohart
3.4) Case study
3.5) Survey

3.1) Correlational design- Find relationship between two variables. E.g participating in Bhajan and
higher marks. Correlation can be positive or negative or zero.

E.g of positive correlation= Height and age of baby


E.g of negative correlation= age and interest in study
E.g of zero correlation= Shoe length and anxiety

-1 0 +1

The measure of correlation is between +1 and -1. Higher value, higher the strength of correlation.

There are two aspects of correlation- Signs & magnitude. When we talk about magnitude, then we
ignore sign, vica versa. E.g between -0.95 and 0.75, we conclude that -0.95 is strong and between
0.85 and. .075, we conclude that 0.85 is strong.

-1 0 +1
(Below average) (Ordinary) (Super)

Given two variables- Anxiety and depression, we measure both and establish correlation. This is a
one time assessment.

3.2) Cross sectional design- It means a section of something. The condition/capabilities or


environment of groups is varied with each other. There are two distinct population defined by a set
of charateristics. Here we do assessment only once. It is not possible to ascertain the exact cause
and effect. When difference is apparent you don’t do pretest and go for cross sectional design.

E.g.
City Boys -------------------------------------- practice yoga
College Boys ----------------------------------practice yoga

College boys chant--------------------------20-25 years age group


Gurukula boys chant-------------------------20-25 years age group

3.3) Case study- It contains detailed information about some person with special abilities. We
document and present it. The information is extraordinary and are compiled and represented using
case study. E.g Yogis who stop heart beat for long time or remain buried. It is difficult to measure
such, so we do case study.
3.4) Survey- In survey we find the nature of design. E.g. how many smokers & non smokers in
Bengaluru. How many teens have smartphone addiction and how many not.
3.5) Cohart- Cross sectional design is also known as Cohart design in medical field. Cross
sectional term is more popular in Pschyology. In cohart we describe the nature of problem. E.g
people with radiation and people with no radiation do yoga. What is the change in effect. Here
randomization is not possible and there are more than 2 assessments.

Pre Post
People -------Yoga------- 10 people
10 people
with radiation

People Pre Post


Without 15 people -------Yoga-------
10 people
Radiation

Important Note: Cohart study becomes Quasi if there is no randomization. E.g in radiation
studies, we can't do randomization, as people with radiation must be in one group and people
without radiation must be in another group.

Concept of Reliability

Reliability is a concept which expresses consistency of measurement. E.g object should show same
weight if measured twice. It is of four types

1) Test Retest reliability- Here we see correlation between 1st test and 2nd test. We conduct retest
and assume that nothing changes in between the 1st test and 2nd test. This type of reliability is must
for a new test. Test retest is a measure of temporal stability i.e how stable is the construct. If
correlation is more (Between 0 and 1) then temporal stability is high. If correlation is 0, then
temporal stability is not good.

Pre Memory Post Memory


Test -------------------------- Retest

2) Split half reliability- We use this for questionaire e.g. general health questionaire in
psychology. E.g we split items of questionaire into two halves e.g. odd and even numbered
questions split or first half and second half of questions split. And then find the correlation among
odd and even numbered portions of questionaire or first half and second half of questions. But
important point to note is that domain and questions must be same.
3) Internal Consistency (Cronbach's Alpha)- Here we summate and divide all possibilities of
correlation. Its like split half but we find all the split combinations in the questionaires e.g odd and
even, first half and second half, etc etc. Suppose we get 150 types of split, we find correlation of all.
Then aggregate and state it. It is also called Cronbach's Alpha, which is common value not average.
Its calculated by a formula.

4) Interrater reliability- In this three judges independently evaluate the event. They are called
raters. Here we see if the raters agree with each other’s opinion. E.g in a judges in a Miss India
event. Here we find the consistency of marks among three raters e.g 9,9,3. The correlation between
marks awarded by the judges. We see the general opinion which are common and high.

Experimental design depends on field.

Confounding Variable (CV)

It is type of noise. All sources of bias will introduce noise. If confounding factors are more then the
variability will also be more.

Types of CV.

1) Researcher does not know (E.g. Diabetes patients eat sugar)- We use randomization to control
this. The researcher assumes that Confounding factors (CF's) are distributed equally.
2) Researcher knows- It has two sub types.

2.1) Know controls experimentally e.g. sorts people like dull and intelligent people. E.g IQ.
2.2) Knows but cannot control directly/experimentally except statistical methods e.g. in case of
age, gender etc.

It is important to note that something will always be unknown which will be in form of error.
Before we study examples of design it is important to know about subjects based classification.

1) Within (Same subject, different conditions)- It means 2 points of measurements only and same
subjects participate.
2) Between (Different subjects, different conditions)- It means 2 different groups and different
subjects participate.
3) Mixed (Both 1) & 2) exists together)- It means both 2 different groups and 2 points of
measurements exists together.

Design Examples
One group Pretest Post test design- Only within is possible. E.g of pretest or post test 1 group.

30 People
Pre Post
Measure Point 1 Measure Point 2
-----------------Yoga---------------
(Before starting) (After the Intervention)

Measure point 2 ---Minus---- Measure Point 1= Effect/change

But it lacks internal validity as no control over confounding variables. E.g eating junk food etc. E.g
measurement of weight at prashanti and home. It can be influenced by no work in home & luxury.
We can't tell how much CF's influenced the effect size.

Here there is another concept called ANOVA. ANOVA means analysis of variance. ANOVA means
that existence of more than 2 groups or more than 2 points of measurements then design is called
ANOVA. The latter is also called RM ANOVA (Repeated measures anova).
One group Pretest Post test design can also be called ANOVA if we have more than 2 measure
points.

Pre Post Post


Measure Measure Point
---------Yoga------------- Measure Point ---------Yoga-----------
Point 1 2 3
(Before (After the (After the
starting) Intervention) Intervention)

Important Note: Measurement at pre Intervention point is called baseline.

Two Group Pre post design (1 group can be a control group)- To enhance control on CF's we
establish control group. Control group helps distribute CF's over two groups and also helps quantify
noise through control group. Control group and main group (Experimental group) are same in all
aspects/conditions except that control group do not actively take part in intervention. It can be of
three types.

Intervention of experimental group-----minus------score of control group = real effect size.

2.1) Two group pre post design- Mixed group – Refer below example.

Pre 1 month yoga Post Experimental


15 people ---------------------- 15 people Group
Yoga
(30 People) Pre Control group
No Intervention Post
15 people
---------------------- 15 people

Control group helps us quantify the CF and also spreads it equally among all groups. E.g the Effect
size of Control group is 10 whereas Effect size of yoga group is 40, then 40 minus 10= 30 is real
effect size.

This is a mixed group as four points of measurement are there:

1) Pre & post yoga- Within group


2) Pre & post control group- Within group
3) Pre yoga and pre control group- Between group
4) post yoga and post control group- Between group

We need to analyse pros and cons of each group.

2.3 Two group pre post design- Within group- Within group refers to same subjects but different
conditions. It is very difficult to get control group for patients in hospitals. We can't ask them to do
nothing. So we do week division. 1St week patients do no yoga and 2nd week they do yoga.
Yoga (30 people)
Measurement pt 1—1 week no yoga-----Measurement Pt 2------1 week yoga-----Measurement
pt 3

The first week no control group is also called waitlist control group.

2.3)- Without control group but different group- Mixed group- It is used for comparitive
research.

Pre Post
Yoga ----------Yoga-------
15 people 15 people

Ayurveda Pre Post


15 people -------Yoga----------
15 people

In this ayurveda is also called active control group as its not idle like normal control groups. We can
compare to ayurveda, what benefit we get in yoga.

In two group design also we can have ANOVA, if more than two measurements points are there.

Pre Post 1 Post 2


Yoga -------Yoga------- 15 people ---------Yoga--------
15 people 15 people

Ayurveda Pre Post 1


15 people -------Yoga------- ----------Yoga-------- Post 2
15 people 15 people

So we have studied three types of control group above:

1) Normal control group- all things same as experimental group except intervention
2) Active control group- all things same as experimental group including intervention
3) Waitlist control group- The group doesn't not do the intervention for a specific period but later
does.

3) Three group pre post test design – There are more than two groups in research design.
It also can be of two types.

3.1) With control group

Yoga

Ayurveda

Control
3.2) Without control group

Yoga

Ayurveda

Naturopathy

More than two group design are always ANOVA.

To study short lived effects which are immediately visible after practice, we have concept called
washout period. E.g. study of Kevala kumbhaka after Kapalbhati.

The longer the effect the Intervention lasts, the longer the washout period. Both Intervention and
washout period are directly proportional.

Washout period- Gap between two successive practices to ensure that effect of previous
intervention is vanished. Here lesser the subject variability in design better it is. Subject variability
is less in within group design as same subjects participate in different activities.

However washout design takes extra day for gap period. So it is always advised only for smaller
sample size. Refer below e.g of left nostril breathing and right nostril breathing.

Pre Post Pre 1 Post 1


30 -----LN breathing------ 30 ---------gap ----------- 30 -------RN breathing----- 30
people people people people

This is to ensure that right nostril effect is not influenced by left nostril effect. We should be able to
independently know effect of both. So gap is given to independently measure the effects of two
interventions.

Now some people will say that we will have right nostril first and not left nostril, it shall have
different effect that way. This is phenomina called Order effect. So we put control.

We divide the 30 people of yoga group into two groups of 15 members each.

Left Nostril Left Nostril

Pre Post Pre 1 Post 1


15 ----------------- -------- 15 ---------gap ----------- 15 --------------------------- 30
people people people people

Right Nostril Right Nostril


Pre Post Pre 1 Post 1
15 -------------------------- 15 ---------gap ----------- 15 ----------------------------- 15
people people people people
After 15 people do left nostril breathing, there is a gap and they do right nostril breathing and in the
same way 15 people do right nostril breathing and there is a gap. Then they do left nostril breathing

Important Note: The above design will be categorized as within group design only as same set
of people are participating in all activities.
Above is also RM ANOVA as there is 4 times assessment points.

Washout period = Duration & Strengty of Intervention + adequate gap

Types of design- Subject based classification


Types of design Pure within Mixed group Mixed group
based on group design design design
randomization
Quasi One group Two group pre Three group pre test
Pre test post test post test post test with active
test with active control group
control group
Experimental One group Two group pre Three group pre test
Pre test post test post test post test with
test with with normal normal control
waitlist control group group
control group
Non NA NA NA
Experimental

Ways of controlling confounding factors in a research


Experimental way to control CF- Knows CF's and can control experimentally.

Restriction- We restrict the people who can participate in the study. We define inclusion and
exclusion criteria. E.g. old age people we exclude from participating in the study. One of the
strategies to avoid confounding is to restrict admission into the study group of subjects who have
same levels of confounding factors. E.g. H(a) tries to find relation between physical activity and
heart disease. Suppose age and gender were two Cfs of concern. Therefore CF's can be avoid by
making sure that all subjects were male between age 40-50.

Male (40- 1 month yoga Male (40- Experimental


50 yrs age) ----------------- 50 yrs age) Group
Yoga
(30 People)
Male (40- Male (40- Control group
No Intervention
50 yrs age) 50 yrs age)
----------------

Limitations:

1) It reduces the number of subjects who are eligible (issue with sample size)
2) Restriction limits generalizability i.e. you can't apply the study to women and other age groups.
The restriction factor is a source of variablity that is not of primary interest to the experimentor.
This is to remove the effect of nuisance factors that can be controlled. Here we exclude people from
the study itself. It means total removal. E.g. we say surya namaskar improves memory. But here IQ
is also confounding factor which effects memory. So we say, if IQ> 140 and IQ<30, then these
people will be excluded from the experiment itself. This is basically barring people from taking part
in the study so that the condition doesn't exist.

Pre 1 month yoga Post Experimental


15 people ---------------------- 15 people Group
Yoga (30>IQ>140) (30>IQ>140)
(30 People) Pre Post Control group
15 people No Intervention 15 people
(30>IQ>140) ---------------------- (30>IQ>140)

Matching- Here we make sure that influence of CF are equally distributed. In this there is no
random allocation. It is done only for Quasi experiment as he intentionally choses samples
according to the CF's e.g. gender. The gender will have effect so we shall have equal male and
female in both groups. So source of CF's are equally distributed in both groups. There must be
atleast two groups however, one group can be control group or another group. It appears like
blocking but randomization is not there. It is done at the implementation stage.

15 Male 15 Female 15 Male 15 Female Experimental


1 month yoga Group
----------------
Yoga
(30 People) 15 Male 15 Female 15 Male 15 Female Control group
No Intervention
------------------

E.g. Age Matched study

Here we divide the number of subjects as per the source of CF i.e age

Age group Experimental group Control group


30-40 20 20
40-60 15 15
50-60 10 10

Blinding- To be discussed later.

Cross over design- When there is an order effect, then we do cross over design. This is already
discussed.

If Quasi experiment is there, then go for matching, No randomization but control is there. If true
experimental experiment is there, then go for blocking, randomization is there. Sometimes we need
to go for Quasi experiment as we are working with less sample size. This is also according to nature
of research.
Statistical methods of controlling variance
Blocking- Blocking is something which you do at the level of analysis. You create homogenous
block of various Cfs and handle them statistically at analysis level. Its more statistical. After
conducting experiment, we separately analyse these homogenous blocks. Blocking is
randomization.

E.g. Male and female or Low IQ and high IQ. During experiment stage, we do the experiment
normally. But we make two blocks only at analysis/statistics level E.g. IQ (Low IQ and high IQ).
These blocks are homogenous in that particular CF's. We give a test to figure out the IQ and then
seggregate at analysis level. Actually it is not an experimental but a statistical way of control.
However, we put in experimental as we have to plan in advance.

Block what you can and randomize what you cannot is the thumb rule.

ANCOVA- Its is analysis of covariance. It controls for covariate. Covariate is a special variable
which you measure in an experiment in order to control for that variable later (statistically) in an
experiment. E.g. IQ, I must know the information of people's IQ. I measure IQ and keep it as
covariate in an experiment. I cannot exclude person experimentally so I measure it to control
statistically.

IQ= Covariate, We control IQ using ANCOVA.

We can have any number of covariates. ANCOVA can be done on 2 groups.

Note: Control of variability is not possible is a non experimental design. True experiment has
maximum internal validity whereas Quasi experiment has middle internal validity.

Note: Two concepts which relate closly to single blind study are Plasibo and Nosebo. Plasibo
means pschyolgical effect which brings healing without actually taking the medicine. Patient
getting cured by taking sugar pill which appears like tablet on doctors advise. Nosebo means
negative expectations result in negative outcome.

Type of Biases

Bias and CF’s go together. Biases give rise to CF’s but both are part of same phenominun. There are
five types of Biases:

1) Researcher Bias- The researcher shows bias against one group and treats other group well. E.g
gives good food for one group while ignoring other.

2) Subject Bias- This is where subject modifies the response to influence the selection process
especially during interview.

3) Selection Bias- It refers to gathering sample from special group. Selection bias at the time of
selection. Eg. I gather sample from gurukula to show the effect of yoga better. Non probability
sampling techniques contribute to this.
4) Recall Bias- Sometimes its very difficult to recall e.g. doctor asking patient that last 6 months
how many times mood swings, or how many times you smoke in a day. If current mood is good
then its very difficult to make recall of past bad moods. So its making bias while in a psychological
state. It refers to over estimation or under estimation of the previous experience.
5) Observer/measurement bias- While taking or measuring data or doing observation you make
error. E.g. doing wrong BP reading while on phone.
6) Instrument Bias- It refers to error in tools. The instrument not caliberating well. Caliberation
means that instrument must show 0 point if nothing is being measured.
7) Publication Bias- It refers to wrongly reporting results or not reporting negative results. We
only report positive and hide negative results, then bias will come. E.g. in a study 3 data showed
significant change while one didn't so we reported only 3 data. It also results in File drawer
problem (piling of Non reporting files).

A good researcher has randomization to take care of confounding variables.

Ways to control Bias

Blinding- It is of three types

Single blind study- When a participant doesn't know which group he is in, its a single blind study.
It is to control participant bias, E.g. patient lying to doctor that he is fine but he has fever. Refer
note below.
Double blind study- When both experimentor and participant both doesn't know which group the
persons belong, its called double blind study. A third person will code the participants in groups
secretely. It is to control experimentor's bias e.g. experimentor gives extra badam to one group or
one extra yoga session to one group.
Triple blind study – The analysis person will also not know which group the people belong. They
will be given code. The statistician will also not know. The third person will be objective & neutral.
So minimum bias. This is to control statistical bias e.g. statistically making one group look
superior through points.

Types of measures:
1) Prospective- Its is forward/future oriented i.e upcoming measurement of patients e.g after
Intervention.
Here also we can measure repeated say every year.
2) Retrospective- E.g. 2 years case history of patient in hospital. Also called archival. E.g. last 10
years trend or events. We go back.
3) Longitudinal- 70-80 years, we go long back.
3.1) Twin studies concept- We trace persons through their life span. E.g 2 twins, everything is
same but environment is different. We study lifespan development, psychology & human brain
development from embrio till death. In short its nature v/s nurture or genetic v/s environment factor.

Phase 3- Analysis
We analyse to conclude and generalize. First it is important to know the levels of measurements.

Levels of measurement- Classification of values of variables


Based on nature of the values of the variables, we classify them into four groups. The classification
is called levels of measurement. We know the nature of scores.
1) Categorical variables- Its of two sub types
a) Nominal
b) Ordinal
Nominal is the lowest level of measurement. Here we describe names or categories. e.g gender,
colour, religion. They don't imply any ordering among responses. E.g when classifying people
according to their favorite colour, there is no sense in which green is placed ahead of blue. E.g
jersey numbers of cricketers are at nominal level. A player with number 30 is not more of anything
than a player of number 15.

Ordinal- Here we order ascending or descending e.g rank of scores. However, the distance
between the attibutes doesn't have not have any meaning or not intepretable. E.g first class is
better than economy and business class is in between. Business class to economy varies from flight
to flight and from airline to airline.

2) Continuous variables
a) Internal
b) Ratio

Interval- Here there is a constant distance between two adjacent categories. This is due to the
fact that numbers can be expressed in quantities. E.g the amount you pay for plane ticket, the degree
of farenheit at your destination, the number of flying miles. E.g temperature scale, as gap between
each points is uniform. But it doesn't have zero point. E.g there can't be a zero degree celsius and
therefore it can't be used for ratios.

Ratio- The ratio scale is the most informative scale. It is an interval scale with the additional
property that its zero position indicates the absence of quantity being measured. Here there is
always an absolute zero that is meaningful. This means we can construct a meaningful ratio with a
ratio variable. The criteria is that at zero there should be total absence of property. E.g there is
nothing like zero height and weight. So we can say that person A weighing 50 KG is two times as
person B weighing 25 KG. E.g there is nothing like zero temperature so no ration is possible.

It is important to know that this is a hierarchy in levels of measurement idea. At each level up
hierarchy, the current level includes all the qualities of the one below and adds something new. Like
General, sleeper, 3rd AC and 2nd AC. Its better to have higher level of measurement (Interval or
ratio) than a lower one (Nominal or ordinal). We must classify variable as per nature. A higher
measurement can be converted into a lower one.

Note: Refer to socialresearchmethods.net for more details.

Two types of statistics


The result or information we get from the sample is called sample statistics. E.g of sample statistics
is mean and standard deviation We want to infer/generalize the result of sample to the whole
population. This process is called inferential process/Anumana.

We try to infer population parameter from sample statistics derived from sample. This process is
called inferential process.

As we discussed earlier, from sample statistic we infer information about population parameter. We
try to generalize the result.

1) Descriptive- It describes something about sample. We have information in hand only about
sample.
2) Inferential - Statistics which help us do process of inference (from sample to population). Here
we assume certain conditions about sampling distribution. Ideally we need to repeat the experiment
many times to increase validity but its not possible because of resource crunch.

Under descriptive there are two sub types:

1.1) Measures of central tendency- It is something which tells about central theme. It tell about
signal. E.g. I conduct experiment- "Did you like breakfast?", all values are different. But we are
only interested in getting only one representative value through which I get maximum
information/story of sample.

The representative value so calculated are called triple M (Mean, Median & Mode)

Mean
We can't take all values, but we are interested in one value which tells maximum story about
sample. So we add all values and divide by the total number of values. E.g. in the below table the
mean is 3 as it is most neutral. 1 is low extreme and 5 is high extreme. Mean = Sum of all values
(1+2+3+4+5)/total number of values (5)= 3. Mean has a nickname i.e average.

Data
1
2
3
4
5

1. However, mean has a limitation. It is effected by big/eccentric values (large or small).


E.g. Mean for below would be 102 (510/5) which is effected by big value i.e 500.

Data
1
2
3
4
500

In case of extreme values mean is not a true representative.

Median

In case of median, we first of all arrange all the scores in ascending or descending order. Then after
arranging the data, we pick the middle most value. In case the scores are even in number, then we
take the average of the 2 middle most values. Medians are used where there are extreme values in
the data.
The median in the below table is 3 as its the middlemost value in the list arranged in ascending
order.
Data Arranged in
ascending order
1
2
3
4
500

The median in the below table having even number of values is sum of two middle values i.e.
(2+3)/2= 2.5

Data Arranged in
ascending order
1
2
3
4

Mode- This refers to score occurring most frequency in the table. We make a frequency table.
E.g we have scores 1,4,8,6,2,7,2,8,9,4,2,6.

Scores Frequency
2 III
4 II
6 II
8 II
7 I
9 I

Since 2 occurs 3 times (most among all) therefore its is the mode.

Note: Normally proportion of extreme values is less. Else sample is also like that.

2.2) Measures of dispersion

Dispersion means scattered. That is scattering of scores. We study how much scores are deviating
from the mean. It tells about spread/variability/noise.
E.g. in the below table, there is high dispersion downward due to extreme value.

Scores Dispersion
1
2
3 3
4
500

Ways of finding dispersion

1) Range- We calculate by subtracting maximum and minimum value. E.g 5-1=4.


2) Standard deviation – It gives deviation from mean in standardized form.
3) Inter quartile range (Box Plot)- to be discussed later.

Scores Deviation (X-X`) (X-X`)*2


1 1-3 -2 4
2 2-3 -1 1
3 3-3 0 0

4 4-3 1 1
5 5-3 2 4
0 10

Since of (X-X`) is zero therefore we square it. The variance is calculated as below:

The standard deviation is calculated as square root of variance as per below:

So standard deviation in our case will be square root of 10/5-1= 10/4=square root of 2.5.

Important Note: We calculate square root as we squared the values of (X-X`) earlier. We need to be
in the same scale.
We use N-1 as denominator as we describe the sample. In case of calculating standard error for
population, we use N as denominator. Statistics is done only for those scores which vary. E.g. there
are four cornors in a classroom, if 1st person will have 4 choices, 2nd person will have 3 choices
but last person will have no choices. When mean is fixed only N-1 values can vary. This is concept
of decrease of freedom i.e. actual number of scores which can vary.

The summary is that measures of central tendency tells about signal and measures of dispersion tells
about noise. And in research, we need to reduce signal and increase noise.

When we do the process of inference from sample to population, it is called Inference. That is we
predict about population statistics from sample statistics.

There will be variation in the result if we select multiple samples but the it will be less.

Inferential Process
Sample Error- The variation in each of the sample statistics which you derive after sampling
each time is called sample error. Roughly it is the variations among two sample statistics. This
sample error tend to be smaller if sample size and sampling technique is random (i.e. sample is true
representative of the population). Refer below example of two experiments with multiple sample
statistics.

Sample Means of group 1 (6 means of 6 samples) = 36, 35, 37, 34, 33, 34.5 (Sample error is less)

Sample means of group 2 (4 means of 4 samples)= 89, 35, 61, 21 (Sample error is more).

The reproducibility and generalizability of the research will be more if we do experiment many
times. Further there might be less bias if we take information from multiple samples. Ideally we
need to do experiment many times then we can get more sample mean (More sample mean = more
representative). But its is not practically possible. Therefore, we use statistics to infer the below
values about population.

U` i.e. Mean of population


SE i.e standard error of population.

We derive the above two values from sample statistics i.e. sample mean and Sample standard
deviation through inference.

To understand this inferential process, we need to study a theorum on nature distribution i.e. Central
Limit theory

Central Limit Theory

When we take sample, we need to generalize. We need to find how much sample is reliable and
accurate. How much sample is true representative. We can't afford to replicate experiment again.
So way out is central limit theory.

Using this we infer U` and SE from one sample statistics only. It has three statements.

Statement 1- When the size of the sample is 30 or greater than (>) 30, then the sampling
distribution will be normally distributed.
Sampling distribution is the distribution of all the means of multiple samples taken. However, in
practice we don't take multiple samples rather use statistics to generate multiple sample
statistics from one sample.

E.g. Sample A (SS = 10, Mean = 35, SD= 20), Sample B (SS = 10, Mean =40, SD= 15), Sample C
(SS= 10, Mean= 38, SD= 18). The distribution of all this means is called Sample distribution.

Before we go further it will be important to understand the concept of frequency distribution to


understand the concept of sampling distribution.

E.g we purport to create frequency distribution of values, 2,4,2,3,4,6,5,4,7,7,9

Step 1- We shall write all unique values, they are scores in the below table.
Step 2- Write the frequency next to the unique values, e.g 2 is appearing 2 times. Similarly write for
others.

Scores Frequency
2 2
3 1
4 3
5 1
6 1
7 2
9 1

Below is the chart of frequency distribution, The X axis is the unique scores and the Y axis is the
frequency.
3.5

2.5

2
Frequency
1.5

0.5

0
2 3 4 5 6 7 9

Note: The frequency distribution of a continuous variable is called as histogram. There are no gaps
among the bars for continuous variables. Continuous variables can take any value in an interval i.e
they may overlap whereas the discreet variable are unique values. The above figure shows bars for
discreet variables.
Normal distribution can be precisely defined using mean and normal distribution.
The main characteristics of non normal distribution are skewness (left tail or right tail) and kurtosis
(steep or broad).

Now let us return to the Statement 1 of central limit theorum. As we know that on the basis of one
sample statistics we conduct experiment and we derive multiple sample statistics through the
statistical tools on the basis of our sample statistics. One such statistical tool is Boot sampling.

The frequency distribution of multiple means of different samples is called as sampling distribution.

Now statement 1 states that "When the size of the sample is 30 or greater than (>) 30, then the
sampling distribution will be normally distributed." We have already understood what is sampling
distribution. Now what is normal distribution?
If the graph of distribution we derive from sample distribution has bell shape it shall be assumed to
be normally distributed. It is also called gaussian curve in mathematics.

The reason behind this assumption/hypothesis is that if sample distribution is not normal (i.e bell
shaped), then our analysis shall be complicated. The bell shaped distribution can be perfectly
defined using only mean and standard deviation in mathematics. It is easy to run statistical
parametric tests and it is easy to comprehend. There is no bias left or no bias right in normal
distribution. 50% of values are less than mean and 50% of values are greater than mean.

If no normal distribution is there, we might need 10 more parameters apart from mean and standard
deviation.

Normal distribution are unimodal around centre, symmetric in middle and asymptotic on the tails.
Normal distribution's right side is a mirror image of the left side. They are perfectly symmetrical
around its center. Normal distributions are continuous i.e have tails that are asymptotic, which
means they approach X axis but never touch it. The implication of this is that no matter how far one
travels along the number line, in either the positive or negative direction, there will still be some
area under any normal curve. The further a data point is from the mean, the less likely it is to occur.
E.g Intelligence, height and blood pressure. Most of the data tend to cluster around the mean.

Features of normal distribution

1) Normal distributions are symmetric around their mean.


2) The mean, median and mode of a normal distribution are equal
3) The total area under the
normal curve is equal to
1.0.
4) Normal distributions are
denser in the centre and less dense in the tails.
5) Normal distribution are defined by two parameters, the mean and the standard deviation.

Empirical rule (68-95-99.7 rule), while calculating standard deviations.

6) 68% of the area of a normal distribution is within one standard deviation of the mean.
7) approximately 95% of the area of a normal distribution is within two standard deviations of the
mean.
8) About 99.7% of the area under the curve falls within 3 standard deviations of the mean.
Standard deviation determines the height and width of graph while the mean is at centre. When
standard deviation is large, the curve is short and wide whereas when the standard deviation is small
the curve is tall and narrow.

Example: 95% of
students at school
are between 1.1m
and 1.7m tall.
Assuming this data
is normally distributed can you calculate the mean and standard deviation?

The mean is halfway between


1.1m and 1.7m.

Mean= (1.1m+1.7m)/2= 1.4


m

95% is 2 standard deviations either side of the mean (a total of 4 standard deviations) so:

1 standard deviation= (1.7m- 1.1m)/4= 0.6m/4= 0.15m

It is good to know the standard deviation, because we can say that any value is:

✔ Likely to be within 1 standard deviation (68 out of 100 should be)


✔ Very likely to be within 2 standard deviations (95 out of 100 should be)
✔ Almost certainly within 3 standard deviations (995 out of 1000 should be)

Note: The number of standard deviations from the mean is also called the standard score, sigma or
Z-score.

Example 2: One of the friends in the school is 1.85m tall.

You can see on the bell curve that 1.85m is 3 standard deviations from the mean of 1.4 so:
Your friend's height has a Z-score of 3.0.

It is also possible to calculate how many standard deviations 1.85 is from the mean.

How far is 1.85 from the mean? It is 1.85-1.4= 0.45m from the mean
How many standard deviations is that? The standard deviation is 0.15m, so:
0.45m/0.15m = 3 standard deviations.

So to convert a value to standard score/Z-score:

1) First subtract the mean


2) Then divide by the standard deviation.

Example 3: A survey of daily travel time had these results (in minutes):

26,33,65,28,34,55,25,44,50,36,26,37,43,62,35,38,45,32,28,34

The mean is 38.8 minutes and the standard deviation is 11.4 minutes.

Now convert the values to Z-scores (Standard scores)

To convert 26:

First subtract by mean: 26-38.8 = -12.8,


Then divide by the standard deviation: -12.8/11.4= -1.12

So 26 is -1.12 standard deviations from the mean. Given below are first three values.

Original Value Calculation Standard/Z-Score


26 (26-38.8)/11.4= -1.12
33 (33-38.8)/11/4= +0.51
65 (65-38.8)/11.4= + 2.30
Formula for Z score

Z is the Z-score or standard score


X is the value to be standardized
U` is the mean
o` is the standard deviation

We standardize as it helps to make decisions about our date.

Example: Professor is marking a test.

Here are the students results (out of 60 points)

20,15,26,32,18,28,35,14,26,22,17

Most students didn't even get 30 out of 60, and most will fail.

The professor decides to standardize the scores and fail only those students who are 1 standard
deviation below the mean.

The mean is 23, and the standard deviation is 6.6, and these are the standard scores:

-0.45, -1.21, 0.45, 1.36,-0.76, 1.82,-1.36, 0.45,-0.15, -0.91

The cut off is 1 standard deviation below mean i.e below 16.4 marks.

It makes life easier as we only need one table, rather than doing calculations individually for each
value of mean and standard deviation. Z score helps to compare two variables in difference scales
through standardization. Z score is for each value. It means per unit change in SD, what is the
change in mean.

Statement 2- The mean of the samping distribution is equal to true mean of the population.

Assuming that Statement 1 is true, the average of all means of sampling distribution is assumed to
be equal to the mean of the population.

Statement 3- The SE (Standard error) of the population = SD/Square root of N

Where N= sample size and SD is equal to the standard deviation of the sample.

Note: Central limit theory works when sample is a true representative i.e. random techniques
adopted in selection of participants and probability of such research is also more.

If we take more sample 1) Distribution becomes more and more normal, 2) The spread of the
distribution decreases as per central limit theorem.
The distribution of an average tends to be normal, even when the distribution from which the
average is computed is decidedly non-normal. Example an investor is looking to analyse the overall
return for a stock index made up of 1000 stocks, he can take random samples of stocks from the
index to get an estimate for the return of the total index. The samples must be random, and at least
30 stocks must be evaluated in each sample for the central limit theorem to hold. Random samples
ensure a broad range of stock across industries and sectors is represented in the sample. Stocks
previously selected must also be replaced for selection in other samples to avoid bias. The average
returns from these samples approximates the return for the whole index and are approximately
normally distributed. The approximation holds even if the actual returns for the whole index are not
normally distributed.
Central Limit Therorum is useful is finding about population parameter.

Normal distribution is only for continuous variables. The curve will never touch X axist otherwise it
shall be certain. Both sides are mirror images i.e symmetry. Certainity means no probability.
Normal property is also called density plot. For Binomial distribution for discreet variables there
will be gaps in charts.

Box Plot

Box Plot helps to tell us about nature of distribution (Skewness level) and also it is an outlier
analysis. It also helps us to know the distribution of data plot and understand the extreme values of
the plot.
When to use the box plot:

1) To find the nature of distribution (Skewness)


2) To find outliners

First we divide the whole data in ascending or descending order for Median. Then I divide the data
in four parts/segments. Each part is called quartile.
Outliers

Upper Fence

4th Quartile 25%


Upper
Whisker
Upper
Quartile
3rd Quartile 25%
Interquartile
Range Median
2nd Quartile
25%
Lower Lower
Whisker Quartile

1st Quartile

25%
Lower Fence

Outliers
Interquartile Range
Interquartile range tells about the measure of spread or dispersion.
Range of Box/Inter Quartile range = Upper Quartile – Lower Quartile
We don't take last values for calculating range as they can be outliers.

Upper Fence= Upper Quartile + 1.5* (Multiplied by) IQR


Upper Fence= Lower Quartile – 1.5* (Multiplied by) IQR

Median is in the middle of 3rd and 2nd Quartile.

The lines are called Whiskers like Cat's hanging moustache. Any value lying above fences are
called outliers. Outliers are extreme values beyond these fences.

Numerical example- First we divide the data in 4 parts

1
2
3
--
4
5
6
--
7
8
9
--
10
11
12

Fence 16.5

12
11
10
9 Upper Quartile
8
7
Median (6+7)/2
6
5
4 Lower Quartile
3
2
1

Fence
- 3.5
Here IQR will be Inter Quartile range = Upper Quartile – Lower Quartile i.e. 9-4= 5

IQR =1.5 X 5 = 7.5

Upper Fence will be Upper Quartile + IQR= 9+7.5=16.5


Lower Fence will be Lower Quartile – IQR =4 – 7.5 = - 3.5

Hence in this example there are no Outliers as all values are within the upper and lower fence.
However, if in this example there is value 20 instead of 12, then 20 shall be outlier as it falls outside
the upper fence i.e 16.5.

Refer below box plot is skewed upward


Confidence Intervals
Confidence interval determines the precision. CI says the probability of truth in a range. The narrow
the confidence interval, the more certain we are that the results are close to the point estimate. E.g if
you have confidence interval of 95%, then you can say that from 100 experiments, 95% of times
true mean is in the confidence interval.

If X` is the mean of the sample then we define the confidence interval as the intervals in which the
true mean of the population may lie. Here the sample mean is treated as point estimate.
E.g. mean weight of all the apples in the orchid is 149gms. Sample mean = 149 gms.
Therefore, we create a confidence interval of 147 gms and 151 gms whithin which population mean
would lie.

147gms 149gms 151gms


Point estimate

Width of confidence interval depends on below factors.

% confidence- The higher the confidence %, the wider the interval. E.g in 99% confidence interval,
there is 1% chance that truth is outside.

Standard deviation/Variations- If population variation is less then sample variation is also less.
Greater the variations, the greater the confidence interval.

Sample size- The larger the sample size, the more information you have about population and the
more similar it is to population. There will be lesser sample error. Therefore lesser the confidence
interval.
Calculating confidence interval

Informal method
Traditional normal based (90,95,99)
Boot strapping

Number of samples = 40

Mean X = 175
Standard Deviation: s = 20

s
X±Z
√(n)

Step 2: decide what Confidence Interval we want. 90%, 95% and 99% are common choices. Then
find the "Z" value for that Confidence Interval here:

For 95% the Z value is 1.960

And we have:

20
175 ± 1.960 ×
√40
Which is:
175cm ± 6.20cm
In other words: from 168.8cm to 181.2cm
The value after the ± is called the margin of error
The margin of error in the previous example is 6.20cm

NHST
After literature review, we observe or collect data. Based on the evidences we accept or reject our
original statement. Systematic process of doing inference is called NHST.

How to do your enthusiastic research. The whole process of evaluating research evidence is called
Null hypothesis significance testing (NHST). It is a process to understand and evaluate
hypothesis. It is a technical term.
Hypothesis is a technical term which gives an idea of your experimental result. It gives idea of
anticipated outcome. Like an intelligent guess, prediction. E.g. can yoga reduce blood sugar in
diabetes.

Null hypothesis- It negates the statement and means that there is no such situation existing. It is a
negatation of statement. There is no relationship of effect.
The whole process of evaluating a research idea is through null hypothesis. You state and then
evaluate null hypothesis. We can’t prove things in yoga as its a social science. However, we can
gain evidence for or against and then come to conclusion. NHST uses inductive/logic/reasoning. It
is a top down approach. Based on evidences we build on hypothesis.

Generally in western logic/research hypothesis it is is easy to negate absence than to prove


presence. We create H(0) and reject or fail to reject based on number of evidences for or against.

Types of hypothesis.

1) One tail hypothesis- The direction is certain. E.g. Brahmari can increase memory or Brahmari
can decrease memory.
2) Two tail hypothesis- The direction is not clear. E.g. Brahmari changes memory.

There can be N number of null hypothesis for an alternate hypothesis.


For E.g H(a) Table is in store room

H(01) – Table is in dining room. H(02)- table is in kitchen. H(03)- Table is in garden.

Steps of NHST

1) State alternate hypothesis H(a)


2) State null hypothesis
3) Fix Alpha
4) Fix Power (1-Beta)
5) Estimate effect size
6) Estimate the required sample size
7) Collect data
8) Analyse
9) Conclude based on P value

If P>0.05 then we fail to reject H(0)- Villian, If P<0.05 then we reject H(0)- Hero.

Nature of H(0)
True False
Reject H(0) Type I (alpha) Power (1-Beta)
Decision Fail to reject (1-Alpha) Type II (Beta)
H(0)
Sum 1 Sum 1

Conditional probability- The probability of occuring of an event provided another event has
happened. If two variables are linked then their sum is always 1. E.g Power is 1-Beta.

Hero of our research is null hypothesis. When you make decision, there is always a possibility of
making error. 1st type of error is called alpha error (Type I) and second type is called Beta error
(Type II).

E.g. H(0)- Brahmari doesn’t improve memory


True H(0)- Brahmari doesn’t improve memory____
False H(0)- Brahmari improves memory ____
Reject H(0)- Brahmari improve memory_____
Fail to reject H(0)- Brahmari doesn’t improve memory._____

So we got two rights and two wrongs.

Type I (Alpha)- When it doesn’t work and you claim it to work. It is a notorious/arrogant/serious
error. It is an overestimation.
Type II (Beta)- When it works and you claim it doesn’t. You had potential but you didn’t say. It is
an innocent error. It is underestimation.
Power- Right deciding the effects. Effect is there and so you claim. The effect of the intervention is
captured correctly.
(1-Alpha)- Doesn’t work and you also said doesn’t work. You didn’t prove anything by keeping
your mouth shut. It is not that much related to effect. Doesn’t capture the effect.

Note: We shall be using fail to reject than the term accept.

E.g. H(0)- Enemy is not there.

Type I (Alpha) Error- Enemy is not there and you shoot


(1-Alpha)- Enemy is not there and you don’t shoot.
(1-B) Power- Enemy is there and you shoot.
Type II (Beta) Error- Enemy is there and you don’t shoot

3) Fix Alpha- 0.05 (5%) ok- 0.01 (1%) good- 0.001 super
This means that researcher will not tolerate more than 5% of Alpha error so we fix it in advance.
There is no negative, its just a percentage. 95% of times its fixed at 5% only.

4) Fix power (1-Beta)- 0.80 /0.95/0.99. We take Beta as 0.20 (20%). So power is 1-Beta i.e 0.80.
Maximum power is 1.
5) Effect size- The amount of effect caused by the intervention. The maximum the intervention, the
maximum the effect size. The magnitude of intervention as measured by the measuring tool. It also
depends upon how sensitive your tool is e.g using weighing machine instead of traditional weighing
instrument. Effect can’t be seen as it is intangible but we can measure it. The effect size is estimated
from previous literature through statistically rigourous meta analysis.

6) Estimate Sample size- Thumb rule is that more rigorous the condition the more the sample size
is required. It depends on four factors

Effect Size Smaller effect is difficult to find so more sample size is required
Power Higher power is difficult to secure so more sample size
Alpha Lower alpha is difficult so more sample size
Two tail hypothesis Rejecting two tail hypothesis is difficult so sample size is bigger

We estimate effect size so that we may calculate correct sample size. Correct sample size is needed:

1) To save resources. Effect size (ES) helps to find correct Sample size (SS). More SS means more
resources.
2) Ethically not right to have more subjects than required. We are not dealing with Guiny pigs.
3) To ensure that we have sufficient power. Insufficient power causes indecisiveness/dilema/
Inconclusiveness. That is, in case of error/failure, we would not know whether our hypothesis really
didn't work or the sample size was less.

E.g P=0.04, which is less (<) than 0.05, we would reject H(0) i.e we conclude that our alternate
hypothesis H(a) is correct, i.e. Brahmari improve memory. But if P value is 0.51 which is greater
than 0.05, then we would fail to reject H(0). And there is a problem mentioned in Point 3 above.

Ways to estimate effect size:

1) Literary review- We can choose past literature which resembles in two aspects a) Intervention b)
Measuring tool.
2) Pilot study – If no past paper is there, we do a dry run with small sample size. This is to know
how a variable behaves or to get taste of future experiment.
3) Consult Experts- E.g. for purchasing colour TV, we check brand, price, budget, technology and
then choose as per our convenience. Fisher says if P>80% then sample size is sufficient, It is a
general norm. Cohen gives idea as Cohen's guidelines:

Cohen's Guidelines

Code Description Mean difference Correlation


HI Effect size is high 0.80 .5
MO Effect size is .50 .3
moderate
LO Effect size is low .30 .1

P<0.05 We reject H(0) and accept H(a), - Hero


P>0.05 We Fail to reject H(0) and reject H(a)- Villian

Power Analysis

It is the process of finding effect size to get estimate sample size in apriori stage and getting power
in the post hoc stage. Basically find unknown from known factors.

Apriori Post Hoc


When Before After
Why To find To get achieved power (Real)
required
sample size
(estimate)
Input ES, Alpha, SS, ES, Alpha, Tail
Tail, Power (1-
Beta)
Output SS (Unknown- Power (1-Beta) (Known)
Estimated)
Finding appropriate sample size is important. For e.g. if you have a bike ride to Mysore. If bike
stops in between then you must know whether the petrol (Power) ran out or Bike genuinly didn't
work (Intervention didn't work). Putting enough petrol is like ensuring that you have enough power.
Less petrol >> Less Power >> Insufficient sample size.

Alpha error is like tyre puncture in a journey or attrition of people in a project.


In all we must derive unequivocal conclusion which is clear without any ambiguity.

P- Value

It is a close cousin of power. It is also probability or chance. It is measured by-

Event/Total possibilities or Intervention/Total Possibilities H(0)

It means possibility of finding a change in data given Null hypothesis H(0) is true. Obviously the
lesser the P value the better it is. Because accidental happening of event is minimised. It would be
unlikely that event would have occured by chance/random and only by intervention it happened.
Higher the intervention, lower the P value.

Here it is important to know If:

ES is less + SS is more = We shall be fine (But ethically wrong to take more sample than required)
But if:

ES is more + SS is less = Because of low sample size there can be low power and more error
(Especially Type I error)

Refer to power analysis in Ipscyologist.com

Statistical tests
All the two tests are done are data is secured.
Assumption tests
Before running any test, we need to check if we are eligible to run the test. For e.g we won’t have
same approach for dengue and fever. We check for diagnosis through eligibility tests. Before
comformative test we do preliminary test. The latter is to ensure that we are eligible for the former.
This is called assumption tests. A test you perform to check for assumption. Assumption is whether
you are eligible to perform parametric tests. If conditions in assumption tests are satisfied then you
can do parametric tests.

Types of assumption tests

1) Check for normality- First we check the null hypothesis before performing test. H(0) is that
the data are normally distributed. We have already discussed the features of normal
distributions. The data should not be negatively skewed (tail skewed towards left) or positively
skewed (tail skewed towards right). Or the distribution should not have kurtosis i.e flat (SD more)
and steep (SD less). The name of the test for normality is Shapiro wilk test.

If this condition is satisfied then we go for parametri test else non parametric test.

Suppose P=0.012 then we reject H(0). H(0) is that data are normally distributed. When we reject
H(0) we cannot run parametric tests.
The steps are-
1) Write H(0)
2) Look at P value
3) See whether H(0) is reject or fail to reject
4) Make inference on hypothesis as to which test to run

2) Equivalence of variance- First we check the null hypothesis before performing test. H(0) is
that variance is equal across two independent groups. Here we check variance of subjects as
well as standard deviation variance. It is only for between group. We check whether variance is
equal across two groups. The name of the test is Levene’s.

If variance is not equal then it is the violation of parametric control. Here if the 1st condition is
satisfied, then we can run parametric test with correction even through 2nd condition is not satisfied.

3) The data must be atleast measured at interval or ratio. In other words the variable must be
continuous. Only then we do parametric tests.

E.g of discreet – Gender- Male & Female


E.g of continuous Marks- 1,2,3,4,5

4) Test for Independence- The measurement made on subject is independent of measurement on


other subjects. E.g measurement of memory on person A should not effect on other person. E.g. you
call person A and ask questions, then you call person B. But A has already told B about the
questions. So measurement on A has effected/influenced measurement of other subjects.

E.g. P= 0.12, Here we fail to reject H(0), H(0) = means of two groups are same.
So we go for parametric test.
In assumption tests, we are happy if P value is > 0.05.
If all 4 conditions met then go for parametric tests else go for non parametric tests.

Note: Whenever we run statistical tests, we get P value.

Parametric & Non Parametric Tests

Based on assumption tests, we decide whether to go for parametric or non parametric tests.

Parametric tests- They assume some conditions (assumption tests) about the distributions. We
prefer to run parametric test if conditions are met as tehy have a higher statistical power. Also
normal distributions can be explained easily through statistical tests.

Non parametric tests- They are distribution free test. If conditions in assumption tests are not met
then non parametric have higher statistical power.

As discussed already H(0) must be known before test else it is a sin to perform test.

Continuous data

Design Parametric Non Parametric H(0)


Correlation Pearson Correlation Spearman rho (e) No correlation R=0
Coefficient (r)
Pre Test Post Test Paired Sample T-test Wilcoxon’s sign rank Mean of difference=0
test
Two group design B/w Independent sample Wilcoxon Sum Mean of two groups are
group same
Between group Anova ANOVA Kruskal Wallis Means of various
groups are same
RM Anova (One RM ANOVA Friedman Means of various
group) assessment points are
same

Discreet Data

Design Parametric Non Parametric H(0)


Categorical Test None Chi Square test Multiple

Note: When design is fixed, the statistics is fixed. Design and statistical tools go together.

Chi Square Tests- It checks for proportions or ratio. If the proportions are significantly same. Is
there any statistical significant difference between two groups below of males and females.
M F
35 25

M F
28 22
It is of two types.

1) Test for goodness of fit


a) It is done when we have 1 categorical variable. E.g gender.
b) H(0)= The proportions at various levels in the variable are same. E.g. of level in a test is 2 i.e.
pass or fail. The level in gender is 2 i.e male and female. The level can be 3 e.g. for social status e.g.
moderate, high and low.

M
F
M
F
M
M
F
F

The proportion will be 50/50 for 2 level i.e 0.5 and 0.5.
The proportion will be 33/33/33 for 3 level i.e 0.33.
The proportion will be 25/25/25/25 for 4 level i.e. 0.25.

c) Expected frequency should be atleast 5.

The below is called observed frequency.

M F
35 25

Expected frequency = Total sample X Expected proprotion


= 0.5 X 60 (35 M + 25 F)=30

X square= Total Sigma (0- E square)/E


P value we get, P=0.012, So we reject H(0)= Numbers are not similar.

2) Test for Independence


a) To find the relation between 2 categorical variables. H(0)= Two categorical variables are not
related or are independent.
E.g. to find whether smoking (2 levels) and gender (2 levels) are related or not.
Contigency Table
Gender Yes No Total
Male 10 20 30
Female 30 40 70
Total 40 60 100

b) Expected frequency >=5


To find the expected frequency we total the rows and columns.

The formula is Row total X Column total/ Total total

Contigency table
Gender Yes No Total
Male 30X40/100 30X20/100 30
Female 70X40/100 70X60/100 70
Total 40 60 100

Then we get a P value, say P=0.01, If Expected frequency <5, then test becomes invalid.

Note- In correlation the relationship between two is continuous but this is discreet.
Note: After collecting observed data, we calculate expected, so it is done at the analysis level.

Você também pode gostar