Escolar Documentos
Profissional Documentos
Cultura Documentos
A Research is defined as systematic evaluation of a general thought to find the truth through
scientific method in social interest. Research is charaterized by a research question.
Qualities of Researcher
✔ Finding/searching/developing/compiling something new
✔ Verifying theories or concepts or advancing old concepts
✔ Using scientific method
Types of research
Classification is done for the sake of convenience. Various permutations and combinations are
possible. Some categories may overlap. Main point is good/concrete evidence which is clean. It is
better to focus on quantitative research.
1) Methodology (Planning)
(a) Research question- Doable, Unique, Socially relevant
b) Literature review- To see what has been done and what needs to be done
c) Design- Experimental plan/blueprint
The time to do all phases may be different. After data collection experiment is dead and must be
revived during post mortem. Analysis is also part of planning as its not necessary to have data to do
analysis. The 1) Methodology and 3) Statistical analysis are like wings and must be strong
1) Descriptive- Here we describe things as they are. They are opinion. E.g male and female in Msc.
Prevalence of diabetes in India.
2) Correlational- Finding relationship between two variables e.g. creative rangoli and females.
3) Causal- Which identify cause and effect relationship between two variables. They are one on
one. E.g punch and pain.
Correlation is not causal but causal is always correlational. E.g one person introduces new flavors
of ice creams in the market. Now the demand goes up for ice creams. He comes to conclusion that
because of him only the demand for ice creams has gone up. But it can be due to summer heat also.
Therefore many factors effect correlation. But causal is definite. E.g Only trataka can improve eye
sight. Or two group are there one does trataka and one doesn’t.
It takes the longest time in research. It is the systematic review of existing knowledge and critical
appraisal/review and summarizing them to see what is done and what is not done. We can save time
by seeing other’s experiences and also ensure no duplication. We can also get information from
experts in the field.
We need to search the relevant information sources for our information i.e specific database with
discipline. They can be categorized into two headings:
Offline (Physical)
Books, Journals, Library, Experts, Manuscripts/ Palm leaves
Online
Categories Sources
Medical Pubmed
Social Network group of researchers Research gate
Psychology Pschinfo, Apa Psycnet
Social science, nature, engineering Science direct (Paid), Sci-hub.cc (can search by
PMID, DOI or urll)
Education ERIC
Others Google scholar, Shodhganga (UGC), NDL
(Open source), proquest, Libraryofyoga.com
(Svyasa digital repository), Coursera (Online
free courses with test and quiz)
1) Define what you want to search- Objective of literature review. Find diabetic population among
the adult population.
2) Look for secondary sources of information- Source is origin from where you get the
information. For secondary sources, authenticity cannot be guaranteed. They are not first hand e.g.
books, newspapers, wikipedia, google. But they are good to start with.
3) Primary sources- They are more original and authentic than secondary sources but also more
technical. It can be understood by experts only. We can use primary sources to cite the information
and secondary sources to find. Scientific peer reviewed journals are 99% authentic as they go for
peer review where its either accepted or raised with objections. Peer reviewed journals are
published only after validating the correctness. Its very rigorous.
4) Choose appropriate database- E.g. Online or offline
5) Key words selection- People should know how I got the information. Important thing in
research is reproducability.
6) Collect- Read all information but if the dataure. If its large then refine or use specific keywords.
See the current developments and updated information. Focus on recent literature. If historical
review then we can go back. If applied research is there e.g new disease then go four recent
literature. We should maximum go for 3 years, 5 years and not more than 10 years of literature.
Exception is classical papers which can be quoted. Outdated information is not useful.
7) Organize- Organize your data as per various classifications or levels. Sort the data under
appropriate headings. There is a software called Mendeley which helps in organizing data. It is used
for reference manager.
Organizing involves personal creativity. We can use APA style of referencing. Article list in excel.
Author Title Design Measurem Result Conclusion
ent tool
Sharma Yoga & The Group Memory & Increase
Children attention memory
scale by 20%
8) Critically read, appraise and summarize- We read and summarize the 10 pages information
into 2 to 3 pages.
Types of Literature
1) Original Articles
2) Single case study
3) Review Articles- It is of 3 types
a) Narrative review- It is by experts in the field. Bias is possible even though we respect the
experience of the teacher.
b) Systematic review- You choose the filter for articles. There is a QC of data.
c) Meta analysis- Here we do systematic review and also statistically combine/pool the effect size.
It shows the strength of evidence. Pratyaksha pramana. The reliability of evidence is called the
strength of evidence. It is highest in Meta analysis as compared to systematic and narrative review.
E.g. review articles (experts come together and write), using statistical tools to bring effect size.
Effect size- It is the effect brought by the intervention. The size of effect is the effect size. It is
expressed numerically. Meta Analysis summates effect size for all and brings a common effect size.
E.g out of 100 values, we would get one number showing the average of all the effect sizes.
In this articles are systematically analyzed and involves summation of reviews. In this rigorous
statistics is involved. Even if domain is same (E.g. diabetes), but still it needs to be organized for
effect size in large data.
Through Meta analysis(process) we get effect size(number). Only for practical purpose and not for
theory. It is like a numerical survey and statistically rigourous. It is done on review paper on
experimental basis.
IV--------------------------------------------DV
CF 2
Therefore, we must reduce CV and we must enhance the rigour of our design. This can be done by
increasing all types of validity.
Population & Sample
Research is divided into two areas.
1) Methodology in research (Concepts, terminology)
2) Statistics (Derive meaningful information)
All people eligible to participate in a research are called population. The population is the group of
people defined by our research question e.g Msc students and Bhajan effect, then all Msc students
who attend Bhajan will become population.
A small subset from big population is sample. Sample is necessary as not possible to take all
population because resources will large and unaffordable. So we try to take a small sample from
population.
H(a)= Effect of yoga on type II diabetes in Bangalore
Population= Number of people with type II diabetes in Bangalore = 12,10,200
Sample= A group of 300 people from the population.
How we should select sample to ensure that it is a true representative of whole population. If
sample is a true representative then probability will increase.
The process of selecting subjects/sample randomly from population is called random sampling. If
we select the people randomly then the sample is called the true representative of the population. It
is done before selecting people. Whereas the process of dividing sample in a group is called
randomization. It is done after selecting people. If done randomly then we ensure that the
noise/disturbing factors are equally spread across all groups equally.
Types of Sampling
1) Probability sampling (Random sampling)- It means each person has equal chance to
participate in the study. This is done to have representativeness of population in sample. It is of four
types.
a) Simple random sampling- Here first we list whole population (say 2,20,100 people) and then
using tool e.g Randomizer.org get randomly 300 people. Each and every individual in the
population has equal chance to participate in the study. Electronic method is quicker and simple
compared to chit method/Lottery system.
b) Systematic random sampling- Here we select people systematically. The choice of 1st person is
random, then it becomes systematic. But uniformity should be there in selection. See below e.g gap
of 5 people in selection.
17 22 27 32 37 42 47 52
c) Stratified random sampling- Once you know the nature of population in advance, select
layer/strata of sample based on same preknown proportion. E.g you know colourblind men:
colourblind women = 3:1. If we know this, we shall not take same proportion is our study as gender
is effecting the study.
So E.g we shall take 75 males and 25 females in a sample of 100. E.g smartphone addiction in rural
and urban areas will be different, so we shall take higher proportion of subjects from urban areas.
We fix this proportion from previous literature.
Here we are selecting subjects whereas in matching we allocate. Matching (Implementation level) is
done after stratified sampling (Planning level).
4) Cluster- Here we create blocks or clusters for huge geographic region. When we select clusters
randomly its called cluster random sampling. E.g make more small clusters and select small number
of people from all.
In case density of population is more in some clusters, then we can
combine with stratified random sampling. It shall be a combination of 2
probability sampling, primary one will be cluster while secondary one
will be stratified.
1) Convenient sampling- We put notice in notice board and whoever is interested can come and
join the experiment. But if a person is not coming to see the notice board, he will not know. The
way out is to take more people. 80% of researchers do convenient sampling. You conveniently
recruit people into study. You choose the method.
Note: In case of Multi centric studies like you study subjects in two yoga universities, the external
validity will be increased.
2) Cluster sampling- When you have large areas, you divide and select. E.g survey.
3) Snowball sampling- Here we show a presentation and then ask participants to answer the survey.
It spreads like word of mouth, if people like they tell others. Snowball becomes big as it roles in the
snow.
4) Quota sampling- I need 300 male and 300 female. As soon as I get that number, then I stop
recruiting more sample.
5) Purposive- Effect of meditation on mind. E.g great yogic with more than 10 years of meditation
experience will be taken. The purpose is well defined and will select target population as per that
purpose. The purpose need to be very unique/Special.
Note: The bias at recruitment level may be because of following non probability sampling.
Concept of Validity
There is a concept of Validity. When a thing is serving its purpose what is is supposed to. Before we
study types of validity, its important to know about types of varirables.
Variable is something which is not constant. Its score of values change. E.g. Age.
Cause Effect
(IV)- It is cause related. It is decided and manupulated by researcher. It indicates cause of the effect.
It tells story about the cause.
(DV)- It is effect related. After the IV is fixed, what are changes in dependent variables. It shows
effect of the intervention.
(CV)- It is unwanted noise. The researcher is not interested in the study of noise but it effects the
dependent variable.
E.g.
IV
1) Frequency of yoga practice – Daily/1 Week/2 Week
2) Duration of yoga – 1 Hour
Effect (DV) will be more if duration and frequency increase. CV is like memory tablet.
Now lets discuss types of validity.
1) Internal Validity- A research design will have hight internal validity if CV is less. That is only
IV cause/influence DV but not CV on DV.
2) External Validity- It refers to generalizability of results across all locations. Meaning result is
not restricted to small group or place. We might also say that our research is same for all conditions.
But more a research is generalizable then external validity is high. E.g Yoga in prashanti and Yoga
in Bengaluru is different.
Construct means Imaginary/Unseen/Intangible which you can't measure directly but the researcher
is interested in measuring.
First we shall define that imaginary concept. We are free to choose the way we want for definition.
Operation tool is not important but how we measure is important.
E.g. Anxiety
So construct is measurement at physical level. We need to see if tool is measuring effect. We need to
use secondary/Imperfect methods but we have no choice as need to measure.
E.g. Instead of measuring reaction time of a person through stop watch. We can use measuring
scale. One person will drop the scale and other person will catch.
Construct validity means whether tool which we use to measure construct is really measuring the
intended construct.
Important- All above 3 types of validity are done before data collection
4) Statistical Validity- We need to use right statistical test for right design and situation. We use the
right tool and from that did we infer correctly or not? E.g wrong to use MRI for fever. This is done
after data collection.
Types of Design
There are three types of research design based on randomization.
1) Experimental- Also called true experimental design- We allocate subjects randomly. There is
cause and effect relationship between variables. Randomized control trial is an example of
experimental design where we have control groups. Control groups don’t participate in the
intervention but rest all conditions are same. They help us to quantify the confounding factor.
Experimental design focus more on CF’s and how to control them.
2) Quasi Experimental Design- Here multiple groups are there for comparison but the second
group is not a control group. It appears like experimental but there is no randomization.
3) Non Experimental design- No multiple groups are there for comparison and no randomization
at all. There is no cause and effect between variables.
E.g I study memory. IQ influence memory. But some people are intelligent and some are below
average. But its not practically and experimentally possible to control it. So we go for random.
Note: We must study/measure IQ as it is a CF. We need to use statistical tools to control it.
2) Quasi Experimental design- Quasi means like. This like half selection is random and half not
random.
3) Non Experimental design- It is of five types.
3.1) Correlational design
3.2) Cross sectional design
3.3) Cohart
3.4) Case study
3.5) Survey
3.1) Correlational design- Find relationship between two variables. E.g participating in Bhajan and
higher marks. Correlation can be positive or negative or zero.
-1 0 +1
The measure of correlation is between +1 and -1. Higher value, higher the strength of correlation.
There are two aspects of correlation- Signs & magnitude. When we talk about magnitude, then we
ignore sign, vica versa. E.g between -0.95 and 0.75, we conclude that -0.95 is strong and between
0.85 and. .075, we conclude that 0.85 is strong.
-1 0 +1
(Below average) (Ordinary) (Super)
Given two variables- Anxiety and depression, we measure both and establish correlation. This is a
one time assessment.
E.g.
City Boys -------------------------------------- practice yoga
College Boys ----------------------------------practice yoga
3.3) Case study- It contains detailed information about some person with special abilities. We
document and present it. The information is extraordinary and are compiled and represented using
case study. E.g Yogis who stop heart beat for long time or remain buried. It is difficult to measure
such, so we do case study.
3.4) Survey- In survey we find the nature of design. E.g. how many smokers & non smokers in
Bengaluru. How many teens have smartphone addiction and how many not.
3.5) Cohart- Cross sectional design is also known as Cohart design in medical field. Cross
sectional term is more popular in Pschyology. In cohart we describe the nature of problem. E.g
people with radiation and people with no radiation do yoga. What is the change in effect. Here
randomization is not possible and there are more than 2 assessments.
Pre Post
People -------Yoga------- 10 people
10 people
with radiation
Important Note: Cohart study becomes Quasi if there is no randomization. E.g in radiation
studies, we can't do randomization, as people with radiation must be in one group and people
without radiation must be in another group.
Concept of Reliability
Reliability is a concept which expresses consistency of measurement. E.g object should show same
weight if measured twice. It is of four types
1) Test Retest reliability- Here we see correlation between 1st test and 2nd test. We conduct retest
and assume that nothing changes in between the 1st test and 2nd test. This type of reliability is must
for a new test. Test retest is a measure of temporal stability i.e how stable is the construct. If
correlation is more (Between 0 and 1) then temporal stability is high. If correlation is 0, then
temporal stability is not good.
2) Split half reliability- We use this for questionaire e.g. general health questionaire in
psychology. E.g we split items of questionaire into two halves e.g. odd and even numbered
questions split or first half and second half of questions split. And then find the correlation among
odd and even numbered portions of questionaire or first half and second half of questions. But
important point to note is that domain and questions must be same.
3) Internal Consistency (Cronbach's Alpha)- Here we summate and divide all possibilities of
correlation. Its like split half but we find all the split combinations in the questionaires e.g odd and
even, first half and second half, etc etc. Suppose we get 150 types of split, we find correlation of all.
Then aggregate and state it. It is also called Cronbach's Alpha, which is common value not average.
Its calculated by a formula.
4) Interrater reliability- In this three judges independently evaluate the event. They are called
raters. Here we see if the raters agree with each other’s opinion. E.g in a judges in a Miss India
event. Here we find the consistency of marks among three raters e.g 9,9,3. The correlation between
marks awarded by the judges. We see the general opinion which are common and high.
It is type of noise. All sources of bias will introduce noise. If confounding factors are more then the
variability will also be more.
Types of CV.
1) Researcher does not know (E.g. Diabetes patients eat sugar)- We use randomization to control
this. The researcher assumes that Confounding factors (CF's) are distributed equally.
2) Researcher knows- It has two sub types.
2.1) Know controls experimentally e.g. sorts people like dull and intelligent people. E.g IQ.
2.2) Knows but cannot control directly/experimentally except statistical methods e.g. in case of
age, gender etc.
It is important to note that something will always be unknown which will be in form of error.
Before we study examples of design it is important to know about subjects based classification.
1) Within (Same subject, different conditions)- It means 2 points of measurements only and same
subjects participate.
2) Between (Different subjects, different conditions)- It means 2 different groups and different
subjects participate.
3) Mixed (Both 1) & 2) exists together)- It means both 2 different groups and 2 points of
measurements exists together.
Design Examples
One group Pretest Post test design- Only within is possible. E.g of pretest or post test 1 group.
30 People
Pre Post
Measure Point 1 Measure Point 2
-----------------Yoga---------------
(Before starting) (After the Intervention)
But it lacks internal validity as no control over confounding variables. E.g eating junk food etc. E.g
measurement of weight at prashanti and home. It can be influenced by no work in home & luxury.
We can't tell how much CF's influenced the effect size.
Here there is another concept called ANOVA. ANOVA means analysis of variance. ANOVA means
that existence of more than 2 groups or more than 2 points of measurements then design is called
ANOVA. The latter is also called RM ANOVA (Repeated measures anova).
One group Pretest Post test design can also be called ANOVA if we have more than 2 measure
points.
Two Group Pre post design (1 group can be a control group)- To enhance control on CF's we
establish control group. Control group helps distribute CF's over two groups and also helps quantify
noise through control group. Control group and main group (Experimental group) are same in all
aspects/conditions except that control group do not actively take part in intervention. It can be of
three types.
2.1) Two group pre post design- Mixed group – Refer below example.
Control group helps us quantify the CF and also spreads it equally among all groups. E.g the Effect
size of Control group is 10 whereas Effect size of yoga group is 40, then 40 minus 10= 30 is real
effect size.
2.3 Two group pre post design- Within group- Within group refers to same subjects but different
conditions. It is very difficult to get control group for patients in hospitals. We can't ask them to do
nothing. So we do week division. 1St week patients do no yoga and 2nd week they do yoga.
Yoga (30 people)
Measurement pt 1—1 week no yoga-----Measurement Pt 2------1 week yoga-----Measurement
pt 3
The first week no control group is also called waitlist control group.
2.3)- Without control group but different group- Mixed group- It is used for comparitive
research.
Pre Post
Yoga ----------Yoga-------
15 people 15 people
In this ayurveda is also called active control group as its not idle like normal control groups. We can
compare to ayurveda, what benefit we get in yoga.
In two group design also we can have ANOVA, if more than two measurements points are there.
1) Normal control group- all things same as experimental group except intervention
2) Active control group- all things same as experimental group including intervention
3) Waitlist control group- The group doesn't not do the intervention for a specific period but later
does.
3) Three group pre post test design – There are more than two groups in research design.
It also can be of two types.
Yoga
Ayurveda
Control
3.2) Without control group
Yoga
Ayurveda
Naturopathy
To study short lived effects which are immediately visible after practice, we have concept called
washout period. E.g. study of Kevala kumbhaka after Kapalbhati.
The longer the effect the Intervention lasts, the longer the washout period. Both Intervention and
washout period are directly proportional.
Washout period- Gap between two successive practices to ensure that effect of previous
intervention is vanished. Here lesser the subject variability in design better it is. Subject variability
is less in within group design as same subjects participate in different activities.
However washout design takes extra day for gap period. So it is always advised only for smaller
sample size. Refer below e.g of left nostril breathing and right nostril breathing.
This is to ensure that right nostril effect is not influenced by left nostril effect. We should be able to
independently know effect of both. So gap is given to independently measure the effects of two
interventions.
Now some people will say that we will have right nostril first and not left nostril, it shall have
different effect that way. This is phenomina called Order effect. So we put control.
We divide the 30 people of yoga group into two groups of 15 members each.
Important Note: The above design will be categorized as within group design only as same set
of people are participating in all activities.
Above is also RM ANOVA as there is 4 times assessment points.
Restriction- We restrict the people who can participate in the study. We define inclusion and
exclusion criteria. E.g. old age people we exclude from participating in the study. One of the
strategies to avoid confounding is to restrict admission into the study group of subjects who have
same levels of confounding factors. E.g. H(a) tries to find relation between physical activity and
heart disease. Suppose age and gender were two Cfs of concern. Therefore CF's can be avoid by
making sure that all subjects were male between age 40-50.
Limitations:
1) It reduces the number of subjects who are eligible (issue with sample size)
2) Restriction limits generalizability i.e. you can't apply the study to women and other age groups.
The restriction factor is a source of variablity that is not of primary interest to the experimentor.
This is to remove the effect of nuisance factors that can be controlled. Here we exclude people from
the study itself. It means total removal. E.g. we say surya namaskar improves memory. But here IQ
is also confounding factor which effects memory. So we say, if IQ> 140 and IQ<30, then these
people will be excluded from the experiment itself. This is basically barring people from taking part
in the study so that the condition doesn't exist.
Matching- Here we make sure that influence of CF are equally distributed. In this there is no
random allocation. It is done only for Quasi experiment as he intentionally choses samples
according to the CF's e.g. gender. The gender will have effect so we shall have equal male and
female in both groups. So source of CF's are equally distributed in both groups. There must be
atleast two groups however, one group can be control group or another group. It appears like
blocking but randomization is not there. It is done at the implementation stage.
Here we divide the number of subjects as per the source of CF i.e age
Cross over design- When there is an order effect, then we do cross over design. This is already
discussed.
If Quasi experiment is there, then go for matching, No randomization but control is there. If true
experimental experiment is there, then go for blocking, randomization is there. Sometimes we need
to go for Quasi experiment as we are working with less sample size. This is also according to nature
of research.
Statistical methods of controlling variance
Blocking- Blocking is something which you do at the level of analysis. You create homogenous
block of various Cfs and handle them statistically at analysis level. Its more statistical. After
conducting experiment, we separately analyse these homogenous blocks. Blocking is
randomization.
E.g. Male and female or Low IQ and high IQ. During experiment stage, we do the experiment
normally. But we make two blocks only at analysis/statistics level E.g. IQ (Low IQ and high IQ).
These blocks are homogenous in that particular CF's. We give a test to figure out the IQ and then
seggregate at analysis level. Actually it is not an experimental but a statistical way of control.
However, we put in experimental as we have to plan in advance.
Block what you can and randomize what you cannot is the thumb rule.
ANCOVA- Its is analysis of covariance. It controls for covariate. Covariate is a special variable
which you measure in an experiment in order to control for that variable later (statistically) in an
experiment. E.g. IQ, I must know the information of people's IQ. I measure IQ and keep it as
covariate in an experiment. I cannot exclude person experimentally so I measure it to control
statistically.
Note: Control of variability is not possible is a non experimental design. True experiment has
maximum internal validity whereas Quasi experiment has middle internal validity.
Note: Two concepts which relate closly to single blind study are Plasibo and Nosebo. Plasibo
means pschyolgical effect which brings healing without actually taking the medicine. Patient
getting cured by taking sugar pill which appears like tablet on doctors advise. Nosebo means
negative expectations result in negative outcome.
Type of Biases
Bias and CF’s go together. Biases give rise to CF’s but both are part of same phenominun. There are
five types of Biases:
1) Researcher Bias- The researcher shows bias against one group and treats other group well. E.g
gives good food for one group while ignoring other.
2) Subject Bias- This is where subject modifies the response to influence the selection process
especially during interview.
3) Selection Bias- It refers to gathering sample from special group. Selection bias at the time of
selection. Eg. I gather sample from gurukula to show the effect of yoga better. Non probability
sampling techniques contribute to this.
4) Recall Bias- Sometimes its very difficult to recall e.g. doctor asking patient that last 6 months
how many times mood swings, or how many times you smoke in a day. If current mood is good
then its very difficult to make recall of past bad moods. So its making bias while in a psychological
state. It refers to over estimation or under estimation of the previous experience.
5) Observer/measurement bias- While taking or measuring data or doing observation you make
error. E.g. doing wrong BP reading while on phone.
6) Instrument Bias- It refers to error in tools. The instrument not caliberating well. Caliberation
means that instrument must show 0 point if nothing is being measured.
7) Publication Bias- It refers to wrongly reporting results or not reporting negative results. We
only report positive and hide negative results, then bias will come. E.g. in a study 3 data showed
significant change while one didn't so we reported only 3 data. It also results in File drawer
problem (piling of Non reporting files).
Single blind study- When a participant doesn't know which group he is in, its a single blind study.
It is to control participant bias, E.g. patient lying to doctor that he is fine but he has fever. Refer
note below.
Double blind study- When both experimentor and participant both doesn't know which group the
persons belong, its called double blind study. A third person will code the participants in groups
secretely. It is to control experimentor's bias e.g. experimentor gives extra badam to one group or
one extra yoga session to one group.
Triple blind study – The analysis person will also not know which group the people belong. They
will be given code. The statistician will also not know. The third person will be objective & neutral.
So minimum bias. This is to control statistical bias e.g. statistically making one group look
superior through points.
Types of measures:
1) Prospective- Its is forward/future oriented i.e upcoming measurement of patients e.g after
Intervention.
Here also we can measure repeated say every year.
2) Retrospective- E.g. 2 years case history of patient in hospital. Also called archival. E.g. last 10
years trend or events. We go back.
3) Longitudinal- 70-80 years, we go long back.
3.1) Twin studies concept- We trace persons through their life span. E.g 2 twins, everything is
same but environment is different. We study lifespan development, psychology & human brain
development from embrio till death. In short its nature v/s nurture or genetic v/s environment factor.
Phase 3- Analysis
We analyse to conclude and generalize. First it is important to know the levels of measurements.
Ordinal- Here we order ascending or descending e.g rank of scores. However, the distance
between the attibutes doesn't have not have any meaning or not intepretable. E.g first class is
better than economy and business class is in between. Business class to economy varies from flight
to flight and from airline to airline.
2) Continuous variables
a) Internal
b) Ratio
Interval- Here there is a constant distance between two adjacent categories. This is due to the
fact that numbers can be expressed in quantities. E.g the amount you pay for plane ticket, the degree
of farenheit at your destination, the number of flying miles. E.g temperature scale, as gap between
each points is uniform. But it doesn't have zero point. E.g there can't be a zero degree celsius and
therefore it can't be used for ratios.
Ratio- The ratio scale is the most informative scale. It is an interval scale with the additional
property that its zero position indicates the absence of quantity being measured. Here there is
always an absolute zero that is meaningful. This means we can construct a meaningful ratio with a
ratio variable. The criteria is that at zero there should be total absence of property. E.g there is
nothing like zero height and weight. So we can say that person A weighing 50 KG is two times as
person B weighing 25 KG. E.g there is nothing like zero temperature so no ration is possible.
It is important to know that this is a hierarchy in levels of measurement idea. At each level up
hierarchy, the current level includes all the qualities of the one below and adds something new. Like
General, sleeper, 3rd AC and 2nd AC. Its better to have higher level of measurement (Interval or
ratio) than a lower one (Nominal or ordinal). We must classify variable as per nature. A higher
measurement can be converted into a lower one.
We try to infer population parameter from sample statistics derived from sample. This process is
called inferential process.
As we discussed earlier, from sample statistic we infer information about population parameter. We
try to generalize the result.
1) Descriptive- It describes something about sample. We have information in hand only about
sample.
2) Inferential - Statistics which help us do process of inference (from sample to population). Here
we assume certain conditions about sampling distribution. Ideally we need to repeat the experiment
many times to increase validity but its not possible because of resource crunch.
1.1) Measures of central tendency- It is something which tells about central theme. It tell about
signal. E.g. I conduct experiment- "Did you like breakfast?", all values are different. But we are
only interested in getting only one representative value through which I get maximum
information/story of sample.
The representative value so calculated are called triple M (Mean, Median & Mode)
Mean
We can't take all values, but we are interested in one value which tells maximum story about
sample. So we add all values and divide by the total number of values. E.g. in the below table the
mean is 3 as it is most neutral. 1 is low extreme and 5 is high extreme. Mean = Sum of all values
(1+2+3+4+5)/total number of values (5)= 3. Mean has a nickname i.e average.
Data
1
2
3
4
5
Data
1
2
3
4
500
Median
In case of median, we first of all arrange all the scores in ascending or descending order. Then after
arranging the data, we pick the middle most value. In case the scores are even in number, then we
take the average of the 2 middle most values. Medians are used where there are extreme values in
the data.
The median in the below table is 3 as its the middlemost value in the list arranged in ascending
order.
Data Arranged in
ascending order
1
2
3
4
500
The median in the below table having even number of values is sum of two middle values i.e.
(2+3)/2= 2.5
Data Arranged in
ascending order
1
2
3
4
Mode- This refers to score occurring most frequency in the table. We make a frequency table.
E.g we have scores 1,4,8,6,2,7,2,8,9,4,2,6.
Scores Frequency
2 III
4 II
6 II
8 II
7 I
9 I
Since 2 occurs 3 times (most among all) therefore its is the mode.
Note: Normally proportion of extreme values is less. Else sample is also like that.
Dispersion means scattered. That is scattering of scores. We study how much scores are deviating
from the mean. It tells about spread/variability/noise.
E.g. in the below table, there is high dispersion downward due to extreme value.
Scores Dispersion
1
2
3 3
4
500
4 4-3 1 1
5 5-3 2 4
0 10
Since of (X-X`) is zero therefore we square it. The variance is calculated as below:
So standard deviation in our case will be square root of 10/5-1= 10/4=square root of 2.5.
Important Note: We calculate square root as we squared the values of (X-X`) earlier. We need to be
in the same scale.
We use N-1 as denominator as we describe the sample. In case of calculating standard error for
population, we use N as denominator. Statistics is done only for those scores which vary. E.g. there
are four cornors in a classroom, if 1st person will have 4 choices, 2nd person will have 3 choices
but last person will have no choices. When mean is fixed only N-1 values can vary. This is concept
of decrease of freedom i.e. actual number of scores which can vary.
The summary is that measures of central tendency tells about signal and measures of dispersion tells
about noise. And in research, we need to reduce signal and increase noise.
When we do the process of inference from sample to population, it is called Inference. That is we
predict about population statistics from sample statistics.
There will be variation in the result if we select multiple samples but the it will be less.
Inferential Process
Sample Error- The variation in each of the sample statistics which you derive after sampling
each time is called sample error. Roughly it is the variations among two sample statistics. This
sample error tend to be smaller if sample size and sampling technique is random (i.e. sample is true
representative of the population). Refer below example of two experiments with multiple sample
statistics.
Sample Means of group 1 (6 means of 6 samples) = 36, 35, 37, 34, 33, 34.5 (Sample error is less)
Sample means of group 2 (4 means of 4 samples)= 89, 35, 61, 21 (Sample error is more).
The reproducibility and generalizability of the research will be more if we do experiment many
times. Further there might be less bias if we take information from multiple samples. Ideally we
need to do experiment many times then we can get more sample mean (More sample mean = more
representative). But its is not practically possible. Therefore, we use statistics to infer the below
values about population.
We derive the above two values from sample statistics i.e. sample mean and Sample standard
deviation through inference.
To understand this inferential process, we need to study a theorum on nature distribution i.e. Central
Limit theory
When we take sample, we need to generalize. We need to find how much sample is reliable and
accurate. How much sample is true representative. We can't afford to replicate experiment again.
So way out is central limit theory.
Using this we infer U` and SE from one sample statistics only. It has three statements.
Statement 1- When the size of the sample is 30 or greater than (>) 30, then the sampling
distribution will be normally distributed.
Sampling distribution is the distribution of all the means of multiple samples taken. However, in
practice we don't take multiple samples rather use statistics to generate multiple sample
statistics from one sample.
E.g. Sample A (SS = 10, Mean = 35, SD= 20), Sample B (SS = 10, Mean =40, SD= 15), Sample C
(SS= 10, Mean= 38, SD= 18). The distribution of all this means is called Sample distribution.
Step 1- We shall write all unique values, they are scores in the below table.
Step 2- Write the frequency next to the unique values, e.g 2 is appearing 2 times. Similarly write for
others.
Scores Frequency
2 2
3 1
4 3
5 1
6 1
7 2
9 1
Below is the chart of frequency distribution, The X axis is the unique scores and the Y axis is the
frequency.
3.5
2.5
2
Frequency
1.5
0.5
0
2 3 4 5 6 7 9
Note: The frequency distribution of a continuous variable is called as histogram. There are no gaps
among the bars for continuous variables. Continuous variables can take any value in an interval i.e
they may overlap whereas the discreet variable are unique values. The above figure shows bars for
discreet variables.
Normal distribution can be precisely defined using mean and normal distribution.
The main characteristics of non normal distribution are skewness (left tail or right tail) and kurtosis
(steep or broad).
Now let us return to the Statement 1 of central limit theorum. As we know that on the basis of one
sample statistics we conduct experiment and we derive multiple sample statistics through the
statistical tools on the basis of our sample statistics. One such statistical tool is Boot sampling.
The frequency distribution of multiple means of different samples is called as sampling distribution.
Now statement 1 states that "When the size of the sample is 30 or greater than (>) 30, then the
sampling distribution will be normally distributed." We have already understood what is sampling
distribution. Now what is normal distribution?
If the graph of distribution we derive from sample distribution has bell shape it shall be assumed to
be normally distributed. It is also called gaussian curve in mathematics.
The reason behind this assumption/hypothesis is that if sample distribution is not normal (i.e bell
shaped), then our analysis shall be complicated. The bell shaped distribution can be perfectly
defined using only mean and standard deviation in mathematics. It is easy to run statistical
parametric tests and it is easy to comprehend. There is no bias left or no bias right in normal
distribution. 50% of values are less than mean and 50% of values are greater than mean.
If no normal distribution is there, we might need 10 more parameters apart from mean and standard
deviation.
Normal distribution are unimodal around centre, symmetric in middle and asymptotic on the tails.
Normal distribution's right side is a mirror image of the left side. They are perfectly symmetrical
around its center. Normal distributions are continuous i.e have tails that are asymptotic, which
means they approach X axis but never touch it. The implication of this is that no matter how far one
travels along the number line, in either the positive or negative direction, there will still be some
area under any normal curve. The further a data point is from the mean, the less likely it is to occur.
E.g Intelligence, height and blood pressure. Most of the data tend to cluster around the mean.
6) 68% of the area of a normal distribution is within one standard deviation of the mean.
7) approximately 95% of the area of a normal distribution is within two standard deviations of the
mean.
8) About 99.7% of the area under the curve falls within 3 standard deviations of the mean.
Standard deviation determines the height and width of graph while the mean is at centre. When
standard deviation is large, the curve is short and wide whereas when the standard deviation is small
the curve is tall and narrow.
Example: 95% of
students at school
are between 1.1m
and 1.7m tall.
Assuming this data
is normally distributed can you calculate the mean and standard deviation?
95% is 2 standard deviations either side of the mean (a total of 4 standard deviations) so:
It is good to know the standard deviation, because we can say that any value is:
Note: The number of standard deviations from the mean is also called the standard score, sigma or
Z-score.
You can see on the bell curve that 1.85m is 3 standard deviations from the mean of 1.4 so:
Your friend's height has a Z-score of 3.0.
It is also possible to calculate how many standard deviations 1.85 is from the mean.
How far is 1.85 from the mean? It is 1.85-1.4= 0.45m from the mean
How many standard deviations is that? The standard deviation is 0.15m, so:
0.45m/0.15m = 3 standard deviations.
Example 3: A survey of daily travel time had these results (in minutes):
26,33,65,28,34,55,25,44,50,36,26,37,43,62,35,38,45,32,28,34
The mean is 38.8 minutes and the standard deviation is 11.4 minutes.
To convert 26:
So 26 is -1.12 standard deviations from the mean. Given below are first three values.
20,15,26,32,18,28,35,14,26,22,17
Most students didn't even get 30 out of 60, and most will fail.
The professor decides to standardize the scores and fail only those students who are 1 standard
deviation below the mean.
The mean is 23, and the standard deviation is 6.6, and these are the standard scores:
The cut off is 1 standard deviation below mean i.e below 16.4 marks.
It makes life easier as we only need one table, rather than doing calculations individually for each
value of mean and standard deviation. Z score helps to compare two variables in difference scales
through standardization. Z score is for each value. It means per unit change in SD, what is the
change in mean.
Statement 2- The mean of the samping distribution is equal to true mean of the population.
Assuming that Statement 1 is true, the average of all means of sampling distribution is assumed to
be equal to the mean of the population.
Where N= sample size and SD is equal to the standard deviation of the sample.
Note: Central limit theory works when sample is a true representative i.e. random techniques
adopted in selection of participants and probability of such research is also more.
If we take more sample 1) Distribution becomes more and more normal, 2) The spread of the
distribution decreases as per central limit theorem.
The distribution of an average tends to be normal, even when the distribution from which the
average is computed is decidedly non-normal. Example an investor is looking to analyse the overall
return for a stock index made up of 1000 stocks, he can take random samples of stocks from the
index to get an estimate for the return of the total index. The samples must be random, and at least
30 stocks must be evaluated in each sample for the central limit theorem to hold. Random samples
ensure a broad range of stock across industries and sectors is represented in the sample. Stocks
previously selected must also be replaced for selection in other samples to avoid bias. The average
returns from these samples approximates the return for the whole index and are approximately
normally distributed. The approximation holds even if the actual returns for the whole index are not
normally distributed.
Central Limit Therorum is useful is finding about population parameter.
Normal distribution is only for continuous variables. The curve will never touch X axist otherwise it
shall be certain. Both sides are mirror images i.e symmetry. Certainity means no probability.
Normal property is also called density plot. For Binomial distribution for discreet variables there
will be gaps in charts.
Box Plot
Box Plot helps to tell us about nature of distribution (Skewness level) and also it is an outlier
analysis. It also helps us to know the distribution of data plot and understand the extreme values of
the plot.
When to use the box plot:
First we divide the whole data in ascending or descending order for Median. Then I divide the data
in four parts/segments. Each part is called quartile.
Outliers
Upper Fence
1st Quartile
25%
Lower Fence
Outliers
Interquartile Range
Interquartile range tells about the measure of spread or dispersion.
Range of Box/Inter Quartile range = Upper Quartile – Lower Quartile
We don't take last values for calculating range as they can be outliers.
The lines are called Whiskers like Cat's hanging moustache. Any value lying above fences are
called outliers. Outliers are extreme values beyond these fences.
1
2
3
--
4
5
6
--
7
8
9
--
10
11
12
Fence 16.5
12
11
10
9 Upper Quartile
8
7
Median (6+7)/2
6
5
4 Lower Quartile
3
2
1
Fence
- 3.5
Here IQR will be Inter Quartile range = Upper Quartile – Lower Quartile i.e. 9-4= 5
Hence in this example there are no Outliers as all values are within the upper and lower fence.
However, if in this example there is value 20 instead of 12, then 20 shall be outlier as it falls outside
the upper fence i.e 16.5.
If X` is the mean of the sample then we define the confidence interval as the intervals in which the
true mean of the population may lie. Here the sample mean is treated as point estimate.
E.g. mean weight of all the apples in the orchid is 149gms. Sample mean = 149 gms.
Therefore, we create a confidence interval of 147 gms and 151 gms whithin which population mean
would lie.
% confidence- The higher the confidence %, the wider the interval. E.g in 99% confidence interval,
there is 1% chance that truth is outside.
Standard deviation/Variations- If population variation is less then sample variation is also less.
Greater the variations, the greater the confidence interval.
Sample size- The larger the sample size, the more information you have about population and the
more similar it is to population. There will be lesser sample error. Therefore lesser the confidence
interval.
Calculating confidence interval
Informal method
Traditional normal based (90,95,99)
Boot strapping
Number of samples = 40
Mean X = 175
Standard Deviation: s = 20
s
X±Z
√(n)
Step 2: decide what Confidence Interval we want. 90%, 95% and 99% are common choices. Then
find the "Z" value for that Confidence Interval here:
And we have:
20
175 ± 1.960 ×
√40
Which is:
175cm ± 6.20cm
In other words: from 168.8cm to 181.2cm
The value after the ± is called the margin of error
The margin of error in the previous example is 6.20cm
NHST
After literature review, we observe or collect data. Based on the evidences we accept or reject our
original statement. Systematic process of doing inference is called NHST.
How to do your enthusiastic research. The whole process of evaluating research evidence is called
Null hypothesis significance testing (NHST). It is a process to understand and evaluate
hypothesis. It is a technical term.
Hypothesis is a technical term which gives an idea of your experimental result. It gives idea of
anticipated outcome. Like an intelligent guess, prediction. E.g. can yoga reduce blood sugar in
diabetes.
Null hypothesis- It negates the statement and means that there is no such situation existing. It is a
negatation of statement. There is no relationship of effect.
The whole process of evaluating a research idea is through null hypothesis. You state and then
evaluate null hypothesis. We can’t prove things in yoga as its a social science. However, we can
gain evidence for or against and then come to conclusion. NHST uses inductive/logic/reasoning. It
is a top down approach. Based on evidences we build on hypothesis.
Types of hypothesis.
1) One tail hypothesis- The direction is certain. E.g. Brahmari can increase memory or Brahmari
can decrease memory.
2) Two tail hypothesis- The direction is not clear. E.g. Brahmari changes memory.
H(01) – Table is in dining room. H(02)- table is in kitchen. H(03)- Table is in garden.
Steps of NHST
If P>0.05 then we fail to reject H(0)- Villian, If P<0.05 then we reject H(0)- Hero.
Nature of H(0)
True False
Reject H(0) Type I (alpha) Power (1-Beta)
Decision Fail to reject (1-Alpha) Type II (Beta)
H(0)
Sum 1 Sum 1
Conditional probability- The probability of occuring of an event provided another event has
happened. If two variables are linked then their sum is always 1. E.g Power is 1-Beta.
Hero of our research is null hypothesis. When you make decision, there is always a possibility of
making error. 1st type of error is called alpha error (Type I) and second type is called Beta error
(Type II).
Type I (Alpha)- When it doesn’t work and you claim it to work. It is a notorious/arrogant/serious
error. It is an overestimation.
Type II (Beta)- When it works and you claim it doesn’t. You had potential but you didn’t say. It is
an innocent error. It is underestimation.
Power- Right deciding the effects. Effect is there and so you claim. The effect of the intervention is
captured correctly.
(1-Alpha)- Doesn’t work and you also said doesn’t work. You didn’t prove anything by keeping
your mouth shut. It is not that much related to effect. Doesn’t capture the effect.
3) Fix Alpha- 0.05 (5%) ok- 0.01 (1%) good- 0.001 super
This means that researcher will not tolerate more than 5% of Alpha error so we fix it in advance.
There is no negative, its just a percentage. 95% of times its fixed at 5% only.
4) Fix power (1-Beta)- 0.80 /0.95/0.99. We take Beta as 0.20 (20%). So power is 1-Beta i.e 0.80.
Maximum power is 1.
5) Effect size- The amount of effect caused by the intervention. The maximum the intervention, the
maximum the effect size. The magnitude of intervention as measured by the measuring tool. It also
depends upon how sensitive your tool is e.g using weighing machine instead of traditional weighing
instrument. Effect can’t be seen as it is intangible but we can measure it. The effect size is estimated
from previous literature through statistically rigourous meta analysis.
6) Estimate Sample size- Thumb rule is that more rigorous the condition the more the sample size
is required. It depends on four factors
Effect Size Smaller effect is difficult to find so more sample size is required
Power Higher power is difficult to secure so more sample size
Alpha Lower alpha is difficult so more sample size
Two tail hypothesis Rejecting two tail hypothesis is difficult so sample size is bigger
We estimate effect size so that we may calculate correct sample size. Correct sample size is needed:
1) To save resources. Effect size (ES) helps to find correct Sample size (SS). More SS means more
resources.
2) Ethically not right to have more subjects than required. We are not dealing with Guiny pigs.
3) To ensure that we have sufficient power. Insufficient power causes indecisiveness/dilema/
Inconclusiveness. That is, in case of error/failure, we would not know whether our hypothesis really
didn't work or the sample size was less.
E.g P=0.04, which is less (<) than 0.05, we would reject H(0) i.e we conclude that our alternate
hypothesis H(a) is correct, i.e. Brahmari improve memory. But if P value is 0.51 which is greater
than 0.05, then we would fail to reject H(0). And there is a problem mentioned in Point 3 above.
1) Literary review- We can choose past literature which resembles in two aspects a) Intervention b)
Measuring tool.
2) Pilot study – If no past paper is there, we do a dry run with small sample size. This is to know
how a variable behaves or to get taste of future experiment.
3) Consult Experts- E.g. for purchasing colour TV, we check brand, price, budget, technology and
then choose as per our convenience. Fisher says if P>80% then sample size is sufficient, It is a
general norm. Cohen gives idea as Cohen's guidelines:
Cohen's Guidelines
Power Analysis
It is the process of finding effect size to get estimate sample size in apriori stage and getting power
in the post hoc stage. Basically find unknown from known factors.
P- Value
It means possibility of finding a change in data given Null hypothesis H(0) is true. Obviously the
lesser the P value the better it is. Because accidental happening of event is minimised. It would be
unlikely that event would have occured by chance/random and only by intervention it happened.
Higher the intervention, lower the P value.
ES is less + SS is more = We shall be fine (But ethically wrong to take more sample than required)
But if:
ES is more + SS is less = Because of low sample size there can be low power and more error
(Especially Type I error)
Statistical tests
All the two tests are done are data is secured.
Assumption tests
Before running any test, we need to check if we are eligible to run the test. For e.g we won’t have
same approach for dengue and fever. We check for diagnosis through eligibility tests. Before
comformative test we do preliminary test. The latter is to ensure that we are eligible for the former.
This is called assumption tests. A test you perform to check for assumption. Assumption is whether
you are eligible to perform parametric tests. If conditions in assumption tests are satisfied then you
can do parametric tests.
1) Check for normality- First we check the null hypothesis before performing test. H(0) is that
the data are normally distributed. We have already discussed the features of normal
distributions. The data should not be negatively skewed (tail skewed towards left) or positively
skewed (tail skewed towards right). Or the distribution should not have kurtosis i.e flat (SD more)
and steep (SD less). The name of the test for normality is Shapiro wilk test.
If this condition is satisfied then we go for parametri test else non parametric test.
Suppose P=0.012 then we reject H(0). H(0) is that data are normally distributed. When we reject
H(0) we cannot run parametric tests.
The steps are-
1) Write H(0)
2) Look at P value
3) See whether H(0) is reject or fail to reject
4) Make inference on hypothesis as to which test to run
2) Equivalence of variance- First we check the null hypothesis before performing test. H(0) is
that variance is equal across two independent groups. Here we check variance of subjects as
well as standard deviation variance. It is only for between group. We check whether variance is
equal across two groups. The name of the test is Levene’s.
If variance is not equal then it is the violation of parametric control. Here if the 1st condition is
satisfied, then we can run parametric test with correction even through 2nd condition is not satisfied.
3) The data must be atleast measured at interval or ratio. In other words the variable must be
continuous. Only then we do parametric tests.
E.g. P= 0.12, Here we fail to reject H(0), H(0) = means of two groups are same.
So we go for parametric test.
In assumption tests, we are happy if P value is > 0.05.
If all 4 conditions met then go for parametric tests else go for non parametric tests.
Based on assumption tests, we decide whether to go for parametric or non parametric tests.
Parametric tests- They assume some conditions (assumption tests) about the distributions. We
prefer to run parametric test if conditions are met as tehy have a higher statistical power. Also
normal distributions can be explained easily through statistical tests.
Non parametric tests- They are distribution free test. If conditions in assumption tests are not met
then non parametric have higher statistical power.
As discussed already H(0) must be known before test else it is a sin to perform test.
Continuous data
Discreet Data
Note: When design is fixed, the statistics is fixed. Design and statistical tools go together.
Chi Square Tests- It checks for proportions or ratio. If the proportions are significantly same. Is
there any statistical significant difference between two groups below of males and females.
M F
35 25
M F
28 22
It is of two types.
M
F
M
F
M
M
F
F
The proportion will be 50/50 for 2 level i.e 0.5 and 0.5.
The proportion will be 33/33/33 for 3 level i.e 0.33.
The proportion will be 25/25/25/25 for 4 level i.e. 0.25.
M F
35 25
Contigency table
Gender Yes No Total
Male 30X40/100 30X20/100 30
Female 70X40/100 70X60/100 70
Total 40 60 100
Then we get a P value, say P=0.01, If Expected frequency <5, then test becomes invalid.
Note- In correlation the relationship between two is continuous but this is discreet.
Note: After collecting observed data, we calculate expected, so it is done at the analysis level.