Você está na página 1de 5

Correlation USP Semester III 2019

Correlation and Regression


Research questions often probe the relationships between variables by looking at simple
associations. For instance, you might wonder about the relationship between the extent Page | 1
of public speaking experience you have and the amount of communication anxiety you
feel. Researchers might ask if the speed at which a report of an international crisis is filed
is linked to its likelihood of containing inaccuracies. Others might wonder how strong the
relationship is between communicator competence and satisfaction in marriage. These
sorts of matters invite using correlations.
In simplest terms, a correlation is “the extent to which two or more things are related
(‘co-related’) to one another”. Correlations simply identify the degree to which scores on
different variables coexist. Whether the associations are causally related or not depends
on the design of the study and is not directly measured by this statistic. A correlation
coefficient can have a value ranging from -1 to +1. Relationships at these extremes
(sometimes called “unit relationships”) are rare. Most of the time, researchers
find correlations that take the form of decimal numbers, or fractions. The correlation
coefficients themselves may indicate several types of relationships.
A direct relationship is indicated by a correlation coefficient with a positive sign
(although, of course, a positive sign rarely is included; it is implied by the lack of a
negative sign). This type of correlation shows that as one variable increases, so does the
other. Such a diagram as the one on the left reveals direct correlation (.79 in the example
below). One might imagine that the variable on the horizontal axis is a set of ratings of
the amount of eye contact a speaker has with an audience, and the vertical axis is the
perceived persuasiveness of the speaker. The positive sign (assumed, at least) of the
coefficient indicates that the relationship is a direct one. The greater the correlation, the
closer the data points are to an imaginary line that might be drawn through the center of
the distribution.

Semester III CU 2019 USP


Correlation USP Semester III 2019

Page | 2

Correlation coefficients preceded by negative signs indicate that the relationship is


inverse, which means that as one variable increases, the other one decreases. The
negative sign indicates the direction of the slope, not subtraction. Hence, a negative sign
does not indicate that the magnitude of a correlation coefficient is “less than” that of
another correlation coefficient.
Completing a correlational analysis takes some doing. First, a correlational study begins
with a hypothesis (or at least a problem question) that probes the association between
variables. Rather than asking if there is a simple difference between the means of two or
more groups, correlational hypotheses probe whether there is a variation in one variable
as another variable changes its values. For instance, researchers might suggest the
hypothesis that “there is a direct relationship between students' perceptions of a
teacher's source credibility and the amount of nonverbal immediacy behaviors,” or they
might advance the notion that “there is an inverse relationship between the amount of
jealousy communicated in messages between romantic partners and the length of their
relationships.” In short, a correlational hypothesis asks if there is an association between
one set of scores and another Second, to examine correlations, researchers must gather
data on two measures (and sometimes more than two measures) to permit examination
of the “co-” relationship under study.
Third, not only do researchers look at the relationships between two variables, but they
also take steps to consider the potential influences of other variables on the relationship.
Either by reviewing literature or theory, or through personal observation, researchers
prepare “short lists” of other intervening variables that might influence relationships.
Then, these individual variables can be controlled or, at least, measured in the research
study so that they may be studied and statistically explained.

Semester III CU 2019 USP


Correlation USP Semester III 2019

Fourth, once researchers decide to use correlations, they must check the assumptions
that underlie using the specific correlational tool. Naturally, the assumptions for different
types of correlations are different. Before researchers can complete correlation studies,
they must examine and report the evidence that assumptions underlying the use of the
Page | 3
statistics have been met.
A nonzero correlation does not mean that a causal relationship has been established. A
correlation coefficient is only a measure of how much the variables coincide with each
other. If there were a causal relationship between variables, we would expect to observe
a strong correlation, but it is not the case that a strong correlation alone means that a
cause-and-effect relationship exists. Other possibilities may explain high correlations.
3. Identifying a causal relationship often is viewed as an example of logic, rather than
statistics, because claims about causal relationships between variables have been based
traditionally on the fulfillment of the Humean criteria, as elaborated for the social
sciences by Paul Lazarsfeld and his coworkers
A third factor may explain high correlations. For instance, until the virtual elimination of
polio, there was a strong correlation between the per capita amount of ice cream
consumed during a month by North Americans and the number of polio cases. Casual
observers wondered if something in ice cream might have contributed to susceptibility
to polio. Of course, the reason for the high correlation was that polio was a disease with
its highest incidence during the warm summer months. Naturally, ice cream sales tended
to increase during the summer months as well.
• Sometimes the causal relationship exists, but it is in the opposite direction presumed by
individuals. For instance, it was observed that the greater the number of small appliances
one owned, the fewer children one tended to have. This information seemed to suggest a
new breakthrough in family planning: sending small appliances such as toasters and hair
dryers to places where the population was exploding. Of course, the causal relationship
was not in the direction implied by the statement; it was in the opposite direction. If you
had few children, you could afford to buy small appliances for yourself, but if you had
many children, you might not have spare money to buy many small appliances.
Sometimes correlations seem to advance causal relationships when the research
methods used did not permit drawing causal claims. For instance, survey research
showed a high correlation between one's amount of self-disclosure, trust, and
interpersonal solidarity (“feelings of closeness between people that develop as a result of

Semester III CU 2019 USP


Correlation USP Semester III 2019

shared sentiments, similarities, and intimate behaviors” [Rubin, 1994, p. 223]). But the
survey measured all these matters at the same time. Identifying a clear starting point
may not really be possible. The survey method may show an association among variables,
but not which variable may trigger any effects.
Page | 4
In case of multivariate population: Correlation can be studied through (a) coefficient of
multiple correlation; (b) coefficient of partial correlation; whereas cause and effect
relationship can be studied through multiple regression equations.
Cross tabulation approach is specially useful when the data are in nominal form. Under it
we classify each variable into two or more categories and then cross classify the variables
in these subcategories.
Then we look for interactions between them which may be symmetrical, reciprocal or
asymmetrical. A symmetrical relationship is one in which the two variables vary together,
but we assume that neither variable is due to the other. A reciprocal relationship exists
when the two variables mutually influence or reinforce each other. Asymmetrical
relationship is said to exist if one variable (the independent variable) is responsible for
another variable (the dependent variable). The cross-classification procedure begins
with a two-way table which indicates whether there is or there is not an interrelationship
between the variables. This sort of analysis can be further elaborated in which case a third
factor is introduced into the association through cross-classifying the three variables. By
doing so we find conditional relationship in which factor X appears to affect factor Y only
when factor Z is held constant. The correlation, if any, found through this approach is not
considered a very powerful form of statistical correlation and accordingly we use some
other methods when data happen to be either ordinal or interval or ratio data.
Charles Spearman’s coefficient of correlation (or rank correlation) is the technique of
determining the degree of correlation between two variables in case of ordinal data
where ranks are given to the different values of the variables. The main objective of this
coefficient is to determine the extent to which the two sets of ranking are similar or
dissimilar
SIMPLE REGRESSION ANALYSIS
Regression is the determination of a statistical relationship between two or more
variables. In simple regression, we have only two variables, one variable (defined as
independent) is the cause of the behaviour of another one (defined as dependent
variable). Regression can only interpret what exists

Semester III CU 2019 USP


Correlation USP Semester III 2019

physically i.e., there must be a physical way in which independent variable X can affect
dependent variable Y. The basic relationship between X and Y is given by
Y’ = a + bX where the symbol Y’ denotes the estimated value of Y for a given value of X.
This equation is known as the regression equation of Y on X (also represents the
Page | 5
regression line of Y on X when drawn on a graph) which means that each unit change in
X produces a change of b in Y, which is positive for direct and negative for inverse
relationships.
Thus, the regression analysis is a statistical method to deal with the formulation of
mathematical model depicting relationship amongst variables which can be used for the
purpose of prediction of the values of dependent variable, given the values of the
independent variable.

Semester III CU 2019 USP

Você também pode gostar