Ordination Methods

6/1/10
Ordination Methods
Correspondence Analysis (CA) Detrended Correspondence Analysis (DCA) Factor Analysis (FA) Principles of Canonical Analysis Redundancy Analysis (RA) Canonical Correspondence Analysis (CCA) Canonical Correlation Analysis (CCorA) Discriminant Analysis (DA)
1. Direct Gradient Analysis................................2 2. Few species...........................................4 4. Monotonic responses to gradients (low beta).......Linear regression 4. Nonmonotonic responses to gradients.(high beta)......Generalized linear models 2. Many species..........................................5 5. Monotonic responses ............................RDA 5. Nonmonotonic responses.............................6 6. concerned about arch effect..................DCCA 6. not concerned about arch effect...............CCA 1. Indirect Gradient Analysis..............................3 3. Only distance values are available....................7 7. Monotonic responses ............................PCoA 7. Nonmonotonic responses..........................NMDS 3. Raw data available....................................8 8. Monotonic responses ...............................9 9. Variables noncommensurate......PCA - corr. matrix 9. Variables commensurate..........PCA - cov. matrix 8. Nonmonotonic responses............................10 10. Feel OK about prespecifying number of dimensions, not worried about local optima, not interested in species scores..............NMDS 10. Not as above, but willing to accept either arch effect or detrending/rescaling................11 11. Don't like arch, detrending OK ..........DCA 11. Arch OK, or only interested in axis 1.....CA
Dichotomous Key for Ordination Methods Not 100% accurate, but a good place to start. (see Palmer 1998)
Correspondence Analysis (CA)

Correspondence was developed independently by several authors over a period of ca. 30 years and given many different names in the literature: Contingency table analysis RQ-technique Reciprocal averaging Correspondence analysis Reciprocal ordering Dual scaling Homogeneity analysis
6/1/10

Correspondence analysis was first proposed for analyzing twoway contingency tables. In such tables, the states of the first descriptor (rows) are compared to the states of the second descriptor (columns). Data in each cell of the table are frequencies. These frequencies are positive integers or zeroes. In EEB, the most common application of CA is for the analysis of species data (0/1, or abundance) at different sampling sites. A species-site table essentially contains frequencies.

In general, CA may be applied to any data table that is dimensionally homogeneous (i.e., the physical dimensions of all variables are the same) and only contains positive integers or zeroes. The !2 distance (D16), which is a coefficient that excludes double-zeroes, is used to quantify the relationship among rows and columns. NB: Some authors have questioned the efficacy of !2 distance for certain types of data.

CA is primarily a method of ordination. As such, it is similar to PCA; it preserves in the space of the principal axes (i.e., after rotation), the Euclidean distance between profiles of weighted conditional probabilities. In other words, CA preserves the !2 distance between the rows and the columns of the contingency table. Correspondence analysis proceeds along three steps: (1) the contingency table is transformed into a table of contributions to the Pearson chi-square statistic after fitting a null model to the table. (2) Singular value decomposition (SVD) to that table and the eigenvalues and eigenvectors are computed (as in PCA). (3) Further matrix manipulations lead to the tables required for plotting in ordination space.
6/1/10
CA
- Example -
Let's use an example other than the stand " species situation we have been looking at (although we could do this here too) and consider the relative abundance (0, +, ++) of a particular species observed at 100 sites. The temperature at each site was recorded and coded (1,2,3): (Descr.-2) Sp. is: Row Sums
Temp. (Descr.-1) Cold (1) Med. (2) Warm (3) Col. Sums
Rare (0) 10 10 15 35
Abund. (+) 10 15 5 30
Very Abund. (++) 20 10 5 35
40 35 25 100
CA
- Example -
Matrix Q contains the proportions pij and the marginal totals pi+ and p+j of the rows and columns, respectively. Identifiers of the rows and columns are given outside the matrix brackets in parentheses:
CA
- Example -
The eigenvalues of Q'Q are: #1 = 0.09613 (70.1%) and #2 = 0.04094 (29.9%) and #3 = 0 (because of centering)
6/1/10
CA
- Example The normalized eigenvectors of Q'Q are then:
And the normalized eigenvectors of QQ' are then:
CA
- Example In Scaling Type-1, F and V are determined to produce a CA joint plot:
Now, to put the rows (matrix F) at the centroids of the columns, the position of each row along an ordination axis is computed as the mean of the column positions, weighted by the relative frequencies of the observations in the various columns of that row...
CA
- Example -
Consider the first row of the original data. The relative frequencies of that row are 0.25, 0.25, 0.50. Multiplying matrix V by that vector provides the coordinates of the first row of the ordination diagram:
continuing:
6/1/10
CA
- Example -
Now, using F and V, we can construct the ordination plot:
CA using R
CA.csv
CA using R
6/1/10
CA using R
CA using R
Data Tables
Correspondence analysis has been applied to many types of data tables other than contingency tables. However, as a caveat, recognize again that in order for CA to work correctly, the data table must be dimensionally homogeneous (i.e., in the same physical units) and nonnegative ($ 0). If the data do not meet these assumptions, they may be transformed or recoded. This is a critical step in CA.
6/1/10
Arch Effect
Let's return to notion of coenocline distortion that we first considered in PCA. Recall that most of the these procedures require linear (or at least monotonic) responses. Species data, in particular, is usually unimodally distributed across a gradient. Recall that this problem usually manifests itself in the form of an arch or horseshoe in the data projection. Some ecologists are willing to tolerate this distortion while others feel that an attempt should be made to recover the original gradient via detrending.
Arch Effect
The most extreme form of the arch effect usually occurs while attempting to apply a Euclidean distance measure to species abundance data. A horseshoe is formed because the ends actually contract and fold inwards at the ends of Axis-1 and bend along Axis-3. This is because ED considers the extreme sites to be very near each other. In most instances, CA does not exhibit such a dramatic folding towards the terminal portions, but rather just bends along Axis-1 to form an arch.
Detrended Correspondence Analysis (DCA)

When a single axis is is enough to order the sites and species correctly, a second axis, which is independent of the first, can be obtained by folding the first axis in the middle and bringing the ends together. Subsequent independent axes can be obtained by folding the first axis in three parts, four parts, etc. This process is referred to as Detrended Correspondence Analysis (DCA).
6/1/10

Recall this data set used to evaluate PCA & PCO where 3 species were unimodally distributed across a coenocline:
9 8 7 6 5 4 3 2 1 0 -1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
PCA w/Euclidean Distance vs. DCA via quadratic polynomial

0.6 0.5 0.4 0.2 0.1 -0.6 -0.5 -0.4 -0.2 -0.1 -0.1 -0.2 -0.4 -0.5 -0.6 Axis 1 0.1 0.2 0.4 0.5 0.6
Axis 2

Two main approaches have been proposed to remove the arch effect: detrending by polynomials (previous example), and detrending by segments. Both methods lead to detrended correspondence analysis.
6/1/10
Detrending by Segments
When detrending by segments (Hill and Gauch 1980), axis-I is divided in to a number of "segments" and, within each one, the mean of the scores along axis II is made equal to zero; in other words, data points in each segment are moved along axis-II to make their mean coincide with the abcissa. Proximities among points should in no case be interpreted as meaningful! Segments can generate large differences in scores for points that are near each other in the original ordination but happen to be on either side of a segment division. The number of segments is arbitrary. Different numbers of segments lead to different ordinations.
Detrending by Segments
Various software packages use 10 as a minimum number of segments and 46 as a maximum; 26 being a recommended starting or default value. This of course necessitates data sets with considerably more observations than 26. There are no empirical rules for the "correct" number of detrending segments. After detrending by segments, the DCA ordination has the interesting property that the axes are scaled in units of the average standard deviation (SD) of species turnover (Gauch 1982). Along a regular gradient, a species typically appears, rises to a modal value, and disappears in 4 SD; similarly, a complete turnover in species composition often occurs over 4 SD. Thus, the length of axis-I is often used as a measure of the length of the ecological gradient.
Detrending by Polynomials
Detrending by polynomials (Hill and Gauch 1980) directly follows from the fact that an arch is produced when a gradient of sufficient length is present in the data. When a sufficient number of species are present and replace each other along the gradient, the second axis of the CA approaches a quadratic function of the first one (i.e., a second degree polynomial). When detrending is sought, detrending by polynomials is an attractive method because it results in a continuous function of the previous axes, without the discontinuities generated by detrending-bysegments. However, the downside of this method is that it imposes a very specific polynomial model that the data must correspond to. It also does not solve terminal gradient compressions at the ends of the ordination axes.
6/1/10
DCA of WI Forest Data

R2 Axis-I = 0.866 R2 Axis-II = 0.076 10 Segments
DCA Axis 2
3
NB: Wedge Effect

10 8 5 9 7 6 4
2 1
DCA Axis 1
To Detrend or Not To Detrend That is the Question...

The controversy over detrending has raged in the literature for well over 20 years now. Wartenburg et al. (1987) argue that the arch is an important and inherent attribute of the distances among sites, not simply a mathematical artifact. The only effect of DCA is to flatten the distribution of points onto axis-I. They also argue that detrending by segments is completely arbitrary and has no theoretical justification. Peet et al. (1988) still support DCA on the grounds that detrending and rescaling may facilitate ecological interpretation & called for improved algorithms.
To Detrend or Not To Detrend That is the Question...

Minchin (1987) produced a nice comparison of several ordination techniques and found DCA to perform poorly on most accounts. He found that DCA actually removed real pattern from the data and produced significant distortion which he referred to as a "tongue" or subsequently a "wedge" in the data and this was a simple artifact of the algorithm. However, Palmer (2010) argues for the viability of DCA, especially in certain applications. He highlights the characteristics of both DCA and NMDS at his Ordination website
10
6/1/10
Factor Analysis
In the social sciences, analysis of the relationships among the descriptors of a multidimensional data matrix is frequently carried out via Factor Analysis (FA). Recall that the goal of PCA is to account for a maximum amount of the variance in the data, whereas the goal of factor analysis is to account for the covariance among descriptors. Put another way, PCA is directed towards reducing the diagonal elements of R. Factor analysis is directed more towards reducing the off-diagonal elements of R. Since reducing the diagonal elements reduces the off-diagonal elements and vice versa, both methods achieve much the same thing.
Factor Analysis
To do this, FA assumes that the observed descriptors are linear combinations of hypothetical underlying variables (i.e., the factors). Originally FA was used to evaluate such things as intelligence. Many variables could be measured such as age, parental education, family income, etc. Multiple variables might play out to show that Factor-1 was determined by all variables related to education and Factor-2 to socio-economic conditions (for example). There are few applications of FA in EEB, so I will not cover it in depth. An excellent treatment can be found in Tabachnick and Fidell (1996).
11
6/1/10
Canonical Analysis
Canonical analysis is the simultaneous analysis of two, or eventually several data tables. It permits biologists to do a direct comparison of two data matrices. Hence, canonical analysis and its derivatives are known as direct ordination methods. Often, in ecology, one is interested in the relationship between a first table describing species composition and a second table of environmental descriptors, observed at the same locations (i.e., objects or samples).
Canonical Analysis
Previous to this, we have considered indirect ordination methods (PCA, PCO, NMDS, CA, DCA) in that we would ordinate a species " stand matrix and then conduct some form of correlation or regression analysis on the ordination vectors to relate objects or descriptors to externally obtained environmental information. This procedure is performed a posteriori. In canonical analysis, with two matrices (X and Y), one is constrained by the other, and both are examined simultaneously. This permits one to directly test a priori hypotheses by bringing all of the variance of Y that is directly related to X and allowing formal tests of the hypotheses.
Canonical Form
In mathematics, a canonical form is the simplest and most comprehensive form to which certain functions, relations, or expressions can be reduced without loss of generality. For example, the canonical form of a covariance matrix is its matrix of eigenvalues. In general, most methods of canonical analysis employ eigenanalysis (some extensions have been described using NMDS). Canonical analysis combines the concepts of ordination and regression. It involves a response matrix Y and an explanatory matrix X. (See next slide.) Like previous ordination methods, canonical analysis produces orthogonal axes from which scatter diagrams may be plotted.
12
6/1/10
Variables y1...yp
Objects 1 to n
Var. y
Objects 1 to n
Variables x1...xp
Simple ordination of matrix Y: PCA, CA, etc.
Ordination of y (single axis) under the constraint of X: aka multiple regression
Variables y1...yp
Objects 1 to n
Variables x1...xp
Ordination of Y under the constraint of X: Redundancy Analysis (RDA) or Canonical Correspondence Analysis (CCA)
Problems of canonical analysis can be represented via a partitioned covariance matrix resulting from the fusion of Y and X data sets and producing a joint dispersion matrix SY+X...
Submatrices SYY (order p ! p) and SXX (m ! m) concern each of two sets of descriptors, respectively, where SYX (p ! m) and its transpose S'YX = SXY (m ! p) account for the covariances among the descriptors of the two groups.
Redundancy Analysis
In redundancy analysis (RDA), each canonical ordination axis corresponds to a direction, in the multivariate scatter of objects (Y), which is maximally related to a linear combination of the explanatory variables X. A canonical axis is thus similar to a principal component. Two ordinations of the objects are obtained: (1) linear combinations of the Y variables (matrix F in PCA), (2) linear combinations of the fitted Y-hat variables (matrix Z), which are thus also linear combinations of the X variables. RDA preserves the Euclidean distance among objects in matrix Y-hat containing values of Y fitted by regression to the explanatory variables X.
13
6/1/10
Canonical Correspondence Analysis

Canonical correspondence analysis (CCA) is similar to RDA. The difference is that it preserves the !2 distance (as in CA), instead of the Euclidean distance among objects. ^ Calculations are a bit more complex since Y contains fitted values obtained by weighted linear regression of matrix Q of correspondence analysis on the explanatory variables X. As in RDA, two ordinations of the objects are obtained.
Canonical Correlation Analysis

In canonical correlation analysis (CCorA), the canonical axes maximize the correlation between linear combinations of the two sets of variables Y and X. This is obtained by maximizing the among-variablegroup covariance (or correlation) with respect to the within-variable-group covariance. Two ordinations of the objects are again obtained.
Canonical Discriminant Analysis

In canonical discriminant analysis, the objects are divided in to k groups, described by a qualitative descriptor. The method maximizes the dispersion of the centroids of the k groups. This is obtained by maximizing the ratio of the among-object-group dispersion over the pooled within-object-group dispersion.
14
6/1/10
Canonical Analysis
Unfortunately, we do not have the time to develop the details of the algebra of each of the 4 methods of canonical analysis previously described. But, you have now gained all of the necessary skills necessary to interpret the details on your own should you need to pursue one of these analyses. Two excellent sources of of information on these methods can be found in Legendre and Legendre (1998), ter Braak and !milauer (1998), and Lep" and !milauer (2003).
Canonical Analysis
As an alternative to a detailed treatment of mathematics behind each method, I would like to develop a worked example. Let's develop a data set using the number of fish observed at 10 sites along a transect running from the beach of a Caribbean island, with water depths going from 1 to 10 m. The first three sites are on sand and the others alternate between coral and "other substrate" (coded as 0/1).
Tropical Fish Data Set

Site No. 1 2 3 4 5 6 7 8 9 10 % Sp1 1 0 0 1 1 9 9 7 7 5 60 Sp2 0 0 1 4 5 6 7 8 9 10 50 Sp3 0 0 0 0 17 0 13 0 10 0 40 Sp4 0 0 0 0 7 0 10 0 13 0 30 Sp5 0 0 0 8 0 6 0 4 0 2 20 Sp6 0 0 0 1 0 2 0 3 0 4 10 Sp7 2 5 0 6 6 10 4 6 6 0 45 Sp8 4 6 2 2 6 1 5 6 2 1 35 Sp9 4 1 3 0 2 4 4 4 0 3 25
Depth (m) Coral Sand Other
1 2 3 4 5 6 7 8 9 10
0 0 0 0 1 0 1 0 1 0
1 1 1 0 0 0 0 0 0 0
0 0 0 1 0 1 0 10 1 0
15
6/1/10
Tropical Fish Data Set

Because we wish to conduct a direct gradient analysis (i.e., we have both species data and environmental data from the same samples), and we have numerous species (9), with roughly monotonic responses (although one may be unimodal; e.g., 7) we select RDA as the method of choice. RDA is particularly appropriate when the gradients are short and species distributions are linear (or generally monotonic). The software of choice for this type of analysis has for the last decade been CANOCO. Mathematically, this software is excellent but its ease of use is not the best and graphics are poor. R now has applications to handle most of these ordination procedures (i.e., RDA and CCA).
RDA using R
- Tropical Fish Data Set -
RDA using R
16
6/1/10
RDA using R
RDA using R
RDA using R
17
6/1/10
RDA using R
Discriminant Analysis
A common situation arises in EEB applications where one starts with an already known grouping of objects, and one wishes to assess how well a group of quantitative descriptors can explain the object groups. Thus, the problem is no longer how to define or delineate groups, but rather how to interpret them. This is the realm of discriminant analysis. Discriminant analysis is a method of linear modeling, like analysis of variance, multiple regression, and canonical correlation analysis. DA is frequently used in systematics.
DA proceeds in two steps: (1) It first tests for the differences in the explanatory variables (X), among the predefined groups. This part of the analysis is identical to the overall test performed in the MANOVA. (2) If the test supports the alternative hypothesis of significant differences among groups in the X variables, the analysis proceeds to find the linear combinations (called discriminant functions) of the X variables that best discriminate the groups.
18
6/1/10
Like one-way ANOVA, discriminant analysis considers a single classification criterion (i.e., division of the objects into groups) and allows one to test whether the explanatory variables can discriminate among the groups. Testing for differences among group means in DA is identical to ANOVA for a single explanatory variable and to MANOVA for multiple explanatory variables. When it comes to modeling, i.e., finding the linear combinations of the variables (X) that best discriminate among the groups, DA is a form of "inverse analysis" where the classification criterion is considered to be the response variable (y) whereas the quantitative variables are explanatory (matrix X).
Note that discriminant analysis (DA) is also called canonical variates analysis (CVA). This method was first proposed by Fisher (1936) where he published the now famous data set where he described the morphology of 150 specimens of irises (Iridaceae) using 4 measured flower characters (lengths and widths of sepals and petals) belonging to three species. Again, in the interest of time, we will bypass the mathematical treatment of DA and work through the iris data set using a software application (NCSS).
Shown here are Fisher's data for the 150 plants (first 31 shown), four variables, and three species (coded 1,2,3: 1 = Iris setosa, 2 = Iris versicolor, and 3 = Iris virginica . Note that I. versicolor is actually a polyploid hybrid of the other two species. Data were originally collected by the botanist Edgar Anderson of the Missouri Botanical garden and used with permission.
19
6/1/10
This chart plots the values of the first and second discriminant function scores. By looking at this plot you can see what the classification rule would be. The first function appears to be the most important in separating the three species.
20

Ordination Methods

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Ordination Methods

Enviado por

Direitos autorais:

Formatos disponíveis

6/1/10

Correspondence Analysis (CA)

Correspondence Analysis (CA)

Correspondence Analysis (CA)

Correspondence Analysis (CA)

Very Abund. (++) 20 10 5 35

And the normalized eigenvectors of QQ' are then:

Now, using F and V, we can construct the ordination plot:

Detrended Correspondence Analysis (DCA)

Detrended Correspondence Analysis (DCA)

PCA w/Euclidean Distance vs. DCA via quadratic polynomial

Detrended Correspondence Analysis (DCA)

DCA of WI Forest Data

NB: Wedge Effect

To Detrend or Not To Detrend That is the Question...

To Detrend or Not To Detrend That is the Question...

Simple ordination of matrix Y: PCA, CA, etc.

Ordination of y (single axis) under the constraint of X: aka multiple regression

Canonical Correspondence Analysis

Canonical Correlation Analysis

Canonical Discriminant Analysis

Tropical Fish Data Set

Tropical Fish Data Set

Você também pode gostar