Você está na página 1de 13

Psychological Review

1995. Vol. 102, No. 2, 396-408

Copyright 1995 by the American Psychological Association, Inc.


0033-295X/95/S3.00

A Measurement-Theoretic Analysis of the


Fuzzy Logic Model of Perception
Court S. Crowther
University of California, Los Angeles

William H. Batchelder

University of California, Irvine

Xiangen Hu
University of Memphis
The fuzzy logic model of perception (FLMP) is analyzed from a measurement-theoretic perspective.
FLMP has an impressive history of fitting factorial data, suggesting that its probabilistic form is
valid. The authors raise questions about the underlying processing assumptions of FLMP. Although
FLMP parameters are interpreted as fuzzy logic truth values, the authors demonstrate that for several factorial designs widely used in choice experiments, most desirable fuzzy truth value properties
fail to hold under permissible rescalings, suggesting that the fuzzy logic interpretation may be unwarranted. The authors show that FLMP's choice rule is equivalent to a version of G. Rasch's (1960)
item response theory model, and the nature of FLMP measurement scales is transparent when stated
in this form. Statistical inference theory exists for the Rasch model and its equivalent forms. In fact,
FLMP can be reparameterized as a simple 2-category logit model, thereby facilitating interpretation
of its measurement scales and allowing access to commercially available software for performing
statistical inference.

The fuzzy logic model of perception (FLMP) is an approach


to multicomponent, factorial pattern recognition experiments.
It has been applied in many areas of human information processing, including speech perception (e.g., Massaro, 1987; Massaro & Oden, 1980; Oden & Massaro, 1978) and letter perception (Massaro & Hary, 1986; Oden, 1979). Furthermore, the
model has been much discussed, debated, and compared to
other models (e.g., Cohen & Massaro, 1992; Massaro, 1989;
Massaro & Friedman, 1990; Oden, 1988). The model makes
strong assumptions about the underlying processing events that
occur when an individual must classify a factorially defined
stimulus into one of several response categories. At the heart of
the model is the assumption that a stimulus is compared, fea-

ture by feature, with prototypes representing each relevant response category. The results of these comparisons are said to be
"fuzzy logic truth values," indicating the degree of match of
each stimulus feature to a corresponding prototype feature. The
fuzzy truth values then are represented by parameters in a probabilistic model that predicts the classification probabilities for
each stimulus.
In this article we examine the probabilistic classification process of FLMP from a measurement-theoretic perspective. Our
analysis has substantial consequences for the model, some positive and some negative. In particular, we show that the fuzzy
logic parameter values cannot be recovered uniquely from the
classification probabilities. Although it is possible to set up
scales of measurement on the basis of the classification probabilities, these scales fail to satisfy the properties needed to justify
their interpretation as fuzzy logic truth value scales. We also
show that the basic probability formula of FLMP is identical
with that of the well-known model of item response theory developed by Rasch (1960) and studied extensively by psychometricians, and in various equivalent forms, in the foundations of
measurement literature. Neither the Rasch formulation nor the
others that we cover have been analyzed previously in terms of
FLMP.
Although our analysis directly challenges the interpretation
of FLMP parameters as fuzzy truth values, it helps to explain
why the model frequently does a good job of fitting data in factorial pattern classification experiments. Indeed, some of the
equivalent formulations have been quite successful in analogous
applications, and they have the added benefit of having been
studied extensively in the psychometrics literature from a statistical standpoint. Thus, rather than questioning the ability of
FLMP to fit data, our article calls attention to the need for more

Court S. Crowther, Department of Linguistics, University of California, Los Angeles; William H. Batchelder, Department of Cognitive Sciences, University of California, Irvine; Xiangen Hu, Department of Psychology, University of Memphis.
Portions of this article were presented at the 25th annual meeting of
the Society for Mathematical Psychology, Stanford University, Palo
Alto, California, August 22, 1992 (Crowther & Hu, 1992). We gratefully acknowledge comments from Jean-Claude Falmagne, Christolf
Klauer, Ece Kumbasar, R. Duncan Luce, and David M. Riefer on earlier drafts. The research presented in this article was supported by a
National Science Foundation (NSF) training grant to the Institute for
Mathematical Behavioral Sciences at University of California, Irvine;
by a National Institutes of Health training grant to the Phonetics Laboratory at University of California, Los Angeles; and by NSF Grant SBR9309667.
Correspondence concerning this article should be addressed to William H. Batchelder, Department of Cognitive Sciences, Social Sciences
Tower, University of California, Irvine, California 92717. E-mail may
be sent via Internet to whbatche@aris.ss.uci.edu.

396

397

MEASUREMENT-THEORETIC ANALYSIS OF FLMP


work justifying the processing interpretation of the model. To
facilitate this and to aid others in using the model, we demonstrate how to reformulate FLMP as a simple logit model. In
its reformulated version, statistical inference for FLMP can be
conducted by standard, log-linear statistical software packages
that can perform parameter estimation, goodness of fit, and hypothesis testing.

Description of the Model


According to FLMP, conjoined features comprise prototypes
stored in long-term memory. The recognition process involves
matching features that are perceived to be present in a stimulus
to the prototypes in long-term memory. For letter perception,
the prototypes are letters (Massaro & Hary, 1986), and the features are visual features of the letters. For speech perception, the
features are acoustic and perhaps visual features of the speech
signal, and the prototypes are syllables (Oden & Massaro,
1978). FLMP includes three processing stages: feature evaluation, feature integration, and pattern classification. In the feature evaluation stage, it is assumed that sources of information
corresponding to each feature are evaluated independently for
the degree to which they match features of the prototypes. During the feature integration stage, feature values are combined,
and the degree to which the resultant feature combination
matches each relevant prototype is determined. During the pattern classification stage, the relative goodness of match between
each feature conjunction and each relevant prototype is determined using a formula described later in Equation 1.
FLMP assumes that the output of the feature evaluation and
feature integration stages are continuous, but the final stage
(pattern classification) is discrete in the sense that the individual will classify the pattern as a token of an available category
according to a probability distribution over the categories. The
model postulates that during feature evaluation, each stimulus
feature is assigned a "fuzzy truth value" (Goguen, 1969; Zadeh,
1965) in the [0,1 ] interval reflecting "the degree to which each
relevant feature is present" (Massaro & Hary, 1986, p. 124).
Fuzzy truth values are used in the model because they
. . . provide a natural representation of the degree of match. Fuzzy
truth values lie between 0 and 1, corresponding to a proposition
being completely false and completely true. The value 0.5 corresponds to a completely ambiguous situation, whereas 0.7 would be
more true than false and so on. (Massaro & Friedman, 1990, pp.
231-232)
During feature integration, the fuzzy truth values assigned during the feature evaluation stage are combined, and the degree to
which they match prototypes is assessed.
Typically, the model is applied to data from straightforward
factorial categorization experiments, where each stimulus is
constructed by conjoining one level from each factor. For example, consider a two-factor experiment in which individuals are
asked to classify each stimulus (Q, Oj) into one of two response
categories, T, or T 2 , where Q e C = {d,..., C/} and Oj e O
= {O,, . . . , Oj}. It is assumed that C and O represent two
different factors, with /and /levels, respectively, and the model
postulates I + J parameters, Ci and o j, with 0 < Ci, o j <. 1, that
are the fuzzy truth values representing the degree to which fac-

tor levels Q and Oj, respectively, are supportive of a TI response. The "additive complements" of q and o j; (1 - q) and
(1 - Oj), respectively, represent the degree to which Q and Oj
are supportive of a T2 response. After feature evaluation, the
fuzzy truth values, Cj and Oj, are passed on to the feature integration stage.
During feature integration, the feature values are combined,
and the degree to which the resultant combination matches
each relevant prototype is determined. In most applications
multiplication is used as the operation for combining feature
values (see Massaro, 1987, chapter 7). In this case, the products, CjOj and (1 - c;)( 1 - DJ), are taken to represent the degree
that the resultant feature combination matches the prototypes
corresponding to categories T! and T2, respectively, and they
are both interpreted as fuzzy truth values in the [0,1] interval.
Finally, in the pattern classification stage, the relative goodness of match between each feature combination and each relevant prototype is determined using the following "relative goodness rule" (Massaro & Friedman, 1990, p. 230), hereafter
RGR, which is motivated as a variant of Luce's (1959) choice
rule:
Pij(Ci, Oj) =

CjOj

(D

where PJJ(CJ, QJ) is the probability that stimulus (Q, Oj) is categorized as T,, 1 < i <; /, 1 < j < ./. The RGR, then, is the ratio
of the support (or evidence) for one alternative (numerator) to
the total support for all relevant alternatives (denominator).
A concrete example from the literature illustrates the operation of the model and facilitates subsequent discussion. In an
audiovisual speech perception experiment reported in Massaro
and Cohen (1983), one visual factor (or cue) and one acoustic
factor were combined factorially to create stimuli that ranged
perceptually from /ba/ to /da/. There were two levels of the
visual factor, C. At one level, the visual component of the stimulus was a speaker saying /ba/, and the corresponding visual
feature was (closed lips). The other level was a speaker saying
/da/, and the corresponding visual feature was (open lips), or,
in the notation of FLMP, (1 - closed lips), which is the additive
complement of (closed lips). A seven-level continuum comprised the acoustic factor, O. One end of the continuum contained rising second and third formant transitions that were
more appropriate for /ba/, and the corresponding acoustic feature was (rising F2-F3). The other end of the continuum contained stimuli with falling second and third formant transitions
that were more appropriate for /da/, and the corresponding
acoustic feature was (falling F2-F3), or (1 - rising F2-F3).
Thus, there were 14 stimuli (two levels of the visual factor coupled with seven levels of the acoustic factor), and the participants' task was to classify each stimulus as either "ba" or "da."
The prototypes can be defined as:
/ba/: (closed lips )A( rising F2-F3);
/da/: (1 - closed lips)A( 1 - rising F2-F3).
Let Q be the fuzzy truth value that represents the extent to
which the visual component of the stimulus, Q, is perceived as
containing "closed lips", and Oj the extent to which the acoustic
component of the stimulus, O j; is perceived as containing "ris-

398

C. CROWTHER, W. BATCHELDER, AND X. HU

ing F2-F3." In most applications, conjunction is implemented


in FLMP by the multiplication operator. Thus, the above prototype definitions can be expressed as:
/ba/: qojand
For the pattern classification stage, in terms of the above formulation, the probability that the participant classifies stimulus
( Q , Oj) as "ba" is given by specializing Equation 1 as:

p("ba"|Q,Oj) = -

Theorem 1

CiOj

In this article, we focus on Equation 1 and related equations


as mathematical models for factorial categorization experiments. As such, the observable quantities for the model are the /
X /category proportions, Py, representing the observed relative
frequency with which stimulus (C i? Oj) is categorized as Tj.
Typically, the / + J fuzzy truth value parameters, the q and o J 5
are estimated from the Py. Because the estimates of the q and
Oj, which represent the influence or effect of the experimental
factors, are determined from the category choice proportions,
FLMP can be seen as implementing probabilistic conjoint measurement (see, e.g., Falmagne, 1985), and it leads to a scale of
measurement for each factor. Typically, Massaro and his coworkers estimate the parameters by using the STEPIT minimization algorithm (Chandler, 1969), and the estimates of the q
and Oj are a set of values, c, and Oj, that minimize the root mean
squared deviation (RMSD) between the predicted probabilities
in EOT ation 1 and the observed proportions:

RMSD =

IX.J

servable response proportions, and only / + J parameters to be


estimated, uniqueness is not necessarily guaranteed (see
Bamber & van Santen, 1985). In the following, we demonstrate
that the estimates that minimize RMSD are not unique. The
uniqueness question is clarified by Theorem 1, which shows
that if Equation 1 is satisfied by any set of parameter values,
then there are infinitely many other sets of parameter values
that satisfy the same factorial probabilities.

Let pjj(Cj,Oj) be given by Equation 1, for some 0 <Ci, Oj < 1,


1 < i < /, 1 <, j < J. Then, if B > -1, the parameter values cf
and of computed by

cf =

In this section we raise questions about the interpretability


of the model's parameters as fuzzy logic truth values. Such an
interpretation requires that the estimates have certain properties such as those cited earlier from Massaro and Friedman
(1990). Furthermore, Massaro and Cohen suggested that "the
parameter values can be used to determine the relative contribution of each source and to ascertain the psychophysical relationship between the stimulus source and the perceptual consequence" (1983, p. 759). If this were so, then uniqueness may be
a necessary property for the estimates. By "uniqueness" we refer to the condition that there exists only one set of parameters
(i.e., one solution) that best fits the data in the sense of minimizing the RMSD in Equation 2. If there were more than one
set of parameter values that provided identical, good fits to the
data, then it would be impossible, without making additional
assumptions or conducting further experiments, to determine
which set of parameter values reflected the true "perceptual
consequence" of the stimuli under study.

Are the Estimates Unique?


Let us consider the uniqueness of the estimates obtained in a
two-factor, two-category FLMP. Although there are / X / ob-

(3)

and
(4)

when inserted into Equation 1, satisfy


Pij(cf , Of) = pij(Ci, Oj),

(5)

for all 1 < i < /, 1 < j s /. Furthermore, any set of parameter


values (cf, of) that satisfies Equations 1 and 5 is defined by
Equations 3 and 4, for some B > 1.
Proof. The two-category, two-factor FLMP can be written
in what Cohen and Massaro (1992) call "likelihood product"
forri by writing
Pij(Ci, Oj) =

(2)

Model Identifiability and Measurement

Ci

(6a)

Clearly, for D > 0, the Py(ci, Oj) are left invariant if (1


cf )/cf and (1 o*)/o* are given by
(1-cf)
cf

>D

(6b)

and

(6c)
Equations 3 and 4 for cf and o* can then be derived explicitly
from Equations 6b and 6c. The derivation involves transforming (1 - Ci)/C; and (1 - OJ)/DJ in Equation 6a in the manner of
Equations 6b and 6c, and letting D = 1 + B. Solving for cf and
of, Equations 3 and 4 are obtained through simple algebraic
manipulation. The proof of the second part of Theorem 1 is
tedious, and because it adds little to the argument we omit the
details.1
From a technical standpoint, Theorem 1 shows that FLMP is
1
To prove the second part of Theorem 1, one equates expressions as
in Equation 5 and rewrites the parameters as in Equation 6a, and then
it can be shown that a unique D in Equations 6b and 6c can be obtained.

MEASUREMENT-THEORETIC

not identifiable2 for a two-factor, two-category experiment (cf.


Riefer & Batchelder, 1988). The practical implication of this
lack of identifiability is that any set of proportions, Py, obtained
in a factorial design can be fit equally well by infinitely many
sets of parameters in the sense of minimizing the RMSD. The
same can be said for any other classical goodness-of-fit criterion
such as maximum likelihood estimation, minimum chisquared estimation, and others discussed by Read and Cressie
(1988) and Batchelder (1991). In other words, for each set of
parameters obtained using the RGR (Equation 1), for any B >
1 the transformations in Equations 3 and 4 will generate a
new set of parameters, within the unit interval, that yields an
identical fit to the data. Therefore, because the estimates are not
unique in the above sense, it is not reasonable to report and
interpret a particular set of parameters returned by STEPIT (or
any other optimization algorithm) as estimates of the fuzzy
truth values that correspond to the "true" perceptual consequence of the stimulus. Such estimates are arbitrary and depend
on arbitrary features of the algorithm, such as its starting values
used in the estimation procedure.
Oden (1979) applies FLMP embodied in Equation 1 to data
from a letter recognition experiment. When fitting the data, he
took steps to avoid the nonuniqueness of parameter estimates;
for example, "... the scale unit was set by choosing a value for
(3, such that the range of parameters is approximately equal for
both factors..." (Oden, 1979, p. 347). In our terms, /?, = (1 Oj)/0j; thus, it is clear that the lack of uniqueness of the parameter estimates was acknowledged in the FLMP literature. However, many more recent fits of factorial data make no mention
of setting the scale unit; nevertheless, tables of scale values are
presented and described as fuzzy truth values and sometimes
are plotted against physical measures as in psychophysical functions (e.g., Massaro, 1987; Massaro& Cohen, 1983,1993;Massaro & Friedman, 1990; Massaro & Oden, 1980). We also were
unable to find any explicit characterization of the scale type
of fuzzy truth values for FLMP in the literature. Massaro and
Friedman (1990) do discuss scales in general; for example,
The outcome of the first stage, evaluation, can be described by a
scale value, which in general we denote as x for a given information
source X, .... We assume that x is a real number on an interval
scale that is measured in some sort of "currency," such as truth
value, probability, activation, energy, or strength. (Massaro &
Friedman, 1990, p. 227)

The scale defined in Theorem 1 is clearly not an interval scale;


furthermore, quantities like probability and truth value could
not be on interval scales because rescalings could violate the
constraint that they are confined to the [0, 1] unit interval. As
we will see, the arbitrariness in the particular scale for FLMP in
Theorem 1 can have several serious consequences for the fuzzy
logic interpretation of the estimated scale values.
Equations 1, 3, and 4 imply that there are two scales, one for
each factor, and next, Corollary 1 shows that they are unique up
to the setting of a single level of one of the factors to an arbitrary
value in (0, 1). In other words, in the terms of the Massaro and
Friedman (1990, pp. 231-232) quote cited earlier, a given level
of one of the factors can be arbitrarily assigned a fuzzy truth
value such as 0.3, 0.5, or 0.7, and this assignment determines
all of the other values.

399

ANALYSIS OF FLMP

Corollary 1. Suppose that Equation 1 holds for some parameters <Ci> and <0j>. Let x e (0, 1) be arbitrary and arbitrarily pick any particular ck (or Oi). Then there is exactly one
set of parameters <c* >, <o*> that satisfies Equations 1 and 5
andck = x(orc>i = x).
Proof. Without loss of generality, pick Ck e C. First, note
that for all 0 < x < 1,

B(x) =

x(l-ck)

(7)

Note from Equation 3 that


l+B(x)(l-Ck)
and further note that no other B > -1 will yield c k = x. From
Theorem 1, all parameter values consistent with Equation 1 are
obtained from Equations 3 and 4; thus, the B(x) in Equation 7
yields the unique scales <c* > and <o*> with c k = x.
To see the consequences of Theorem 1 and Corollary 1, let us
return to the audiovisual integration example from Massaro
and Cohen (1983) that was discussed earlier. To determine the
fuzzy truth values assigned to the stimulus features, Massaro
and Cohen (1983) estimated a set of parameter values that minimized RMSDs between predicted and observed data. Averaged
over all 14 conditions and all six participants, the RMSD was
0.015, suggesting to the authors that the model fit the data well.
For illustrative purposes, Figure 1 (open circles) shows the estimated parameter values returned from STEPIT for participant
number six reported in Massaro and Cohen (1983). Figure 1
compares the original estimated fuzzy truth values (open
circles) for each level of the acoustic (top panel) and visual
(bottom panel) factors, together with three new sets of fuzzy
truth values that result when Equations 4 and 3, respectively,
are applied to the reported STEPIT output using three different
values of the scale parameter, B. Each of these four curves is
equally supported by the pattern classification data in that
STEPIT could have produced any one of them as minimizing
the RMSD in Equation 2. Furthermore, they suggest quite
different, contradictory stories about the psychophysical relationship between the acoustic and visual factor levels and the
corresponding fuzzy truth values. For example, consider Level
3 of the acoustic factor in Figure 1. The reported estimation run
yielded the value 0.28. Interpreting this parameter value as a
fuzzy truth value in the spirit of the quote cited in our introduction from Massaro and Friedman (1990), we would attribute a relatively low degree of truth to the proposition, "the
feature 'falling F2-F3' is present." However, by transforming
2

A probabilistic model for categorical data assigns to each value of


its parameters, 9 e fl, a unique probability distribution, p(0), over the
categories, where $} is the parameter space. Statisticians refer to such a
model as (globally) identifiable, if corresponding to every distinct pair
of parameters are distinct distributions, that is, for all 0|, 02 n, *i ^ "2
implies p(*i) ^ p(82). Two models are said to be equivalent if their
sets of probability distributions are identical. Nonequivalent models are
potentially distinguishable by data. Model identifiability should not be
confused with model equivalence (see Bamber & van Santen, 1985;
Bishop, Fienberg, & Holland, 1975; Riefer & Batchelder, 1988).

400

C. CROWTHER, W. BATCHELDER, AND X. HU

ba

Acoustic Factor Level

-O- B = Orig.

0.9

-A- B = -0.96

/n

0.8-3

-B- B = 1.57

0.7

-A- B = 11

0.6
0.5

//

0.4
0.3
0.2
0.1

da

ba
Visual Factor Level

Figure 1. Relative evidence for the syllable "da" for the acoustic (top panel) and visual (bottom panel)
factors. The original scale values from Massaro and Cohen (1983) are plotted with open circles (B = orig.),
and the other markers represent scale values computed with three different choices for B using Equations 4
and 3, respectively. Open triangle, B = -0.96; open square, B = 1.57; filled triangle, B = 11. Orig. = original.

this value using Equation 4 and letting B = 11, we obtain the


value 0.82, which would suggest a high degree of truth to that
same proposition. Letting B = 1.57 would produce the value
0.5, indicating that the truth of the proposition is "completely
ambiguous." As a result, unless constraints are introduced to
restrict FLMP parameters, the estimate of any particular parameter in a two-category, two-factor experiment cannot be interpreted as a fuzzy truth value in the sense described in the
quote by Massaro and Friedman (1990, pp. 231-232) cited
earlier.
The statements in the Massaro and Friedman (1990, pp.
231-232) quote pertain to the parameter values of a single level
of one of the factors. From a measurement-theoretic perspective

the statements in the quote are not "meaningful." For example,


in Roberts (1979) and Suppes and Zinnes (1963), a statement
involving scale values is said to be meaningful3 in case its truth
value is preserved under permissible rescalings. In the case we
3

The definitions of meaningfulness given in Roberts (1979) and Suppes and Zinnes (1963) are not accepted by all measurement theorists
as covering all applications of the concept of meaningfulness. In fact,
considerable foundational work has since occurred to provide a more
adequate sense of meaningfulness (e.g., Luce, Krantz, Suppes, & Tversky, 1990). However, the situation in the current article is sufficiently
simple that the Suppes and Zinnes (1963) and Roberts (1979) definition can be regarded as adequate.

401

MEASUREMENT-THEORETIC ANALYSIS OF FLMP

have been discussing, statements interpreting the scale value of


a given level, as in the quote, can be rendered either true or false
by scale transformations, and hence they are not meaningful in
the intended sense.
Rather than interpreting any particular value in isolation,
one might hope to make meaningful statements about the relative values of some of the parameters. For example, consider a
stimulus (C k , Oi) in the "ba"-"da" example discussed earlier.
One might hope to make weaker statements, such as "Ck gives
stronger support for 'ba' than does Oi" One way to quantify
such interfactor comparisons is to assert the proposition ck > Oi.
Unfortunately, as Corollary 2 shows, such statements are not
meaningful in the sense that rescalings can always reverse inequality relationships comparing specific levels of the two
factors.
Corollary 2. Suppose that a set of parameter values <Cj,
Oj> determines the probabilities Pij(q, O;) through Equation 1.
For a particular stimulus (C k , Ot), suppose ck > QI. Another set
of parameter values <cf, o*> satisfies Equations 1 and 5,
where o* > c*.
Proof. On the basis of Equations 3 and 4, we note that
;<0

5B

(8)

and

to* Oi(l :>0


SB [l+Bo,)]

(9)

for all 0 < c k , QI < 1, and B > 1. Thus, increases in B decrease


c* and increase o*.
Next, solve the equation c* = o* for B to yield the quadratic
equation
B2 + 2B-

Clc-Ol
i( 1 - Ck)

= 0,

(10)

which always has a solution BO > -1, for 0 < ck, o, < 1. From
Equations 8 and 9, it is clear that c* > o* for B < B0 and o* >
c k t forB>B 0 .
Corollary 2 shows that it is meaningless to compare even ordinally the magnitudes of fuzzy truth values between respective
levels of the two factors. In fact, it follows by a minor extension
of the preceding argument that for any set of proportions <? >,
there are two sets of parameter values <Cu, DIJ> and <c2i, O2j>,
both of which minimize RMSD in Equation 2 and satisfy cu <
QI j and c2i > o2 j, for all 1 < i < / and 1 < j < /. In other words,
rescalings can completely reverse all interfactor ordering on the
parameters. This is illustrated in Figure 1: For B = 11 (filled
triangles), all levels of the acoustic factor are above .5, whereas
all levels of the visual factor are below .5; however, for B = 0.96
(open triangles), this ordering is completely reversed.

sons of factor levels are meaningless, in that rescalings can


always reverse their order. Nevertheless, it is possible that one
might hope to be able to quantify meaningfully the overall,
relative impact or importance that each factor has on the decision. Of course, the factor impact will depend on the particular levels chosen in the experiment. For example, FLMP
may allow us to compare the "impact" of the visual and
acoustic factors as "ba"-"da" cues. Large variation in FLMP
parameter values might be taken to represent extreme psychological impact of the factor in question, and one could
hope to assess the overall impact of one factor as compared to
another; that is, factor impact could be operationalized by
some measure of the dispersion of the parameters corresponding to the levels of a given factor. If we were to find that
the parameter dispersion corresponding to the various levels
of one factor was greater than the dispersion for a second factor, then we might have evidence that, overall, the former factor has more impact than the latter.
Massaro4 (1987, p. 233) has suggested using the range
(maximum minus minimum) of the parameter value estimates that correspond to each factor as a measure of dispersion for the purpose of comparing factor impact in an audiovisual speech perception experiment. However, in light of the
measurement scales defined by Equations 3 and 4, we need to
determine whether parameter value range comparisons are
scale invariant. The ranges are computed by Ac = cmax cmin
and Ao = omax - o min . Clearly, statements such as Ac > Ao are
meaningful only in case they are preserved under arbitrary
rescalings. It is easy to see that comparisons of this type are
not always scale invariant. For example, assume two levels of
each factor, with one set of parameters as follows: Ci = 0.19,
c2 = 0.3, QI = 0.1, and o2 = 0.2. Then Ac = 0.11 > Ao = 0.1.
However, let B = 10. Then it is easy to compute Ao =
0.183 > 0.017 = Ac from Equations 3 and 4, so ordering of
the factor ranges has been reversed by rescaling. Far from being contrived, range reversals can occur through rescalings in
many other cases as well; thus, the range cannot be a meaningful indicator of factor impact. Similar problems also occur if other measures of factor impact are used, such as variance or average deviation.
Fortunately, it is possible to define a measure of "factor impact" using estimated parameter values that permits meaningful comparisons between factors. Define the logit function
L(x) = log

1 -x'

(11)

for 0 < x < 1. Then L(c f ) and L(OJ) can be interpreted as the
log-odds ratios (e.g., Luce, 1959) for factor levels Q and Oj, respectively. It is easy to see from Equations 3 and 4 that
cf) = L(Ci)-log(l+B)

(12)

Quantifying Overall Factor Impact

and

So far, we have discovered that an FLMP parameter corresponding to any particular level of either factor can be rescaled to any particular value in (0, 1); however, this step
determines the value of B and therefore rescales all the other
values. We have also shown that ordinal interfactor compari-

4
In more recent applications, Massaro has suggested using variation
in the response proportions themselves rather than the parameters. This
operational basis for the definition of factor impact is independent of
FLMP and is not subject to our concerns.

402

C. CROWTHER, W. BATCHELDER, AND X. HU

L(of) = L( 0 j ) + log(l

(13)

forB>-l.
From Equations 12 and 13, it is clear that the ranges of the
log-odds ratios for both factors can be ordinally compared. Specifically, define
= L(c max )-L(c min )
and

= L(o m a x )-L(o m i n ),
where cmax, c min , 0,*, and omin are defined as before. It is meaningful to assert that, say, RC > RO, because from Equations 12
and 13, the values of RC and RO do not depend on the value of
B, and therefore are invariant under rescaling.

What Measurement Scale Is Implied?


The lack of uniqueness of the parameter values in a statistical
model does not, in itself, render the model useless. For example,
Luce's (1959) choice rule has been quite successful in describing paired-comparison judgments from a set of TV objects by the
rule
P, = ^-,

(14)

where py is the probability that object i is chosen over object


j in a paired comparison, and the scale values Vj, v j > 0, where
1 <, i * j < N. Clearly, rescaling the parameters in Equation
14 by vk = Av k , for A > 0, shows that Luce's choice, model
is not identifiable. In fact, Suppes and Zinnes (1963) have
referred to this type of scale as an indirect, ratio-scale measurement. This sort of nonuniqueness is common in statistical models, particularly the logit models such as the BradleyTerry-Luce model that is based on Equation 14, and nonuniqueness does not affect their value or usefulness in analyzing and interpreting data.
So far, for the two-factor, two-category FLMP, it is clear
from Theorem 1 and its consequences that the scale of measurement is determined by one fixed quantity, B. However,
the situation is different from the choice rule in Equation 14
in two essential respects. First, two separate sets of scale values are set up indirectly from Equation 1, one for factor C
and one for factor O. As Corollary 2 establishes, if these two
sets of scale values are regarded as being in the same "currency" (as suggested by Massaro & Friedman, 1990, p. 227),
and are therefore merged onto a single scale of fuzzy logic
truth values, not even ordinal properties among the scale values are preserved by rescaling. A second difference is that
Equations 3 and 4 show that the formulas for rescaling each
factor do not assume familiar, conventional forms such as for
ratio or interval scales.
Most of the properties we have considered concerning interpretation of the parameters as fuzzy truth values have proven
not meaningful in the sense that scale transformations can destroy them. A weaker interpretation of fuzzy truth, as discussed
in an article by Goguen (1969, p. 332) that is cited often in the
FLMP literature (see, e.g., Massaro & Oden, 1980), assumes
that fuzzy truth values are scaled ordinally. In Goguen's frame-

work, a scale of "degree of membership" preserves 0 and 1 but


otherwise is only ordinal. Corollary 3 shows that FLMP satisfies
this property as long as the two factor scales are considered
separately.
Corollary 3: intrascale ordinality. Suppose that a set of parameter values <Ci, Oj> determines the probabilities Pjj(ci, Oj)
through Equation 1. Suppose further that <c*, o*> is another
set of parameter values that satisfy Equations 1 and 5. Then ck
< Q iff c* < c* and Op < oq iff o* < oj, for all 1 < k, 1 < /and 1
< p, q < J.
Proof. By Theorem 1, there is a unique B > -1 that yields
thr set <c*, o*> through Equations 3 and 4. Next, note that
Ci

= cr

(15)

iff C]<( 1 + B) < C|( 1 + B). Because B > -1, we can conclude
that Equation 15 holds iff ck < Ci. A similar argument shows
that Op < oJ iff Op < oq.
Corollary 3 shows that the separate scales for each factor established by Equation 1 satisfy Goguen's (1969) ordinal properties; however, because both scales are yoked and set by a single
constant B, they are much stronger than ordinal scales. More
importantly, because of Corollary 2, they cannot be merged
onto a single scale and retain interscale ordinal properties. Obviously, interscale ordinality is a desirable property if the estimates are to be considered on a single dimension or in a common currency, such as fuzzy truth values intended to indicate
degree of support for a certain prototype. In the next section we
discuss some of the steps that one might consider taking to avoid
the model nonidentifiability problem.

Avoiding the Nonidentifiability Problem


Presenting Experimental Factors in Isolation
So far, all of the problems in interpreting the parameters as
fuzzy truth values have occurred because of parameter nonuniqueness in Equation 1 for a two-factor experiment. In
some cases, the nonuniqueness problem can be overcome by
presenting single experimental factors in isolation. In fact,
this is an experimental procedure often used in applications
of FLMP, and it is called the expanded factorial design (e.g.,
Massaro & Friedman, 1990, p. 226). For example, in the
above audiovisual speech perception experiment, one could
obtain unique parameter values by adding "visual only" and
"acoustic only" conditions. The parameter values so obtained would be unique, because they turn out to be equivalent to the observed relative frequencies, P; and Pj. For example, when given only the visual factor, Q, instead of Equation
1 the RGR now becomes

Pi(Q) =

Ci + ( 1 -

(16)

and this condition coupled with PJ(OJ) = QJ uniquely determines the parameter values.
However, there are three considerations involved with including single-factor conditions that may limit their applicability.
First, the logic of the approach requires the assumption that the
scale values in the single-factor experiment given by Equation

MEASUREMENT-THEORETIC ANALYSIS OF FLMP


16 apply to the two-factor situation. Incorporating Equation 16
into Equation 1 yields the relationship
Pij(Q, Oj) =

(17)

Equation 17 is a strong condition and is easily tested with a chi


square using the one-factor relative frequencies as estimates of
the Q and Oj. If it does not hold, Equation 1 of FLMP may still
be valid, but the single-factor data will not be consistent with
the two-factor scales.
A second issue concerns the phenomenological character of
the single-factor experiment itself. To perceive the stimuli as intended, both factors may be required in some cases. For example, among the large number of factorial experiments in the
speech perception literature, many become problematic when
single factors are presented in isolation. For example, Crowther
(1993) tested the influence of two acoustic factors, vowel duration and first formant offset frequency, on stop consonant voicing perception. If the first formant offset factor had been presented in the absence of the vowel duration factor, the resultant
stimuli would not even sound like speech. Furthermore, if the
vowel duration factor had been presented in the absence of the
first formant offset frequency factor, then the stimuli would not
likely have been perceived as containing a voiced stop consonant. Sometimes it is not even feasible to produce stimuli for
single-factor trials. For example, one may be interested in studying stimulus duration and stimulus intensity, but clearly these
factors cannot be isolated physically. Therefore, depending on
the nature of the stimuli, it may or may not be possible to overcome the nonuniqueness problem for the two-factor, two-category FLMP by including single-factor trials.
A third issue involves cases where the single-factor experiment is both possible and satisfies Equation 17 to an acceptable
degree. In this case, it is possible to postulate scale-invariant
fuzzy logic truth values; however, that this scale is directly
linked to observable proportions questions the need or usefulness of the fuzzy logic interpretation. At a minimum, considerable evidence from other sources would be required to assume
that individuals are processing fuzzy truth values.

Effect of Adding More Experimental Factors


Another possible way to avoid parameter nonidentifiability is
to include more than two experimental factors in the design
while holding constant the number of response categories. This
would increase the number of parameters to be estimated, but
(because the experiment is factorial) it would also increase to a
greater extent the number of observable entities. For example,
consider a factorial experiment with N factors and / > 2 levels
per factor n. Then the number of parameters to be estimated is
N

AT

2 / n , but the number of conditions for observed data is IT / n .


-'
n-i
As n increases, the latter term increases faster than the former,
and, in fact, the ratio of the number of parameters to the number of observable entities decreases to zero with increasing n.
Therefore, one might hope that adding experimental factors
might avoid the problem of nonuniqueness of estimates. Unfortunately, as Corollary 4 shows, such an approach not only fails

403

to resolve the nonidentifiability problem, but instead worsens


the situation by increasing the number of arbitrary scale values.
Corollary 4. Add a third experimental factor, U, with K levels, to the two-category, two-factor hypothetical experiment.
The resultant three-factor model is not identifiable.
Proof. Expressing the three-factor experiment in the form
of Equation 1,
Pijk(Ci,Oj,U k ) =

Cj Oj Uk
Cj Oj Uk + ( 1 - Ci)( 1 - Oj)( 1 - U k )

, (18a)
-Uk)

uk
where pyk(Cj, Oj, u k ) is the probability of identifying stimulus
(Q, Oj, U k ) as an instance of T,. Let DI, D2, D3 > 0 be such
DI D2 D3 = 1. If the following transformations,

(l-cD_(l-Ci)
D,,
cf
Ci

U8b)

'D 3 ,

(18d)

and

uk

are substituted for their corresponding terms, the probabilities


in Equation 18a remain invariant. This result follows immediately by inserting Equations 18b, 18c, and 18d into Equation
18a.
Corollary 4 shows that for a three-factor experiment, there
are three scales unique up to two arbitrary constants. Of course,
these scales are not fuzzy truth value scales; however, simple
algebraic manipulations such as in Theorem 1 can be performed to examine the implied fuzzy truth value parameter
scales for the factor levels. It is easy to extend this result to an
N factor experiment, the consequence being that the N scales
underlying the generalization of Equation 18a have TV" - 1 arbitrary constants in them.

Effect of Adding More Response Categories


Another possible approach that avoids nonidentifiability is to
glean more information from the experiment by increasing the
number of response categories while holding constant the number of experimental factors. To see how this might be done, let
us expand the hypothetical two-factor experiment that involved
the use of two-response categories, T, and T 2 , to include four
response categories, T,, T 2 , T 3 , and T 4 . Cohen and Massaro
(1992) provide an extensive discussion of the four-category paradigm. In this case, the prototype definitions may be expressed
as follows:

404

C. CROWTHER, W. BATCHELDER, AND X. HU

The probability of identifying stimulus (Q, Oj) as an instance


of category T,, T2, T3, or T4 in this model is given by:

CfOj
CjOj + Cj ( 1 - Oj) + ( 1 - Ci)0j + ( 1 - Ci)( 1 - Oj)

= CjOj,

and similarly

Pij(T 3 |Ci,Oj) = (l and

p i j (T 4 lc i ,o j ) = ( l It is easy to see that only the identity transformation on the


parameters leaves invariant the four equations pij(Tk| q, Oj), k
= 1, 2, 3, 4. Thus, FLMP for any two-factor experiment that
supports four categories is identifiable. More generally, when
all possible prototypes are defined for the experimental factors,
FLMP will be identifiable; that is, if there are n factors and all
2" prototypes are used, unique parameter estimates can be obtained. In fact, it is worth noting that the above equations for
the complete four (and 2 n ) category experiments satisfy the
properties of general processing tree models described in Hu
and Batchelder ( 1994), which provides algorithms for statistical inference.
Many applications of FLMP involve four categories and two
experimental factors. However, the nature of stimuli in some
categorization experiments that lend themselves quite naturally
to a binary choice paradigm may be such that it is not possible
to expand the design naturally to a four-choice paradigm; that
is, for some experiments it is possible that there are stimulus
feature combinations that do not correspond to natural prototypes that could be the basis of a response category. Furthermore, even when all four prototypes do correspond to natural
categories, the simplicity of the four equations for the model
does not argue, at least for us, for the desirability or usefulness
of postulating that individuals process fuzzy truth values.
Fixing the Value of One Parameter
A fourth, potential remedy does not involve changes at the
experimental design level, but rather changes in the parameter
estimation procedure. Considering our hypothetical two-category, two-factor experiment, if one were to set the value of, say,
Ci before passing the data through the parameter estimation
procedure, then Corollary 1 shows that the parameter value
scales would be determined in the sense that all of the parameter
estimates are constrained to be unique by the setting of Ci . In
fact, this was mentioned earlier as the course taken by Oden5
( 1979) in a 6 X 6 factorial letter perception experiment. Oden
claims that the "general shapes" of the curves relating the levels
of a factor to the scale values of the factor (as in our Figure 1 )
". . . would not be greatly affected by changes in the parameter
scale unit" ( 1 979, p. 347 ) . However, it is not clear what is meant
by general curve "shape." All three curves in both panels of

Figure 1 are monotonic and defined by a single parameter, although one might think their shapes differ. In any case, many
aspects of the relationship between any such curves can be understood by analyzing Equations 3 and 4. We think it is difficult
to maintain that the "shape" of functions of the Q and Oj do not
change with the scale parameter, B.
Fixing a parameter value to resolve the nonidentifiability
problem may entail negative consequences depending on just
how one wants to use the parameter values. In particular, we see
no way to set a particular scale value a priori in a manner that
guarantees the properties of the fuzzy logic interpretation of the
parameters given in the quote cited from Massaro and Friedman ( 1 990, pp. 23 1 -232 ) . Criteria such as equating parameter
value ranges used by Oden ( 1979, Footnote 4) could always be
used; however, they have only arbitrary and unsystematic implications for the fuzzy logic interpretation of the parameters.
On the other hand, it may be reasonable to fix a parameter at
a certain value and maintain a fuzzy logic interpretation in certain circumstances. For example, if one had sufficient reason
for believing that the psychological value of a particular level of
one of the factors should be "completely ambiguous," then it
seems reasonable to set the corresponding fuzzy truth value to
0.5 before starting the estimation procedure. Our survey of the
applications of FLMP did not yield many situations with a factor level that was obviously "completely ambiguous," and even
in cases where there was such a level, one worries that response
bias for one of the two categories in the presence of ambiguity
might enter and thus thwart this approach to the nonidentifiability problem. The next section provides some insight into the
type of indirect measurement that is entailed by Equation 1 .

FLMP and the Rasch ( 1 960 ) Model


Model Equivalence
It turns out that Equation 1 of FLMP is equivalent to a version of Rasch's ( 1960) item response theory model that is well
known to psychometricians. Rasch's two-parameter model concerns the case where /participants take a test with /items. The
Rasch model is perhaps the most popular among many item
response theory models. It is discussed and analyzed extensively
in the psychometric literature (e.g., Hambleton, Swaminathan,
& Rogers, 1991; Lord, 1974, 1980), and a great deal is known
about its statistical theory. Let
,
1J

j 1 if subject i is correct on item j


J O otherwise,

! < i </, 1 < j < J.


The Rasch model postulates that

p ( X y = !) = [!+ exp-(9i- ft)]

(19)

where -oo < 0;, ft < oo , 0i is the "ability" of the ith participant,
and ft is the "difficulty" of item j.
5
Actually, as noted earlier, Oden fixed the value of one parameter in
a different form of the model, and he did not discuss the effect of this
step on interpreting the parameters as fuzzy truth values. Our Equations
3 and 4 provide this interpretation.

MEASUREMENT-THEORETIC ANALYSIS OF FLMP

Batchelder and Romney (1989) desired a form of Equation


19 that constrained the parameters to the unit interval. In essence, they showed that the continuous transformations
aiMl+e^r 1

(20)

and

(21)
yield 0 < 3j, bj < 1 and, from Equation 19,

(22)
which is equivalent to Equation 1. In this reformulation, bj
should be thought of as an "item easiness" rather than as an
item difficulty parameter because bj decreases with increasing
0J.

The connection between Equation 1 and Equation 19 is interesting because the indirect measurement scale denned by
Equation 19 is obvious, namely, it is called a difference scale
(Suppes & Zinnes, 1963), where an arbitrary constant can be
added to both participant ability and item difficulty parameters
without changing the p(Xjj = 1). It is not surprising that the
equivalent model in Equations 1 and 22 also defines a scale up
to a single constant, albeit a nonstandard one as seen in Theorem 1. In fact, most of the points made in Theorem 1 and Corollaries 1, 2, and 3 are transparent for the equivalent Rasch
(1960) model in Equation 19.
Equation 1 for the two-factor experiment not only arises from
the Rasch (1960) model, but it is well studied in the foundations
of measurement literature. For example, Falmagne (1985) describes a probabilistic conjoint measurement model for two responds satisfying the condition

(23)
where f and g are nonnegative functions defined on the factors 7
and J, respectively, anti Equation 23 represents the probability
of a fixed response (see Equation 1 ) given the factorial stimulus
Falmagne ( 1985, p. 148) shows that under certain technical
conditions, Equation 23 is equivalent to a quadruple condition,
namely, for all levels i, i' e 7 and j, j ' e /,
PC Pi-j-

Pij'Pi-j

(24)

The condition in Equation 23 is also shown to be equivalent,


again under technical conditions, to a random variable representation where
= Pr(U i >Vj),

(25)

where Uj and Vj are independent, negative exponential random


variables. All three of the conditions in Falmagne (1985) apply
to Equation 1 of FLMP; for example, if f(i) = Ci/( 1 q) and
g(j) = (1 - Oj)/0j, Equation 23 becomes Equation 1.
Other equivalent formulations of FLMP are provided in
Oden (1988) and in Massaro (Cohen & Massaro, 1992; Massaro & Friedman, 1990), where various logistic versions of the
two-factor, two-category FLMP are discussed. In particular,

405

these authors have drawn formal parallels between the two-category, two-factor FLMP and several other mathematical models
of information integration and classification currently in use
in psychology, such as Bayesian and connectionist approaches.
They have shown that, among the models compared to FLMP,
those that could be written in "likelihood product" form make
predictions identical to those of FLMP. Thus, Equation 1 of
FLMP admits to many different interpretations, including ones
discussed by Cohen and Massaro (1992). Indeed, the basic
equation of FLMP for two factors is one of the most studied
equations in psychology, not only in the area of classification,
but in test theory and foundations of measurement.
Thus, for the models examined by Cohen and Massaro
(1992) and Massaro and Friedman (1990), those that can be
put into likelihood product form do not have identifiable parameters for two-category, two-factor experiments. With choice
models or information integration models, the nonidentifiability usually can be captured by scale constants. Consequently,
although the likelihood product models may share the property
of having nonidentifiable parameters, the arguments given
above regarding nonidentifiability are damaging in principle
only to models like FLMP, where parameters are intended to be
interpreted in very specific ways that are not invariant under
rescaling. If a model's parameters are not to be interpreted as
such, they are not subject to the same criticism. The consequences of parameter nonidentifiability depend on how one
wishes to interpret the parameters in question.
Perhaps the most important point to be made in this section
is that a formula such as Equation 1 admits to numerous interpretations. To justify a particular psychological interpretation
such as that offered by FLMP, it is necessary to provide strong
evidence that the parameters are parameterizing the psychological process claimed by the model. In the case of FLMP, in light
of its many equivalent forms, we argue that, even with the problem of parameter nonuniqueness covered in Theorem 1 and its
corollaries aside, the interpretation of the parameters of Equation 1 as functioning as fuzzy logic truth values in individuals'
mental processing is not convincing.

Statistical Inference
That the model represented in Equation 1 is equivalent to
the two-parameter Rasch (1960) model in Equation 19 (with
parameters compressed to the unit interval) has numerous
practical advantages. For more than two decades, Psychometrica and other quantitative journals have published many articles concerning estimation of the Rasch model family, including
generalizations that allow more than one participant parameter
and more than one test-item parameter (e.g., Holland, 1990;
Lewis, 1986). In fact, in the psychometric literature, the Rasch
model in Equation 19 is extremely well understood statistically
from just about every inferential perspective. It is therefore desirable that users of FLMP and other equivalent models are able
to use this extensive statistical theory.
Although FLMP and the Rasch (1960) model are formally
equivalent, there are some important differences between the
test theory situation and the information integration situation.
First, in test theory, participants generally are drawn from a
population at random, and sometimes it facilitates the statistical

406

C. CROWTHER, W. BATCHELDER, AND X. HU

inference of the Rasch model to make this population assumption explicit. On the other hand, unlike the subject ability factor
in test theory, in an information integration experiment, the
levels of both factors are selected quite intentionally by the
experimenter.
A second difference is that, in many situations the levels
within a factor can be ordered on some physical scale (e.g.,
length of a crossbar, pitch of a vowel sound, and so on). From a
psychophysical perspective, these physical orderings may suggest intrascale orderings that can be imposed on the estimation
problem. In general, estimation of categorical models under ordinal constraints is a standard topic in inference (e.g., Agresti,
1984; Robertson, Wright, & Dykstra, 1988). In fact, there have
been several efforts (e.g., Fischer & Formann, 1982; Fischer &
Tanzer, 1994) to incorporate ordinal constraints into the estimation of the Rasch model.
Finally, a third difference between the test theory situation
and FLMP is that much of the estimation theory for the Rasch
model concerns the case where only one observation (correct or
error) is obtained for each combination of participant and item.
In FLMP applications, one usually obtains repeated observations at each combination of factor levels. Thus, the data in
FLMP are usually response proportions rather than single occurrences. However, it is of great value, we think, to note that it
is perfectly feasible to work with FLMP when only one observation is made at each factor-level combination, and in fact some
of the most sophisticated applied statisticians in psychology
have devoted considerable energy to estimation theory in just
this case.
Fortunately, there is a way to reformulate the Rasch model,
and hence FLMP, as a simple logit model. In this way, data from
a factorial experiment with repeated observations can be analyzed with FLMP and other equivalent models using standard,
readily available software. Agresti (1990, Section 4.3.2) describes a logit model for an / X J X 2 data structure. The data
structure exactly matches the two-factor, two-response situation
for FLMP with experimentally preset numbers of observations
falling into each cell of the design. In terms of the logit function
in Equation 11, the model in Agresti (1990) can be written for
a particular response as
L(Pij) = a + |8p + ftf,

(26)

where 2 0f = I, P = 0, and the effects of factors C and O are


represented by the {/3f} and {/3f}, respectively. The model in
Equation 26 assumes an absence of interaction between the two
factors. Agresti (1990, chapters 6 and 7) provides ways of conducting statistical inference for such models. The next theorem
shows that FLMP is equivalent to the logit model in Equation
26.

Theorem 2
Consider the FLMP for a two-factor, two-response paradigm
for some set of parameters (Q), <Oj>, leading to Equation 1.
Then it is equivalent to the logit model in Equation 26.
Proof
Deriving Equation 26 from Equation 1: Letpij(Ci,Oj) = pyin
Equation 1, and use the logit function in Equation 11 to write

= \og-

1 /
Next, define the parameter means L(C) = - Z L(Cj)andL(0)
/ i-l

1 J
= - Z L(OJ). It is then easy to define a as the grand mean of

the logits given by


= JJ Z 2 L(pij)
= L(c) + L(o).

Finally, define /3p = L(q) - L(C) and /3f = L(OJ) - L(O) as


deviation scores from the means. It follows immediately that
-(Pij) = a + /Sp + <S,

and Z /3p = Z /8f = 0, which is the form of Equation 26.


Deriving Equation 1 from Equation 26: Assume that Equation 26 holds. Then

(27)
Next, pick any real number r arbitrarily and define aj = /3f +
(a + r) and bj = r - 0. Rewriting Equation 27, we get

and solving for p{j yields


1 +i

(28)

Equation 28 is in the form of the Rasch (1960) model in Equation 19, and this was shown to be equivalent to Equation 1 of
FLMP.
Theorem 2 is the key to relating FLMP in Equation 1 to standard statistical theory. Converting to logits lays bare the essence
of FLMP as an additive model that assumes no interaction.
Logit models are highly studied, including various models for
ordered factor levels, and standard methods exist to transform
logit models into useful log-linear forms (e.g., Agresti, 1984,
1990). A variety of standard, readily available software exists
to analyze such models (e.g., Agresti, 1990, Appendix A). This
software can perform all aspects of statistical inference, including parameter estimation, goodness of fit, and hypothesis
testing.

Conclusion
Our analysis of FLMP has focused on the indirect scales of
measurement implied by Equation 1 for a two-factor, twocategory experiment as well as related experimental designs.
In the two-factor case, we have shown that the equation
defines two indirect scales of measurement in the sense of
Suppes and Zinnes (1963) that are interrelated and unique
up to a single quantity B > -1. The model represents a natural version of probabilistic conjoint measurement that has
proven very useful in fitting data from a variety of two-factor

MEASUREMENT-THEORETIC

categorization experiments. However, our analysis has raised


serious concerns about the interpretation of the parameter
values as fuzzy truth values. The only attribute of the two
scales that accords with fuzzy logic is intrascale ordinality,
as shown in Corollary 3. In particular, if the parameters are
interpreted as being in the "currency" of fuzzy truth, then
none of the fuzzy truth properties in the quote discussed earlier from Massaro and Friedman (1990, pp. 231-232) hold
under parameter rescaling. More generally, we showed that
the fuzzy logic interpretation is problematic formultifactor,
two-category designs. It is important to note that the expanded factorial design, in which single experimental factors
are presented in isolation, is identifiable and therefore not
subject to the same criticisms. Furthermore, the two-factor,
four-category design, and, more generally, any n factor, 2 n
category design is also identifiable. It is important to repeat
that we are not challenging the ability of FLMP to fit data
from categorization experiments, and in fact it has an impressive history in this regard. Furthermore, the literature
showing that FLMP is equivalent to several other information
integration approaches (e.g., Cohen & Massaro, 1992; Massaro & Friedman, 1990; Oden, 1988) is important theoretically. In fact, we have expanded the set of equivalent forms
to include the Rasch (1960) model in several of its guises,
including an especially tractable logit reformulation that is
very well understood statistically. We hope our analysis of the
measurement scale properties of Equation 1 of FLMP and its
relation to the well-understood Rasch model will facilitate
its use and, in particular, encourage more work designed to
uncover the underlying processing events that give rise to the
success of the equation in fitting data from factorial classification tasks.
References
Agresti, A. (1984). Analysis of ordinal categorical data. New York:
Wiley.
Agresti, A. (1990). Categorical data analysis. New York: Wiley.
Bamber, D., & van Santen, J. P. H. (1985). How many parameters can
a model have and still be testable? Journal of Mathematical Psychology, 29, 443-473.
Batchelder, W. H. (1991). Getting wise about minimum distance measures [Review of Goodness-of-fit statistics for discrete multivariate
data]. Journal of Mathematical Psychology, 35,267-273.
Batchelder, W. H., & Romney, A. K. (1989). New results in test theory
without an answer key. In E. E. Roskam (Ed.), Mathematical psychology in progress (pp. 229-248). Heidelberg: Springer-Verlag.
Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete
multivariate analysis: Theory and practice. Cambridge, MA: MIT
Press.
Chandler, J. P. (1969). Subroutine STEPITFinds local minima of a
smooth function of several parameters. Behavioral Science, 14, 8182.
Cohen, M. M., & Massaro, D. W. (1992). On the similarity of categorization models. In F. G. Ashby (Ed.), Multidimensional models of
perception and cognition (pp. 395-447). Hillsdale, NJ: Erlbaum.
Crowther, C. S. (1993). The operation of vocalic duration andFl offset
frequency as voicing and vowel cues. Unpublished doctoral dissertation, University of California, Irvine.
Crowther, C. S., & Hu, X. (1992, August 22). An identifiable multinomial model for integrating acoustic voicing cues. Paper presented at

ANALYSIS OF FLMP

407

the 25th annual meeting of the Society for Mathematical Psychology,


Stanford University, Palo Alto, California.
Falmagne, J. C. (1985). Elements ofpsychophysical theory. New York:
Oxford University Press.
Fischer, G. H., & Formann, A. K. (1982). Some applications of logistic
latent trait models with linear constraints on the parameters. Applied
Psychological Measurement, 4, 397-416.
Fischer, G. H., & Tanzer, N. (1994). Some LBTL and LLTM relationships. In G. H. Fischer & D. Laming (Eds.), Contributions to mathematical psychology, psychometrics, and methodology (pp. 277-303).
New York: Springer-Verlag.
Goguen, J. A. (1969). The logic of inexact concepts. Synthese, 19,325373.
Hambleton, R. K., Swaminathan, H"., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage
Publications.
Holland, P. W. (1990). On the sampling theory foundations of item
response theory models. Psychometrica, 55, 577-601.
Hu, X., & Batchelder, W. H. (1994). The statistical analysis of general
processing tree models with the EM algorithm. Psychometrica, 59,
21-47.
Lewis, C. (1986). Test theory and Psychometrica: The past twenty-five
years. Psychometrica, 51, 11-22.
Lord, F. M. (1974). Individualized testing and item characteristic curve
theory. In D. H. Krantz, R. C. Atkinson, R. D. Luce, & P. Suppes
(Eds.), Contemporary developments in mathematical psychology
(Vol. 2, pp. 106-126). San Francisco: Freeman.
Lord, F. M. (1980). Applications of item response theory to practical
testing problems. Hillsdale, NJ: Erlbaum.
Luce, R. D. (1959). Individual choice behavior. New 'York: Wiley.
Luce, R. D., Krantz, D., Suppes, P., &Tversky, A. (1990). Foundations
of measurement: Vol. 3. Representation, axiomatization, and invariance. New \brk: Academic Press.
Massaro, D. W. (1987). Speech perception by ear and eye: A paradigm
for psychological inquiry. Hillsdale, NJ: Erlbaum.
Massaro, D. W. (1989). [Multiple review of A precis of speech perception by ear and eye: A paradigm for psychological inquiry]. Behavioral and Brain Sciences, 12, 741-794.
Massaro, D. W, & Cohen, M. M. (1983). Evaluation and integration
of visual and auditory information in speech perception. Journal of
Experimental Psychology: Human Perception and Performance, 9,
753-771.
Massaro, D. W., & Cohen, M. M. (1993). The paradigm and the fuzzy
logical model of perception are alive and well. Journal oj'Experimental Psychology: General, 122, 115-124.
Massaro, D. W., & Friedman, D. (1990). Models of integration given
multiple sources of information. Psychological Review, 97, 225-252.
Massaro, D. W., & Hary, J. M. (1986). Addressing issues in letter recognition. Psychological Research, 48, 123-132.
Massaro, D. W, & Oden, G. C. (1980). Evaluation and integration of
acoustic features in speech perception. Journal of the Acoustical Society of America, 67, 996-1013.
Oden, G. C. (1979). A fuzzy logical model of letter identification.
Journal of Experimental Psychology: Human Perception and Performance, 5, 336-352.
Oden, G. C. (1988). FuzzyProp: A symbolic superstrate for connectionist models. Proceedings of the IEEE International Conference on
Neural Networks, J, 293-300.
Oden, G. C., & Massaro, D. W. (1978). Integration of featural information in speech perception. Psychological Review, 85, 172-191.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Denmarks Paedagogiske Institute.
Read, T. R. C., & Cressie, N. A. C. (1988). Goodness-of-fit statistics for
discrete multivariate data. New York: Springer-Verlag.

408

C. CROWTHER, W. BATCHELDER, AND X. HU

Riefer, D. M., & Batchelder, W. H. (1988). Multinomial modeling and


the measurement of cognitive processes. Psychological Review, 95,
318-339.
Roberts, F. S. (1979). Measurement theory: With applications to decision-making, utility, and the social sciences. Reading, MA: AddisonWesley.
Robertson, T., Wright, F. T., & Dykstra, R. L. (1988). Order restricted
statistical inference. New York: Wiley.

Suppes, P., & Zinnes, J. L. (1963). Basic measurement theory. In R. D.


Luce, R. R. Bush, & E. Galanter (Eds.), Handbook of mathematical
psychology (Vol. I, pp. 1-102). New York: Wiley.
Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338-353.

Received July 6, 1994


Revision received November 1, 1994
Accepted November 2, 1994

P&C Board Appoints Editor for New Journal:


Psychological Methods
The Publications and Communications Board of the American Psychological Association
has appointed an editor for a new journal. In 1996, APA will begin publishing Psychological
Methods. Mark I. Appelbaum, PhD, has been appointed as editor. Starting January 1,1995,
manuscripts should be directed to
Mark I. Appelbaum, PhD
Editor, Psychological Methods
Department of Psychology and Human Development
Box 159 Peabody
Vanderbilt University
Nashville, TN 37203
Psychological Methods will be devoted to the development and dissemination of methods
for collecting, understanding, and interpreting psychological data. Its purpose is the
dissemination of innovations in research design, measurement, methodology, and statistical
analysis to the psychological community; its further purpose is to promote effective
communication about related substantive and methodological issues. The audience is diverse
and includes those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ
those procedures in research. The journal solicits original theoretical, quantitative empirical,
and methodological articles; reviews of important methodological issues; tutorials; articles
illustrating innovative applications of new procedures to psychological problems; articles on
the teaching of quantitative methods; and reviews of statistical software. Submissions should
illustrate through concrete example how the procedures described or developed can enhance
the quality of psychological research. The journal welcomes submissions that show the
relevance to psychology of procedures developed in other fields. Empirical and theoretical
articles on specific tests or test construction should have a broad thrust; otherwise, they may
be more appropriate for Psychological Assessment.

Você também pode gostar