Escolar Documentos
Profissional Documentos
Cultura Documentos
GUI Shi-chun
The Plato's problem-- how do people know as much as they do with as little information as they
get?-- also known as the poverty of the stimulus, negative evidence, or the logical problem of
language acquisition , has aroused the interest of many philosophers, psychologists, linguists, and
computational scientists. Nativism is the answer provided by Chomsky, but psychologists like MacWhinney
and computational linguists like Sampson offer different explanation. Quine calls the problem the scandal
of induction, whereas Shepard maintains that a general theory of generalization and similarity is as
necessary to psychology as Newton's laws are to physics. However, the acceptance of the hereditary nature
of language propensity does not mean the solution of the general theory of generalization and similarity--the
problem of categorization. Many models have been suggested to find a mechanism by which a set of stimuli,
words, or concepts come to be treated as similar. They attempt to postulate some constraints that can narrow
the solution space of the problem that is to be solved by induction. Latent semantic analysis (LSA) put forth
by Landauer et al isa high-dimensional linear associative model that embodies no human knowledge
beyond its general learning mechanism, to analyze a large corpus of natural text and generate a
representation that captures the similarity of words and text passages.The model employs a statistical
technique of linear algebra known as singular value decomposition (SVD). The input to LSA is a matrix {A}
consisting of rows representing unitary event types by columns representing contexts in which instances of
the event types appear. SVD then decomposes the matrix into three matrices: {A}={U}{w}{V}T, and
reduction of dimensionality is carried out in the reconstruction of the original matrix. To illustrate the power
of reduction of dimensionality, two examples are given. In the example given by Landauer, the text input is
titles of nine technical articles, five about human-computer interaction, four about mathematical graph
theory. LSA shows how in the two-dimensionally reconstructed matrices two words that were totally
uncorrelated in the original are quite strongly correlated (r =.9) in the reconstructed approximation. The
other example is the use of SVD in a preliminary study of the relationship among the errors by Chinese
learners of English. Reduction of dimensionality offers a better explanation of trends of development of
spelling errors, misuse of words, and syntactic construction among five different types of learners. LSA
have a wide area of application in connection with text processing.
Key words: Platos problem, similarity, induction, latent semantic analysis, singular value
decomposition
(Latent Semantic Analysis, LSA)
.90
LSA
[] H195 [] A [] 1003-6105200301-0076-9
(Plato)
1
Chomsky 2
3 4
MacWhinney Sampson Quine
Gavagai
gavagaigavagai
scandal of induction
Shepard1987
Shepard
2000
Landauer Dumais(1997)
1
Meno
Phaedo
Cratylus (physis)(nomos)
2
Chomsky196519862000
Pinker(1994)
3
Brian MacWhinney
4
Geoffrey Sampson1997 Educating Eve Empirical Linguistics
2001
of of the of
78
3 1
2
3
Landauer Foltz
DumaisDeerwesterFurnas (Deerwester et. al. 1990) Kintsch
Latent Semantic Analysis, LSA
LSA
A BC 4.5
B C 9
Feigenbaun Aihara
100
5
Osgood1971 70
Kintsch1988,1998construction-integration
model
prepositions
the red rose the rose is red
5
394-395
79
argumentsreferents
Graesser 1981
LSA
LSA/
Singular Value DecompositionSVDLSA
SVDmn{A}mnm>n
mn{U}{w}
NN{V}(transpose)NN
{A} = {U}{w}{V}T
9 5 4
9 12
1
SVD 3 2
2
81
SVD 2 6
Fm1
4 6
CET
6
Excel Greg Hood Excel Poptools2.4
82
pollution
SVD
wd3 SVD
LSA
7
Cosine
XY
83
.90http://LSA. colorado.edu
These findings indicate a considerable degree of functional equivalence of perception and imagery. However, it is
possible that subjects in the imagery condition merely made plausible guesses about the fields of resolution, and did
not actually rely on imagery at all.
While it is very straightforward to see that previous learning can facilitate problem solving by supplying
well-practiced skills and strategies, it is perhaps less obvious that knowledge acquired in the past can sometimes
disrupt, and interfere with, subsequent attempts to solve problems.
.82
LSALandauerLahamFoltz19985LSA
5
8
LSA
LSA
LSALandauerLSATill
1988
Kintsch1999LSA
Long-term Working MemoryLTWM
LTWMLSA
LSA
LSAmountainmountains
.81
mountain peaks, rugged, ridgesclimber, mountainspeaks, rugged, plateaus
foothillsLTWMThe band
played a waltz.Mary loved to dance.
.45
LSAKintsch2000Steinhart2001
Summary StreetSteinhart
Summary Street
BoulderSummary Street
84
LSA
LSA
Berry, M., S. Dumais, & G. O Brien [M]. 1994. Using linear algebra for Intelligent Information Retrieval [M]. Boston: Houghton Mifflin Company.
Carroll, J., et al. 1971. Word Frequency Book. Houghton Mifflin Company & American Heritage Publishing Co., Inc.
Chomsky, N. 1965. Aspects of the Theory of Syntax [M]. Cambridge, MA: MIT Press.
Chomsky, N. 1986. Knowledge of language: Its nature, origin, and use [M]. Westport: Greenwood Publishing Group.
Chomsky, N. 2000. New horizons in the study of language and mind [M]. Cambridge: Cambridge University Press.
Deerwester, S, S. Dumais, G. Furnas, T. Landuauer, & R. Harshman. 1990. Indexing by latent semantic analysis [J]. Journal of the American Society
for Information Science 41: 391-407.
Dumais,S.et al. 1982. Using semantic analysis to improve access to textual information [J]. Machine Studies 17: 87-107.
Foltz, P. W., W. Kintsch & T. K. Landauer. 1993 (Jan). An analysis of textual coherence using Latent Semantic Indexing [A]. Paper presented at the
meeting of the Society for Text and Discourse, Jackson, WY.
Geoffrey Sampson. 2001. Empirical Linguistics [M]. London: Continuum.
Graesser, A. 1981. Prose Comprehension beyond the word [M]. New York: Springer.
Kintsch, W., D. Steinhart, G. Stahl & LSA Research Group. 2000. Developing summarization skills through the use of LSA-Based Feedback [J].
Interactive learning environments 8 (2): 87-109.
Kintsch, W. 1988. The role of knowledge in discourse comprehension: A construction -integration model [J]. Psychological Review 95: 163-182.
Kintsch, W. 1998. Comprehension [M]. Cambridge University Press. 86-91.
KintschW., L. Vimla, K. Patel & A. Ericsson. 1999. The role of long-term working memory in text comprehension [J]. Psychologia 42: 186-198.
Landauer, T. & S. Dumais. 1997. A solution to Plato's problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation
of knowledge [J]. Psychological Review 104: 211-240.
Landauer,T.K., D. Laham & P. W. Foltz. 1998. Computer-based grading of the conceptual content of essays. Unpublished manuscript.
Landauer, T., P. W. Foltz & D. Lanham. 1998 An introduction to latent semantic analysis [J]. Discourse Processes 25: 259-284.
Maletic, J.et al. 1999. 14th IEEE ASE99 [A]. Cocoa Beach FL.12-15th [C]. pp.251-254.
Osgood, C. 1971. Exploration in semantic space: A personal diary [J]. Journal of Social Issues 27: 5-64.
Pinker. 1994. The Language Instinct.[M]. New York: William Morrow Company, Inc.
RosarioB. 2000. Latent Semantic Indexing: An overview [A]. INFOSYS 240 Spring 2000.
Shepard, R. 1987. Towards a universal law of generalization for psychological science [J]. Science 237: 1317-1323.
Steinhart, D. 2001. Summary Street: an intelligent tutoring system for improving student writing through the use of latent semantic analysis [D].
Unpublished doctoral dissertation, Institute of Cognitive Science, University of Colorado, Boulder.
Till, R., E. Mross & W. Kintsch. 1988. Time course of priming for associate and inference words in discourse context [J]. Memory and Cognition 16:
283-299.
van Dijk, T., & W. Kintsch. 1983. Strategies of discourse comprehension [M]. New York: Academic Press.
2000[M]308-329