Argumentation Trails and Topic Maps

Automatic Extraction of Topic Maps based
Argumentation Trails
Marco Büchler, Lutz Maicher*, Frederik Baumgardt, Benjamin Bock*
Natural Language Processing Group,
University of Leipzig, Germany
[mbuechler | maicher | fbaumgardt | bock]@informatik.unileipzig.de
Abstract. With argumentation trails we introduce an approach of finding

relevant associations between arbitrary terms. An argumentation trail between
two terms is an ordered list of cooccurrences, providing a connected path from
the origin to the endpoint of the argumentation. Within this paper the automatic
generation of argumentation trails is examined and assessed. Furthermore, the
formal representation of these trails as Topic Maps is implemented. This
enables the integration of argumentation trails with further background
information to support sensemaking or other discourse enriching techniques for
academic or political debates.
Motivation and Introduction
Small world related research on natural language corpora, hypertext structures on the
web or social networks like coauthorships has shown that the average path length
between two arbitrary nodes is generally not larger than 7. The general problem is the
discovery of the shortest path between these nodes, especially if the edges of the
graph are only partially known and the distance is greater than two.
Academic or political debates are confronted with a similar problem. In a lot of
cases there exists the supposition of a relationship between two terms of a specific
domain, which are the origin and the endpoint of an argument. But the closer
connection between both is unidentified and will become the essence of a discourse.
Our approach is the disclosure of the relevant connections between the origin and the
endpoint of an argument. We model this relationship as a connected path of terms,
based on cooccurrences. This path is called argumentation trail.
A cooccurrence is a directed edge between two terms c(ti,tj) that is extracted
automatically by using (different) statistical methods. An argumentation trail a(t1,tn)
between two arbitrary terms t1 and tn is an ordered list of cooccurrences, providing a
*
Topic Maps Lab: http://www.topicmapslab.de/
2 Büchler, Maicher, Baumgardt, Bock
connected path from t1 and tn. The distance d of an argumentation trail is the number
of cooccurrences in this list. The distance between the terms t1 and tn is the length of
the shortest argumentation trail between them.
The general problem is the calculation of the shortest and most significant
argumentation trails between two terms. Similar to the small world example above,
the argumentation trails for any distances greater than one are not obvious. The
extraction of relevant argumentation trails becomes more complex as the density of
the cooccurrence graphs and the distance between origin and endpoint increases. To
support sensemaking and other discourse supporting techniques in academic and
political debates, we introduce a method for the automatic extraction of
argumentation trails.
Topic Maps are used for representing highly networked and interlinked domains.
Furthermore Topic Maps is a semantic integration technology because each topic is a
hub for all available information about a specific subject. Besides others, topic maps
are used for sensemaking and knowledge federation techniques. For these
applications, the integration of argumentation trails with further background
information is necessary.
Therefore the approach presented in this paper combines the idea of the automatic
generation of argumentation trails with the formal representation of these trails as
topic maps. In the evaluation we assess the quality of this first proof of concept.
State of the Art
A graph is an intuitive representation of relations between words. More formally a
graph can be expressed by G=(V,E) where as V is a set of vertices (nodes, words) and
E a subset of edges of VxV. In Natural Language Processing (NLP) the set V of nodes
can be comprehended as the set of a corpus' word types. The set of edges E can be
computed by cooccurrence analysis [Bue08, Bue09]. Typically tens or hundreds of
million cooccurrences can be extracted. That's why measures are necessary for
computing an edge's significance. In the early 1990s some basic measures like the
mutual information [CH89] were introduced. However this measure displays some
numerical problems for very rare cooccurrences. As a result, in 1993 an adoption of
the log likelihood measure was introduced by [Dun93] which can handle more
infrequent events. However, most significant edges are completely distinctive. Whilst
log likelihood ratio prefers more frequent cooccurrences the mutual information
computes less frequent edges as more significant [Bue08, Eve05]. That's why log
likelihood ratio is better for exploration and understanding a new domain by
computing more general word associations. Whereas if the domain is wellknown less
frequent information becomes more relevant for users [BB04].
One elementary feature of a graph is the small world property which describes the
average path length between two different nodes [WS98, Bar00]. Research on small
worlds is based on works of Milgram [Mil67]. Several evaluations and applications
on natural language corpora, hypertext structures on the web or coauthorships on
Automatic Extraction of Topic Maps based Argumentation Trails
publications [CS01] have shown that the average path length is very small and
generally not larger than 7.
Similar to lexical chaining argumentation trails are a minimum spanning tree of
words having same or similar contexts [MWW07, MWH08]. However, there are
differences in use cases and texts. Lexical chaining is often used in text
summarisation [BCP01] or word sense disambiguation [GK03]. Thereby a stronger
work by sliding from paragraph to paragraph [MWH08] is useful. Since ancient text
corpora are only fragmentary achieved (caused by e. g. natural decomposition and
conscious deletions of person names or cities) an approach working directly on a co
occurrence graph is chosen.
Automatic Topic Maps Generation
Topic Maps (ISO 13250), the international industry standard for semantic information
representation and integration, is an implementation of the subjectcentric modelling 1
approach [MB08]. A topic map is a subjectcentric domain model, consisting of
topics, as subject proxies, and associations between them. Each topic can represent a
set of typed names for the subject. Furthermore, occurrences allow representing typed
properties of the subject. The associations between the topics are typed, rolebased
and nary. Summarised, Topic Maps provides a subjectcentric modelling approach
and a full set of basic modelling constructs, like names, occurrence and fullfeatured
associations for convenient domain modelling. For a more comprehensive
introduction into TM we refer to [AM05, Ma07a]. A topic map can be seen as a set or
a stream of statements about subjects [LH08].
Besides the expressive and flexible modelling constructs, Topic Maps provide a
powerful integration model [Ma07b, TMDM]. This integration model assures that
two topics representing the same subject will always be merged. Technically, the
subject of a topic is identified by a set of URIs which are called subject identifiers.
Whenever two topics in a topic map have one subject identifier in common, they are
automatically merged. Hence it is guaranteed that in a topic map there is always only
one information hub for each subject. This powerful integration model is the
fundament for the usage of Topic Maps as integration technology.
The subjectcentric modelling approach supports the (semiautomatic) generation
of subjectcentric web portals and other interfaces to the highlyinterlinked data
[MB08]. Combined with interchange protocols like TMRAP [Ga06] or TMIP
[Ba05b], these applications simultaneously feed the web of linked data [Be06].
1
According to the Topic Maps standards a subject is “anything whatsoever, regardless of
whether it exists or has any other specific characteristics, about which anything whatsoever
may be asserted by any means whatsoever” [TMDM]. Summarised, a subject is anything
that can be a topic of conversation. Simplified, the subjectcentric modelling enforces that for
each relevant subject exactly one proxy is created within the domain model. Consequently,
proxies become the unique information access points for all information about their subjects.
In the context of the work presented in this paper, the generation of Topic Maps
data is an important issue. The following table summarises the general categories for
approaches of creating Topic Maps data:
generic manual Topic Maps can explicitly be written in a text editor using an

generation interchange format like CTM2 or XTM3. Furthermore, generic
Topic Maps editors like Ontopoly or Topincs [Ce07] can be
used. Generally the users need deep knowledge about the
concepts of Topic Maps.
domainspecific, In domain specific TopicMapsbased portals or applications
manual generation like Musica migrans [Ma08] or Fuzzzy [LK08] the users
simply create data by using web forms or other interfaces.
The Autonomous Topic Maps approach [Ma07a] fits to this
category, too. The users don’t need any knowledge about
Topic Maps.
automatic (Nearly) all data can be represented as topic maps, so any
generation application might produce topic maps data for its domain.
Automatic generation is implemented by the approach
presented in this paper or similar work [BM06].
transformation (Nearly) all existing data can be seen as topic map (through
subjectcentric glasses). Providing Topic Maps views for
legacy data supports the integration of heterogeneous data.
Hereby a differentiation between materialized (ETL) and
nonmaterialized (non ETL) topic maps views is established
[Ba04].
For the global interoperability and usability of generated Topic Maps data two
issues are important: (1) the domain ontology and (2) the used subject identifiers at
type and instance level [Ma07a].
The domain ontology formalises the domain knowledge behind the data and can be
used for the optimisation and generation of the data consuming applications [Bo08].
The ontology of the Topic Maps data created by the work presented in this paper is
shown in Figure 2.
For the integration of the generated topic maps with other information about the
represented subjects, adequate subject identifiers must be used at the type and, very
important, at the instance level [Ma07b]. In the methodology section the approach for
choosing the subject identifiers in the argumentation trails is sketched.
2
http://www.isotopicmaps.org/ctm/
3
http://www.topicmapslab.de/glossary/XTM
Methodology
Exploring argumentation trails in a semantic network is closely related to searching
the kshortest paths from a source to a target node in an undirected graph where the
number of paths k is substituted by a maximum path length. The kshortest paths
problem is applicable in many fields and has been extensively studied, with the
number of publications approaching 100. The four most widely recognized methods
are those of Yen [Yen71], Lawler [Law72], Katoh [KIM82] and Hoffman [HP59].
Yen's algorithm is a naive usage of Dijkstra's shortest path algorithm, with complexity
O(kn3), where k is the number of paths and n the number of nodes in the graph.
Lawler and Katoh improve upon Yen by compartementalization of the paths, Lawler
by a constant factor and Katoh with complexity O(kn2). Even before Yen, Hoffman
introduced a different idea with the precalculation of shortestpath for all nodes,
resulting in complexity O(kn2).
With highly datarich environments, as described in this paper, memory constraints
become an issue as well. Thus, of the above algorithms only Yen was feasible, but
much too slow. Early trials demonstrated the need for a custommade method, as
required by the specific problem.
Drastic reduction of the search space was necessary. In the following approach to
explore argumentation trails with maximum length of 3, we utilise topologic attributes
that help us reduce the actual amount of data.
Instead of searching for paths between source s0 and target t0 nodes, connections
and overlaps between the neighborhoods ( Ns = {s1,..,sn}, Nt = {t1,..,tm} ) of both
endpoints are being searched. For each neighbor of the source or target nodes to be
included in an argumentation trail, it must be incident to, or part of the neighborhood
surrounding the endpoint on the opposite side of the trail. Thus, in a Graph G=(V,E)
we search for nodes v that are either member of both Ns and Nt (v∈Ns ∩ Nt) or
incident to a node in the opposite neighborhood ((v, ti)∈E for v∈Ns, or (v,si)∈E for
v∈Nt).
Fig. 1: Path selection on the topology of the endpoints' neighborhoods
Calculations are performed within the JUNG2 framework4, visualizations within

the prefuse5 toolkit.
For each combination of nodes in the cooccurrence graph, if neither node has
degree 0, two HashSets are built each containing the incident nodes of either the
source or the target. The smaller neighborhood is selected and its incident nodes
compared with the HashSet of the opposite neighborhood, including source or target
itself. If a match is found and hence an argumentation trail exists, both nodes and the
connecting edge are inserted into a new graphstructure for the resulting
argumentation trails. This graph is then being completed with the edges between
source or target and their neighborhoods. The resulting structure is a graph containing
all argumentation trails of maximum length 3 between source and target.
For the endpoint with the lesser degree, a search on every node O(n) incident to the
nodes in the neighborhood O(n) is performed. For a single set of paths between two
nodes, the computational complexity is O(n2), with a very low constant factor.
The conversion of selected argumentation trails to topic maps and the subsequent
export into the XTMformat are carried out by a composition of the prefuse and the
tinyTiMframeworks6. The figure below summarizes the TMCL schema7 of the
created topic maps. Each node in a prefusegraph is exported as a topic of type
Concept. The subject identifiers of these topics are composited by the namespace
“concept” (see footnote 8) and the correctly encoded term. Each edge between two
nodes is exported as an association of type Argumentation Step. The frequency of a
concept is exported as an occurrence “Frequency” of the according topic. The
significance of the relationship between two concepts is exported as an occurrence
“Significance” of the topic which reifies the according argumentation step. Finally, an
association between the source and target of the argumentation trail is created, to
facilitate orientation within the topic map. Additionally, some metadata about the
argumentation trail is added to the topic map by using the Dublin Core vocabulary as
it is described in [Mai08].
Fig. 2: Schema (TMCL)8 of the Topic Maps export for argumentation trails
4
http://jung.sourceforge.net/
5
http://prefuse.org/
6
http://www.topicmapslab.de/projects/tiny_TiM
7
http://www.isotopicmaps.org/tmcl/
Results – Graph and argumentation trail properties
The underlying cooccurrence graph is based on a corpus of about 5.5m sentences and
87m word tokens. A cooccurrence of the graphs shown in table 1 is significant if it
occurs at least three times and has a minimum log likelihood ratio of 6.63. All
columns of table 1 labeled by 2) to 7) are different subgraphs of 1). In columns 2) to
4) the minimum word frequency is 2. Additionally, the 100, 300 and 500 most
frequent words were excluded. Column 5 of table 1 shows a smaller subgraph only
based on named entities. Whilst column 6 expands all named entities of column 5 by
normalised9 equal words, column 7 works on both a normalised corpus and named
entity list.
Comparing the average degree of the underlying cooccurrence graph in row e) and
the average degree of the edge reduced argumentation trail in row g) it is obvious that
the path finding algorithm reduces the degree dramatically. However, the average
degree of a node in the argumentation trail in column g) is significantly smaller than
the degree of the inner nodes of an argumentation trail – h) for trails with two inner
nodes and i) for trails with only one inner node. This is caused by a more central role
of hubs within an argumentation trail.
1) 2) 3) 4) 5) 6) 7)
a) 538,572 388,929 363,359 353,618 1,14 9 4 ,4 87 2,178

Graph properties
b) 57,762,474 34 ,818,138 25,615,956 21,004,538 1 5 ,4 36 126,188 152,856
c) 30,382,422 21,739,476 17,687,582 15,462,940 14 ,87 6 69,858 84,124
d) 0.53 0.62 0.69 0.74 0.9 6 0.55 0.55
e) 56.4 1 55.90 48.68 43.73 12 .9 5 15.57 38.62

> 10 8 > 10 8 > 108 > 108 361 .0 94 7.958.240 3.087.581
Argumentation
f)
trail properties
g) 15.34 9.93 7.70 6.79 7 .0 3 7.77 9.93
h) 31.34 21.08 14.33 11.45 7.0 2 10.15 12.31
i) 301.38 362.56 285.86 231.39 55 .6 6 76.06 81.86
Table 1: Some properties of argumentation trails inclusive the characteristic
features of the underlying cooccurrence graph10 11
8
The schema is created with the TMCL editor Onotoa (http://onotoa.topicmapslab.de/). The
used graphical notation is a nonnormative GTM level 1 syntax
(http://www.isotopicmaps.org/gtm/) proposed by the Topic Maps Lab and implemented by
Onotoa. The namespace “eaqua” must be resolved to “http://psi.eaqua.net/ontology/” and the
namespace “concept” must be resolved to “http://psi.eaqua.net/corpora/[corpusname]/”..
9
Normalised: All letters will be made lowercase and diachritics will be removed.
10
Column labels: 1) Complete graph, 2) TOP 100 stop words and words with a frequency of 1
are removed , 3) SW=300, min. freq=2, 4) SW=500 min. freq. 2, 5) only Named Entities
(NE), 6) Normalised Named Entities, 7) Named Entities Normal. Corpus
11
Row labels: a) Number of nodes, b) Number of cooccurrences, c) Number of significant co
occurrences, d) Percentage, e) Average degree, f) Number of trails, g) Average degree, h)
Average degree of internal node (trail length 2), i) Average degree of internal node (trail
length 3)
One more result of table 1 is shown in column 5). Row d) describes the ratio of
significant cooccurrences and found cooccurrences. In table cell 5d) this ratio is
significantly larger than all other ratios. That's why the next section is reduced to this
data set.
Use Case – argumentation trails for Classical Studies
In Classics there are many use case scenarios for argumentation trails. On the one
hand you can use such trails for exploring new domains (e. g. new centuries) by
looking for the way in which different terms are related. On the other hand ancient
texts are strongly fragmented. For those cases you can e. g. observe a person A and
you know the context B of this fragmentary document. Using argumentation trails
you can observe how both concepts belong together based on other texts of the same
time frame. Furthermore, you can filter the found trails more rigorous than in figure
3b) by using other words from the fragmentary text. As a result of this you can get a
virtual expansion of the document's story.
Figure 3: a) Connection between two words with a low number of trails b)
Large trail cloud between two words
Typically, one is to observe trails like in picture 3) in such graphs. From a common
start and end point trails are to be found differing in only one node 3. column in
image 3a. The different nodes of the black and red trails of figure 3a are Krates and
Herodot. Looking for both words in the corpus one can find 46 sentences which
contain both words. The counterexample for this is shown in figure 3b. The black and
red trails have only the common starting and ending point. Based on this two
completely different argumentation trail threads exist.
Further Work and Conclusion
As mentioned in the introduction this paper is a proof of concept. We examined the
feasibility of the automatic extraction of argumentation trails and their usage as
discourse enriching technique in academic or political debates.
The automatic generation has been identified as difficult. However, some very
interesting results have been achieved and should be basis of further research.
As shown in table 1, the number of trails needs to be reduced dramatically. This
might be achieved by e. g. semantic preclustering or by authors restrictions.
Semantic preclustering causes trails to be rejected if every node is part of another
and completely different semantic cluster. In opposite of lexical chainings [MWH08]
this step is necessary because it's difficult to build a reliable “document based
summary” (text fragments). Author restrictions can be used to reject trails if edges of
a trail are computed by completely different sets of authors or work.
Furthermore, trails containing network hubs should be weighted lower to avoid
forging results. This could be shown in table 1 as well. All of these complexity
reduction approaches are necessary to compute trails on more complex graphs.
In the field of visualization a stronger clustering of trails is necessary. As depicted
in figure 3a) there exist two almost equal trails differing only in two nodes. By
clustering such trails to more globally relevant argumentation trail threads, the
understanding of more complex trail clouds as shown in figure 3b can be done easier
and faster.
Additionally, typing of nodes will be done by typed significant terms (e. g. literary,
geographic or dating classifications) [BHG08]. The same holds for typing or naming
the edges in the argumentation trails. Such kinds of enrichment will additionally
support a stronger integration to Topic Maps. Generally, the work in this paper does
not cover the problem of integrating the generated argumentation trails (as topic
maps) with further background information in sufficient detail.
Argumentation Trails and Topic Maps
Based on the historical roots of Topic Maps, the technology focuses on the
aggregation of information to subjects (esp. at the instance level like persons,
projects, etc.). The idea is collecting and documenting information about a subject
from different “perspectives”, whereby contradictions are expected. The integration
of “facts” and “discourses” about subjects is a long standing tradition in Topic Maps,
which is today coined as sensemaking [Pa08] and knowledge federation.
Argumentation trails support discourses, ease the creation of new hypothesis and
open new views to the data. Combined with further background information they are
a tool for sensemaking as discourse supporting tool in academic and political debates.
By using Topic Maps and adequate subject identifiers, the concepts in the
argumentation trails can be (instantly) integrated with other data or applications
dealing with the same subjects.
References
[AM05] Ahmed, K.; Moore, G.: An introduction to Topic Maps. In: The Architecture Journal
5, 2005.
[Ba04b] Barta, R.: Virtual and Federated Topic Maps. In: Proceedings of XML Europe,
Amsterdam (2004).
[Ba05] Barta, R.: TMIP, A RESTful Topic Maps Interaction Protocol. In: Proceedings of
Extreme Markup Languages 2005, Montréal. Online available at:
http://www.mulberrytech.com/Extreme/Proceedings/xslfo-pdf/2005/Barta01/
EML2005Barta01.pdf
[Bar00] Barabasi, A.L. et al .: Scale-free characteristics of random networks: the topology of
the World-wide web, Physica A (281)70-77, 2000
[BB04] Baroni, M.; Bisi, S.: Using cooccurrence statistics and the web to discover synonyms
in a technical language. Proceedings of LREC 2004.
[BCP01] Brunn, M., Chali Y., Pinchak C. J.: Text Summarization Using Lexical Chains. 2001
[Be06] Berners-Lee, T.: Linked Data. Online available at:
http://www.w3.org/DesignIssues/LinkedData.html (2009-02-20)
[BM06] Böhm, K.; Maicher, L.: Real-time Generation of Topic Maps from Speech Streams.
In: Proceedings of First International Workshop on Topic Maps Research and Applications
(TMRA'05), Leipzig; Springer LNAI 3873, (2006).
[Bo08b] Bock, B.: Topic-Maps-Middleware. Modellgetriebene Entwicklung kombinierbarer
domänenspezifischer Topic-Maps-Kompenenten. Diploma thesis at University of Leipzig
(2008).
[Bue08] Büchler, M.: Medusa: Performante Textstatistiken auf großen Textmengen:

Kookkurrenzanalyse in Theorie und Anwendung, Vdm Verlag Dr. Müller, 2008.
[BHG08] Büchler, M., Heyer, G., Gründer, S.: Bringing Modern Text Mining Approaches to
Two Thousand Years Old Ancient Texts, e-Humanities – an emerging discipline: Workshop in
the 4th IEEE International Conference on e-Science, 2008.
[Bue09] Büchler, M.: Medusa Release Homepage. http://www.eaqua.net/medusa/, 2005-9.
[Ce07] Cerny, R. (2007): Topincs: Topic Maps, REST and JSON. In: Maicher, L.; Sigel, A.;
Garshol, L. M. (Hrsg.): Leveraging the Semantics of Topic Maps. LNAI 4438, Springer:Berlin
(2007).
[CH89] Church, K.; Hanks, P.: Word association norms, mutual information, and
lexicography. In: ACL 1989, 76-83.
[CS01] Ferrer i Cancho, R.; Solé, R. V.: The Small-World of Human Language.
http://www.santafe.edu/sfi/publications/, 2001
[Dun93] Dunning, T. E.: Accurate Methods for the Statistics of Surprise and Coincidence. In:
Computational Linguistics, vol. 19, num. 1, pp. 61-74. 1993.
[Eve05] Evert, S.: The Statistics of Word Cooccurrences. Word Pairs and Collocations.
Institut für maschinelle Sprachverarbeitung, Universität Stuttgart, Dissertation, 2005.
[Ga06] Garshol, L. M.: TMRAP – Topic Maps Remote Access Protocol. In: Maicher, L.;
Park, J. (Hrsg.): Charting the Topic Maps Research and Applications Landscape. LNAI 3873,
Springer:Berlin (2006).
[GK03] Galley, M., McKeown, K.: Improving Word Sense Disambiguation in Lexical
Chaining. 2003.
[He08] Heuer, L.: Streaming Topic Maps API. In: Maicher, L.; Garshol, L. M. (eds.):
Subject-centric computing. In: Maicher, L.; Garshol, L.M.: Subject-centric computing.
Proceedings of TMRA 2008. Leipzig, (2008).
[HP59] Hoffman, W.; Parley, R.: A method for the solution of the nth best path problem.
Journal of the Association for Computing Machinery (ACM) 1959; 6:506-514.
[KIM82] Katoh, R. K.; Ibaraki, T.; Mine, H.: An efficient algorithm for k shortest simple paths.
Networks 1982; 12:411-427.
[Kle00] Kleinberg, J.: The small-world phenomenon: An algorithmic perspective. Proc. 32nd
ACM Symposium on Theory of Computing, 2000.
[Law72] Lawler, E. L.: A procedure for computing the k best solutions to discrete optimisation
problems and its application to the shortest path problem. In: Management Science, Theory
Series 1972; 18:401-405.
[LK08] Lachica, R.; Karabeg, R.: Metadata Creation in Socio-semantic Tagging Systems:
Towards Holistic Knowledge Creation and Interchange. In: Maicher, L.; Garshol, L.M.:
Scaling Topic Maps. LNAI 4999, Springer:Berlin (2008).
[Ma07a] Maicher, L.: Autonome Topic Maps. Zur dezentralen Erstellung von implizit und
explizit vernetzten Topic Maps in semantisch heterogenen Umgebungen. Doctoral thesis at
University of Leipzig (2007).
[Ma07b] Maicher, L.: The Impact of Semantic Handshakes. In: Maicher, L.; Sigel, A.; Garshol,
L. M.: Leveraging the Semantics of Topic Maps. LNAI 4438, Springer, Berlin (2007).
[Ma08] Maicher, L.: Musica migrans - Mapping the Movement of Migrant Musicians.
Presentation held at the Topic Maps User Conference 2008, Oslo. Slides available at (April 10,
2008): http://www.topicmaps.com/tm2008/maicher.pdf
[Mai08] Maicher, L.: Mapping between the Dublin Core Abstract Model DCAM and the
TMDM. In: Maicher, L.; Garshol, L.M.: Scaling Topic Maps. LNAI 4999, Springer, Berlin.
[MB08] Maicher, L.; Bock, B.: ActiveTM - The Factory for Domain-customised Portal
Engines. In: Proceedings of I-Media’08, Graz (2008).
[Mil67] Milgram, S.: The small world problem. In: Psychology Today 2, pp. 60-67, 1967.
[MWH08] Alexander Mehler, Ulrich Waltinger, and Gerhard Heyer: Towards Automatic
Content Tagging: Enhanced Web Services in Digital Libraries Using Lexical Chaining. In: 4th
International Conference on Web Information Systems and Technologies (WEBIST '08),
Funchal, Portugal , 2008
[MWW07] Mehler, A. Waltinger U. und Wegner A.: A Formal Text Representation Model
Based on Lexical Chaining. Proceedings of the KI 2007 Workshop on Learning from Non-
Vectorial Data (LNVD 2007) September 10, Osnabrück, Seiten 17–26, Osnabrück, 2007.
Universität Osnabrück.
[TMDM] ISO/IEC IS 13250-2:2006: Information Technology - Document Description and

Processing Languages - Topic Maps - Data Model. International Organization for
Standardization, Geneva, Switzerland. http://www.isotopicmaps.org/sam/sam-model/
[Pa08] Park, J.: Topic Maps, Dashboards and Sensemaking. In: Maicher, L.; Garshol, L.M.:
Subject-centric computing. Proceedings of TMRA 2008. Leipzig, (2008).
[Va05] Vatant, B.: Tools for semantic interoperability : hubjects. Working Paper. Online
available at: http://www.mondeca.com/lab/bernard/hubjects.pdf
[WS98] Watts, D.J., S.H. Strogatz: Collective dynamics of ‘small-world’ networks. In: Nature
393:440-442, 1998.
[Yen71] Yen, J.Y.: Finding the k shortest loopless paths in a network. In: Management Science
1971; 17:712-716.

Argumentation Trails and Topic Maps

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Argumentation Trails and Topic Maps

Enviado por

Direitos autorais:

Formatos disponíveis

Automatic Extraction of Topic Maps based

Abstract. With argumentation trails we introduce an approach of finding

generic manual Topic Maps can explicitly be written in a text editor using an

Calculations are performed within the JUNG2 framework4, visualizations within

a) 538,572 388,929 363,359 353,618 1,14 9 4 ,4 87 2,178

b) 57,762,474 34 ,818,138 25,615,956 21,004,538 1 5 ,4 36 126,188 152,856

c) 30,382,422 21,739,476 17,687,582 15,462,940 14 ,87 6 69,858 84,124

d) 0.53 0.62 0.69 0.74 0.9 6 0.55 0.55

e) 56.4 1 55.90 48.68 43.73 12 .9 5 15.57 38.62

g) 15.34 9.93 7.70 6.79 7 .0 3 7.77 9.93

h) 31.34 21.08 14.33 11.45 7.0 2 10.15 12.31

i) 301.38 362.56 285.86 231.39 55 .6 6 76.06 81.86

[Bue08] Büchler, M.: Medusa: Performante Textstatistiken auf großen Textmengen:

[TMDM] ISO/IEC IS 13250-2:2006: Information Technology - Document Description and

Você também pode gostar