Escolar Documentos
Profissional Documentos
Cultura Documentos
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
tales como PMI (Church y Hanks, papel que juegan fenómenos concretos a la hora
1990; Hearst 1992; Pantel y de elegir hipónimos e hiperónimos que sean
Pennacchiotti, 2006), medición de relevantes para un dominio de conocimiento.
entropía entre pares de palabras (Ryu En un plano de análisis lingüístico, la
y Choi, 2005), así como cálculos de mayoría de estos métodos se han enfocado en
vectores para medir la distancia encontrar nuevas instancias de hipónimos e
conceptual entre palabras (Ritter, hiperónimos a partir de un conjunto de
Soderland y Etzioni, 2009). instancias semilla, que sean reconocibles en
Como un mecanismo de apoyo para contextos oracionales (Hearst, 1992; Pantel y
corroborar si los candidatos a Pennacchiotti, 2006; Ritter, Soderland y
hipónimos e hiperónimos mantienen Etzioni, 2009; Ortega, Villaseñor y Montes,
una relación canónica, autores como 2007; Ortega et al., 2011). Sin embargo, no se
Hearst (1992), así como Ritter, ha considerado aún el potencial de relaciones de
Soderland y Etzioni (2009) emplean hiponimia que puede generar un hiperónimo en
la base léxica WordNet (Fellbaum, su función como núcleo de una frase nominal.
1998) como fuente de consulta. De acuerdo con lo expresado por Croft y Cruse
Una vez que se ha corroborado cuál es (2004), creemos que un hiperónimo unipalabra
el contenido de información que un más un rasgo semántico pueden generar
par de palabras comparte como hipónimos relevantes que den cuenta de la
hipónimos e hiperónimos, se pasa a estructura de un dominio de conocimiento, e
una evaluación para determinar el igualmente reflejar perspectivas de clasificación
grado de precisión & Recall (Van de un hiperónimo.
Rijsbergen, 1979) que se ha logrado Siguiendo esta idea, en este trabajo nos
alcanzar con el método empleado, enfocamos en frases nombre + adjetivo,
haciendo ajustes con una medida-F en teniendo en mente la función semántica que
caso de que sea requerido (Ortega, cumplen los adjetivos como unidades que
Villaseñor y Montes, 2007; Ortega et expresan y priorizan rasgos conceptuales, cuya
al., 2011). selección puede ser condicionada por el
dominio de conocimiento en el cual se
3 Problemas en la selección de manifiestan, como es el caso de la terminología
hipónimos e hiperónimos pertinentes a un médica.
dominio de conocimiento Consideramos entonces que si no se toma en
cuenta la observación hecha por Croft y Cruse,
Tomando en cuenta los métodos de extracción
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
The main goal of this workshop is to bring together researchers that are working on the
creation of new LKRs on any domain, or on their exploitation for specific information
processing tasks such as data analysis, text mining, natural language processing and
visualization, as well as for knowledge engineering issues, like knowledge acquisition,
validation and personalization.
Research, demo and position papers showing the benefits that exploiting LKRs can bring to the
information processing area will be especially welcome to this workshop.
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 90
High quality documentation such as technical/scientific articles and patents, has not received
all the attention that the field deserves. Given the explosion of technical documentation
available on the Web and in intranets, scientist and research and development facilities face a
true scientific information deluge: summarization should be a key instrument not only for
reducing the information content but also for measuring information relevance in context,
providing to users adequate answers in context.
The workshop “Automatic Text Summarization of the Future” aims to bringing together
researchers and practitioners of natural language processing to address the aforementioned
and related issues.
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
Los artículos completos de este taller han sido publicados en: http://ceur‐ws.org/Vol‐882/
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
PROGRAMA
A proposal for a European large knowledge repository in advanced food composition tables
for assessing dietary intake
Oscar Coltell, Francisco Madueño, Zoe Falomir, Dolores Corella
Short Papers
Statements of interest
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
TASS ‐ Taller de Análisis de Sentimientos
en la SEPLN
Organizadores:
Currently market research using user surveys is typically performed. However, the rise of social
media such as blogs and social networks and the increasing amount of user‐generated
contents in the form of reviews, recommendations, ratings and any other form of opinion, has
led to creation of an emerging trend towards online reputation analysis. The so‐called
sentiment analysis, i.e., the application of natural language processing and text analytics to
identify and extract subjective information from texts, which is the first step towards the
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
online reputation analysis, is becoming a promising topic in the field of marketing and
customer relationship management, as the social media and its associated word‐of‐mouth
effect is turning out to be the most important source of information for companies and their
customers' sentiments towards their brands and products.
Sentiment analysis is a major technological challenge. The task is so hard that even humans
often disagree on the sentiment of a given text. The fact that issues that one individual finds
acceptable or relevant may not be the same to others, along with multilingual aspects, cultural
factors and different contexts make it very hard to classify a text written in a natural language
into a positive or negative sentiment. And the shorter the text is, for example, when analyzing
Twitter messages or short comments in Facebook, the harder the task becomes.
Within this context, TASS is an experimental evaluation workshop, as a satellite event of the
SEPLN 2012 Conference that will be held on September 7th, 2012 in Jaume I University at
Castellón de la Plana, Comunidad Valenciana, Spain, to foster the research in the field of
sentiment analysis in social media, specifically focused on Spanish language. The main
objective is to promote the application of existing state‐of‐the‐art algorithms and techniques
and the design of new ones for the implementation of complex systems able to perform a
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 93
sentiment analysis based on short text opinions extracted from social media messages
(specifically Twitter) published by a series of representative personalities.
The challenge task is intended to provide a benchmark forum for comparing the latest
approaches in this field. In addition, with the creation and release of the fully tagged corpus,
we aim to provide a benchmark dataset that enables researchers to compare their algorithms
and systems.
PROGRAMA
Techniques for Sentiment Analysis and Topic Detection of Spanish Tweets: Preliminary
Report
Antonio Fernández Anta, Philippe Morere; Luis Núñez Chiroque, Agustín Santos.................... 112
UNED @ TASS: Using IR techniques for topic‐based sentiment analysis through divergence
models
Angel Castellano González, Juan Cigarrán Recuero, Ana García Serrano ................................. 140
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
TASS - Workshop on Sentiment Analysis at SEPLN
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
Julio Villena-Román Sara Lana-Serrano
Janine García-Morera DIATEL - Universidad Politécnica de Madrid
Cristina Moreno-García slana@diatel.upm.es
Linda Ferrer-Ureña
DAEDALUS
{jvillena, jgarcia, cmoreno}@daedalus.es
Abstract: This paper describes TASS, an experimental evaluation workshop within SEPLN to
foster the research in the field of sentiment analysis in social media, specifically focused on
Spanish language. The main objective is to promote the application of existing state-of-the-art
algorithms and techniques and the design of new ones for the implementation of complex
systems able to perform a sentiment analysis based on short text opinions extracted from social
media messages (specifically Twitter) published by representative personalities. The paper
presents the proposed tasks, the contents, format and main statistics of the generated corpus, the
participant groups and their different approaches, and, finally, the overall results achieved.
Keywords: TASS, reputation analysis, sentiment analysis, social media.
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 95
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
about those subjective is so hard ,
actions. It covers informati that even the
many factors to on from humans har
calculate the texts, often der
market value of which is disagree the
reputation. the first on the task
Reputation step sentiment bec
analysis has towards of a om
come into wide the online given es.
use as a major reputation text. The W
factor of analysis, fact that ithi
competitiveness is issues n
in the becoming that one this
increasingly a individua con
complex promising l finds text
marketplace of topic in acceptabl ,
personal the field e or TA
2
and business of relevant SS ,
relationships marketing may not whi
among people and be the ch
and companies. customer same to stan
Currently relationsh others, ds
market research ip along for
using user managem with Tall
surveys is ent, as the multiling er
typically social ual de
performed. media aspects, Aná
However, the and its cultural lisis
rise of social associate factors de
media such as d word- and Sen
blogs and social of-mouth different timi
networks and the effect is contexts ent
increasing turning make it os
amount of user- out very hard en
generated to be the to la
contents in the most classify a SE
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
erimental The main their systems to the technological
evaluation objective is to audience in a regular challenge is to
workshop, improve the existing workshop session build a classifier
organized as a techniques and together with special to identify the
satellite algorithms and design invited speakers. topic of the text,
event of the new ones in order to Submitted papers are
SEPLN 2012 perform a sentiment reviewed by the
Conference, held analysis in short text program committee.
on opinions extracted from
September 7th, social media messages 2.1 Task
2012 in Jaume I (specifically Twitter) 1:
University at published by a series of Sentiment
Castellón de la important personalities. Analysis
Plana, The challenge task
Comunidad is intended to provide a This task consists on
Valenciana, benchmark forum for performing an
Spain, to comparing the latest automatic sentiment
promote the approaches in this analysis to determine
research in the field. In addition, with the polarity of each
field of the message in the test
sentiment creation and release of corpus.
analysis in social the fully tagged corpus, The evaluation
media, initially we aim to provide a metrics to evaluate and
focused on benchmark dataset that compare the different
Spanish, enables researchers to systems are the usual
although it could compare their measurements of
be extended to algorithms and precision (1), recall (2)
any language. systems. and F- measure (3)
calculated over the full
2 test set, as shown in
2
D Figure 1.
h
es
t
t cr
p ip
(1)
: ti
/ o
/ n
w of (2)
w ta
w sk
. s
d
a Two tasks are proposed (3)
e for the participants in
d this first edition: Fig
a sentiment analysis and ure
l trending topic 1:
u coverage. Eva
s Groups may luat
. participate in both ion
e tasks or just in one of met
s them. rics
/ Along with the
T submission of
2.2 Task 2:
A experiments,
participants are Trending topic
S
encouraged to submit a coverage
S
paper to the workshop In this case, the
in order to describe
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 96
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 97
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 98
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 99
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 100
n e a ranking g
n of the
T s discrimin ran
A e ative kin
S words. g of
S d Moreove wor
. i r, a set of ds
For polarity s events of
classification, a is eac
they propose an m retrieved h
emotional b based on topi
concept-based i a c
method. The g probabili and
original u stic the
method makes a approach ran
use of an t that was kin
affective lexicon i adapted g of
to o to the wor
represent the text n characteri ds
as the set of stics of mos
emotional a Twitter. t
meanings it l To like
expresses, along g determin ly
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 101
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 102
4.8 Universidad de
Málaga (UMA)
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 103
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 101
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 102
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
TASS: Detecting Sentiments in Spanish Tweets
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
Xabier Saralegi Urizar In˜aki San Vicente
Elhuyar Fundazioa Roncal
Zelai Haundi 3, 20170 Usurbil Elhuyar Fundazioa
x.saralegi@elhuyar.com Zelai Haundi 3, 20170 Usurbil
i.sanvicente@elhuyar.com
Resumen: Este art´ıculo describe el sistema presentado por nuestro grupo para
la tarea de an´alisis de sentimiento enmarcada en la campan˜a de evaluacion TASS
2012. Adoptamos una aproximacion supervisada que hace uso de conocimiento
lingu¨´ıstico. Este conocimiento lingu¨´ıstico comprende lematizacion, etiquetado
POS, etiquetado de palabras de polaridad, tratamiento de emoticonos, tratamiento
de negacion, y ponderacion de polaridad segu´n el nivel de anidamiento sintactico.
Tambi´en se lleva a cabo un preprocesado para el tratamiento de errores ortogr
´aficos. La deteccion de las palabras de polaridad se hace de acuerdo a un l
´exico de polaridad para el castellano creado en base a dos estrategias: Proyeccion
o traduccion de un l´exico de polaridad de ingl´es al castellano, y extraccion de
palabras divergentes entre los tuits positivos y negativos correspondientes al corpus
de entrenamiento. Los resultados de la evaluacion final muestran un buen
rendimiento del sistema as´ı como una notable robustez tanto para la deteccion de
polaridad a alta granularidad (65% de exactitud) como a baja granularidad (71% de
exactitud).
Palabras clave: TASS, An´alisis de sentimiento, Miner´ıa de opiniones,
Deteccion de polaridad
Abstract: This article describes the system presented for the task of sentiment
analysis in the TASS 2012 evaluation campaign. We adopted a supervised approach
that includes some linguistic knowledge-based processing for preparing the features.
The processing comprises lemmatisation, POS tagging, tagging of polarity words,
treatment of emoticons, treatment of negation, and weighting of polarity words
depending on syntactic nesting level. A pre-processing for treatment of spell-errors
is also performed. Detection of polarity words is done according to a polarity lexicon
built in two ways: projection to Spanish of an English lexicon, and extraction of
divergent words of positive and negative tweets of training corpus. Evaluation results
show a good performance and also good robustness of the system both for fine
granularity (65% of accuracy) as well as for coarse granularity polarity detection
(71% of accuracy).
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 104
the tweets. The TASS evaluation workshop icon to classify movie reviews. Read (2005)
aims “to provide a benchmark forum for com- confirmed the necessity to adapt the mod-
paring the latest approaches in this field”. els to the application domain, and (Choi and
Our team only took part in the first task, Cardie, 2009) address the same problem for
which involved predicting the polarity of a polarity lexicons.
number of tweets, with respect to 6-category In the last few years many researchers
classification, indicating whether the text ex- have turned their efforts to microblogging
presses a positive, negative or neutral senti- sites such as Twitter. As an example, (Bol-
ment, or no sentiment at all. It must be noted len, Mao, and Zeng, 2010) have studied the
that most works in the literature only classify possibility of predicting stock market res-
sentiments as positive or negative, and only ults by measuring the sentiments expressed
in a few papers are neutral and/or objective in Twitter about it. The special character-
categories included. We developed a super- istics of the language of Twitter require a
vised system based on a polarity lexicon and special treatment when analyzing the mes-
a series of additional linguistic features. sages. A special syntax (RT, @user, #tag,...),
The rest of the paper is organized as fol- emoticons, ungrammatical sentences, vocab-
lows. Section 2 reviews the state of the art ulary variations and other phenomena lead
in the polarity detection field, placing spe- to a drop in the performance of traditional
cial interest on sentence level detection, and NLP tools (Foster et al., 2011; Liu et al.,
on twitter messages, in particular. The third 2011). In order to solve this problem, many
section describes the system we developed, authors have proposed a normalization of the
the features we included in our supervised text, as a pre-process of any analysis, report-
system and the experiments we carried out ing an improvement in the results. Brody
over the training data. The next section (2011) deals with the word lengthening phe-
presents the results we obtained with our sys- nomenon, which is especially important for
tem first in the training-set and later in the sentiment analysis because it usually ex-
test data-set. The last section draws some presses emphasis of the message. (Han and
conclusions and future directions. Baldwin, 2011) use morphophonemic simil-
arity to match variations with their standard
2 State of the Art vocabulary words, although only 1:1 equival-
Much work has been done in the last dec- ences are treated, e.g., ’imo = in my opinion’
ade in the field of sentiment labelling. Most would not be identified. Instead, they use an
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
of these words are limited to polarity de- Internet slang dictionary to translate some of
tection. Determining the polarity of a text those expressions and acronyms. Liu et al.
unit (e.g., a sentence or a document) usually (2012) propose combining three strategies,
includes using a lexicon composed of words including letter transformation, “priming” ef-
and expressions annotated with prior polar- fect, and misspelling corrections.
ities (Turney, 2002; Kim and Hovy, 2004; Once the normalization has been per-
Riloff, Wiebe, and Phillips, 2005; Godbole, formed, traditional NLP tools may be used to
Srinivasaiah, and Skiena, 2007). Much re- analyse the tweets and extract features such
search has been done on the automatic or as lemmas or POS tags (Barbosa and Feng,
semi-automatic construction of such polar- 2010). Emoticons are also good indicators
ity lexicons (Riloff and Wiebe, 2003; Esuli of polarity (O’Connor et al., 2010). Other
and Sebastiani, 2006; Rao and Ravichandran, features analyzed in sentiment analysis such
2009; Velikovich et al., 2010). as discourse information (Somasundaran et
Regarding the algorithms used in senti- al., 2009) can also be helpful. (Speriosu et
ment classification, although there are ap- al., 2011) explore the possibility of exploiting
proaches based on averaging the polarity of the Twitter follower graph to improve polar-
the words appearing in the text (Turney, ity classification, under the assumption that
2002; Kim and Hovy, 2004; Hu and Liu, people influence one another or have shared
2004; Choi and Cardie, 2009), machine learn- affinities about topics. (Barbosa and Feng,
ing methods have become the more widely 2010; Kouloumpis, Wilson, and Moore, 2011)
used approach. Pang et al. (2002) proposed combined polarity lexicons with machine
a unigram model using Support Vector ma- learning for labelling sentiment of tweets.
chines which does not need any prior lex- Sindhwani and Melville (2008) adopt a semi-
supervised approach using a polarity lexicon English-Spanish bilingual dictionary Den−es
combined with label propagation. (see Table 2). Despite Pen including neutral
A common problem of the supervised ap- words, only positive and negative ones were
proaches is to gather labelled data for train- selected and translated. Ambiguous trans-
ing. In the case of the TASS challenge, we lations were solved manually by two annot-
would tackle this problem should we want to ators. Altogether, 7,751 translations were
collect additional training data. In order to checked. Polarity was also checked and cor-
automatically build annotated corpora, (Go, rected during this manual annotation. It
Bhayani, and Huang, 2009) collect tweets must be noted that as all translation candid-
containing the “:)” emoticon and regard ates were checked, many variants of the same
them as positive, and likewise for the “:(“ source word were selected in many cases. Fi-
emoticon. Kouloumpis (2011) uses a sim- nally, 2,164 negative words and 1,180 positive
ilar approach based on most common posit- words were included in the polarity lexicon
ive and negative hashtags. Barbosa (Barbosa (see fifth column of Table 3). We detected
and Feng, 2010) rely on existing web services a significant number of OOV words (35%) in
such as Twend or Tweetfeel to collect annot- this translation process (see second and third
ated emoticons. One major problem of the columns of Table 3). Most of these words
aforementioned strategies is that only posit- were inflected forms: pasts (e.g., “terrified”),
ive and negative tweets can be collected. plurals (e.g., “winners”), adverbs (e.g., “vi-
brantly”), etc. So they were not dealt with.
3 Experiments
3.1 Training Data #headwords #pairs avg.
#trans-
The training data Ct provided by the or- lations
ganization consists of 7,219 twitter messages Den−es 15,134 31,884 2.11
(see Table 1). Each tweet is tagged with its
global polarity, indicating whether the text
expresses a positive, negative or neutral sen- Table 2: Characteristics of the Den−es
timent, or no sentiment at all. 6 levels have bilin- gual dictionary.
been defined: strong positive (P+), positive
(P), neutral (NEU), negative (N), strong neg- b) As a second source for our polarity
ative (N+) and no sentiment (NONE). The lexicon, words were automatically extracted
numbers of tweets corresponding to P+ and from the training corpus Ct . In order to ex-
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
NONE are higher than the rest. NEU is the tract the words most associated with a cer-
class including the least tweets. In addition, tain polarity; let us say positive, we divided
each message includes its Twitter ID, the cre- the corpus into two parts: positive tweets
ation date and the twitter user ID. and the rest of the corpus. Using the Log-
Polarity #tweets % of #tweets likelihood ratio (LLR) we obtained the rank-
P+ 1,764 24.44% ing of the most salient words in the positive
P 1,019 14.12% part with respect to the rest of the corpus.
NEU 610 8.45% The same process was conducted to obtain
N 1,221 16.91% negative candidates. The top 1,000 negative
N+ 903 12.51% and top 1,000 positive words were manually
NONE 1,702 23.58% checked. Among them, 338 negative and 271
Total 7,219 100% positive words were selected for the polarity
lexicon (see sixth column in Table 3). We
found a higher concentration of good candid-
Table 1: Polarity classes distribution in cor-
ates among the best ranked candidates (see
pus Ct .
Figure 1).
tinguishing between subjective and objective The importance of the word in the tweet
texts. Our hypothesis is that certain POS determines the influence it can have on the
tags are more frequent in opinion messages, polarity of the whole tweet. We measured
e.g., adjectives. In our experiments POS tags the importance of each word w by calculat-
provided by Freeling were used. We used ing the relative syntactic nesting level ln (w).
as a feature the frequency of the POS tags The lower the syntactic level, the less import-
in a message. ant it is. The relative syntactic nesting level
Results in Table 4 show that this feature is computed as the inverse of the syntactic
provides a notable improvement and it is es- nesting level (1/ln (w)).
pecially helpful for detecting objective mes-
sages (view difference in F-score between SP Features/
Metric
Acc.
(6 cat.)
P+ P NEU N N+ NONE
and SP+PO for the NONE class). Baseline 0.45 0.574 0.267 0.137 0.368 0.385 0.578
SP 0.484 0.594 0.254 0.098 0.397 0.422 0.598
3.3.5 Frequency of Polarity SP+PO 0.496 0.596 0.245 0.093 0.414 0.438 0.634
Words SP+EM
SP+FP
0.49
0.514
0.612
0.633
0.253
0.261
0.097
0.115
0.402
0.455
0.428
0.438
0.6
0.613
(FP) All 0.523 0.648 0.246 0.111 0.463 0.452 0.657
ALL+AC1 0.523 0.647 0.248 0.116 0.46 0.451 0.655
The SP classifier does not interpret the polar-
ity information included on the lexicon. We
explicitly provide that information as a fea- Table 4: Accuracy results obtained on the
ture to the classifier. Furthermore, without evaluation of the training data. Columns 3rd
the polarity information, the classifier will be to 8th show F-scores for each of the class val-
built taking into account only those polarity ues.
words appearing in the training data. Includ-
ing the polarity frequency information expli-
3.3.6 Using Additional Corpora (AC) the training data Ct−train only those tweets
Additional training data were retrieved using of Ctw containing at least one word w from
the Perl Net::Twitter API. Different searches Pes but not appearing in the training corpus
were conducted during June 2012 using the (w ∈ Pes ∧ f req(w, Ct−train ) = 0). Only
attitude feature of the twitter search. Using 7.9% of the retrieved tweets were added.
this feature, users can search for tweets ex- Results were still unsatisfactory, and so,
pressing either positive or negative opinion. additional training data were left out of the
The search is based on emoticons as in (Go final model.
et al., 2009). Retrieved tweets were classified It must be noted that the tweet retrieval
according to their attitude. effort was very simple, due to the limited
time we had to develop the system. We
Corpora/Tweets P N Total conclude that these additional training data
Ctw 11,363 9,865 21,228 were unhelpful due to the differences with
the original data provided: Ctw contained
many more ungrammatical structures and
Table 5: Characteristics of the tweet corpus
nonstandard tokens than the original data;
collected from Twitter.
the dates of the tweets were different which
The corpus Ctw including retrieved tweets could even lead to topic and vocabulary dif-
(see Table 5.) was used in two ways: on the ferences; and especially, the fact that the ad-
one hand, we used it to find new words for our ditional data collected did not include neutral
polarity lexicon Pes , by using the automatic or objective tweets and neither did it include
method described in section 3.2. The first 500 different degrees of polarity in the case of pos-
positive candidates and 500 negative candid- itive and negative tweets.
ates were manually checked. Altogether, 110
Features/ #training Accuracy
positive words and 95 negative ones (AC1) Metric examples
were included in the polarity lexicon Pes . ALL 6,137 0.573
According to the results (see ALL+AC1 in ALL+AC2 27,365 0.507
Table 4), these new polarity words do not ALL+AC2-OOV 7,807 0.569
provide any improvement. The reason is that
most relevant polarity words included in the
training corpus Ct are already included in Table 6: Results obtained by including addi-
Pe s as explained in section 3.2. In order to tional examples in the training data.
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
NONE classes in- creases significantly in the system effectively combines several features
test data with re- spect to the training data. based on linguistic knowledge. In our case,
By contrast, the NEU and P classes using a semi-automatically built polarity lex-
decreased dramatically. The distribution icon improves the system performance signi-
difference together with the performance of ficantly over a unigram model. Other fea-
the system regarding specific classes could tures such as POS tags, and especially word
explain the difference in accur- acy between polarity statistics were also found to be help-
test and training evaluations. It remains ful. In our experiments, including external
unclear to us why the F-scores for all the training data was unsuccessful. However, our
classes improved with respect to the training approach was very simple, and so, a more ex-
phase. We should analyse the char- haustive experimentation should be carried
acteristics of the training and test corpora, out in order to obtain conclusive results. In
looking for differences in the samples and an- any case, the system shows robust perform-
notation. ance when it is evaluated against test data
As for the results of the individual classes, different from the training data.
it is worth mentioning that neutral tweets There is still much room for improvement.
are very difficult to classify because they Tweet normalization was na¨ıvely implemen-
do contain polarity words. We looked at ted. Some authors (Pang and Lee, 2004;
its confusion matrix (both for training and Barbosa and Feng, 2010) have obtained pos-
test evaluations) and it shows that NEU itive results by including a subjectivity ana-
tweets wrongly classified are evenly dis- lysis phase before the polarity detection step.
tributed between the other classes, except We would like to explore that line of work.
for the NONE class, with almost no NEU Lastly, it would be worthwhile conducting
tweets classified as NONE. Most of the NEU
in-depth research into the creation of polar- Conference on Artificial Intelligence, Au-
ity lexicons including domain adaption and gust.
treatment of word senses.
Go, A., R. Bhayani, and L. Huang. 2009.
Acknowledgments Twitter sentiment classification using dis-
tant supervision. CS224N Pro ject Rep ort,
This work has been partially founded by the Stanford, pages 1–12.
Industry Department of the Basque Govern-
ment under grant IE11-305 (knowTOUR pro- Godbole, N., M. Srinivasaiah, and S. Skiena.
ject). 2007. Large-scale sentiment analysis for
news and blogs. In Proceedings of the
Reference International Conference on Weblogs and
s Social Media (ICWSM), pages 219–222.
Barbosa, Luciano and Junlan Feng. 2010. Hall, Mark, Eibe Frank, Geoffrey Holmes,
Robust sentiment detection on twit- Bernhard Pfahringer, Peter Reutemann,
ter from biased and noisy data. In and Ian H. Witten. 2009. The
Proceedings of the 23rd International WEKA data mining software: an up-
Conference on Computational Linguistics: date. SIGKDD Explor. Newsl., 11(1):10–
Posters, COLING ’10, pages 36–44, 18, november.
Stroudsburg, PA, USA. Association for
Computational Linguistics. Han, Bo and Timothy Baldwin. 2011.
Lexical normalisation of short text
Bollen, Johan, Huina Mao, and Xiao-Jun messages: Makn sens a #twit-
Zeng. 2010. Twitter mood predicts the ter. In Proceedings of the 49th
stock market. 1010.3003, October. Annual Meeting of the Association
for Computational Linguistics: Human
Brody, Samuel and Nich-
Language Technologies, pages 368–378,
olas Diakopoulos. 2011.
Portland, Oregon, USA, June. Associ-
Cooooooooooooooollllllllllllll!!!!!!!!!!!:
ation for Computational Linguistics.
using word lengthening to detect senti-
ment in microblogs. In Proceedings of Hu, M. and B. Liu. 2004. Mining
the Conference on Empirical Methods in and summarizing customer reviews. In
Natural Language Processing, EMNLP Proceedings of the tenth ACM SIGKDD
’11, pages 562–570. Association for international conference on Knowledge
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
Antonio Fern´andez Philippe Morere†
Anta Institute IMDEA ENSEIRB-MATMECA
Networks Madrid, Spain Bordeaux, France
<i>XVIII Congreso de la Asociación Española para el Procesamiento del Lenguaje Natural</i>, edited by Llavorí, Rafael Berlanga, et al., Universitat Jaume I. Servei de
Comunicació i Publicacions, 2012. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/bibliotecauptsp/detail.action?docID=4184256.
Created from bibliotecauptsp on 2019-09-28 09:35:08.
XVIII CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL 113
ber of methods and techniques have been pro- Pang and Lee (Pang and Lee, 2008) have
posed in the literature to solve them. Most a comprehensive survey of sentiment analy-
of these techniques focus on English texts sis and opinion mining research. Liu (Liu,
and study large documents. In our work, 2010), on his hand, reviews and discusses a
we are interested in languages different from wide collection of related works. Although,
English and micro-texts. In particular, we most of the research conducted focuses on
are interested in sentiment and topic clas- English texts, the number of papers on the
sification applied to Spanish Twitter micro- treatment of other languages is increasing ev-
blogs. Spanish is increasingly present over ery day. Examples of research papers on
the Internet, and Twitter has become a pop- Spanish texts are (Brooke, Tofiloski, and
ular method to publish thoughts and infor- Taboada, 2009; Mart´ınez-Camara, Mart´ın-
mation with its own characteristics. For in- Valdivia, and Uren˜a-Lopez, 2011; Mart
stance, publications in Twitter take the form ´ınez Camara et al., 2011).
of tweets (i.e., Twitter messages), which are Most of the algorithms for sentiment anal-
micro-texts with a maximum of 140 char- ysis and topic detection use a collection of
acters. In Spanish tweets it is common to data to train a classifier that is later used
find specific Spanish elements (SMS abbrevi- to process the real data. The (training and
ations, hashtags, slang). The combination of real) data is processed before being used for
these two aspects makes this a distinctive re- (building or applying) the classifier in or-
search topic, with potentially deep industrial der to correct errors and extract the main
applications. features (to reduce the required processing
The motivation of our research is twofold. time or memory). Many different techniques
On the one hand, we would like to know have been proposed for these phases. For in-
whether usual approaches that have been stance, different classification methods have
proved to be effective with English text are been proposed, like Naive Bayes, Maximum
also so with Spanish tweets. On the other, Entropy, Support Vector Machines (SVM),
we would like to identify the best (or at BBR, KNN, or C4.5. In fact, there is no fi-
least good) technique for Spanish tweets. For nal agreement on which of these classifiers
this second question, we would like to eval- is the best. For instance, Go et al. (Go,
uate those techniques proposed in the lit- Bhayani, and Huang, 2009) report similar ac-
erature, and possibly propose new ad hoc curacy with classifiers based on Naive Bayes,
techniques for our specific context. In our Maximum Entropy, and SVM.
study, we try to sketch out a comparative
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
might be relevant for topic detection. “cierro la cerca” (“I close the fence”), using
Ad- ditionaly, the hashtags are a great 2-grams will allow to detect the two differ-
indicator of the topic of a tweet, whereas ent meanings of the word “cerca”. As the
retrieving keywords from the web-page words stay in their context, an n-gram car-
linked within a tweet allows to overpass the ries more information than the sum of the
limit of the 140 characters and thus improves information of its n words: it also carries the
the efficiency of the estimation. Another way context information. (Using uni-grams every
to overpass this limit is to investigate the single word is a term, and any context infor-
keywords of a tweet in a search-engine to mation is lost.)
retrieve other words of the same context. When using n-grams, n is a parameter
that highly influences performance. Having
Classification methods In addition to a high value of n allows catching more con-
these variants, we have explored the full spec- text information, since the combinations of
trum of classification methods provided by words are less probable. On another side,
WEKA. rare combinations means less occurrences in
We can construct a large set of (more than the data set, which means that a bigger data
100 thousand) different methods by combin- set is needed to have good results. Also, the
ing features from all the described families. larger n is, the longer the attribute list is.
As this number of combinations is too high, In addition, since tweets are short, choosing
we had to reduce it by manually, choosing a a large n would result in n-grams of almost
subset of all the methods that is manageable the size of a tweet, which would make little
and we think is the most relevant. We hope sense. We found that, in practice, having n
the reader finds the subset we present satis- larger than 3 did not improve the results, so
Parameter/flag Description Process
n-gram Number of words that form a term Both
Only n-gram Whether words are also terms Both
Use input data Whether the input data is used to define attributes Both
Lemma/Stem Which technique is used to extract the root of words Both
Correct words Whether a dictionary is used to correct misspellings Both
SMS Whether an emoticons and SMS dictionary is used Both
Word types Types of words to be processed Both
Affective dictionary Whether an affective dictionary is used to define attributes Sentiment
Negation Whether negations are considered Sentiment
Weight Whether valence shifters are considered Sentiment
Hashtags Whether hashtags are considered as attributes Topic
Author tags Whether author tags are considered as attributes Topic
Links Whether data from linked web pages is used Topic
Search engine Whether a search engine is used Topic
we limit n to be no larger than 3. dictionary (see below) we may not use the in-
Of course, it is possible to combine the n- put data. This is controlled with a parameter
grams with several values of n. We only con- that we denote Use input data (see Table 1).
sider the possibility of combining two such Moreover, even if the input data is processes,
values, and one has to be n = 1. This is we may filter it and only keep some of it; for
controlled with the flag Only n-gram (see instance, we may decide to use only nouns.
Table 1), which says whether only n-grams This can be controlled with the parameter
(with n > 1) are considered as terms or also Word types (see Table 1), which is described
individual words (unigrams) are considered. below. In summary, the list of attributes is
In the latter case, the lists of attributes of built from the input data (if so decided) pre-
both cases are merged. The drawback of processed as determined by the rest of pa-
merging is the high number of entries in the rameters (e.g., filtered Word types ) and from
final attribute list. Hence, when doing this, a potentially the additional data (like the af-
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
3 Experimental Results
3.1 Data Sets
We have used a corpus of tweets provided for the
TASS workshop at the SEPLN 2012 conference (TAS,
2012) as input data set. This set contains about 70,000
tweets pro- vided as tuples ID, date, userID. Additionaly,
over 7,000 of the tweets were given as a small training set
with both topic (chosen politics, economy, technology,
literature, music, cin- ema, entertainment, sports, soccer
or others ) and sentiment (or polarity, chosen strong pos-
itive, positive, neutral, negative, strong nega- tive or none
) classification. The data set was shuffled for the topics
and sentiments to be randomly distributed. Due to the
large time taken by the experiments with the large data
set, most of the experiments presented have used the
small data set, using 5,000 tweets for training and 2,000
for evaluation.
3.2 Configurations for the
Submitted Results
We tested multiple configurations with all the WEKA
classifiers to choose the one with the highest accuracy
to be submitted to the TASS challenge. Different
configurations gave the best results for sentiment
analy- sis and topic detection. For instance, for topic
detection the submitted results were obtained with a
Complement Naive Bayes classifier on attributes and
vectors obtained from the input data by not applying
lemmati- zation nor stemming, filtering the words and
keeping only nouns, and using hastags and author tags.
The reported accuracy by the challenge organizers in the
large data set is
45.24%.
Regarding sentiment (polarity), the sub- mitted
results were obtained by first classi- fying the tweets in
5 subsets by using the topic detection algorithm, and
then running the sentiment analysis algorithm within
each
subset. The latter used Naive Bayes Multi- for each classification method a new configu-
nomial on data preprocessed by using the af- ration is created and tested with the param-
fective dictionary, filtering words and keep- eter settings that maximized the accuracy.
ing only adjectives and verbs (adjectives were The accuracy values computed in each of
stemmed, and verbs were lemmatized), using the configurations with the five methods with
the SMS dictionary, and processing negations the small data set are presented in Figures
at the sentence level. The accuracy reported 1 and 2. In both figures, Configuration 1
in the large data set was of 36.04%. is the basic configuration. The derived con-
Since the mentioned results were submit- figurations are numbered 2 to 9. (Observe
ted, we have worked on making the algorithm that each accuracy value that improves over
more flexible, so it is simpler to activate and the accuracy with the basic configuration is
deactivate certain processes. This has led to shown on boldface.) Finally, the last 5 con-
a slightly different behaviour from the sub- figurations of each figure correspond to the
mitted version, but we believe it has resulted parameters settings that gave highest accu-
in an improvement in accuracy. racy in the prior configurations for a method
(in the order Ibk, Complement Naive Bayes,
3.3 Process to Obtain the New Naive Bayes Multinomial, Random Commit-
Experimental Results tee, and SMO).
As mentioned, the algorithm used for ob-
taining the new experimental results, is more 3.4 Topic Estimation Results
flexible and can be configured with the pa- As mentioned, Figure 1 presents the accu-
rameters defined in Table 1. In addition, racy results for topic detection on the small
all classification methods of WEKA can be data set, under the basic configuration (Con-
used. Unfortunately, it is unfeasible to exe- figuration 1), configurations derived from this
cute all possible configurations with all pos- one by toggling one by one every parameter
sible classification methods. Hence, we have (Configurations 2 to 9), and the seemingly
made some decisions to limit the number of best parameter settings for each classification
experiments. method (Configurations 10 to 14). Observe
First, we have chosen only five clas- that there are no derived configuration with
sification algorithms from those provided the search engine flag set. This is because
by WEKA. In particular, we have chosen the ARFF file generated in that
the methods Ibk, Complement Naive Bayes, configuration after searching the web as
Copyright © 2012. Universitat Jaume I. Servei de Comunicació i Publicacions. All rights reserved.
Naive Bayes Multinomial, Random Commit- described above (even for the small data
tee, and SMO. This set tries to cover the set) was extremely large and the experiment
most popular classification techniques. Sev- could not be com- pleted
eral configurations of the parameters from The first fact to be observed in Figure 1
Table 1 will be evaluated with these 5 meth- is that Configuration 1, which is supposed
ods. to be similar to the one used for the sub-
Second, we have chosen for each of the mitted results, seems to have a better ac-
two problems (topic and sentiment) a basic curacy with some methods (more than 56%
configuration. In each case, the basic con- versus 45.24%). However, it must be noted
figuration is as close as possible to the con- that this accuracy has been computed with
figuration used to obtain the submitted re- the small data set (while the value of 45.24%
sults. (Since the algorithm has been mod- was obtained with the large one). A second
ified to add flexibility, the exact submitted observation is that in the derived configura-
configuration could not be used.) The rea- tions there is no parameter that by changing
son for choosing these as basic configurations its setting drastically improves the accuracy.
is that they were found to be the most ac- This also applies to the rightmost configu-
curate among those explored before submis- rations, that combine the best collection of
sion. Then, starting from this basic config- parameter settings.
uration a sequence of derived configurations Finally, it can be observed that the largest
are tested. In each derived configuration, one accuracy is obtained by Configuration 2 with
of the parameters of the basic configuration Complement Naive Bayes. This configura-
was changed, in order to explore the effect of tion is obtained from the basic one by sim-
that parameter in the performance. Finally, ply removing the word filter that allow only