Escolar Documentos
Profissional Documentos
Cultura Documentos
Starfish
A Thesis Proposal to
the Faculty of the College of Computer Studies
De La Salle University
In Partial Fulfillment
of the Requirements for the Degree of
Bachelor of Science in Computer Science
by
Guevarra, Czerald G.
Jaymalin, Ramon Roberto C.
Ruaro, Amity R.
Talon, Juan Antonio Gabriel A.
Starfish
developed by:
Guevarra, Czerald G.
Jaymalin, Ramon Roberto C.
Ruaro, Amity R.
Talon, Juan Antonio Gabriel A.
____________________________, Adviser
______________________________, Date
Abstract
Wikipedia and other wiki sites are infamous for containing unreliable information be-
cause it allows users to freely contribute and edit content at ease. Content-driven reputa-
tion systems were developed to determine the reliability of Wikipedia articles by analyz-
ing the stability of these texts and the amount of time these texts remain unedited. How-
ever, content-driven reputation systems assume that texts high in trust last longer which
could falsely label inaccurate text as credible. Another kind of reputation system is user-
driven. In contrary to the content-driven systems, this reputation system relies on the
user's inputs to determine the reputation of its users. Nevertheless, it has its own share
of flaws such as user feedbacks being subject to bias and uncooperative authors. There
is an existing reputation system implemented in mobile ad-hoc networks (MANETs)
which provides incentives for nodes to cooperate in the network. This concept can be
applied to user-driven reputation systems in order to attend to its flaws by encouraging
users to actually give feedback.
Table of Contents
1. Introduction................................................................................................................................1
1.1 Overview .............................................................................................................................1
1.2 Problem Statement..............................................................................................................2
1.3 Research Objectives............................................................................................................2
1.3.1 General Objective........................................................................................................2
1.3.2 Specific Objectives ......................................................................................................2
1.4 Project Scope and Limitations ..........................................................................................2
1.5 Test and Verification Method ...........................................................................................4
1.5.1 Computational Model Test ........................................................................................4
1.5.2 Local wiki-server Test.................................................................................................4
1.5.3 Contributor Reputation Test .....................................................................................4
1.5.4 Rater Reputation Test .................................................................................................4
1.5.5 Article Rating Test.......................................................................................................5
1.5.6 Comparison with other Reputation Systems...........................................................5
1.6 Significance of the Study....................................................................................................5
1.7 Research Methodology.......................................................................................................6
1.8 Resources..............................................................................................................................7
1.8.1 Hardware ......................................................................................................................7
1.8.2 Software ........................................................................................................................7
2. Review of Related References..................................................................................................1
2.1 Review of Related Work ....................................................................................................1
2.1.1 WikiTrust 3.53.53.53.5................................................................................................1
2.1.2 Wikipedia Recommender System 3.5 .......................................................................1
2.2 Review of Related Literature .............................................................................................2
2.2.1 Reputation Systems 3.5...............................................................................................2
2.2.2 Mobile Ad-hoc Networks (MANETs) 3.53.53.5....................................................2
2.2.3 eBay Feedback Forum 3.53.53.53.5..........................................................................3
2.2.4 Sybil Attacks 3.53.5 .....................................................................................................3
2.2.5 CONFIDANT Protocol 3.5 ......................................................................................3
3. Theoretical Framework.............................................................................................................1
3.1 Mobile Ad-Hoc Networks (MANETs) ...........................................................................1
3.2 Wiki .......................................................................................................................................1
3.3 Content-driven Reputation System ..................................................................................1
3.4 User-driven Reputation System ........................................................................................1
3.5 Nodes....................................................................................................................................1
1. Introduction
1.1 Overview
Issues with conventional user-driven reputation systems may be solved by adopting repu-
tation systems used in other fields such as Mobile Ad-hoc networks (MANETs). A
MANET is a collection of mobile devices or nodes that rely on each other to forward
messages [15]. Due to resource constraints such as battery life, some nodes tend to be
selfish and misbehaving. They refuse to cooperate in order to conserve resources.
MANETs solve the problem of misbehaving nodes by providing incentives for nodes to
cooperate [6]. To help alleviate these problems regarding misbehaving nodes, a reputa-
tion system is used to identify these malicious nodes and exclude them from the network.
Nodes observe each other for every function executed and every node has a reputation
rating which is used by the network system to decide whether to respond on a node's
request or to trust another node's observations. The aforementioned reputation rating is
updated corresponding to the node's behavior [5]. The reputation system concepts
adopted from MANETs can be applied to current user-driven reputation systems which
are susceptible to bias and dishonesty.
1.2 Problem Statement
Inaccuracies in Wikipedia articles are sometimes neglected and ignored. Since the cur-
rent content-driven reputation system for Wikipedia assumes that texts high in trust last
longer, it could falsely label inaccurate text as credible. On the other hand, conventional
user-driven reputation systems are susceptible to problems such as users not providing
any feedback and users giving biased feedback.
The study aims to create a user-driven reputation system for Wikipedia that encourages
authors to give reliable feedback regarding Wikipedia's entries.
In order to define the bounds of the study, the following scope and limitations are set:
• The study assumes that there are good authors in the system
o This is needed in order to determine the ratio between good authors and
bad authors needed for the system to become effective. The system also
requires the existence of good authors for it to function correctly. Oth-
erwise, the system fails to determine the credibility of each entry.
• The study assumes that the author takes part in two roles; one as a contributor
and one as a rater.
o The author acts as a contributor when he creates and edits existing arti-
cles. On the other hand, the author acts as a rater when he reviews an ar-
ticle other than his own.
• The study assumes that there are three types of unreliable raters: non-performing
raters, lazy raters, and malicious raters.
o Non-performing raters are raters who do not review and rate the articles
at all. Lazy raters are those who do not review other articles and rate at
random. Malicious raters commit ratings which are intentionally different
from the norm. There is, however, a fourth kind of rater whose behavior
is similar to malicious raters but differ in intention called deviant raters.
Deviant raters honestly give their opinion on the rating. Although this is
the case, deviant raters give ratings different from that of the norm.
• The study does not cover the algorithms involved in differentiating deviant raters
from malicious raters.
o The system cannot distinguish malicious and deviant raters because they
present similar behavior even if their intentions are entirely different.
The study involved in differentiating the two groups is broad and may in-
volve the study of human psychology because the intentions of each of
these user groups cannot easily be determined based on their inputs.
Therefore, the system recognizes them both as malicious.
• The study assumes that an author may be reliable or unreliable depending on his
article contributions and ratings.
• The study includes the use of a data source for testing the software in real-world
situations.
o The server allows contributions only from a certain group of people. The
study also includes measuring the performance of the algorithm and per-
forming surveys that would also measure the quality of the edited entries.
o The study takes note of the differences between the results of conven-
tional reputation systems and Starfish.
To test the efficiency of the algorithms and formulas researched, a computer simulation
is created to imitate the interaction of authors in the reputation system. These algo-
rithms are used to determine the rating of each article. In addition, from the rating of
each article, the reputation of a user as contributor and as a rater is computed. The re-
sults found in the simulation will become the basis of testing the system. Results such as
the ratio of reliable authors to the unreliable authors will also be computed in the simula-
tion.
The objective of this test is to verify that the computation of the reputation of authors
works correctly. This test will prove if this module of the program works completely.
This test can be done by first filling the local wiki-server with article samples. Some of
the sample should be correct and true while some are inaccurate. If the rating of an au-
thor to a true article is poorly done or it deviates from the other ratings then the system
should appropriately rate that author as unreliable. If on the other hand the rating of the
author is near the other ratings and it does not deviate then he should be marked as reli-
able.
The objective of this test is to verify that the computation of the reputation of authors as
contributors works correctly. This test will prove if this module of the program works
completely. This test can be done by first creating an article. One article is accurate
while the other contains false information. If most of the other raters rated the article
low then the author should be marked as unreliable. In contrary, if it was rated high then
the reputation of the author should go up as well according to the system’s computation.
This test is also true if the author edits or revises an existing article. If the article has a
high credibility, but when he edits it, the credibility went down, then the reputation of
the contributor should also go down. Likewise, if the article has low credibility but when
an author revises it and the rating of the article suddenly goes up then it would naturally
make the reputation of the contributor go up as well.
The objective of this test is to verify that the computation of the reputation of authors
works correctly. This test can be done by first filling the local wiki-server with article
samples. Some of the samples are correct and true while some are fake. If the rating of
an author to a true article is poorly done or it deviates from the other ratings then the
system should appropriately rate that author as unreliable. If on the other hand the rat-
ing of the author is near the other ratings and it does not deviate then he should be
marked as reliable.
The objective of this test is to verify the results of each author’s feedback regarding an
article’s credibility. Verifying the results of this test will affect the reliability of the algo-
rithms adapted from MANETs. In testing the results of each article’s ratings, it is im-
perative for the local wiki server to have a published article. The system would then
compute, from the algorithms consolidated, for the article’s rating based on the feed-
backs of each author in the wiki server. The computed rating of the article must match
each user feedback received by the article to ensure that the system is implemented prop-
erly.
Collaborative websites such as Wikipedia, Google Maps and Flickr are increasing in im-
portance on the web [2]. User-contributed data from these collaborative websites now
play a bigger role on society because people rely on them for information such as finding
businesses nearby on Google Maps or finding pictures of a specific place on Flickr. The
most successful collaborative website is arguably Wikipedia [1][2] however, Wikipedia
contains unreliable content because anyone, regardless of their background and knowl-
edge on a subject, can edit and contribute to Wikipedia articles [16]. A reputation system
developed for Wikipedia could increase its reliability and provide its readers a way to de-
termine reliable articles from the rest. Access control can also be implemented in a repu-
tation system in order to prevent malicious users in tampering with the reputation values
and limit access to controversial content. Investing on the existing content of Wikipedia
is essential [11] and, because of the open nature of Wikipedia, it has the ability to outdo
traditional encyclopedias in terms of price, number of articles, speed, and supported lan-
guages.
Encouraging authors to seriously rate other articles could increase the reliability of an
article. The study could enable the collection of vast amounts of information from peo-
ple around the globe without sacrificing credibility. The reputation system does not pre-
vent vandalism but rather, it notifies the readers which articles are of high quality and
which are not. In effect, Wikipedia will become a more trusted platform and become a
better source of information for a wide range of topics.
1.7 Research Methodology
The implementation of the study involves four phases. The first phase involves re-
searches on how computation algorithms are implemented in a user-driven reputation
system. The next phase, the testing phase, test the efficiency of the data consolidated in
the research phase. The third phase includes installing a local-wiki server and testing if
the server works faultlessly. Lastly, the final phase involves the actual implementation of
the reputation system in a real-world simulation. It is in this phase that the system will
actually be integrated, debugged, and tested with the wiki server.
The first task is to research on current algorithms and formulas which are implemented
on existing Mobile Ad-Hoc Networks (MANETs). This includes the formulae and the
variables needed to compute for a user’s reputation. One of the criteria for choosing the
appropriate variables is whether the increased in ratings through different methods are
balanced. Once the algorithms have been adapted to the context of Wiki-like domains,
its efficiency is tested. This efficiency can be tested by performing a computer simula-
tion. This simulation imitates the interaction of the aforementioned types of authors in
Wikipedia and through the simulation, and the expected limitations of the system such as
the percentage of reliable and unreliable authors which can be taken into consideration
when developing the actual server system.
After doing the simulation and studies, the next step would be to deploy a local version
of the Wikipedia. The local version would now then needs to be filled up with sample
articles to rate on. It is important that the server runs smoothly because it is where the
software will be tested on.
Developing and integrating the actual reputation system on existing wiki software is done
after the computer simulation to test if our findings work in a real-world setting. The
software made is hosted on a server for testing and verification. The server will be ini-
tially accessible within the network laboratory with at least 20 computers so that system
testing would be manageable at first. After successfully testing the system for errors and
improvements, the next task to do is to make the server accessible to a larger group of
people. Data gathering is soon followed and this data is used to compute for the effi-
ciency of the system as compared to the computer simulation.
Throughout the course of the system’s development, the documentation will be updated
as required.
1.8 Resources
1.8.1 Hardware
• computer with an Internet connection
o x86-compatible PC
o minimum of 640MB (512MB for the virtual machine and 256MB for the
server) of RAM
• LAN Card
o Used by the computer to access the Internet
1.8.2 Software
• PHP Server
o Executes PHP code - a scripting language for web development.
• HTTP Server
o Delivers web pages through the Internet using the Hypertext Transfer
Protocol (HTTP) protocol.
• MediaWiki
o MediaWiki is used to manage wiki entries, track revisions and its users.
• Database Management System
o A database is used to store the information for the system which includes
the articles, users, and their reputations.
• Virtual machine
o The virtual machine is used to run the the operating system and its ser-
vices.
• Ubuntu Server
o Used to run the services
2. Review of Related References
Text newly inserted (or an article newly published) will have an initial trust rating. As the
number of revisions goes higher, the text or article's trust rating will become higher as
well. WikiTrust also uses the author's reputation to compute for the text's predictive
value - if the text is likely to be long lasting or not.
WikiTrust is capable of access control where they can limit the access of low reputation
contributors to controversial articles in Wikipedia. WikiTrust assumes that newly regis-
tered contributors have the same reputation as bad contributors because if new users had
a higher reputation, the bad contributors would simply register again. An author with
low reputation will likely edit many consecutive times to raise his reputation. As a solu-
tion, that author cannot edit an article again until a certain number of authors have edited
the article.
It is an external recommendation system for Wikipedia. It does not modify existing Me-
diaWiki software but instead serves as a web proxy to Wikipedia. The web proxy resides
on the reader's computer and it retrieves articles from Wikipedia and rewrites it to in-
clude the rating of the article and the reader has the option to provide feedback. The
rating provided by the reader is used by the web proxy to build trust between other users
who have recommended the article as well.
It has a recommender system which stores the feedback of all users and is stored on
Wikipedia's servers through the discussion pages of the articles. To prevent anyone from
editing the comments on Wikipedia, a digital signature is used to verify the integrity of
the comment. This system requires a repository to store user's public-keys.
The reputation of the articles is computed from the ratings given by the users, the user's
pseudonym in Wikipedia, and the version number of the article the rating applies to.
The ratings are computed by calculating the weighted average of the user's ratings. The
weights are computed by gathering the "Ring of Reviewers" or the list of people whose
recommendations the person used in the past.
2.2 Review of Related Literature
The Internet supports anonymous interactions between different users and it has pre-
sented new ways of communication between people but with it comes the risks such as
false information and false claims. A reputation system's job is to collect user data and
combine it with feedback from the user's past behavior. It also helps users decide on
whom to trust or not by using the reputation system. Reputation systems should imitate
long-term relationships between human beings which include learning from past interac-
tions and having an incentive to show good behavior because of the fear of retaliation.
However, it becomes much harder to trust a stranger because you do not know their past
histories, no incentives to do good, and the person's reputation is not at stake which dis-
courages cooperation.
Reputation systems should show that users are long-lived, user feedback should be ac-
quired from other users and it should display them in a readable manner so that the users
you interact with know your past history.
There are also challenges that reputation systems face such as users not providing feed-
back, difficulty in eliciting negative feedback due to the fear of retaliatory negative feed-
back, and lastly are assuring honest reports.
In MANETs, there is no central server or device that does the routing and other net-
working functions within the network. The hosts in the network wirelessly connect and
do the necessary functions by themselves in the network. Some hosts, however, do not
cooperate in the functions of the network but still use other nodes to forward their own
packets. These hosts
To solve this problem, MANETs use a reputation system to enforce cooperation and
fairness between nodes to ensure the performance in the functions of the network. The
reputation rating and the trust rating are the metrics used to show the behavior of a node
as observed by others and the trustworthiness of a node as observed by others respec-
tively. Hosts observe each other and each observation is used as basis for reputation and
trust ratings.
Hosts are classified as either normal or misbehaving (selfish nodes) if one node does not
participate in the processes of the network, and either trustworthy or untrustworthy (ma-
licious nodes) if one node makes either negative or positive observation towards another
node. The system utilizes reputation fading to prevent misbehaving hosts from using the
network resources. Untrustworthy nodes have ratings that can impact the reputation of
other nodes. Punishment of untrustworthy nodes may be utilized but it is not imple-
mented in order to prevent discouragement from participating in the processes in the
network. These nodes only have lesser impact on the reputation of others. The system
uses periodic reevaluation to make it possible for misbehaving or untrustworthy nodes to
make themselves treated as useful again in the network. Nodes that are suspected to
have redeemed themselves in order to make false ratings again and affect the reputation
ratings of others will be evaluated using the old observations from others.
MANET reputation systems use a reputation table which contains reputation data for
every node, also known as a network entity. MANETs also use a watchdog mechanism,
which is used to detect misbehaving entities. A node triggers the watchdog when it
needs to monitor the correct execution of a function implemented in a neighboring en-
tity. The protocols in this system automatically decrease the reputation of a node if it
refuses to cooperate.
It is the reputation system used by eBay. It allows the buyer and the seller to give feed-
back every time they finish a transaction. The feedback includes a rating of either -1, 0, 1
which represents a negative feedback, neutral feedback and a positive feedback respec-
tively. They can optionally leave a comment to each other.
The reputation of a seller is computed by getting the summation of his ratings and this
computed reputation is displayed beside his pseudonym so that other users are aware of
his performance in his past transactions.
However, the reputation of a user can be received artificial boosting their reputation us-
ing Buy-It-Now listings (selling something at a fixed price) that cost one penny. These
small transactions allow a user to receive and collect positive ratings. With such a reputa-
tion system, it is also hard for the buyer or seller to give a negative feedback because the
two parties usually communicate before a transaction which means that negative feed-
back is only given by unsatisfactory transactions. Users are also afraid to give negative
feedback because the rated user might give a negative feedback as well.
Some nodes (or users) can manifest numerous identities thereby throwing away their past
reputations. This attack can compromise peer-to-peer systems by misleading them into
thinking that there are multiple nodes when there is only one. In order to resist this at-
tack, several measures are used such as redundancy, certification by a third party, hashing
the IP address of a node, naming remote paths, and using once-in-a-lifetime identifiers
which are anonymous certificates that prevents users from having another identity.
Prevention mechanisms are good if the mechanism is perfect. Since most systems have
imperfection, given that most attacks are due to vulnerabilities and flaws, prevention is
not that helpful. Because prevention systems are not 100 percent secure, the best way to
thwart an attack is through detecting the attack. This protocol thwarts attacks by first
detecting if there is a malicious behavior. After detecting if there is misbehavior, what
the system does is to react to it. The system reacts to it by not forwarding the packets of
the malicious nodes. If the malicious node or the node that was wrongly accused repents
by normally behaving again for a given amount of time, it maybe included in the network
again.
The motivation for cooperating is because of the rewards the nodes might get. Nodes
forward each others packet on behalf of one another. A perfect analogy of this is the
birds that groom other birds but they cannot groom their selves. The first kind of bird is
the suckers which always help other birds regardless of the case. Cheats on the other
hand does not return the favor even if the bird help them clean their heads. The last
kind is the grudger which is very helpful to all the other birds at first when a particular
bird does not return the favor then they will not groom that bird anymore. In a simula-
tion test done, the first to be extinct will be the suckers. The grudger will also suffer
some loss and the cheats’ population will increase. This is because the sucker does not
have anyone to groom them but their selves and the grudgers while the cheats have eve-
ryone to groom them. But overtime cheats will become extinct because no one will help
them since the grudges will not give them help anymore. The grudges would finally be
the winner.
These results are then integrated to the CONFIDANT protocol to give better results.
The nodes should learn from observation. Instead of waiting for the node to not for-
ward their packets, the node should be able to ask other nodes if that particular node is
misbehaving. If the node learned this before experiencing it then the malicious node can
be immediately penalized.
Each node in the network observes and detects the behavior of its neighboring nodes. It
detects if a node is misbehaving by observing the transmission of the node, and by its
route protocol behavior. When a certain node in the network is detected to have been
behaving differently from its normal behavior, a trust manager is executed and sends
alerts to all other nodes that a certain node has not behaved according to the norm. This
trust manager is the basis of how reliable each node in the network is because aside from
sending alert messages to nodes, it also calculates the trust level of each node. Each
node stores a table containing the rating of every other node in the network. When a
node finds out that a path has a very low rating, it disregards the current path and devises
a new path using the path manager.
3. Theoretical Framework
The MANET is a network design that consists of wireless host computers that do the
network functions such as forwarding packet, routing, and name-to-address mapping.
This design replaces wired connections and allows mobility between hosts. Every host
acts as a router and therefore eliminates the use of devices other than the wireless hosts.
3.2 Wiki
Wiki is a server software that allows users to create, edit, and delete content using a web
browser. It provides information about anything that users want to share and allows dy-
namic updates by anyone within the community of the website. It is also a collaborative
software because of its open nature in allowing anyone to contribute content.
The content-driven reputation system is a system that does not require user input but
instead derives the reputation from the users' input. In this system, the metrics used for
measuring the reputation are the longevity of the texts, which basically means the time
that an article is left unedited or unmodified.
The user-driven reputation system is a system that computes for the reputation of the
contents and the authors based from the behaviors and ratings of other authors. A per-
fect example of this would be the reputation system used on the website eBay. Users of
the website and the items sold are rated by other users.
3.5 Nodes
Nodes are basically host computers that are included in a Mobile Ad-Hoc Network.
Nodes are the ones to act as a router in a MANET. They serve each other by forwarding
the packets, etc. Nodes can support wireless connections since MANET is implemented
as a wireless network.
Appendix A.Bibliography
[1] Adler, B., Chatterjee, K., & de Alfaro, L., (2007, November). Assigning trust to
Press, 2008.
[2] Adler, B. & L. de Alfaro (2007). A content-driven reputation system for the wikipe-
[3] Adler, B., de Alfaro, L., Pye, I., & Raman, V. (2008). Measuring author contributions
[4] Chatterjee, K., de Alfaro ,L., & Pye, I. (2008). Robust content-driven reputation. In
[5] Buchegger, S. & Boudec, J. Y. (2004). A robust reputation system for P2P and mo-
Peer-to-Peer Systems.
[6] Buchegger, S. & Boudec, J. Y. (2002, January). Nodes bearing grudges: Towards
[7] Buchegger, S., & Le Boudec, J. (2002). Performance Analysis of the CONFIDANT
networking & computing. New York, NY, USA, ACM pp. 226-236.
[8] Douceur, J. R. (2002). The sybil attack. In IPTPS '01: Revised Papers from the First
[9] Friedman, E. & Resnick, P. (1999, August 11). The social cost of cheap pseudo-
techniques for reputation systems. ACM Computing Surveys, 42(1), pp. 1-31.
[11] Korsgaard, T. R. & Jensen, C. D. (2009, July 30). Reengineering the Wikipedia for
[12] Kuwabara, K., Friedman, E., Resnick, P., & Zeckhauser, R. (2000, December).
[14] Resnick, P., Zeckhauser, R., Friedman, E., & Kuwabara, K. (2000). Reputation sys-
[15] Rishiwal, V., Verma, S. & Yaday, M. (2008, July 16). Power aware routing to support
real time traffic in mobile adhoc networks. In proceedings of the 2008 First Interna-
[16] Cross, T. (2009, September). Puppy smoothies: Improving the reliability of open,
http://pear.accc.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1400/
(February 2, 2010).
(February 1, 2010).
http://answers.yahoo.com/info/scoring_system;_ylt=AsY6SYqCoD9FxeqwH7l283
Czerald G. Guevarra
Computer Technology Department
College of Computer Studies
De La Salle University-Manila
czerald@gmail.com
Amity R. Ruaro
Computer Technology Department
College of Computer Studies
De La Salle University-Manila
amityruaro@gmail.com