Escolar Documentos
Profissional Documentos
Cultura Documentos
CHAPTER 1
INTRODUCTION
This paper mainly deals with information retrieval system. Information retrieval
is the area where users might search for documents, information within documents
and metadata from documents on the web. Many users query might include retrieval
of documents for personal names. Many celebrities and experts from various fields
are referred by their original names on web. Most of the queries to web search engines
include person names. For example, people might use Michel Jackson as a query on
search engine to know about him. The search engine might give the relevant
documents met the information need of the users query. Apparently celebrities and
experts might also be referred by their aliases on the web. Many web pages about
person names might also be created by aliases. For example, a newspaper article
1
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
might refer the persons using their original names, whereas a blogger might refer
them using their nick names. The user will not be able to retrieve all information
about a person if he only uses his personal name. To retrieve complete information
about a person name, one might know about his aliases on the web. Various types of
words are used as aliases on the web. Identifying aliases will be helpful in information
retrieval. The aliases are extracted using previously proposed alias extraction method.
The search engine expands the query on person names by tagging the extracted aliases
to retrieve relevant web pages those are referred by original names as well as aliases
thereby improving recall and MRR.
1.1 MOTIVATION
Searching for information about people in the Web is one of the most common
activities of Internet users. Around 30% of search engine queries include person
names .However, retrieving information about people from web search engines can
become difficult when a person has nicknames or name aliases. For example, the
famous Japanese major league base ballplayer Hideki Matsui is often called as
Godzilla on the Web. A newspaper article on the baseball player might use the real
name, Hideki Matsui, whereas a blogger would use the alias, Godzilla, in a blog entry.
We will not be able to retrieve all the information about the base ballplayer if we only
use his real name .Identifying aliases of a name is important in information retrieval.
In information retrieval, to improve recall of a web search on a person name, a search
engine can automatically expand a query using aliases of the name. In our previous
example, a user who searches for Hideki Matsui might also be interested in retrieving
documents in which Matsui is referred to as Godzilla. Consequently, we can expand a
query on Hideki Matsui using his alias name Godzilla.
The set A of its aliases to be the set of all words or multiword expressions that are
used to refer on the web. For example, Godzilla is a one-word alias for Hideki Matsui,
whereas alias the Fresh Prince contains three words and refers to Will Smith. Various
types of terms are used as aliases on the web. For instance, in the case of an actor, the
name of a role or the title of a drama (or a movie) can later become an alias for the
person (e.g., Fresh Prince, Knight Rider). Titles or professions such as president,
doctor, professor, etc., are also frequently used as aliases. Variants or abbreviations of
names such as Bill for William, and acronyms such as JFK for John Fitzgerald
Kennedy are also types of name aliases that are observed frequently on the web.
2
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
1.2 OVERVIEW
An individual is typically referred by numerous name aliases on the web.
Accurate identification of aliases of a given person name is useful in various web
related tasks such as information retrieval, personal name disambiguation, and
relation extraction. We propose a method to extract aliases of a given personal name
from the web. Given a personal name, the proposed method first extracts a set of
aliases. Second, we rank the extracted candidates according to the likelihood of a
candidate being a correct alias of the given name.
We define numerous ranking scores to evaluate candidate aliases using three
approaches: lexical pattern frequency, word co-occurrences in an anchor text graph,
and page counts on the web. To construct a robust alias detection system, we integrate
the different ranking scores into a single ranking function using ranking support
vector machines. We evaluate the proposed method on two data sets: an English
personal names data set, an English place names data set The proposed method
outperforms numerous baselines and previously proposed name alias extraction
methods, achieving a statistically significant mean reciprocal rank (MRR) of 0.67.
Experiments carried out using location names and Japanese personal names suggest
the possibility of extending the proposed method to extract aliases for different types
of named entities, and for different languages. Moreover, the aliases extracted using
the proposed method are successfully utilized in an information retrieval task and
improve recall by 20 percent in a relation detection task.
To select the best aliases among the extracted candidates, we propose numerous
ranking scores based upon three approaches: word co-occurrences in an anchor text
graph, and page counts on the web. Moreover, using real-world name alias data, we
train a ranking support vector machine to learn the optimal combination of individual
ranking scores to construct a robust alias extraction method.
Along with the recent rapid growth of social media such as blogs, extracting and
classifying sentiment on the web has received much attention However, when people
express their views about a particular entity, they do so by referring to the entity not
only using the real name but also using various aliases of the name. By aggregating
texts that use various aliases to refer to an entity, a sentiment analysis system can
produce an informed judgment related to the sentiment.
1.3 LIMITATIONS
3
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
However, an inherent limitation is that they cannot identify aliases which share
no words or letters with the real name For example; approximate string matching
methods would not identify Fresh Prince as an alias for Will Smith.
The most likely correct aliases are assigned a higher rank. Ranking a set of
candidate aliases must be in descending order only. The co-occurrences will be
considered not only anchor text but also the overall process on the web.
The different ranking scores must be integrated to evaluate ranking function
which a complex process Comparison of SVM ranks with is previously proposed rank
must be difficult.
4
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
CHAPTER 2
LITERATURE SURVEY
2.1 INTRODUCTION
Literature survey is the most important step in software development process.
Before developing the tool it is necessary to determine the time factor, economy n
company strength. Once these things r satisfied, ten next steps are to determine which
operating system and language can be used for developing the tool. Once the
programmers start building the tool the programmers need lot of external support.
This support can be obtained from senior programmers, from book or from websites.
5
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Before building the system the above consideration are taken into account for
developing the proposed system.
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
search query on a name by tagging the aliases according to their association orders to
retrieve all relevant pages which in turn will increase the recall and achieve a
substantial MRR.
2.4 CONCLUSION
This paper presents a software application which avoids manual hours and helps
the customer to track the status of applications. It is not secure to maintain important
information manually it is better to store in database which helps in avoiding
conflicts. In this software application no specific training is required for the
employees to use this application.
7
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
CHAPTER 3
ANALYSIS
3.1 INTRODUCTION
FEASIBILITY STUDY
The feasibility of the project is analyzed in this phase and business proposal is
put forth with a very general plan for the project and some cost estimates. During
system analysis the feasibility study of the proposed system is to be carried out. This
8
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
is to ensure that the proposed system is not a burden to the company. For feasibility
analysis, some understanding of the major requirements for the system is essential.
Three key considerations involved in the feasibility analysis are
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the system will
have on the organization. The amount of fund that the company can pour into the
research and development of the system is limited. The expenditures must be justified.
Thus the developed system as well within the budget and this was achieved because
most of the technologies used are freely available. Only the customized products had
to be purchased.
TECHNICAL FEASIBILITY
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on
the available technical resources. This will lead to high demands on the available
technical resources. This will lead to high demands being placed on the client. The
developed system must have a modest requirement, as only minimal or null changes
are required for implementing this system.
SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by the
user. This includes the process of training the user to use the system efficiently. The
user must not feel threatened by the system, instead must accept it as a necessity. The
level of acceptance by the users solely depends on the methods that are employed to
educate the user about the system and to make him familiar with it. His level of
confidence must be raised so that he is also able to make some constructive criticism,
which is welcomed, as he is the final user of the system.
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
people from various fields of cinema, sports, politics, science, and mass media. The
place names data set contains aliases for US states. According to the definition of cooccurrences if the two anchor texts co-occur in pointing to the same URL, then
undirected edge will be drawn between them to denote their co-occurrences. A word
co-occurrence graph like will be created for the name and aliases according to their
first order associations among them. Each name and aliases will be represented by a
node in the graph. The two nodes will be connected if they make first order
associations between them. The edge between nodes will describe that the nodes
bearing anchor texts co-occur according to the definition of anchor texts cooccurrences. Next the hop distance between nodes will be identified in order to have
first, second, and higher order associations between name and aliases by graph mining
algorithm.
3.2.2 SCOPE OF THE PROJECT
Extracting aliases of an entity is important for various tasks such as identification
of relations among entities, web search and entity disambiguation. To extract relations
among entities properly, one must first identify those entities. We propose a novel
approach to find aliases of a given name using automatically extracted lexical
patterns. We exploit a set of known names and their aliases as training data and extract
lexical patterns that convey information related to aliases of names from text snippets
returned by a web search engine. The patterns are then used to find candidate aliases
of a given name. We use anchor texts to design a word co-occurrence model and use it
to define various ranking scores to measure the association between a name and a
candidate alias. The ranking scores are integrated with page-count-based association
measures using support vector machines to leverage a robust alias detection method.
The proposed method outperforms numerous baselines and previous work on alias
extraction on a dataset of personal names, achieving a statistically significant mean
reciprocal rank of 0.6718. Experiments carried out using a dataset of location names
and Japanese personal names suggest the possibility of extending the proposed
method to extract aliases for different types of named entities and for other languages
SRS MODEL
10
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
through
the
phases
of
Conception,
11
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
fact, is how the term is generally used in writing about software development to
describe a critical view of a commonly used software development practice.
Pentium III/IV
Speed
1.1 Ghz
RAM
256 MB(min)
Hard Disk
20 GB
Floppy Drive
1.44 MB
Key Board
Mouse
Monitor
SVGA
: Windows95/98/2000/XP
Application Server
: Tomcat5.0/6.X
Front End
Scripts
: JavaScript.
Database
: MysQl
Database Connectivity
: JDBC.
12
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
think about encapsulation is as a protective wrapper that prevents the code and data
from begin arbitrarily accessed by other code defined outside of the wrapper. Access
to he code and data inside the wrappers is tightly controlled through a well-defined
interface. To relate this to the real world, consider the automatic transmission on an
automobile. It encapsulates hundreds of bits of information about our engine, such as
how much you are accelerating, the pitch of the surface you are on, and the position of
the shift lever.
B) INHERITANCE
Inheritance is the process by which one object acquires the properties of
another object. This is important because it supports the concept of hierarchical
classification. As mentioned earlier, most knowledge is made manageable by
hierarchical classification. Inheritance interacts with encapsulation as well. If a given
class encapsulates some attributes, then subclass will have the same attributes plus
any that it adds as part of its specialization.
C)POLYMORPHISM
Polymorphism (from the Greek, meaning many forms) is a feature that
allows one interface to be used for a general class of actions. More generally, the
concept of polymorphism is often expressed by the phrase one interface, multiple
methods. This means that is possible to design a generic interface to a group or
related activities. This helps reduce complexity by allowing the same interface to be
used to specify a general class of action. It is the compilers job to select the specific
action as it applies to each situation
MY SQL
MySQL is a relational database management system (RDBMS), and ships
with no GUI tools to administer MySQL databases or manage data contained within
the databases. Users may use the included command line tools, or use MySQL "frontends", desktop software and web applications that create and manage MySQL
databases, build database structures, back up data, inspect status, and work with data
records. The official set of MySQL front-end tools,my sql work bench is actively
developed by Oracle, and is freely available for use.
Graphical Package
The official MySQL Workbench is a free integrated environment developed by
MySQL AB, that enables users to graphically administer MySQL databases and
14
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Cross-platform support
Stored procedures
Triggers
Cursors
Updatable Views
Information schema
Strict mode
Transactions with the InnoDB, and Cluster storage engines; savepoints with
InnoDB
SSL support
15
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Limitations
Like other SQL databases, MySQL does not currently comply with the full
SQL standard for some of the implemented functionality, including foreign key
references when using some storage engines other than the 'standard' InnoDB
Triggers are currently limited to one per action / timing, i.e. maximum one after insert
and one before insert on the same table. There are no triggers on views
MySQL, like most other transactional relational databases, is strongly limited by hard
disk performance. This is especially true in terms of write latency. Given the recent
appearance of very affordable consumer grade SATA interface Solid-state drives that
offer zero mechanical latency, a fivefold speedup over even an eight drive RAID
array can be had for a smaller investment
Windows Deployment
MySQL can be built and installed manually from source code, but this can be
tedious so it is more commonly installed from a binary package unless special
customizations are required. On most Linux distributions the package management
system can download and install MySQL with minimal effort, though further
configuration is often required to adjust security and optimization settings.Though
MySQL began as a low-end alternative to more powerful proprietary databases, it has
gradually evolved to support higher-scale needs as well. It is still most commonly
used in small to medium scale single-server deployments
Community
The MySQL server software itself and the client libraries use dual-licensing
distribution. They are offered under GPL version 2, beginning from 28 June 2000
(which in 2009 has been extended with a FLOSS License Exception) or to use a
proprietary license.
Support can be obtained from the official manual. Free support additionally is
available in different IRC channels and forums. Oracle offers paid support via its
MySQL Enterprise products.
Apache tomcat
16
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Apache Tomcat (or simply Tomcat, formerly also Jakarta Tomcat) is an open
source web server and servlet container developed by the Apache Software
Foundation (ASF). Tomcat implements the Java Servlet and the JavaServer Pages
(JSP) specifications from Sun Microsystems, and provides a "pure Java" HTTP web
server environment for Java code to run.Apache Tomcat includes tools for
configuration and management
Cluster
This component has been added to manage large applications. It is used for Load
balancing that can be achieved through many techniques.Clustering support currently
requires the JDK version 1.5 or later.
High availability
A high-availability feature has been added to facilitate the scheduling of
system upgrades (e.g. new releases, change requests) without affecting the live
environment. This is done by dispatching live traffic requests to a temporary server on
a different port while the main server is upgraded on the main port. It is very useful in
handling user requests on high-traffic web applications.[2]
Web Application
It has also added user as well as system based web applications enhancement
to add support for deployment across the variety of environments. It also tries to
manage session as well as applications across the network.
Tomcat building is additional components. A number of additional
components may be used with Apache Tomcat. These components may be built by
users should they need them or they can be downloaded from one of the mirrors may
be built by users
Features
Tomcat 7.x implements the Servlet 3.0 and JSP 2.2 specifications. It requires
Java version 1.6, although previous versions have run on Java 1.1 through 1.5.
Versions 5 through 6 saw improvements in garbage collection, JSP parsing,
17
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
F
Fig3.2: Architecture of proposed method
The architecture outlined in Fig3.2 and comprises four main components namely
computation of word co-occurrence statistics, ranking anchor texts, creation of anchor
text co-occurrence graph, and discovery of association orders. The anchor text mining
to measure the associations between anchor texts. Ranking support vector machine
(SVM) will be used to rank the anchor texts with respect to each anchor text to
identify the highest ranking anchor text for making first order associations among
anchor texts. The whole process that should done in the project is described as follows
The input is in the form of allinanchor: input is given to the Google search engine
and it will retrieve the all corresponding anchor texts and urls according to the given
input and those anchor texts and urls are kept in a table called contingency table in
contingency table the anchor texts and urls are arranged in two columns. After
creation of contingency table the word co-occurrence frequency can be computed
18
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
In the second part of the architecture the training data set is presented. The input
is given to the support vector machine. The support vector machine generates a
ranking function that function is given to the ranking algorithm. The ranking
algorithm generates a first order association.
In the third part of the architecture to get the connections among the name and
aliases the graph drawing algorithm is drawn. From that algorithm the word cooccurrence graph is created. The word co-occurrence graph should be mined by graph
mining algorithm. The hop distance is found by using graph mining algorithm. Finally
the association orders will be discovered by using the hop distances.
3.6ALGORITHM
3.6.1 Keyword Extraction Algorithm
Matsuo, Ishizuka proposed a method called keyword extraction algorithm that
applies to a single document without using a corpus. Frequent terms are extracted
first, and then a set of co-occurrences between each term and the frequent terms, i.e.,
occurrences in the same sentences, are generated. Co-occurrence distribution showed
the importance of a term in the document. However, this method only extracts a
keyword from a document but not correlate any more documents using anchor textsbased co-occurrence frequency
Generate Occurrences in
the Same Sentence
Extracts a Keyword
Stop
Fig3.3: Flow chart of the project
19
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
3.8 CONCLUSION
In this phase, we understand the software requirements specifications for the
project. We arrange all the required components to develop the project in this phase
itself so that we will have a clear idea regarding the requirements before designing the
project. Thus we will proceed to the design phase followed by the implementation
phase of the project.
20
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
CHAPTER 4
DESIGN
21
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
4.1 INTRODUCTION
4.1.1 INPUT DESIGN
The input design is the link between the information system and the user. It
comprises the developing specification and procedures for data preparation and those
steps are necessary to put transaction data in to a usable form for processing can be
achieved by inspecting the computer to read data from a written or printed document
or it can occur by having people keying the data directly into the system. The design
of input focuses on controlling the amount of input required, controlling the errors,
avoiding delay, avoiding extra steps and keeping the process simple. The input is
designed in such a way so that it provides security and ease of use with retaining the
privacy. Input Design considered the following things:
What data should be given as input?
How the data should be arranged or coded?
The dialog to guide the operating personnel in providing input.
Methods for preparing input validations and steps to follow when error
occur.
4.1.2 OBJECTIVES
Input Design is the process of converting a user-oriented description of the input
into a computer-based system. This design is important to avoid errors in the data
input process and show the correct direction to the management for getting correct
information from the computerized system.
It is achieved by creating user-friendly screens for the data entry to handle large
volume of data. The goal of designing input is to make data entry easier and to be free
from errors. The data entry screen is designed in such a way that all the data
manipulates can be performed. It also provides record viewing facilities
When the data is entered it will check for its validity. Data can be entered with
the help of screens. Appropriate messages are provided as when needed so that the
user will not be in maize of instant. Thus the objective of input design is to create an
input layout that is easy to follow
4.1.3 OUTPUT DESIGN
22
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
A quality output is one, which meets the requirements of the end user and
presents the information clearly. In any system results of processing are
communicated to the users and to other system through outputs. In output design it is
determined how the information is to be displaced for immediate need and also the
hard copy output. It is the most important and direct source information to the user.
Efficient and intelligent output design improves the systems relationship to help user
decision-making.
1. Designing computer output should proceed in an organized, well thought out
manner; the right output must be developed while ensuring that each output element is
designed so that people will find the system can use easily and effectively. When
analysis design computer output, they should Identify the specific output that is
needed to meet the requirements.
2. Select methods for presenting information.
3. Create document, report, or other formats that contain information produced by the
system.
The output form of an information system should accomplish one or more of the
following objectives.
Convey information about past activities, current status or projections of the
Future.
Signal important events, opportunities, problems, or warnings.
Trigger an action.
Confirm an action
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
of a system encompasses the classes, interfaces, and collaborations that form the
vocabulary of the problem and its solution. The process view of a system
encompasses the threads and processes that form the system's concurrency and
synchronization mechanisms. The implementation view of a system encompasses the
components and files that are used to assemble and release the physical system. The
deployment view of a system encompasses the nodes that form the system's hardware
topology on which the system executes.
Uses of UML
The UML is intended primarily for software intensive systems. It has been used
effectively for such domain as
Enterprise Information System
Banking and Financial Services
Telecommunications
Transportation
Defense/Aerospace
Retails
Medical Electronics
Scientific Fields
Distributed Web
Building blocks of UML
The vocabulary of the UML encompasses 3 kinds of building blocks
Things
Relationships
Diagrams
Things
Things are the data abstractions that are first class citizens in a model. Things are of 4
types
Structural Things, Behavioral Things , Grouping Things, An notational Things
Relationships
Relationships tie the things together. Relationships in the UML are
Dependency, Association, Generalization, Specialization
UML Diagrams
A diagram is the graphical presentation of a set of elements, most often rendered
as a connected graph of vertices (things) and arcs (relationships).
25
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
user
+name
+password
+phone number
+gender
+keyword
+alias
+name
+description
+upload()
+search()
+insert()
+retrive()
26
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Use cases
Actors
System boundary
enter character
User
select character
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Sequence diagrams in UML shows how object interact with each other and the
order those interactions occur. Its important to note that they show the interactions for
a particular scenario. The processes are represented vertically and interactions are
show as arrows
enter character
user
1 : select character()
2 : draw character()
user
enter character
28
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
No
User login
yes
Authenticated
Upload image
Getting results
29
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
user
enter character
select character
Home Page
user
Login
Register
Upload Image
Getting Results
30
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Search
Image with
Aliases
Upload
Image
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Processes
Processes show what system does. Each process has one or more data inputs and
produces one or more data outputs. Processes are represented by round rectangles in
DFD.
Data Stores
A file or data stores is repository of data. Processes can enter data into a store or
retrieve data from data store. The line in the DFD and each store represents each data
store as a unique name.
External Entities
External entities are outside the system but they supply either input data into the
system or used for the system output. They are entities on which the designer has no
control. There may be an organization or other bodies with which system interacts.
Data Flows
Data Flows model the passage of data on the system and represented by the lines
joining the system components. An arrow indicates the direction of flow and line is
labeled by the name of data flow. Flow of data in the system can take place
Between two processes
From a data store to a process
From a process to a process
From source to a process
4.3.2Level 0 DFD Diagrams
User Login
Upload
Image
Search Image
with Aliases
Getting Results
32
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Alias Names
User Login
Upload Image
Getting Results
: This table contain two fields they are name and password where the
Admin login with his name and password where he can browse and
upload the photos
Field
name
Type
varchar(255)
Null
NO
33
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
password
varchar(255)
NO
: This table contains four fields they are id, name, comment, date. This
Table is used to put a comment to the user uploaded photos. The users
names also
Type
varchar(255)
varchar(255)
varchar(255)
varchar(255)
Null
NO
NO
NO
NO
: The user can login with his login id and upload photos with their tags
and descriptions. The likes and comments are also presented where the
another users Can put a comment on already uploaded photo.
Field
image
tag
iname
id
count
comment
description
like
date
Type
blob
varchar(255)
varchar(255)
int(255)
int(255)
int(255)
varchar(255)
int(255)
varchar(255)
Null
NO
NO
NO
NO
NO
NO
NO
NO
NO
: The new user can register with their name and e-mail and whenever the
Registration process is successfully completed then that user login with
34
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Type
varchar(255)
varchar(255)
varchar(255)
varchar(255)
varchar(255)
Null
NO
NO
NO
NO
NO
35
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
36
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Terminator
The
expendables
Predator
Governator
37
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
38
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Hideki Matsui
Matsui
Godzilla
Baseball
Yankees
New York
Sports
4.6 CODING
4.6.1 ADMIN LOGIN PAGE
39
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
<%@page
import="com.oreilly.servlet.*,java.sql.*,databaseconnection.*,java.util.*,java.io.*,java
x.servlet.*, javax.servlet.http.*"%>
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>
<%
Statement st = null;
ResultSet rs = null;
String name = request.getParameter("aname");
String password = request.getParameter("password");
try{
Class.forName("com.mysql.jdbc.Driver");
Connection con =
DriverManager.getConnection("jdbc:mysql://localhost:3306/alias","root","root");
st = con.createStatement();
String qry ="select * from admin where name='"+name+"' AND
password='"+password+"'";
rs = st.executeQuery(qry);
if(!rs.next()){
out.println("Enter correct username, password");
}
else{
response.sendRedirect("upload.jsp");
}
con.close();
st.close();
}
catch(Exception ex){
out.println(ex);
}
40
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
%>
</body>
</html>
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
{
response.sendRedirect("search.jsp");
}
con.close();
st.close();
}
catch(Exception ex){
out.println(ex);
}
%>
</body>
</html>
4.6.3 LIKE PAGE
<%@page
import="com.oreilly.servlet.*,java.sql.*,databaseconnection.*,java.util.*,java.io.*,java
x.servlet.*, javax.servlet.http.*"%>
<%
String id=(String)session.getAttribute("id");
Statement st = null;
ResultSet rs1=null;
Try
{
Class.forName("com.mysql.jdbc.Driver");
Connection con =
DriverManager.getConnection("jdbc:mysql://localhost:3306/alias","root","root");
st=con.createStatement();
String sql1="select * from new where id='"+id+"'";
rs1=st.executeQuery(sql1);
while(rs1.next()){
int lyke=0;
lyke=rs1.getInt("lyke")+1;
try{
Class.forName("com.mysql.jdbc.Driver");
42
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Connection con2 =
DriverManager.getConnection("jdbc:mysql://localhost:3306/alias","root","root");
PreparedStatement ps=con.prepareStatement("Update
new set lyke=? where id='"+id+"'");
ps.setInt(1,lyke);
int x=ps.executeUpdate();
//String shit=Integer.toString(hit);
//session.setAttribute("shit",shit);
response.sendRedirect("search3.jsp?message=success");
}
catch (Exception ex)
{
out.println(ex.getMessage());
}
}
}
catch (Exception e)
{
out.println(e.getMessage());
}
%>
4.6.4 COMMENT PAGE
<%@page
import="com.oreilly.servlet.*,java.sql.*,databaseconnection.*,java.util.*,java.io.*,java
x.servlet.*, javax.servlet.http.*"%>
<%
String id=(String)session.getAttribute("id");
Statement st = null;
ResultSet rs1=null;
try{
Class.forName("com.mysql.jdbc.Driver");
Connection con =
DriverManager.getConnection("jdbc:mysql://localhost:3306/alias","root","root");
st=con.createStatement();
43
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
4.7 CONCLUSION
In this way we can design the layout of the project which is to be implemented
during the construction phase. Thus we will have a clear picture of the project before
44
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
being coded. Hence any necessary enhancements can be made during this phase and
coding can be started.
CHAPTER 5
IMPLEMANTATION
45
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
5.1 INTRODUCTION
Implementation is the stage of the project when the theoretical design is turned
out into a working system. Thus it can be considered to be the most critical stage in
achieving a successful new system and in giving the user, confidence that the new
system will work and be effective.
The implementation stage involves careful planning, investigation of the existing
system and its constraints on implementation, designing of methods to achieve
changeover and evaluation of changeover methods. Implementation can be preceded
through JSP.JSP will be more suitable for dynamic page designing, data sharing and
mining concepts. For maintaining data information we go for MS-SQL as database
back end.
Implementation is the stage of the project when the theoretical design is turned
out into a working system. Thus it can be considered to be the most critical stage in
achieving a successful new system and in giving the user, confidence that the new
system will work and be effective.
The implementation stage involves careful planning, investigation of the existing
system and its constraints on implementation, designing of methods to achieve
changeover and evaluation of changeover methods.
Implementation is the process of converting a new system design into operation.
It is the phase that focuses on user training, site preparation and file conversion for
installing a candidate system. The important factor that should be considered here is
that the conversion should not disrupt the functioning of the organization.
46
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Operational Transformation (OT) has been well accepted in group editors for
archive high local responsiveness and unconstrained collaboration.
Remote operations are transformed before they are executed such that
inconsistencies are repaired.
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Existing documents such as employee loan details should be entered into the new
system. Since these files are very large, conversion of these may continue long after
the system based on current files has been implemented. Hence we need to assign
responsibility for each activity.
The system may come into full operation via number of possible routes.
Complete change over at one point time is conceptually the most tidy. But this
approach requires careful planning and coordination, particularly during the
changeover. A phased approach, possible implementing the system of the section
relating to one operation or procedure first and progressing to more novel or complex
subsystems in the fullness of time. These likely to be less traumatic. A phased
approach gives the staff time to adjust to the new system. But depends on being able
to split the system, without reliance on it. Thus approach is sensible when the
consequences of failure are disastrous, but will require extra staff time. The fourth
angle, is pilot operation permits any problems to be tackled on a smaller scale
operation. Pilot operation generally means the implementation of the complete
system, but at one location or branch only.
5.2.1 FORM DESIGN
The various forms and their processing is given below
Admin Login Screen
Purpose
The purpose of the screen is to allow the administrator to enter into the system.
Details
The Edit boxes are used to enter the username and password.
The send button is used to login into the system
Screen design
48
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
49
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
50
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Search screen
Purpose
The purpose of the screen is to search the images which are already uploaded
Details
The search bar is used to search the images
Screen design
Like screen
Purpose
The purpose of the screen is to like the images which are already uploaded according
to our wish
Details
The like link is used to like the images
Screen design
51
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Comment screen
Purpose
The purpose of the screen is to comment the images which are already uploaded
according to our wish
Details
The comment link is used to comment the images the edit box will appear to comment
Screen design
52
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
54
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
56
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
5.3 CONCLUSION
In this way we implemented the project successfully for an easy interaction of the
user with the interfaces and enhanced security with less effort work. We proceed to
the next phase i.e., testing which is very important before delivering the project.
58
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
CHAPTER 6
TESTING AND VALIDATION
59
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
6.1 INTRODUCTION
Testing is the process of detecting errors. Testing performs a very critical role for
quality assurance and for ensuring the reliability of software. The results of testing are
used later on during maintenance also.
The aim of testing is often to demonstrate that a program works by showing that
it has no errors. The basic purpose of testing phase is to detect the errors that may be
present in the program. Hence one should not start testing with the intent of showing
that a program works, but the intent should be to show that a program doesnt work.
Testing is the process of executing a program with the intent of finding errors.
Software testing is a critical element of software quality assurance and represents
the ultimate review of software specification, design and coding. The increasing
visibility of software as a system element and the attendant costs associated with a
software failure are motivating forces for well planned, thorough testing. It is not
unusual for software. Development organization to expend 40 percent of total project
effort on testing. Hence the importance of software testing and its implications with
respect to software quality cannot be overemphasized. Different types of testing have
been carried out for this system, and they are briefly explained below.
6.1.1 TESTING OBJECTIVES
The main objective of testing is to uncover a host of errors, systematically and
with minimum effort and time. Stating formally, we can say,
A good test case is one that has a high probability of finding error, if it
exists.
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
for. The testing is conducted on a complete, integrated system to evaluate the system's
compliance with its specified requirements.
The basic levels of testing are as shown below
Acceptance
Testing
Client Needs
Requirements
System Testing
Design
Integration Testing
Code
Unit Testing
61
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
The test case Specification deals with the details of the test data, which is used
to test the system.
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
In Bottom-Up Integration Testing each sub module is tested separately and then
the full system is tested.
System Testing
Project testing is an important phase without which the system cant be
released to the end users. It is aimed at ensuring that all the processes
are according to the specification accurately.
Acceptance Testing
Acceptance Test is performed with realistic data of the client to demonstrate that
the software is working satisfactorily. Testing here is focused on external behavior of
the system; the internal logic of program is not emphasized.
6.3.1 TESTING ACTIVITIES
The activities of testing include:
Component Inspection: This finds faults in an individual component through
the manual inspection of its source code. Inspections can be conducted before
or after the unit test.
Usability Testing: This finds differences between the system and the users
expectation of what it should do?
2. Path based testing or White box testing: This white box testing
technique identifies faults in the implementation of the component. The
assumption here is that, by exercising all possible paths through the code
at least once, most faults will trigger failures. The identification of paths
requires knowledge of the source code and data structures.
63
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Two or more
components are integrated and tested and when no new faults are revealed,
additional components are added to the group.
System Testing: This focuses on the complete system, its functional and
nonfunctional requirements, and its target environment. All the testing
activities stated are to be implemented on large scale projects to get the
consistent System to be designed. All of them are not applicable to small
projects that do not much analysis done.
6.4 VALIDATION
Name
Location
Input1
Log1
Input2
Log2
is
displayed.
6.5 CONCLUSION
In this way we also completed the testing phase of the project and ensured that
the system is ready to go live. Thus we developed a new technology courier system so
that people will have tracking details of their consignments.
64
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
CHAPTER 7
CONCLUSION
65
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
66
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
CHAPTER 8
BIBILOGRAPHY
67
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
Good Teachers are worth more than thousand books, we have them in Our
Department
References Made From
[1] J. Artiles, J. Gonzalo , and F. Verdejo, A Testbed for People
Searching Strategies in the WWW, Proc. SIGIR 05, pp. 569-570, 2005.
[2] R. Guha and A. Garg, Disambiguating People in Search, technical
report, Stanford Univ., 2004.
[3] D.Bollegala, Y. Matsuo, and M. Ishizuka , Automatic Discovery of
Personal Name Aliases from the Web, IEEE Transactions on Knowledge
and Data Engineering, vol. 23, No. 6, June 2011.
[4] Y. Matsuo, and M. Ishizuka, Keyword Extraction from a Single
Document using Word Co-occurrence Statistical Information,
International Journal on Artificial Intelligence Tools, 2004.
[5] W. Lu, L. Chien and H. Lee, Anchor Text Mining for Translation of
Web Queries: A Transitive Translation Approach, ACM Transactions on
Information Systems, Vol. 22, No. 2, Aprill 2004, Pages 242-269.
[6] Z. Liu, W. Yu, Y. Deng, Y. Wang, and Z. Bian, A Feature selection
Method for Document Clustering based on Part-of-Speech and Word Cooccurrence, Proceedings of 7th International Conference on Fuzzy
Systems and Knowledge Discovery (FSKD 10), pp. 2331-2334, Aug
2010.
68
DEPARTMENT OF CSE
GDMM,NANDIGAMA
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
AUTOMATIC DISCOVERY OF ASSOCIATION ORDERS BETWEEN NAME AND ALIASES FROM THE
WEB USING ANCHOR TEXT BASED CO-OCCURENCES
http://www.sourcefordgde.com
http://www.networkcomputing.com/
http://www.roseindia.com/
http://www.java2s.com/
70
DEPARTMENT OF CSE
GDMM,NANDIGAMA