Você está na página 1de 3

International Journal of Emerging Technologies and Engineering (IJETE)

Vo lu me 1 Issue 10, November 2014, ISSN 2348 8050

Content Search & Query Value Search for Document Annotation


Nishigandha Kute*, Monal Salunke**, Surekha Khaire *** ,Prof. Mrs. S. Patil****
*MMIT, Savitribai Phule Pune University, Lohgaon
** MMIT, Savitribai Phule Pune University, Lohgaon
*** MMIT, Savitribai Phule Pune University, Lohgaon
**** MMIT, Department of Computer Engineering, Savitribai Phule Pune University, Lohgaon

ABSTRACT

II.

Searching the World Wide Web can be both beneficial and


frustrating in todays life. And also all collection of
textual data contains large amount of structured
information which remains hidden in unstructured format.
Relevant information is always difficult to find in these
documents. You may find vast amounts of information, or
you may not find the kinds of information you're looking
for. Searching online will provide you with a wealth of
information, but not all of it will be useful or of the
highest quality.
In this paper we are consider an alternative
approach for content and query searching based on
Facilitating Document Annotation for the structured
metadata by identifying documents in overall system that
are likely to contain information of interest and this
information is going to be useful for content of querying
the database. Here publisher will likely to assign metadata,
structured or unstructured related to documents which they
upload which will easily help the users in retrieving the
documents.
Keywords CADs, Annotation, Query,Attribute

Annotations of documents are comments, notes,


explanations, or other types of external remarks that can
be attached to a Web document or to a selected part of a
document. As they are external, it is possible to annotate
any Web document independently, without needing to edit
the document itself. From a technical point of view,
annotations are usually seen as metadata, as they give
additional information about an existing piece of data.
Annotations of documents can be stored locally or in
one or more database servers. When a document is
searched, content of queries value each of these database
servers, requesting the annotations related to that
document in web server database.
An annotation has many properties including: Document
Annotation Physical location: is the publisher stored in
the local file system or in a database server
Document Annotation Scope : is the user associated to a
whole document or just to a fragment.
Document Annotation type : 'Annotation', 'Comment',
'Query', 'Content'...

III.
I.

ANNOTATION OF DOCUMENT

PROPOSED WORK

INTRODUCTION

Enhancing the search results in large archives is a concern


shared by Collection of huge and textual data. The search
content improvement can come from two directions of
method: Filename based search or Content based search.
Both search content directions are active research areas. In
this filename based search system are search the data
within the filename itself and it produces very low
accurate results. And second one is content based search
the data within the file contents instead of filename. It
also produces very low accurate and large amount of
results. But there is no any use of the results.
In this paper we learn and explore the interaction
between Document Annotation Using Content &
Querying Value.

In Proposed System or in this paper, we describe CADS


(Collaborative Adaptive Data Sharing platform), which is
based on an annotate-as-you-create infrastructure that
facilitates fielded data annotation document. A key or
query content contribution of our system is the direct use
of the query for annotation process, in addition to
examining the content of the document. In other
collection of textual words, we are trying to prioritize the
annotation of documents towards generating attribute
values for attributes that are often used by content
querying users.
IV.

ALGORITHM IMPLEMENTATION

4.1 INFORMATION EXTRACTION :


253

www.ijete.org

International Journal of Emerging Technologies and Engineering (IJETE)


Vo lu me 1 Issue 10, November 2014, ISSN 2348 8050

Information extraction (IE) systems:


Step 1: Find and understand limited relevant
of texts in annotation documents

parts

Step 2: Gather information from many pieces of textual


data or documents
Step3: Produce a structured representation of relevant
information from the document:
-relations (in the database sense), a.k.a.,
-a knowledge base
4.2 QV, CV COMPUTATION AND COMBINING
ALGORITHM:
Step 1: Enter the content queries as input for retrieving
document form the given attribute
Example: College='MMIT' and address='Lohgaon'
Step 2: Divide the quires and send to or pass it to database
for retrieving
Step 3: Check all annotation document related results and
show the related result from the database to user
Step 4: The efficient and accurate and correct result, users
should try to enter maximum queries they can
V.

SYSTEM ARCHITECTURE

We describe the actual working of the system.

This system is very useful for users. This system describes


the Document Annotation Using Content & Querying for
based on online and offline system. It is also useful for
publisher or author of annotation document. Show the
following module for system.
Modules:
1. User or Publisher Registration
2. User or Publisher Login
3. Document Upload By Publisher (Author)
4. Content & Querying Search Techniques
5. Get (Show) Result
Modules Description
User or Publisher Registration:
In this module Publisher (Creator) or User
have to register first, then only registered user or publisher
has to access the data base from the system.
User or Publisher Login:
In this module registered person has login to
database for purpose of authentication and then entered in
the system as user or publisher.
Document Upload By Publisher (Author) :
In this module publisher uploads an unstructured
document as file(along with meta data) into system
database, with the help of this metadata and its Document
Annotation Using Content , the end user has to download
the file on the system . User or publisher has to enter
content/query for download the file.

Fig1.1 System Architecture


254
www.ijete.org

International Journal of Emerging Technologies and Engineering (IJETE)


Vo lu me 1 Issue 10, November 2014, ISSN 2348 8050

Content & Querying Search Techniques:


Here we are using two techniques for
searching the annotation document first one is Content
Search, second one is Query Search. In the content search
document will be downloaded by giving the content
search which is present in the corresponding annotation
document. If its search result present the corresponding
document will be downloaded, otherwise it is not
downloaded. And second one is query search that the
document will downloaded by using simple query which
has present in the base paper .if the result is matches the
document then this document will be downloaded
otherwise it rejected.
Show Result and Download Document:
The User or publisher has to download the document
using query search /content search values which have
given in the base paper or the database .user or publisher
enters the correct data in the text boxes, if its correct or
matches it will download the document file. Otherwise it
is rejected.

VI

CONCLUSION

We presented in this paper the two ways to combine these


two techniques of evidence, content value search and
querying value search. The main use of our system is
mainly that when users of author perform query based
search, they could get minimum and distinct accurate
results where it could be easy for retrieval data from the
database. By using these techniques two techniques,
workload of system can reduce by large amount. And it
Also, given the fact the efficiency of searching annotation
document will be faster because of using the query-based
searching technique or content value searching .
Query-based searching will be the future in
information retrieval from the database as this searching
techniques may be applied on other file formats like
.docx,.pdf,.xml etc which can give users better, faster and
accurate results produce and will also increase the
performance of system or application . This system gets
the good result for searching of database.

REFERENCES
[1] Eduardo J. Ruiz, Vangelis Hristidis, Panagiotis
G. Ipeirotis, Facilitating Document Annotation using
Content and Querying Value, ieee transactions on
knowledge and data engineering vol.pp no.99 2013.
[2] Akshay Shingote, Nikhil Vispute, Priyanka Dhikale,
Facilitating Document Annotation Using Content &
Querying Value, International Journal of Computer
Trends and Technology (IJCTT) volume 9 number 4
Mar 2014
[3]http://www.w3.org , Annotating Document
[4] http://www.stanford.edu/, Information Extraction and
Named Entity Recognition, Stanford University
[5] Vagelis Hristidis, Eduardo Ruiz, CADS: A
Collaborative Adaptive Data Sharing Platform, School of
Computing
and
Information
Sciences, Florida
International University.
[6] S. R. Jeffery, M. J. Franklin, and A. Y. Halevy, Payas-you-go user feedback for dataspace systems, in ACM
SIGMOD, 2008
[7] J. Madhavan and et al., Web-scale data integration:
You can only afford to pay as you go, in CIDR, 2007.

255
www.ijete.org

Você também pode gostar