Escolar Documentos
Profissional Documentos
Cultura Documentos
A SEMINAR REPORT
Submitted by
AANCHAL GARG
BACHELOR OF ENGINEERING
IN
At
Today's search engines are equipped with specialized agents known as Web
These contents are then analyzed, indexed and made available to users. Crawlers
interact with thousands of Web servers over periods extending from a few weeks to
several years. This type of crawling process therefore means that certain judicious
criteria need to be taken into account, such as the robustness, extendibility and
will provide details of the various crawling strategies, crawling policies and web
ABSTRACT……………………………………………….. i
ACKNOWLEDGMENT………………………………….. ii
1. WEB CRAWLER…………………………………………..1
1.1 Introduction……………………………………......1
1.2 Prerequisites of a Crawling System……………….1
1.3 General Crawling Strategies………………………2
1.4 Crawling Policies………………………………….3
1.4.1 Selection Policy……………………………..4
1.4.2 Re-Visit Policy……………………………...7
1.4.3 Politeness Policy……………………………9
1.4.4 Parallelization Policy……………………….10
4. REFERENCES………………………………17
ACKNOWLEDGMENT
With great pleasure and pride, I take an opportunity to pay my gratitude and thanks
to my respected guide and teacher Mr. Lalit Goel who has been a continuous
source of inspiration, and without whose help, the completion of this report would
have been impossible.
AANCHAL GARG
061002