Bem-vindo(a) ao Scribd!

A Review of Web-Crawler and P2P Overlay Networks

Enviado por

0% acharam este documento útil (0 voto)

43 visualizações1 página

Web Crawlers are the heart of search engines. Crawlers continuously keep on crawling the web and find any new web pages. Peer-to-peer overlay networks offer a novel platform for distributed applications.

Descrição original:

Direitos autorais

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Denunciar este documento

Direitos autorais:

Attribution Non-Commercial (BY-NC)

Sinalizar o conteúdo como inadequado

0% acharam este documento útil (0 voto)

43 visualizações1 página

A Review of Web-Crawler and P2P Overlay Networks

Enviado por

Rajni Garg

Direitos autorais:

Attribution Non-Commercial (BY-NC)

Sinalizar o conteúdo como inadequado

Pular para a página

Você está na página 1de 1

INT. J. NEW. INN.

, 2012, 1(1), 105-110

ISSN:2277-4459

A REVIEW OF WEB-CRAWLER AND P2P OVERLAY NETWORKS

Praveen Kumar1, Arpit Kumar2
1

Department of Computer Science & Engineering, Govt. Polytechnic, Lisana (Rewari), Distt. Rewari - 123401, Haryana, India E-mail: praveensatija@gmail.com 2 Department of Computer Science & Engineering, D.A. V. College of Engineering & Technology, Kanina, Mohindergarh - 123027, Haryana, India E-mail: arpitrewari@gmail.com

ABSTRACT
Web crawlers are the heart of search engines. Web crawlers continuously keep on crawling the web and find any new web pages that have been added to the web as well as the pages that have been removed from the web. Due to growing and dynamic nature of the web; it has become a challenge to traverse all URLs in the web documents and to handle these URLs. In this paper, we review the challenges and issues faced in using a single type of Crawler. We also explore the concept of exploiting network proximity in peer-to-peer overlay networks [11]. Peer to peer overlay networks offer a novel platform for a variety of scalable and decentralized distributed applications. These systems provide efficient and fault- tolerant routing, object location and load balancing within a self-organizing overlay network.

KEYWORDS: Crawler;uri;p2p;html;http

1. INTRODUCTION
The World Wide Web [1] is a global, read-write information space. Text documents, images, multimedia and many other items of information, referred to as resources, are identified by short, unique, global identifiers called Uniform Resource Identifiers (URI) so that each can be found, accessed and cross-referenced in the simplest possible way. It is a client-server based architecture that allows a user to initiate search by providing a keyword and some optional additional information to a search engine, which in turn collects and returns the required web pages from the Internet. Web Crawlers are Software programs that traverse the World Wide Web information space by following the hypertext links extracted from hypertext documents. Since, a crawler identifies a document from its URL, it picks up a seed URL and downloads corresponding Robot.txt file, which contains downloading permissions and the information about the files that should be excluded by the crawler. On the basis of the host protocol, it downloads the document from the same machine or to one on the other side of the world. When a user accesses a web page using its URL, the documents are transferred to the client machine using Hyper Text Transfer Protocol (HTTP). The browser interprets the 105

document and makes it available to the user. He/she follows the links in the presented page to access other pages. A crawler often has to download hundreds of millions of pages in a short period of time and has to constantly monitor and refresh the downloaded pages. In addition, the crawler should avoid putting too much pressure on the visited Web sites and the crawler's local network, because they are intrinsically shared resources. Web crawlers utilize the graph structure of the program to move from page to other page. A crawler stores a web page and then extracts any URLs appearing in that web page. The same process is repeated for all web pages who are URLs have been extracted from the earlier page. For this, queue data structure is used. All encountered URLs are put in the queue. The process is repeated until queue is empty or crawler decides to stop. Key purpose of designing web crawlers is to retrieve web pages and to add them to local repository.

International Journal of New Innovations

Você também pode gostar

Fdi in Multi Brand Retail in India
Documento1 página
Fdi in Multi Brand Retail in India
Rajni Garg
Ainda não há avaliações
Financial Innovations: Changing Rural Financial System in India
Documento1 página
Financial Innovations: Changing Rural Financial System in India
Rajni Garg
Ainda não há avaliações
Future of Plastic Money in India
Documento1 página
Future of Plastic Money in India
Rajni Garg
Ainda não há avaliações
Women Empowerment in India at Present Scenario
Documento1 página
Women Empowerment in India at Present Scenario
Rajni Garg
86% (7)
Gender Discrimination at Work Place - An Obstacle in Organisation Development
Documento1 página
Gender Discrimination at Work Place - An Obstacle in Organisation Development
Rajni Garg
Ainda não há avaliações
Women Entrepreneurs - A Mirage of Indian Women
Documento1 página
Women Entrepreneurs - A Mirage of Indian Women
Rajni Garg
Ainda não há avaliações
Binder
Documento1 página
Binder
Rajni Garg
Ainda não há avaliações
Gender and Age Perspective: A Study Among Employees in Automotive Industries With Special Reference To Madurai
Documento1 página
Gender and Age Perspective: A Study Among Employees in Automotive Industries With Special Reference To Madurai
Rajni Garg
Ainda não há avaliações
Agile Software Development: Existing and New Development Methodologies
Documento1 página
Agile Software Development: Existing and New Development Methodologies
Rajni Garg
Ainda não há avaliações
India's Opportunities & Challenges With EU
Documento1 página
India's Opportunities & Challenges With EU
Rajni Garg
Ainda não há avaliações
Mutual Funds in Indian Perspective
Documento1 página
Mutual Funds in Indian Perspective
Rajni Garg
Ainda não há avaliações
Bringing RFID For You
Documento1 página
Bringing RFID For You
Rajni Garg
Ainda não há avaliações
Binder
Documento1 página
Binder
Rajni Garg
Ainda não há avaliações
Recession and Recent Counter Techniques
Documento1 página
Recession and Recent Counter Techniques
Rajni Garg
Ainda não há avaliações
Communication As A Key Contributor in IT Project Management
Documento1 página
Communication As A Key Contributor in IT Project Management
Rajni Garg
Ainda não há avaliações
Intelligent Tutoring Systems in School Education: An Overview
Documento1 página
Intelligent Tutoring Systems in School Education: An Overview
Rajni Garg
Ainda não há avaliações
An Evolutionary Approach To Minimize Functions
Documento1 página
An Evolutionary Approach To Minimize Functions
Rajni Garg
Ainda não há avaliações
Performance Melioration in Wlan Using CSMA/ECA: A Review
Documento1 página
Performance Melioration in Wlan Using CSMA/ECA: A Review
Rajni Garg
Ainda não há avaliações
Cloud Computing: A New Era in Computing
Documento1 página
Cloud Computing: A New Era in Computing
Rajni Garg
Ainda não há avaliações
Representation of Image Compression Using Wavelets
Documento1 página
Representation of Image Compression Using Wavelets
Rajni Garg
Ainda não há avaliações
Data Mining and Web Mining
Documento1 página
Data Mining and Web Mining
Rajni Garg
Ainda não há avaliações
A Secure Private Key Encryption Technique For Data Security in Modern Cryptosystem
Documento1 página
A Secure Private Key Encryption Technique For Data Security in Modern Cryptosystem
Rajni Garg
Ainda não há avaliações
QOS Solutions For Mpeg-4 Fgs Video Streaming Over Wired Network
Documento1 página
QOS Solutions For Mpeg-4 Fgs Video Streaming Over Wired Network
Rajni Garg
Ainda não há avaliações
Role of Information Technology in Anti-Corruption
Documento1 página
Role of Information Technology in Anti-Corruption
Rajni Garg
Ainda não há avaliações
Neural Networks in Data Mining
Documento1 página
Neural Networks in Data Mining
Rajni Garg
Ainda não há avaliações
Digital Watermarking: A Safety Tag Against Human Humbug
Documento1 página
Digital Watermarking: A Safety Tag Against Human Humbug
Rajni Garg
Ainda não há avaliações
A Review To Image Registration Methods
Documento1 página
A Review To Image Registration Methods
Rajni Garg
Ainda não há avaliações
A Review: Security Issues of Adhoc Networks
Documento1 página
A Review: Security Issues of Adhoc Networks
Rajni Garg
Ainda não há avaliações
Analysis of Intelligent Cloud Computing
Documento1 página
Analysis of Intelligent Cloud Computing
Rajni Garg
Ainda não há avaliações
Zigbee and Bluetooth: A Comparative Study
Documento1 página
Zigbee and Bluetooth: A Comparative Study
Rajni Garg
Ainda não há avaliações
Shoe Dog: A Memoir by the Creator of Nike
No Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Nota: 4.5 de 5 estrelas
4.5/5 (537)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
No Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Nota: 4 de 5 estrelas
4/5 (5794)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
No Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Nota: 4 de 5 estrelas
4/5 (890)
The Yellow House: A Memoir (2019 National Book Award Winner)
No Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Nota: 4 de 5 estrelas
4/5 (98)
The Little Book of Hygge: Danish Secrets to Happy Living
No Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Nota: 3.5 de 5 estrelas
3.5/5 (399)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
No Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Nota: 3.5 de 5 estrelas
3.5/5 (231)
Never Split the Difference: Negotiating As If Your Life Depended On It
No Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Nota: 4.5 de 5 estrelas
4.5/5 (838)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
No Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Nota: 4.5 de 5 estrelas
4.5/5 (474)
Rise of ISIS: A Threat We Can't Ignore
No Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Nota: 3.5 de 5 estrelas
3.5/5 (137)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
No Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Nota: 4.5 de 5 estrelas
4.5/5 (344)
Grit: The Power of Passion and Perseverance
No Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Nota: 4 de 5 estrelas
4/5 (587)
Yes Please
No Everand
Yes Please
Amy Poehler
Nota: 4 de 5 estrelas
4/5 (1891)
On Fire: The (Burning) Case for a Green New Deal
No Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Nota: 4 de 5 estrelas
4/5 (73)
The Emperor of All Maladies: A Biography of Cancer
No Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Nota: 4.5 de 5 estrelas
4.5/5 (271)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
No Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Nota: 4.5 de 5 estrelas
4.5/5 (265)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
No Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
Nota: 4 de 5 estrelas
4/5 (1090)
Team of Rivals: The Political Genius of Abraham Lincoln
No Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Nota: 4.5 de 5 estrelas
4.5/5 (234)
Angela's Ashes: A Memoir
No Everand
Angela's Ashes: A Memoir
Frank McCourt
Nota: 4.5 de 5 estrelas
4.5/5 (440)
Principles: Life and Work
No Everand
Principles: Life and Work
Ray Dalio
Nota: 4 de 5 estrelas
4/5 (599)
Steve Jobs
No Everand
Steve Jobs
Walter Isaacson
Nota: 4.5 de 5 estrelas
4.5/5 (806)
Fear: Trump in the White House
No Everand
Fear: Trump in the White House
Bob Woodward
Nota: 3.5 de 5 estrelas
3.5/5 (738)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
No Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Nota: 3.5 de 5 estrelas
3.5/5 (2219)
The Unwinding: An Inner History of the New America
No Everand
The Unwinding: An Inner History of the New America
George Packer
Nota: 4 de 5 estrelas
4/5 (45)
John Adams
No Everand
John Adams
David McCullough
Nota: 4.5 de 5 estrelas
4.5/5 (2409)
Bad Feminist: Essays
No Everand
Bad Feminist: Essays
Roxane Gay
Nota: 4 de 5 estrelas
4/5 (1015)
The Glass Castle: A Memoir
No Everand
The Glass Castle: A Memoir
Jeannette Walls
Nota: 4.5 de 5 estrelas
4.5/5 (1711)
The Outsider: A Novel
No Everand
The Outsider: A Novel
Stephen King
Nota: 4 de 5 estrelas
4/5 (1839)
The Light Between Oceans: A Novel
No Everand
The Light Between Oceans: A Novel
M.L. Stedman
Nota: 4.5 de 5 estrelas
4.5/5 (789)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
No Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Nota: 4.5 de 5 estrelas
4.5/5 (119)
The Perks of Being a Wallflower
No Everand
The Perks of Being a Wallflower
Stephen Chbosky
Nota: 4.5 de 5 estrelas
4.5/5 (2099)
Brooklyn: A Novel
No Everand
Brooklyn: A Novel
Colm Tóibín
Nota: 3.5 de 5 estrelas
3.5/5 (1937)
Wolf Hall: A Novel
No Everand
Wolf Hall: A Novel
Hilary Mantel
Nota: 4 de 5 estrelas
4/5 (3811)
The Woman in Cabin 10
No Everand
The Woman in Cabin 10
Ruth Ware
Nota: 3.5 de 5 estrelas
3.5/5 (2322)
A Man Called Ove: A Novel
No Everand
A Man Called Ove: A Novel
Fredrik Backman
Nota: 4.5 de 5 estrelas
4.5/5 (4609)
The Art of Racing in the Rain: A Novel
No Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Nota: 4 de 5 estrelas
4/5 (4200)
Manhattan Beach: A Novel
No Everand
Manhattan Beach: A Novel
Jennifer Egan
Nota: 3.5 de 5 estrelas
3.5/5 (792)
Sing, Unburied, Sing: A Novel
No Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Nota: 4 de 5 estrelas
4/5 (1103)
Little Women
No Everand
Little Women
Louisa May Alcott
Nota: 4 de 5 estrelas
4/5 (104)
Her Body and Other Parties: Stories
No Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Nota: 4 de 5 estrelas
4/5 (821)
A Tree Grows in Brooklyn
No Everand
A Tree Grows in Brooklyn
Betty Smith
Nota: 4.5 de 5 estrelas
4.5/5 (1929)
The Constant Gardener: A Novel
No Everand
The Constant Gardener: A Novel
John le Carre
Nota: 3.5 de 5 estrelas
3.5/5 (104)
Statistics For Business Decision Making and Analysis 2nd Edition Stine Foster Test Bank
Documento5 páginas
Statistics For Business Decision Making and Analysis 2nd Edition Stine Foster Test Bank
eloise
100% (19)
Police Reports and Their Purpose
Documento30 páginas
Police Reports and Their Purpose
Irene Enriquez Carandang
100% (4)
Download Photoshop 2021 Free Full Version
Documento3 páginas
Download Photoshop 2021 Free Full Version
Luke
Ainda não há avaliações
Guided Notes Packet - Argumentative Writing
Documento12 páginas
Guided Notes Packet - Argumentative Writing
api-498343131
Ainda não há avaliações
BANA 2 Reviewer
Documento6 páginas
BANA 2 Reviewer
DELOS REYES, PATRICK JEROME B.
Ainda não há avaliações
Nadzom Pepeling Maot
Documento2 páginas
Nadzom Pepeling Maot
Arman Nur Rokhman
Ainda não há avaliações
Eli Calendar
Documento2 páginas
Eli Calendar
Aarya Raichura
Ainda não há avaliações
Logic and Reasoning Q3 FARIS NASIR F. QUILLA 2-2-22
Documento5 páginas
Logic and Reasoning Q3 FARIS NASIR F. QUILLA 2-2-22
Shermen B. Flores
Ainda não há avaliações
ZXDU500 500A Combined Power Supply System User's Manual
Documento134 páginas
ZXDU500 500A Combined Power Supply System User's Manual
Daisy Tan
67% (3)
JPA-ORM-OGM Lab
Documento59 páginas
JPA-ORM-OGM Lab
vovanhaiqn
Ainda não há avaliações
Plan Lectie, CL Ax-A, Word Formation
Documento5 páginas
Plan Lectie, CL Ax-A, Word Formation
Bogdan Zancu
100% (1)
ESC Shahadah 2
Documento14 páginas
ESC Shahadah 2
fatimah.kamal013
Ainda não há avaliações
The OSI Model's Seven Layers Defined and Functions Explained
Documento6 páginas
The OSI Model's Seven Layers Defined and Functions Explained
ankit boxer
Ainda não há avaliações
Differentiated Learning With Seesaw Handout
Documento7 páginas
Differentiated Learning With Seesaw Handout
api-468171582
Ainda não há avaliações
Analitical Exposition
Documento11 páginas
Analitical Exposition
Heri Maula Akasyah
Ainda não há avaliações
Antojos by Julia Alvarez
Documento24 páginas
Antojos by Julia Alvarez
Jelena Tadić
Ainda não há avaliações
Bit11103 T2
Documento3 páginas
Bit11103 T2
Elson Lee
0% (2)
AutoCAD System Variables
Documento92 páginas
AutoCAD System Variables
ganeshprabhu
Ainda não há avaliações
Lesson Plan
Documento5 páginas
Lesson Plan
Redwan Siddik
100% (2)
Hospital Management System
Documento112 páginas
Hospital Management System
Yesu Raj
100% (1)
Datastage Scenario Based Interview Questions and Answers For Experienced
Documento4 páginas
Datastage Scenario Based Interview Questions and Answers For Experienced
Saravanakumar V
Ainda não há avaliações
List of Waterfalls in India: Sendhil
Documento3 páginas
List of Waterfalls in India: Sendhil
DeepakDeep
Ainda não há avaliações
Statistics
Documento4 páginas
Statistics
Kim B. Manila
Ainda não há avaliações
John 6:60-71
Documento5 páginas
John 6:60-71
John Shearhart
Ainda não há avaliações
Verawati R. Simbolon Catholic University of Saint Thomas Email
Documento13 páginas
Verawati R. Simbolon Catholic University of Saint Thomas Email
vera simbolon
Ainda não há avaliações
Ghana Communication Tech University English Course Outline
Documento6 páginas
Ghana Communication Tech University English Course Outline
shalil hamid
Ainda não há avaliações
GWBASIC
Documento67 páginas
GWBASIC
MAWiskiller
100% (1)
First Seminar Presentation On Sing Unburied Sing by Jesmyn Ward - Section 24
Documento3 páginas
First Seminar Presentation On Sing Unburied Sing by Jesmyn Ward - Section 24
hjt7cdnzhj
Ainda não há avaliações
English Mcqs
Documento2 páginas
English Mcqs
Faizan Ch
Ainda não há avaliações
PYNQ Productivity With Python
Documento67 páginas
PYNQ Productivity With Python
Dr. Dipti Khurge
Ainda não há avaliações