Escolar Documentos
Profissional Documentos
Cultura Documentos
INTRODUCTION
The deep Web (also called Deepnet , the invisible Web, dark Web or the hidden Web) refers to World Wide Web content that is not indexed by standard search engines.
Searching on the Internet today can be compared to dragging a net across the surface of the ocean.
Surface Web
The
surface Web (also known as the visible Web or indexable Web) is that portion of the World Wide Web that is indexed by conventional search engines.
Search engines construct a database of the Web by using programs called spiders or Web crawlers that begin with a list of known Web pages.
The surface Web contains an estimated 2.5 billion documents, growing at a rate of 7.5 million documents per day.
HISTORY
In the earliest days of the Web, there were relatively few documents and sites. First, database technology was introduced to the Internet as Bluestone's Sapphire/Web bought by HP and later Oracle. Los Alamos National Laboratory(LANL) founded Innovative Web Applications in 1996. Finally ,Deployed the first "deep web" application in Federal government, in February 1999.
7
There is still a wealth of information that is deep, and therefore, missed. Reason : Most of the Web's information is buried in dynamically generated sites, and standard search engines never find it. Traditional search engines create their indices by spidering or crawling surface static Web pages. Deep Web sources store their content in databases.
CHARACTERISTICS
Deep Web is massive -- approximately 500 times greater than that visible to conventional search engines -- with much higher quality throughout. Fast, economical, provide depth knowledge.
10
Invisible Web search engines are built to construct queries, which connect with dynamic content in real-time in order to obtain current information. Focused on searching pre-selected data sources.
FIGURE 1.2 : Harvesting the Deep and Surface Web with a Directed Query Engine
12
While study/research , we have avoided the term "invisible Web" because it is inaccurate.
The only thing "invisible" about searchable databases is that they are not indexable.
13
Using BrightPlanet technology, they are totally "visible" to those who need to access them.
BrighT Planet's technology is uniquely suited to tap the deep Web and bring its results to the surface.
14
15
TECHNICAL CHALLENGES
16
Some Applications
1.Deep Web as a Search Engine
When youre searching the Web for what you need, youre missing about 90 percent of all the information on the web if you arent searching using Deep Web search engines. The deep web search engines are offering us to access specific searches across the web for sites which have stored data that cant be easily spidered by Google or any other surface web sites .
17
18
The concept of the deep Web is becoming more complex as search engines such as Google have found ways to integrate deep Web content into their central search function. However, even a search engine as far-reaching as Google provides access to only a very small part of the deep Web.
19
20
DeepWebTech DeepPeep
Complete Planet
Infomine
21
CONCLUSION
The deep Web thus appears to be a critical source when it is imperative to find a "needle in a haystack." Going to play a major role in the future search engine industry. A rich and huge source of information for a seeker.
22
REFERENCES
Wright, Alex (2009-02-22). "Exploring a 'Deep Web' That Google Cant Grasp". New York Times. http://www.nytimes.com/2009/02/23/technology/internet/23search.ht ml?th&emc=th. Retrieved 2009-02-23. Bergman, Michael K. (August 2001). "The Deep Web: Surfacing Hidden Value". The Journal of Electronic Publishing 7 (1). doi:10.3998/3336451.0007.104. http://quod.lib.umich.edu/cgi/t/text/textidx?c=jep;view=text;rgn=main;idno=3336451.0007.104 Sriram Raghavan; Hector Garcia-Molina (2000) (PDF). Crawling the Hidden Web. Stanford Digital Libraries Technical Report. http://ilpubs.stanford.edu:8090/456/1/2000-36.pdf Barbosa, Luciano; Juliana Freire (2007). An Adaptive Crawler for Locating Hidden-Web Entry Points. WWW Conference 2007.
23