Escolar Documentos
Profissional Documentos
Cultura Documentos
Engineering Challenges
in Vertical Search Engines
Aleksandar Bradic, Senior Director,
Engineering and R&D, Vast.com
+
Introduction
Vertical Search
Search focused on vertical data
Vertical Data – data inherently described by it’s structure:
Items/Properties for sale (Automotive, Real Estate..)
We’re hiring ! : )
+
Challenges in Vertical Search
Engines
Web Data Retrieval
Unstructured Data
Vertical Search
Data Analytics
Computational Advertising
+
Web Data Retrieval
Crawler Architecture
Queue Management
Crawl Ordering Policies
Duplicate URL Detection
Content Hash Management
Politeness Management
Coverage Measurement
Freshness Optimization
Incremental Crawling
+
Web Data Retrieval
Adversarial Crawling
Web Spam Detection
Cloaked Content Detection
+
Unstructured Data
Faceted Search
fac-et (fas’it) :
1. One of the flat polished surfaces cut on a gemstone or occurring
naturally on a crystal.
2. One of numerous aspects, as of a subject.
Learning to rank
Inventory/Market Trends
Price Prediction
Challenges:
“Good Deal” detection
Recommendation Systems for Vertical Data with no explicit user
feedback
Accuracy of Automatic Valuation Models
Data-driven feature design
Click Prediction
User Behavior Modeling
+
Computational Advertising
ads
ads
search results !
+
Computational Advertising
ad ?
ad ?
+
Computational Advertising
Central challenge:
Find the “best match” between a given user in a given context
and a suitable advertisement
“best match” – maximizing the value for :
Users
Advertisers
Publishers
Each of the parties has different set of utilities:
Users want relevance
P(click) = ?
+
Computational Advertising
Analytical Aparatus:
Regression Analysis (Linear, Logistic, probit model, High
Dimensional methods)
Game Theory (Nash Equilibria, dominant strategy)
Auction Theory (Vickrey, GSP, VCG…)
Graph Theory (random walks on graphs, graph matching, etc.)
Information Retrieval Techniques (similarity metrics, etc.)
…
+
Conclusion