Corona

Corona: A High Performance Publish-Subscribe System for the World Wide Web
Venugopalan Ramasubramanian Ryan Peterson Emin Gün Sirer

Cornell University, Ithaca, NY 14853
{ramasv,ryanp,egs}@cs.cornell.edu
Abstract Consequently, content providers have to handle the high

bandwidth load imposed by subscribers, each polling in-
Despite the abundance of frequently changing informa-
dependently and repeatedly for the same content. More-
tion, the Web lacks a publish-subscribe interface for de-
over, such a workload tends to be “sticky;” that is, users
livering updates to clients. The use of naı̈ve polling for
subscribed to popular content tend not to unsubscribe af-
detecting updates leads to poor performance and lim-
ter their interest diminishes, causing a large amount of
ited scalability as clients do not detect updates quickly
wasted bandwidth. Existing micronews syndication sys-
and servers face high loads imposed by active polling.
tems provide ad hoc, stop-gap measures to alleviate these
This paper describes a novel publish-subscribe system
problems. Content providers currently impose hard rate-
for the Web called Corona, which provides high perfor-
limits based on IP addresses, which render the system
mance and scalability through optimal resource alloca-
inoperable for users sharing an IP address, or they pro-
tion. Users register interest in Web pages through exist-
vide hints for when not to poll, which are discretionary
ing instant messaging services. Corona monitors the sub-
and imprecise. The fundamental problem is that an ar-
scribed Web pages, detects updates efficiently by allocat-
chitecture based on naı̈ve, uncoordinated polling leads to
ing polling load among cooperating peers, and dissemi-
ineffective use of server bandwidth.
nates updates quickly to users. Allocation of resources
for polling is driven by a distributed optimization engine This paper describes a novel, decentralized system for
that achieves the best update performance without ex- detecting and disseminating Web page updates. Our sys-
ceeding load limits on content servers. Large-scale sim- tem, called Corona, provides a high-performance up-
ulations and measurements from PlanetLab deployment date notification service for the Web without requiring
demonstrate that Corona achieves orders of magnitude any changes to the existing infrastructure, such as Web
improvement in update performance at a modest cost. servers. Corona enables users to subscribe for updates to
any existing Web page or micronews feed, and delivers
updates asynchronously. The key contribution that en-
1 Introduction ables such a general and backwards-compatible system
Even though Web content changes rapidly, existing Web is a distributed, peer-to-peer, cooperative resource man-
protocols do not provide a mechanism for automatically agement framework that can determine the optimal re-
notifying users of updates. The growing popularity of sources to devote to polling data sources in order to meet
frequently updated content, such as Weblogs, collabora- system-wide goals.
tively authored Web pages (wikis), and news sites, mo- The central resource-performance tradeoff in a
tivates a publish-subscribe mechanism that can deliver publish-subscribe system in which publishers serve con-
updates to users quickly and efficiently. This need for tent only when polled involves bandwidth versus update
asynchronous update notification has led to the emer- latency. Clearly, polling data sources more frequently
gence of micronews syndication tools based on naı̈ve, re- will enable the system to detect and disseminate updates
peated polling. The wide acceptance of such micronews earlier. Yet polling every data source constantly would
syndication tools indicates that backwards compatibility place a large burden on publishers, congest the network,
with existing Web tools and protocols is critical for rapid and potentially run afoul of server-imposed polling lim-
adoption. its that would ban the system from monitoring the mi-
However, publish-subscribe through uncoordinated cronews feed or Web page. The goal of Corona, then, is
polling, similar to the current micronews syndication, to maximize the effective benefit of the aggregate band-
suffers from poor performance and scalability. Sub- width available to the system while remaining within
scribers do not receive updates quickly due to the funda- server-imposed bandwidth limits.
mental limit posed by the polling period, and are tempted Corona resolves the fundamental tradeoff between
to poll at faster rates in order to detect updates quickly. bandwidth and update latency by expressing it formally
as a mathematical optimization problem. The system width utilization. This paper makes three contributions:
then computes the optimal way to allocate bandwidth to (i) it outlines the general design of a publish-subscribe
monitored Web objects using a decentralized algorithm system that does not require any changes to content
that works on top of a distributed peer-to-peer overlay. sources, (ii) formalizes the tradeoffs as an optimization
This allocation takes object popularity, update rate, con- problem and presents a novel distributed numerical so-
tent size, and internal system overhead stemming from lution technique for determining the allocation of band-
accounting and dissemination of meta-information into width that will achieve globally targeted goals while re-
account, and yields a polling schedule for different ob- specting resource limits, and (iii) presents results from
jects that will achieve global performance goals, subject extensive simulations and a live deployment that demon-
to resource constraints. Corona can optimize the sys- strate that the system is practical.
tem for different performance goals and resource limits. The rest of the paper is organized as follows. The
In this paper, we examine two relevant goals: how to next section provides background on publish-subscribe
minimize update latency while ensuring that the average systems and discusses other related work. Section 3 de-
load on publishers is no more than what it would have scribes the architecture of Corona in detail. Implemen-
been without Corona, and how to achieve a targeted up- tation details are presented in Section 4 and experimen-
date latency while minimizing bandwidth consumption. tal results based on simulations and deployment are de-
We also examine variants of these two main approaches scribed in Section 5. Finally, Section 6 summarizes our
where the load is more fairly balanced across objects. contributions and concludes.
The front-end client interface to Corona is through
existing instant messaging (IM) services. Users sub-
scribe for content by sending instant messages to a reg- 2 Background and Related Work
istered Corona IM handle, and receive update notifica- Publish-subscribe systems have raised considerable in-
tions asynchronously. Internally, Corona consists of a terest in the research community over the years. In this
cloud of nodes that monitor the set of active feeds or section, we provide background on publish-subscribe
Web pages called channels. The Corona resource allo- based content distribution and summarize the current
cation algorithm determines the number of nodes desig- state of the art.
nated to monitor each channel. Cooperative polling en- Publish-Subscribe Systems: The publish-subscribe
sures that the system can detect updates quickly while no paradigm consists of three components: publishers, who
single node exceeds server-designated limits on polling generate and feed the content into the system, sub-
frequency. Each node dedicated to monitoring a channel scribers, who specify content of their interest, and an in-
has a copy of the latest version of the channel contents. frastructure for matching subscriber interests with pub-
A feed-specific difference engine determines whether de- lished content and delivering matched content to the
tected changes are germane by filtering out superficial subscribers. Based on the expressiveness of subscriber
differences such as timestamps and advertisements, ex- interests, pub-sub systems can be classified as topic-
tracts the relevant portions that have changed, and dis- based or content-based. In topic-based systems, pub-
tributes the delta-encoded changes to all internal nodes lishers and subscribers are connected together by pre-
assigned to monitor the channel, which in turn distribute defined topics, called channels; content is published on
them to subscribed clients via IM. well-advertised channels to which users subscribe to re-
We have implemented a prototype of Corona and de- ceive asynchronous updates. Content-based systems en-
ployed it on PlanetLab. Evaluation of this deployment able subscribers to express elaborate queries on the con-
shows that Corona achieves more than an order of mag- tent and use sophisticated content filtering techniques to
nitude improvement in update performance. In experi- match subscriber interests with published content.
ments parameterized by real RSS workload collected at Prior research on pub-sub systems has primarily fo-
Cornell [19] and spanning 60 PlanetLab nodes and in- cused on the design and implementation of content fil-
volving 150,000 subscriptions for 7500 different chan- tering and event delivery mechanisms. Topic-based
nels, Corona clients see fresh updates in intervals of publish-subscribe systems have been built based on sev-
45 seconds on average compared to legacy RSS clients, eral decentralized mechanisms, such as group commu-
which see a mean update interval of 15 minutes. At nication in Isis [13], shared object spaces in Linda [5],
all times during the experiment, Corona limits the total TSpace [36], and Java Spaces [16] and rendezvous points
polling load on the content servers within the load im- in TIBCO [35] and Herald [4]. Content-based publish-
posed by the legacy RSS clients. subscribe systems that use in-network content filter-
Overall, Corona is a new overlay-based publishing and aggregation include SIENA [6], Gryphon [34],
subscribe system for the Web that provides asynchronous Elvin [32], and Astrolabe [37]. While the above
notifications, fast update detection, and optimal band- publish-subscribe systems impose well-defined struc-
tures for the content, few systems have been proposed content servers are lightly loaded and do not get over-
for semi-structured and unstructured content. YFilter [8], whelmed due to flash-crowds or sticky traffic.
Quark [3], XTrie [7], and XTreeNet [11] are recent archi- CAM [26] and WIC [25] are two techniques for al-
tectures for supporting complex content-based queries locating bandwidth for polling data sources on a single
on semi-structured XML data. Conquer [21] and We- node. Similar to Corona, they perform resource alloca-
bCQ [20] support unstructured Web content. tion using analytical models for the tradeoff and numeri-
The fundamental drawback of the preceding publish- cal techniques to find near-optimal allocations. However,
subscribe systems is their non-compatibility with the cur- these techniques are limited to a single node. Corona per-
rent Web architecture. They require substantial changes forms resource allocation in a decentralized, cooperative
in the way publishers serve content, expect subscribers to environment and targets globally optimal update perfor-
learn sophisticated query languages, or propose to lay- mance.
out middle-boxes in the core of the Internet. On the Overlay Networks: Corona is layered on structured
other hand, Corona interoperates with the current pull- overlays and leverages the underlying structure to facil-
based Web architecture, requires no changes to legacy itate optimal resource management. Recent years have
Web servers, and provides an easy-to-use IM based in- seen a large number of structured overlays that orga-
terface to the users. Optimal resource management in nize the network based on rings [33, 29, 39, 23], hyper-
Corona aimed at bounding network load insulates Web dimensional cubes [28], butterfly structures [22], de-
servers from high load during flash-crowds. Bruijn graphs [17, 38], or skip-lists [14]. Corona is ag-
Micronews Systems: Micronews feeds are short nostic about the choice of the overlay and can be easily
descriptions of frequently updated information, such as layered on any overlay with uniform node degree, includ-
news stories and blog updates, in XML-based formats ing the ones listed here.
such as RSS [30] and Atom [1]. They are accessed via Corona’s approach to a peer-to-peer resource man-
HTTP through URLs and supported by client applica- agement problem has a similar flavor to that of Bee-
tions and browser plug-ins called feed readers, which hive [27], a structured replication framework that re-
check the contents of micronews feeds periodically and solves space-time tradeoffs optimizations in structured
automatically on the user’s behalf and display the re- overlays. Corona differs fundamentally from Beehive in
turned results. The micronews standards envision a three ways. First, the Beehive problem domain is lim-
publish-subscribe model of content dissemination and ited to object replication in systems where objects have
define XML tags such as cloud that tell clients how to re- homogeneous popularity, size, and update rate proper-
ceive asynchronous updates, as well as TTL, SkipHours, ties, whereas Corona is designed for the Web environ-
and SkipDays that inform clients when not to poll. Yet ment where such properties can vary by several orders of
few content providers currently use the cloud tag to de- magnitude between objects [10, 19]. Thus, Corona takes
liver asynchronous updates. object characteristics into account during optimization.
Recently, commercial services such as Bloglines, Second, the more complex optimization problem ren-
NewsGator, and Queoo have started disseminating mi- ders the Beehive solution technique, based on mathemat-
cronews updates to users. Corona differs fundamentally ical derivation, fundamentally unsuitable for the problem
from these commercial services, which use fragile cen- tackled by Corona. Hence, Corona employs a more gen-
tralized servers and relentless polling to detect updates. eral and sophisticated numerical algorithm to perform its
Corona is layered on a self-organizing overlay comprised optimizations. Finally, the resource-performance trade-
of cooperative peers that share updates, judiciously de- offs that arise in Corona are fundamentally different from
termine the amount of bandwidth consumed by polling, the tradeoffs that Beehive addresses.
and can provide strong bandwidth guarantees.
FeedTree [31] is a recently proposed system for dis-
seminating micronews that also uses a structured over-
3 Corona
lay and shares updates between peers. FeedTree nodes Corona (Cornell Online News Aggregator) is a topic-
perform cooperative update detection in order to reduce based publish-subscribe system for the Web. It pro-
update dissemination latencies, and Corona shares the in- vides asynchronous update notifications to clients while
sight with FeedTree that cooperative polling can drasti- interoperating with the current pull-based architecture
cally reduce update latencies. FeedTree decides on the of the Web. URLs of Web content serve as topics or
number of nodes to dedicate to polling each channel channels in Corona; users register their interest in some
based on heuristics. Corona’s key contribution is the use Web content by providing its URL and receive updates
of informed tradeoffs to optimal resource management. asynchronously about changes posted to that URL. Any
This principled approach enables Corona to provide the Web object identifiable by a URL can be monitored with
best update performance for its users, while ensuring that Corona. In the background, Corona checks for updates
Figure 2: Cooperative Polling in Corona: Each channel is
Figure 1: Corona Architecture: Corona is a distributed assigned a wedge of nodes to poll the content servers and de-
publish-subscribe system for the Web. It detects Web up- tect updates. Corona determines the optimal wedge size for
dates by polling cooperatively and notifies clients through each channel through analysis of the global performance-
instant messaging. overhead tradeoff.
ables Corona to efficiently resolve the tradeoff between

on registered channels by cooperatively polling the con-
performance and scalability.
tent servers from geographically distributed nodes.
In this section, we provide detailed descriptions of the
We envision Corona as an infrastructure service of- components of Corona’s architecture, including the an-
fered by a set of widely distributed nodes. These nodes alytical models, optimization framework, update detec-
may be all part of the same administrative domain, such tion and notification mechanisms, and user interface.
as Akamai, or consist of server-class nodes contributed
by participating institutions. By participating in Corona, 3.1 Analytical Modeling
institutions can significantly reduce the network band- Our analysis-driven approach can be easily applied on
width consumed by frequent redundant polling for con- any distributed system organized as a structured overlay
tent updates, as well as reduce the peak loads seen at with uniform node degree. In this paper, we describe
content providers that they themselves may host. Corona Corona using Pastry [29] as the underlying substrate.
nodes self-organize to form a structured overlay sys- Pastry organizes the network into a ring by assigning
tem. We use structured overlays to organize the system, identifiers from a circular numeric space to each node.
as they provide decentralization, good failure resilience, The identifiers are treated as a sequence of digits of base
and high scalability [33, 29, 39, 28, 9, 14, 17, 23, 24, 38]. b. In addition to neighbors along the ring, each node
Figure 1 illustrates the overall architecture of Corona. maintains contact with nodes that have matching prefix
The key feature that enables Corona to achieve fast digits. These long-distance contacts are represented in a
update detection is cooperative polling. Corona assigns tabular structure called a routing table. The entry in the
multiple nodes to periodically poll the same channel and ith row and j th column of a node’s routing table points to
shares updates detected by any polling node. In gen- a node whose identifier shares i prefix digits and whose
eral, n nodes polling with the same polling interval and (i + 1)th digit is j. Essentially, the routing table defines
randomly distributed polling times can detect updates n a directed acyclic graph (DAG) rooted at each node, en-
times faster if they share updates with each other. While abling a node to reach any other node in logb N hops.
it is tempting to take the maximum advantage of coop- Corona assigns nodes in well-defined wedges of the
erative polling by having every Corona node poll for ev- Pastry ring for polling each channel. Each channel is
ery feed, such a naı̈ve approach is clearly unscalable and assigned an identifier from the same circular numeric
would impose substantial network load on both Corona space. A wedge is as set of nodes sharing a common
and content servers. number of prefix digits with a channel’s identifier. A
Corona makes informed decisions on distributing channel with polling level l is polled by all nodes with
polling tasks among nodes. The number of nodes that at least l matching prefix digits in their identifiers. Thus,
poll for each channel is determined based on an analy- polling level 0 indicates that all the nodes in the system
sis of the fundamental tradeoff between update perfor- poll for the channel. Figure 2 illustrates the concept of
mance and network load. Corona poses this tradeoff as polling levels in Corona.
an optimization problem and obtains the optimal solu- Assigning well-defined portions of the ring to each
tion using Honeycomb, a light-weight toolkit for com- channel enables Corona to manage polling efficiently
puting optimal performance-overhead tradeoffs in struc- with little overhead. The set of nodes polling for a
tured distributed systems. This principled approach en- channel can be represented by just a single number, the
polling level, eliminating the expensive O(n) complexity nels where the extra bandwidth yields higher marginal
for managing state about cooperating nodes. Moreover, benefit. Since improvement in update performance is in-
this also facilitates efficient update sharing, as a wedge versely proportional to the number of polling nodes, a
is a subset of the DAG rooted at each node, and all the naı̈ve heuristic-based scheme that assigns polling nodes
nodes in a wedge can be reached quickly using the con- in proportion to number of subscribers would clearly suf-
tacts in the routing table. fer from diminishing returns. Corona, on the other hand,
The polling level of a channel quantifies its distributes the surplus load to other, less popular chan-
performance-overhead tradeoff. A channel at level l has, nels, improving their update detection times and achiev-
on average, N bl nodes polling it, which can cooperatively
ing a better global average.
detect updates in τ2 N bl
time on average, where τ is the
polling interval. We estimate the average update detec- Corona-Lite:
P M bl i PM PM
tion time at a single node polling periodically at an inter- min. 1 qi N s.t. 1 si bNli ≤ 1 qi
val τ to be τ2 . Simultaneously, the collective load placed Minimize average update detection time, while
on the content server of the channel is τ N bl
. Note that bounding the load placed on content servers.
we do not include the propagation delay for sharing up-
Corona-Fast:
dates in this analysis because updates can be detected by PM N PM PM
min. s.t.
li
comparing against any old version of the content. Hence, 1 s i bl i 1 qi bN ≤ T 1 qi

even if an update detected at a different node in the sys- Achieve a target average update detection time, while
tem is received late, the time to detect the next update at minimizing the load placed on content servers.
the current node does not change. Corona-Fair:
An easy way to set polling levels is to independently P M τ bl i PM PM
min. 1 q i ui N s.t. si bNli ≤ qi
choose a level for each channel based on these esti- 1 1
mates. However, such an approach involves investigating Minimize average update detection time w.r.t. expected
heuristics for determining the appropriate performance update frequency, bounding load on content servers.
requirement for each channel and for dividing the total Corona-Fair-Sqrt:
load between different channels. Moreover, it does not PM √τ bli PM PM
min. 1 qi u N
√ s.t. si bNli ≤ qi
provide fine-grained control over the performance of the i
1 1
system, often causing it to operate far from optimally. Corona-Fair with sqrt weight on the latency ratio to
The rest of this section describes how the tradeoffs can be emphasize infrequently changing channels.
posed as mathematical optimization problems to achieve Corona-Fair-Log:
different performance requirements. PM log τ bli PM PM
min. 1 qi log ui N s.t. si bNli ≤ qi
Corona-Lite: The first performance goal we ex- 1 1
plore is minimizing the average update detection time Corona-Fair with log weight on the latency ratio to
while bounding the total network load placed on content emphasize infrequently changing channels.
servers. Corona-Lite improves the update performance Notation
seen by the clients while ensuring that the content servers
τ polling interval
handle a light load: no more than they would handle from
M number of channels
the clients if the clients fetched the objects directly from
N number of nodes
the servers.
b base of structured overlay
The optimization problem for Corona-Lite is defined
T performance target
in Table 1. The overall update performance is measured
li polling level of channel i
by taking an average of the update detection time of each
qi number of clients for channel i
channel weighted by the number of clients subscribed to
si content size for channel i
the channels. We weigh the average using the number of
ui update interval for channel i
subscriptions because update performance is an end user
experience, and each client counts as a separate unit in Table 1: Performance-Overhead Tradeoffs: This table
the average. The target network load for this optimiza- summarizes the optimization problems for different perfor-
tion is simply the total number of subscriptions in the mance goals in Corona.
system.
Corona-Lite clients experience the maximum bene- Corona-Fast: While Corona-Lite bounds the network
fits of cooperation. Clients of popular channels gain load on the content servers and minimizes update latency,
greater benefits than clients of less popular channels. Yet, the update performance it provides can vary depending
Corona-Lite does not suffer from “diminishing returns,” on the current workload. Corona-Fast provides stable up-
using its surplus polling capacity on less popular chan- date performance, which can be maintained steadily at a
desired level through changes in the workload. Corona- The preceding optimization problem is NP-Hard, as
Fast solves the converse of the previous optimization the polling levels only take integral values. Hence, in-
problem; that is, it minimizes the total network load on stead of using computationally intensive techniques to
the content servers while meeting a target average up- find the exact solution, Honeycomb finds an approxi-
date detection time. Corona-Fast enables us to tune the mate solution quickly in time comparable to a sorting
update performance of the system according to applica- algorithm. Honeycomb’s optimization algorithm runs in
tion needs. For example, a stock-tracker application may O(M log M log N ) time.
choose a target of 30 seconds to quickly detect changes The solution provided by Honeycomb is accurate and
to stock prices. deviates from the optimal in at most one channel. Honey-
Corona-Fast shields legacy Web servers from sudden comb achieves this accuracy by finding two solutions that
increases in load. A sharp increase in the number of sub- optimize the problem with slightly altered constraints:
scribers for a channel does not trigger a corresponding one with a constraint Td ≤ T and another with con-
increase in network load on the Web server since Corona- straint Tu ≥ T . The corresponding solutions L∗u and L∗d
Fast does not increase polling after diminishing returns are exactly optimal for the optimization problems with
sets in. In contrast, in legacy RSS, popularity spikes constraints Tu and Td respectively, and differ in at most
cause a significant increase in network load on content one channel. That is, one channel has a different polling
providers. Moreover, the increased load typically contin- level in L∗u than in L∗d . Note that the optimal solution L∗
ues unabated in legacy RSS as subscribers forget to un- for the original problem with constraint T may actually
subscribe, creating “sticky” traffic. Corona-Fast protects decide to allocate channels differently from L∗d and L∗u .
content servers from flash-crowds and sticky traffic. Yet, the minimum determined by L∗ will be bounded by
Corona-Fair: Corona-Fast and Corona-Lite do not the minima determined by L∗d and L∗u due to monotonic-
consider the actual rate of change of content in a channel. ity. Honeycomb then chooses L∗d as the final solution
While some Web objects are updated every few minutes, because it satisfies the constraint T strictly.
others do not change for days at a time [10, 19]. Corona- Honeycomb computes L∗d and L∗u using a Lagrange
Fair incorporates the update rate of channels into the per- multiplier to transform the optimization problem as fol-
formance tradeoff in order to achieve a fairer distribution lows:
of update performance between channels. It defines a
modified update performance metric as the ratio of the X
M X
M
update detection time and the update interval of the chan- L∗ = arg min. fi (li ) − λ[ gi (li ) − T ].
nel, which it minimizes to achieve a target load. 1 1
While the new metric accounts for the wide difference Honeycomb iterates over λ and obtains the two solu-
in update characteristics, it biases the performance unfa- tions L∗d and L∗u that bracket the minimum using standard
vorably against channels with large update interval times. bracketing methods for function optimization in one di-
A channel that does not change for several days experi- mension.
ences long update detection times, even if there are many Two observations enable Honeycomb to speed up the
subscribers for the channel. We compensate for this bias optimization algorithm. First, L∗ (λ0 ) for a single itera-
by exploring other update performance metrics based on tion can be computed by finding arg min.fi (li ) − λ0 gi (li )
square root and logarithmic functions, which grow sub- independently for each channel. This takes O(M log N )
linearly. A sub-linear metric dampens the tendency of the time as the number of levels is bounded by dlog N e.
optimization algorithm to punish slow-changing yet pop- Second, for each channel there are only log N val-
ular feeds. Table 1 summarizes the optimization prob- ues of λ that change arg min.fi (li ) − λ0 gi (li ). Pre-
lems for different versions of Corona. computing these λ values for each object provides a dis-
crete iteration space of M log N λ values. By keep-
3.2 Decentralized Optimization
ing a sorted list of the λ values, Honeycomb computes
Corona determines the optimal polling levels using the
the optimal solution in O(log M ) iterations. Overall,
Honeycomb optimization toolkit. Honeycomb provides
the run-time complexity of the optimization algorithm
numerical algorithms and decentralized mechanisms for
is O(M log M log N ) time, including the time spent in
solving optimization problems that can be expressed as
pre-computation, sorting, and iterations.
follows:
The preceding algorithm requires the tradeoff func-
X
M
PM tions fi (l) and gi (l) of all channels in the system in
min. fi (li ) s.t. 1 gi (li ) ≤ T. order to compute the global optimum. Solving the op-
1
timization problem using limited data available locally
Here, fi (l) and gi (l) can define the performance or the can produce highly inaccurate solutions. On the other
cost for channel i as a function of the polling level l. hand, collecting the tradeoff factors for all the channels
at each node is clearly expensive and impractical. It is of the content, and the interval at which servers update
possible to gather the tradeoff data at a central node, run channel content. The latter is estimated based on the time
the optimization algorithm at a single location, and then between updates detected by Corona.
distribute the optimal levels to peers from the central lo- Corona manages cooperative polling through a pe-
cation. We avoid using a centralized infrastructure as it riodic protocol consisting of an optimization phase, a
introduces a single point of failure in the system and has maintenance phase, and an aggregation phase. In the
limited scalability. optimization phase, Corona nodes apply the optimiza-
Instead, Honeycomb internally aggregates coarse tion algorithm on fine-grained tradeoff data for locally
grained information about global tradeoff factors. It polled channels and coarse-grained tradeoff clusters ob-
combines channels with similar tradeoff factors into a tained from overlay contacts. In the maintenance phase,
tradeoff cluster. Each cluster summarizes the trade- changes to polling levels are communicated to peer
off factors for multiple channels and provides coarse- nodes in the routing table through maintenance mes-
grained tradeoff information. A ratio of performance and sages. Finally, the aggregation phase enables nodes to
cost factors, fi /gi , is used as a metric to combine chan- receive new aggregates of tradeoff factors. In practice,
nels. For example, channels with comparable values for the three phases occur concurrently at a node with aggre-
ui si are combined into a cluster in Corona-Fair. gation data piggy-backed on maintenance messages.
qi
Honeycomb nodes periodically exchange the clusters Corona nodes operate independently and make deci-
with contacts in the routing table and aggregate the clus- sions to increase or decrease polling levels locally. Ini-
ters received from the contacts. Honeycomb keeps the tially, only the owner nodes at level K = dlog N e poll
overhead for cluster aggregation low by limiting the for the channels. If an owner decides to lower the polling
number of clusters for each polling level to a constant level to K − 1 (based on local optimization), it sends a
Tradeoff Bins. Each node receives Tradeoff Bins clus- message to the contacts in its routing table at row K−1 in
ters for every polling level from each contact in the rout- the next maintenance phase. As a result, a small wedge
ing table. Combined, these clusters summarize the trade- of level K − 1 nodes start polling for that channel. Sub-
off characteristics of all the channels in the system. The sequently, each of these nodes may independently decide
cluster aggregation overhead in terms of memory state as to further lower the polling level of that channel. Simi-
well as network bandwidth is limited by the size of the larly, if an owner node decides to raise the level from
routing table, and scales with the logarithm of the system K − 1 to K, it asks its contact in the K − 1 wedge to
size. stop polling.
In general, when a level i node lowers the level to i−1
3.3 System Management or raises the level from i − 1 back to i, it instructs its
Corona is a completely decentralized system, in which contact in row i − 1 of its routing table to start or stop
nodes act independently, share load, and achieve glob- polling for that channel. This control path closely fol-
ally optimal performance through mutual cooperation. lows the DAG rooted at the owner node. Nodes at level i
Corona spreads load uniformly among the nodes through (depth K − i) in this DAG decide whether their children
consistent-hashing [18]. Each channel in Corona has a at level i − 1 should poll a channel and convey these de-
unique identifier and one or more owner nodes managing cisions periodically every maintenance interval. When a
it. The identifier is a content-hash of the channel’s URL, node is instructed to begin polling for a channel, it waits
and the primary owner of a channel is the Corona node for a random interval of time between 0 and the polling
with the numerically closest identifier to the channel’s interval before the first poll. This ensures that polls for a
identifier. Corona adds additional owners for a channel channel at different nodes are randomly distributed over
in order to tolerate failures. These owners are the F clos- time.
est neighbors of the primary owner along the ring. In the Corona nodes gather current estimates of tradeoff fac-
event an owner fails, a new neighbor automatically re- tors in the aggregation phase. Owners monitor the num-
places it. ber of subscribers and send out fresh estimates along
Owners take responsibility for managing subscrip- with the maintenance message. Subsequent maintenance
tions, polling, and updates for a channel. Owners receive messages sent out by descendant nodes in the DAG prop-
subscriptions through the underlying overlay, which au- agate these estimates to all the nodes in the wedge. The
tomatically routes all subscription requests of a channel update interval and size of a feed only change during up-
to its owner. The owners keep state about the subscribers dates and are therefore sent along with updates. Tradeoff
of a channel and send notifications to them when fresh clusters are also sent by contacts in the routing table in
updates are detected. In addition, owners also keep track response to maintenance messages.
of channel-specific factors that affect the performance Corona inherits robustness and failure-resilience from
tradeoffs, namely the number of subscribers, the size the underlying structured overlay. If the current contact
in the routing table fails, the underlying overlay automat- polling level of the channel. The dissemination along the
ically replaces it with another contact. When new nodes DAG takes place using contacts in the routing table of the
join the system or nodes fail, Corona ensures the transfer underlying overlay. For channels that cannot obtain a re-
of subscription state to new owners. A node that is no liable modification timestamp from the server, the node
longer an owner simply erases its subscription state, and detecting the update sends the delta to the primary owner,
a node that becomes a new owner receives the state from which assigns a new version number and initiates the dis-
other owners of the channel. Simultaneous failure of semination to other nodes polling that channel. Two dif-
more than F adjacent nodes poses a problem for Corona, ferent nodes may detect a change “simultaneously” and
as well as to many other peer-to-peer systems; we as- send deltas to the primary owner. The primary owner
sume that F is chosen to make such an occurrence rare. always checks the current delta with the latest updated
Note that clients can easily renew subscriptions should a version of the content and ignores redundant deltas.
catastrophic failure lose some subscription state.
Overall, Corona manages polling using light-weight 3.5 User Interface
mechanisms that impose a small, predictable overhead Corona employs instant messaging (IM) as its user in-
on the nodes and network. Its algorithms do not rely on terface. Users add Corona as a “buddy” in their favorite
expensive constructs such as consensus, leader election, instant messaging system; both subscriptions and update
or clock synchronization. Networking activity is limited notifications are then communicated as instant messages
to contacts in the nodes’ routing tables. between the users and Corona. Users send request mes-
sages of the form “subscribe url” and “unsubscribe url”
3.4 Update Dissemination to subscribe and unsubscribe for a channel. A subscribe
Updates are central to the operation of Corona; hence, or unsubscribe message delivered by the IM system to
we ensure that they are detected and disseminated effi- Corona is routed to all the owner nodes of the channel,
ciently. Corona uses monotonically increasing numbers which update their subscription state. When a new up-
to identify versions of content. The version numbers are date is detected by Corona, the current primary owner
based on content modification times whenever the con- sends an instant message with the delta to all the sub-
tent carries such a timestamp. For other channels, the scribers through the IM system. If a subscriber is off-
primary owner node assigns version numbers in increas- line at the time an update is generated, the IM system
ing order based on the updates it receives. buffers the update and delivers it when the subscriber
Corona nodes share updates only as deltas, the differ- subsequently joins the network.
ences between old and new content, rather than the en- Delivering updates through instant messaging systems
tire content. A measurement study on micronews feeds incurs additional latency since messages are sent through
conducted at Cornell shows that the amount of change in a centralized service. However, the additional latency is
content during an update is typically tiny. The study re- modest as IM systems are designed to reduce such la-
ports that the average update consists of 17 lines of XML, tencies during two-way communication. Moreover, IM
or 6.8% of the content size [19], which implies that a systems that allow peer-to-peer communication between
significant amount of bandwidth can be saved through their users, such as Skype, deliver messages in quick
delta-encoding. time.
A difference engine enables Corona to identify when Instant messaging enables Corona to be easily accessi-
a channel carries new information that needs to be dis- ble to a large user population, as no computer skills other
seminated to subscribed clients. The difference engine than an ability to “chat” are required, and ubiquitous IM
parses the HTML or XML content to discover the core deployment ensures that hosts behind NATs and firewalls
content in the channel, ignoring frequently changing ele- are supported. Moreover, instant messages also guaran-
ments such as timestamps, counters, and advertisements. tee the authenticity of the source of update messages to
The difference engine generates a delta if it detects an the clients, as instant messaging systems pre-authenticate
update after isolating the core content. The data in a Corona as the source through password verification.
delta resembles the typical output of the POSIX ‘diff’
command: it carries the line numbers where the change 4 Implementation
occurs, the changed content, an indication of whether it We have implemented a prototype of Corona as an ap-
is an addition, omission, or replacement, and a version plication layered on Pastry [29], a prefix-matching struc-
number of the old content to compare against. tured overlay system. The implementation uses a 160-bit
When a delta is generated by a node, it shares the up- SHA-1 hash function to generate identifiers for both the
date with all other nodes in the channel’s polling wedge. nodes (based on their IP addresses) and channels (based
To achieve this, the node simply disseminates the delta on their URLs). Both the base of Pastry and the number
along the DAG rooted at it up to a depth equal to the of tradeoff clusters per polling level are set to 16.
Prefix matching overlays occasionally create orphans, mance of Corona to the performance of legacy RSS, a
that is, channels with no owners having enough match- widely-used micronews syndication system. The simula-
ing prefix digits. Orphans are created when the re- tions and experiments are driven by real-life RSS traces.
quired wedge of the identifier space, corresponding to We collected characteristics of micronews workloads
level dlog N e − 1, is empty. Corona cannot assign ad- and content by passively logging user activity and ac-
ditional nodes to poll an orphan channel to improve its tively polling RSS feeds [19]. User activity recorded
update performance. Moreover, orphans adversely affect between March 22 and May 3 of 2005 at the gate-
the computation of globally optimal allocation. Corona way of the Cornell University Computer Science De-
handles orphan channels by adjusting the tradeoffs ap- partment provided a workload of 158 clients making
propriately. The tradeoff factors of orphan channels are approximately 62,000 requests for 667 different feeds.
aggregated into a slack cluster, which is used to adjust The channel popularity closely follows a Zipf distribu-
the performance target prior to optimization. tion with exponent 0.5. The survey analyzes the update
Corona’s reliance on IM as an asynchronous commu- rate of micronews content by actively polling approxi-
nication mechanism poses some operational challenges. mately 100,000 RSS feeds obtained from syndic8.com.
Corona interacts with IM systems using GAIM [12], We poll these feeds at one hour intervals for 84 hours,
an open source instant messaging client for Unix-based and subsequently select a subset of 1000 feeds and poll
platforms that supports multiple IM systems including them at a finer granularity of 10 minutes for five days.
Yahoo Instant Messenger, AOL Instant Messenger, and Comparing periodic snapshots of the feeds shows that
MSN Messenger. Several IM systems have a limitation the update interval of micronews content is widely dis-
that only one instance of a user can be logged on at a tributed: about 10% of channels changed within an hour,
time, preventing all Corona nodes from being logged on while 50% of channels did not change at all during the
simultaneously. While we hope that IM systems will sup- five days of polling.
port simultaneous logins from automated users such as
Corona in the near future, as they have for several chat 5.1 Simulations
robots, our implementation uses a centralized server to We use tradeoff parameters based on the RSS survey in
talk to IM systems as a stop-gap measure. This server our simulations. In order to scale the workload to the
acts as an intermediary for all updates sent to clients as larger scale of our simulations, we extrapolate the distri-
well as subscription messages sent by clients. Also, IM bution of feed popularity from the workload traces and
systems such as Yahoo rate-limit instant messages sent set the popularity to follow a Zipf distribution with ex-
by unprivileged clients. Corona’s implementation limits ponent 0.5. We use a distribution for the update rates of
the rate of updates sent to clients to avoid sending up- channels obtained through active polling, setting the up-
dates in bursts. date interval of the channels that do not see any updates
Corona trusts the nodes in the system to behave cor- to one week.
rectly and generate authentic updates. However, it is We perform simulations for a system of 1024 nodes,
possible that in a collaborative deployment, where nodes 100,000 channels, and 5,000,000 subscriptions. We start
under different administrative domains are part of the each simulation with an empty state and issue all sub-
Corona network, some nodes may be malicious and gen- scriptions at once before collecting performance data.
erate spurious updates. This problem can be easily We run the simulations for six hours with a polling inter-
solved if content providers are willing to publish digi- val of 30 minutes and maintenance interval of one hour.
tally signed certificates along with their content. An al- We study the performance of the three schemes, Corona-
ternative solution that does not require changes to servers Lite, Corona-Fast, and Corona-Fair, and compare the
is to use threshold-cryptography to generate a certificate performance with that of legacy RSS clients polling at
for content [40, 15]. The responsibility for generating the same rate of 30 minutes.
partial signatures can be shared among the owners of a
node ensuring that rogue nodes below the threshold level Corona-Lite
cannot corrupt the system. Designing and implementing Figures 3 and 4 show the network load and update perfor-
such a threshold-cryptographic scheme is, however, be- mance, respectively, for Corona-Lite, which minimizes
yond the scope of this paper. average update detection time while bounding the to-
tal load on content servers. The figures plot the net-
work load, in terms of the average bandwidth load placed
5 Evaluation on content servers, and update performance, in terms of
We evaluate the performance of Corona through large- the average update detection time. Figure 3 shows that
scale simulations and wide-area experiments on Planet- Corona-Lite stabilizes at the load imposed by legacy RSS
Lab [2]. In all our evaluations, we compare the perfor- clients. Starting from a clean slate where only owner
1.5 4
10
Legacy RSS
Corona Lite
Network Load per Channel (kbps)
3
10
Number of Polling Nodes

1
2
10
0.5
1
10
Legacy RSS
Corona Lite
Corona Fast 0
0 10
0 2 4 6 0 20000 40000 60000 80000 100000
Time (hours) Channel Rank by Popularity
Figure 3: Network Load on Content Servers: Corona-Lite Figure 5: Number of Pollers per Channel: Corona trades
converges quickly to match the network load imposed by off network load from popular channels to decrease update
legacy RSS clients. detection time of less popular channels and achieve a lower
system-wide average.
100
Legacy RSS
4
Corona Lite 10
Corona Fast Legacy RSS
Corona Lite
Update Detection Time (min)
15
10 3
Update Detection Time (sec) 10
2
10
1
0.5
1
10
0 2 4 6 0
Time (hours) 10
0 20000 40000 60000 80000 100000
Channel Rank by Popularity
Figure 4: Average Update Detection Time: Corona-Lite

provides 15-fold improvement in update detection time Figure 6: Update Detection Time per Channel: Popular
compared to legacy RSS clients for the same network load. channels gain greater decrease in update detection time
than less popular channels.
nodes poll for each channel, Corona-Lite quickly con-

verges to its target in two maintenance phases. The av- axis shows channels in reverse order of popularity. We
erage load exceeds the target for a brief period before only plot 20,000 channels for clarity. The load imposed
stabilization. This slight delay is due to nodes not hav- by legacy RSS is equal to the number of clients. For
ing complete information about tradeoff factors of other Corona-Lite, three levels of polling can be identified in
channels in the system. However, the discrepancy is Figure 5: channels clustered around 100 at level 1, chan-
corrected automatically when aggregated global tradeoff nels with fewer than 10 clients at level 2, and orphan
factors are available to each node. channels close to the X-axis with just one owner node
At the same time, Figure 4 shows that Corona-Lite polling them. The sharp change in the distribution after
achieves an average update detection time of about one 75,000 channels indicates the point where the optimal
minute. The update performance of Corona-Lite repre- solution changes polling levels.
sents an order of magnitude improvement over the av- Figure 5 shows that Corona-Lite favors popular chan-
erage update detection time of 15 minutes provided by nels over unpopular ones when assigning polling levels.
legacy RSS clients. This substantial difference in per- Yet, it significantly reduces the load on servers of pop-
formance is achieved through judicious distribution of ular content compared to legacy clients, which impose
polling load between cooperating nodes, while imposing a highly skewed load on content servers and overload
no more load on the servers than the legacy clients. servers of popular content. Corona-Lite reduces the load
Figures 5 and 6 show the number of polling nodes as- of the over-loaded servers and transfers the extra load to
signed by Corona-Lite to different channels and the re- servers of less popular content to improve update perfor-
sulting distribution of update detection times. The x- mance.
Average Update Average Load 4
10
Corona Lite
Scheme Detection Time (polls per 30 min Corona Fair
(sec) per channel)
3
Update Detection Time (sec)

Legacy-RSS 900 50.00 10
Corona-Lite 53 48.97
Corona-Fair 142 50.14 2
10
Corona-Fair-Sqrt 55 49.46
Corona-Fair-Log 53 49.43
Corona-Fast 32 58.75 1
10
Table 2: Performance Summary: This table provides a

summary of average update detection time and network 0
10
0 20000 40000 60000 80000 100000
load for different versions of Corona. Overall, Corona Channel Rank by Update Interval
provides significant improvement in update detection time
compared to Legacy RSS, while placing the same load on Figure 7: Update Detection Time per Channel: Corona-
servers. Fair provides better update detection time for channels that
change rapidly than for channels that change rarely.
The favorable behavior of Corona-Lite is due to dimin- 4
10
ishing returns caused by the inverse relation between the Corona Fair Sqrt
update detection time and the number of polling nodes. Corona Fair Log
It is more beneficial to distribute the polling across many 3

10
channels than to devote a large percentage of the band-
width to polling the most popular channels. Neverthe- 2
less, load distribution in Corona-Lite respects the pop- 10
ularity distribution of channels: popular channels are

polled by more nodes than less popular channels (see 1
10
Figure 5). The upshot is that popular channels gain an
order of magnitude improvement in update performance 0
over less popular ones (see Figure 6). 10
0 20000 40000 60000 80000 100000
Channel Rank by Upate Interval
Corona-Fast
Unlike Corona-Lite, Corona-Fast minimizes the total Figure 8: Update Detection Time per Channel: Corona-
Fair-Sqrt and Corona-Fair-Log fix the bias against chan-
load on servers while aiming to achieve a target update
nels that change rarely and provide better update detection
detection latency. Figures 3 and 4 show the network load time for them than Corona-Fair does.
and update performance, respectively, for Corona-Fast.
Figure 4 confirms that Corona-Fast closely meets the
desired target of 30 seconds. This improvement in up- ranked by their update intervals. Channels with the same
date detection time entails an increase in server load over update intervals are further ranked by popularity. For
Corona-Lite. Unlike Corona-Lite, whose update perfor- clarity of presentation, we plot the distribution for 200
mance may vary depending on the workload seen by the uniformly chosen channels.
system, Corona-Fast provides a stable average update Figure 7 shows that Corona-Lite achieves an unfair
performance. Moreover, it enables us to set the perfor- distribution of update detection times by ignoring update
mance depending on the requirements of the application interval information. Many channels with large update
and ensures that the targeted performance is achieved intervals have short update detection times (shown in the
with minimal load on content servers. lower-right of the graph), while some rapidly changing
channels have long update detection times (shown in the
Corona-Fair upper-left of the graph). Corona-Fair fixes this unfair dis-
Finally, we examine the performance of Corona-Fair, tribution of update detection time by using update inter-
which uses the update rates of channels to fine-tune the vals of channels to influence the choice of polling levels.
distribution of load. It takes advantage of the wide dis- Figure 7 shows that Corona-Fair has a fairer distribution
tribution of update intervals among channels and aims of update detection times with update intervals; that is,
to poll frequently updated channels at a higher rate channels with shorter update intervals have faster update
than channels with long update intervals. Figure 7 detection times and vice-versa.
shows the distribution of update detection times achieved Corona-Fair optimizes for update performance mea-
by Corona-Lite and Corona-Fair for different channels sured as the ratio of update detection time and update in-
terval. Thus, channels with long update intervals may 3
10
also have proportionally long update detection times,
leading to long wait times for clients. Section 3.1 pro-

posed to compensate for this bias using two metrics with
sub-linear growth based on the square root and logarithm
of the update interval. Figure 8 shows that Corona-Fair- 2
10
Sqrt and Corona-Fair-Log achieve update detection times
that are fairer and lower than Corona-Fair. Between the
two metrics, Corona-Fair-Sqrt is better than Corona-Fair- Corona
Log, which has a few channels with small update inter- 1
Legacy RSS
vals but long update detection times. 10
0 2 4 6
Time (hours)
Overall, the Corona-Fair schemes provide fair dis-
tributions of polling between channels with low aver-
age update detection times without exceeding bandwidth Figure 9: Average Update Detection Time: Corona pro-
load on the servers. The average update detection time vides an order of magnitude lower update detection time
and load for different Corona-Fair schemes is shown in compared to legacy RSS.
Table 2. The average update detection time suffers a
little in Corona-Fair compared to Corona-Lite, but the 6000
modified Corona-Fair schemes provide an average per-

formance close to that of Corona-Lite.
Total Network Polls per Min 4000
5.2 Deployment
We deployed Corona on a set of 60 PlanetLab nodes and
measured its performance. The deployment is based on 2000
the Corona-Lite scheme, which minimizes update detec-
tion time while bounding network load. For this exper- Corona
iment, we use 7500 real channels providing RSS feeds Legacy RSS
0
obtained from www.syndic8.com. We issue 150,000 sub- 0 2 4 6
Time (hours)
scriptions for them based on a Zipf popularity distribu-
tion with exponent 0.5. Subscriptions are issued at a uni-
form rate during the first hour and a half of the experi- Figure 10: Total Polling Load on Servers: The total load
generated by Corona is well below the load generated by
ment. The maintenance interval and the polling interval
clients using legacy RSS
are both set to 30 minutes. We collected data for a period
of six hours.
Figure 9 shows the average update detection time for uses the new parameters to periodically adjust the opti-
Corona deployment compared to legacy RSS. Corona de- mal polling levels of channels and meet performance and
creases the average update time to about 45 seconds com- load targets. Corona offers considerable flexibility in the
pared to 15 minutes for legacy RSS. Figure 10 shows kind of performance goals it can achieve. In this section,
the corresponding polling load imposed by Corona on we showed three specific schemes targeting update de-
content servers. Corona gradually increases the num- tection time, network load, and fair distribution of load
ber of nodes polling each channel and reaches a load under different metrics of fairness. Measurements from
limit of around 4500 polls per minute. Corona’s total the deployment showed that achieving globally optimal
network load is bounded by the load imposed by legacy performance in a distributed wide-area system is practi-
RSS, which averages to just above 5000 polls per minute. cal and efficient. Overall, Corona proves to be a high
These graphs highlight that while imposing comparable performance, scalable publish-subscribe system.
load as legacy RSS, Corona achieves a substantial im-
provement in update detection time.
6 Conclusions
5.3 Summary This paper proposes a novel publish-subscribe architec-
The results from simulations and wide-area experiments ture that is compatible with the existing pull-based archi-
confirm that Corona achieves a balance between update tecture of the Web. Motivated by the growing demand for
latency and network load. It dynamically learns the pa- micronews feeds and the paucity of infrastructure to pro-
rameters of the system such as number of nodes, number vide asynchronous notifications, we develop a unique so-
of subscriptions, and tradeoff factors of all channels, and lution that addresses the shortcomings of pull based con-
tent dissemination and delivers a real, deployable, easy- lic use. We hope that a backwards-compatible, high-
to-use publish-subscribe system. performance, efficient publish-subscribe system will
Many real-world applications require quick and effi- make it possible for people to easily track frequently
cient dissemination of information from data sources to changing content on the Web.
clients. It is quite common for legacy data sources, such
as Web pages, sensors, stock feeds, event trackers and so
Acknowledgements
forth, to be deployed piecemeal, and thus to force clients
to poll them manually and explicitly to receive the latest We would like to thank Rohan Murty for implementing
updates. As the numbers of data sources increase, the an earlier prototype of the Corona system, Yee Jiun Song
task of monitoring so many event sources quickly be- for his help with the system, and our shepherd Dave An-
comes overwhelming for humans. At sufficiently large dersen for his guidance and feedback. This work was
scales, the task of allocating bandwidth is difficult even supported in part by NSF CAREER grant 0546568, and
for computers. We can see examples of such applications TRUST (The Team for Research in Ubiquitous Secure
in large scale sensor networks, in investment manage- Technology), which receives support from the National
ment systems that track commodity prices, and in many Science Foundation (CCF-0424422) and the following
adaptive distributed systems for detecting events. All of organizations: Cisco, ESCHER, HP, IBM, Intel, Mi-
these applications pose a fundamental tension between crosoft, ORNL, Qualcomm, Pirelli, Sun, and Symantec.
the polling bandwidth required to achieve fast event de-
tection and the corresponding load imposed by periodic References
polling. [1] Atom. Atom Syndication Format. http://www.-
Our unique contribution is the optimal resolution of atomenabled.org/developers/syndication, Oct. 2005.
performance-overhead tradeoffs in such event detection [2] A. Bavier, M. Bowman, B. Chun, D. Culler, S. Karlin,
systems. This paper provides a general approach based S. Muir, L. Peterson, T. Roscoe, T. Spalink, and M. Wawr-
on analytical modeling of the cost-performance tradeoff zoniak. Operating System Support for Planetary-Scale
and mathematical optimization that enables applications Network Services. In Proc. of Symposium on Networked
to make informed, near-optimal decisions on which data Systems Design and Implementation, Boston, MA, Mar.
sources to monitor, and with what frequency. We develop 2004.
[3] C. Botev and J. Shanmugasundaram. Context Sensitive
techniques to solve typical resource allocation problems
Keyword Search and Ranking for XML. In Proc. of In-
that arise in distributed systems through decentralized,
ternational Workshop on Web and Databases, Baltimore,
low-overhead mechanisms. MD, June 2005.
The techniques at the core of this system are easily [4] L. F. Cabera, M. B. Jones, and M. Theimer. Herald:
applicable to any domain where a set of nodes monitor Achieving a Global Event Notification Service. In Proc.
exogenous events. The Corona approach to monitoring of Workshop on Hot Topics in Operating Systems, Elmau,
oblivious, pull-based data sources makes it unnecessary Germany, May 2001.
to change the data publishing workflow, agree on new [5] N. Carriero and D. Gelernter. Linda in Context. Commu-
dissemination protocols, or deploy new software on data nications of the ACM, 32(4):444–458, Apr. 1989.
sources. This is particularly relevant when the sources [6] A. Carzaniga, D. S. Rosenblum, and A. L. Wolf. De-
sign and Evaluation of a Wide-Area Event Notifica-
to be monitored are large in number, and deploying new
tion Service. ACM Transactions on Computer Systems,
software is logistically difficult. For instance, large scale
19(3):332–383, Aug. 2001.
Web spiders that monitor changes to Websites to incre- [7] C.-Y. Chan, P. Felber, M. Garofalakis, and R. Rastogi.
mentally update a Web index could benefit from the prin- Efficient Filtering of XML Documents with XPath Ex-
cipled approach developed here. pressions. In Proc. of International Conference on Data
Corona applies this general approach to disseminat- Engineering, San Jose, CA, Feb. 2002.
ing updates to the Web, where the resource-performance [8] Y. Diao, S. Rizvi, and M. Franklin. Towards an Internet-
tradeoff is affected by the popularity, size, and up- Scale XML Dissemination Service. In Proc. of Inter-
date rate of Web content and the network capacities of national Conference on Very Large Databases (VLDB),
clients and content servers. Performance measurements Toronto, Canada, Aug. 2004.
[9] J. R. Douceur, A. Adya, W. J.Bolosky, and D. Simon. Re-
based on simulations and real-life deployment show that
claiming Space from Duplicate Files in a Serverless Dis-
Corona clients can achieve several orders of magnitude
tributed File System. In Proc. of International Conference
improvement in update latency without an increase in on Distributed Computing Systems, Vienna, Austria, July
average load. Corona acts as a buffer between clients 2002.
and servers, shielding servers from the impact of flash- [10] F. Douglis, A. Feldman, B. Krishnamurthy, and J. Mogul.
crowds and sticky traffic. Our implementation is cur- Rate of Change and Other Metrics: a Live Study of the
rently deployed on PlanetLab and available for pub- World Wide Web. In Proc. of USENIX Symposium on
Internet Technologies and Systems, Monterey, CA, Dec. [25] S. Pandey, K. Dhamdere, and C. Olston. WIC: A
1997. General-Purpose Algorithm for Monitoring Web Infor-
[11] W. Fenner, M. Rabinovich, K. K. Ramakrishnan, D. Sri- mation Sources. In Proc. of the Conference on Very Large
vastava, and Y. Zhang. XTreeNet: Scalable Overlay Data Bases (VLDB), Toronto, Canada, Aug. 2004.
Networks for XML Content Dissemination and Query- [26] S. Pandey, K. Ramamritham, and S. Chakraborti. Mon-
ing (Synopsis). In Proc. of International Workshop on itoring the Dynamic Web to Respond to Continuous
Web Content Caching and Distribution, Sophia Antipo- Queries. In Proc. of the International World Wide Web
lis, France, Sept. 2005. Conference, Budapest, Hungary, May 2003.
[12] GAIM. A Multi-Protocol Instant Messaging Client. [27] V. Ramasubramanian and E. G. Sirer. Beehive: Exploit-
http://gaim.sourceforge.net, Oct. 2005. ing Power Law Query Distributions for 0(1) Lookup Per-
[13] B. Glade, K. P. Birman, R. Cooper, and R. van Renesse. formance in Peer-to-Peer Overlays. In Proc. of Sympo-
Light-Weight Process Groups in the ISIS System. Dis- sium on Networked Systems Design and Implementation,
tributed Systems Engineering, 1(1):29–36, Sept. 1993. San Francisco, CA, Mar. 2004.
[14] N. Harvey, M. Jons, S. Saroiu, M. Theimer, and A. Wol- [28] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and
man. SkipNet: A Scalable Overlay Network with Practi- S. Shenker. A Scalable Content-Addressable Network.
cal Locality Properties. In Proc. of USENIX Symposium In Proc. of ACM SIGCOMM, San Diego, CA, Aug. 2001.
on Internet Technologies and Systems, Seattle, WA, Mar. [29] A. Rowstron and P. Druschel. Pastry: Scalable, De-
2003. centralized Object Location and Routing for Large-scale
[15] W. K. Josephson, E. G. Sirer, and F. B. Schneider. Peer- Peer-to-Peer Systems. In Proc. of IFIP/ACM Interna-
to-Peer Authentication With a Distributed Single Sign-On tional Conference on Distributed Systems Platforms, Hei-
Service. In Proc. of International Workshop on Peer-to- delberg, Germany, Nov. 2001.
Peer Systems, San Diego, CA, Feb. 2004. [30] RSS 2.0 Specifications. http://blogs.law.harvard.edu/-
[16] JS-JavaSpaces Service Specification. http://www.jini.- tech/rss, Oct. 2005.
org/nonav/standards/davis/doc/specs/html/js-spec.html, [31] D. Sandler, A. Mislove, A. Post, and P. Druschel.
2002. FeedTree: Sharing Web Micronews with Peer-to-Peer
[17] F. Kaashoek and D. Karger. Koorde: A Simple Degree- Event Notification. In Proc. of International Workshop
Optimal Distributed Hash Table. In Proc. of International on Peer-to-Peer Systems, Ithaca, NY, Feb. 2005.
Workshop on Peer-to-Peer Systems, Berkeley, CA, Feb. [32] B. Segall, D. Arnold, J. Boot, M. Henderson, and
2003. T. Phelps. Content Based Routing with Elvin. In Proc.
[18] D. Karger, E. Lehman, T. Leighton, M. Levine, D. Lewin, of AUUG2K, Canberra, Australia, June 2000.
and R. Panigraphy. Consistent Hashing and Random [33] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Bal-
Trees: Distributed Caching Protocols for Relieving Hot akrishnan. Chord: A Scalable Peer-to-peer Lookup Ser-
Spots on the World Wide Web. In Proc. of ACM Sympo- vice for Internet Applications. In Proc. of ACM SIG-
sium on Theory of Computing, El Paso, TX, Apr. 1997. COMM, San Diego, CA, Aug. 2001.
[19] H. Liu, V. Ramasubramanian, and E. G. Sirer. Client [34] R. Strom, G. Banavar, T. Chandra, M. Kaplan, K. Miller,
Behavior and Feed Characteristics of RSS, a Publish- B. Mukherjee, D. Sturman, and M. Ward. Gryphon: An
Subscribe System for Web Micronews. In Proc. of ACM Information Flow Based Approach to Message Broker-
Internet Measurement Conference, Berkeley, CA, Oct. ing. In Proc. of International Symposium on Software
2005. Reliability Engineering, Paderborn, Germany, Nov. 1998.
[20] L. Liu, C. Pu, and W. Tang. WebCQ: Detecting and De- [35] TIBCO Publish-Subscribe. http://www.tibco.com, Oct.
livering Information Changes on the Web. In Proc. of the 2005.
International Conference on Information and Knowledge [36] TSpaces. http://www.almaden.ibm.com/cs/TSpaces/,
Management, McLean, VA, Nov. 2000. Oct. 2005.
[21] L. Liu, C. Pu, W. Tang, and W. Han. CONQUER: A [37] R. van Renesse, K. P. Birman, and W. Vogels. Astrolabe:
Continual Query System for Update Monitoring in the A Robust and Scalable Technology For Distributed Sys-
WWW. International Journal of Computer Systems, Sci- tems Monitoring, Management, and Data Mining. ACM
ence and Engineering, 14(2):99–112, 1999. Transactions on Computer Systems, 21(3), May 2003.
[22] D. Malkhi, M. Naor, and D. Ratajczak. Viceroy: A Scal- [38] U. Wieder and M. Naor. A Simple Fault Tolerant Dis-
able and Dynamic Emulation of the Butterfly. In Proc. of tributed Hash Table. In Proc. of International Workshop
ACM Symposium on Principles of Distributed Comput- on Peer-to-Peer Systems, Berkeley, CA, Feb. 2003.
ing, Monterey, CA, Aug. 2002. [39] B. Zhao, L. Huang, J. Stribling, S. Rhea, A. Joseph, and
[23] P. Maymounkov and D. Mazieres. Kademlia: A Peer-to- J. Kubiatowicz. Tapestry: A Resilient Global-scale Over-
peer Information System Based on the XOR Metric. In lay for Service Deployment. IEEE Journal on Selected
Proc. of International Workshop on Peer-to-Peer Systems, Areas in Communications, 22(1):41–53, Jan. 2004.
Cambridge, CA, Mar. 2002. [40] L. Zhou, F. B. Schneider, and R. van Renesse. COCA: A
[24] A. Mizrak, Y. Cheng, V. Kumar, and S. Savage. Struc- Secure Distributed On-line Certification Authority. ACM
tured Superpeer: Leveraging Heterogeneity to Provide Transactions on Computer Systems, 20(4):329–368, Nov.
Constant-time Lookup. In Proc. of IEEE Workshop on 2002.
Internet Applications, San Francisco, CA, Apr. 2003.

Corona

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Corona

Enviado por

Direitos autorais:

Formatos disponíveis

Corona: A High Performance Publish-Subscribe System for the World Wide Web

Venugopalan Ramasubramanian Ryan Peterson Emin Gün Sirer

Abstract Consequently, content providers have to handle the high

ables Corona to efficiently resolve the tradeoff between

comparing against any old version of the content. Hence, 1 s i bl i 1 qi bN ≤ T 1 qi

Number of Polling Nodes

Figure 4: Average Update Detection Time: Corona-Lite

nodes poll for each channel, Corona-Lite quickly con-

Update Detection Time (sec)

Table 2: Performance Summary: This table provides a

It is more beneficial to distribute the polling across many 3

Update Detection Time (sec)

ularity distribution of channels: popular channels are

Update Detection Time (sec)

modified Corona-Fair schemes provide an average per-

Você também pode gostar