Escolar Documentos
Profissional Documentos
Cultura Documentos
Examples:
DNS
Single naming authority per zone lazy propagation of updates
WWW
No write-write conflicts Usually acceptable to serve slightly out-of-date pages from a cache
Eventual Consistency
The principle of a mobile user accessing different replicas of a distributed database.
If no updates take place for some time, all replicas gradually converge to a consistent state
3
Monotonic Reads
WS(x1) is part of WS(x2) If a process has seen a value of x at time t, it will never see an older value at a later time. Example: -replicated mailboxes with on-demand propagation of updates
The read operations performed by a single process P at two different local copies of the same data store. a) A monotonic-read consistent data store b) A data store that does not provide monotonic reads.
5
Monotonic Writes
If an update is made to a copy, all preceding updates must have been completed first. A write may affect only part of the state of a data item
No guarantee that x at L2 has the same value as x at L1 at the time W(x1) completed
a) b)
The write operations performed by a single process P at two different local copies of the same data store A monotonic-write consistent data store. A data store that does not provide monotonic-write consistency.
6
a) b)
A data store that provides read-your-writes consistency. A data store that does not.
7
a) b)
A writes-follow-reads consistent data store A data store that does not provide writesfollow-reads consistency
8
Per-client state:
Read set
Write IDs relevant to clients read operations
Write set
IDs of writes performed by client
After the read, the clients read set is updated with the servers relevant writes
Monotonic write:
When a client issues a write, the server is given the clients write set
to ensure that all specified writes have been applied (in-order)
Writes-follow-reads:
Server is brought up-to-date with the writes in the clients read set After write, the new ID is added to the clients write set, along with the IDs in the read set
as these have become relevant for the write just performed
11
13
14
The logical organization of different kinds of copies of a data store into three concentric rings.
15
Server-initiated
Push caches
Dynamic replication to handle bursts Read-only
Client-initiated
Improve access time to data
Danger of stale data
Server-Initiated Replicas
Counting access requests from different clients.
P := closest server for both C1 & C2
CntQ(P, F)
At each server: Count of accesses for each file Originating clients Routing DB to determine closest server for client C
Deletion threshold: del(S, F) Replication threshold: rep(S, F)
Update propagation
State vs Operations
Notification of an update
Invalidation protocols Best for low read/write ratio (%)
Pull vs Push
Push replicas maintain a high degree of consistency
Updates are expected to be of use to multiple readers
Pull best for low read/write % Hybrid scheme based on lease model
Unicast vs Multicast
Push multicast group Pull single server or client requests an update
18
Leases
A promise by a server that it will push updates for a specified time period
After expiration, client has to pull for updates
Alternatives:
Age-based leases
Depending on the last time an item was modified
Long-lasting leases for items that are expected to remain unmodified
Comparison between push-based & pull-based protocols in the case of multiple client, single server systems.
20
Primary-based remote-write protocol with a fixed server to which all read & write operations are forwarded.
21
Primary-backup protocols
Blocking updates
straightforward implementation of sequential consistency
The primary orders all updates Processes see the effects of their most recent write
Non-blocking updates
reduce blocking delay for the process that initiated the update
The process only waits until the primarys ACK
Fault tolerance ?
23
Keeping track of each data items current location ? Primary-based local-write protocol in which a single copy is migrated between processes.
24
Suitable for disconnected operation Primary-backup protocol in which the primary migrates to the process wanting to perform an update.
25
(a) Forwarding an invocation request from a replicated object. (b) Returning a reply to a replicated object.
27
Write:
Version number inquiries to find set (g) of RMs
totV(g) >= W up-to-date copies
If there are insufficient up-to-date copies, replace a non-current copy with a copy of the current copy
After obtaining a read quorum, a read may be carried out on the local copy if it is up-to-date
Blocking probability:
In some cases, a quorum cannot be obtained
30
Ex1: file with high % read/write Ex2: file with moderate %read/write
Reads can be satisfied by local RM, but writes must also access one remote RM
Derived performance of file suite: Read Write Latency Blocking probability Latency Blocking probability 65 0.01 75 0.01 75 0.0002 100 0.0101 75 0.000001 750 0.03
Ex3: file with very high % read/write Examples assume 99% availability for RMs 31
Quorum-Based Protocols
Three examples of the voting algorithm: a) A correct choice of read & write set b) A choice that may lead to write-write conflicts c) A correct choice, known as ROWA (read one, write all)
32
Replication transparency
One copy serializability Read one, write all
Failures must be observed to have happened before any active Tx s at other servers
33
Network Partitions
Separate but viable groups of servers Optimistic schemes validate on recovery
Available copies with validation
partition
34
Fault Tolerance
Design to recover after a failure with no loss of (committed) data. Designs for fault tolerance:
Single server, fail and recover Primary server with trailing backups Replicated service
35
Fault Tolerance = ?
Define correctness criteria When 2 replicas are separated by network partition:
Both are deemed incorrect & stop serving. One (the master) continues & the other ceases service. One (the master) continues to accept updates & both continue to supply reads (of possibly stale data). Both continue service & subsequently synchronise.
36
37
FE
RM Backup
38
FE
RM
FE
RM
41
Replication Architectures
How many replicas are required?
All or majority ? T
A
getBalance(A)
A A
deposit(B)
B
replica managers
B B B
42
getBalance(A)
deposit(B)
Y replica managers
deposit(A) U getBalance(B)
B M B N
B P
43
Local Validation
Failure & recovery events do not occur during a Tx. Example:
T reads A before server Xs failure, therefore T failX T observes server Ns failure when it writes B, therefore failN T failN T.getBalance(A) T.deposit(B) failX failX U.getBalance(B) U.deposit(A) failN
Server x fails followed by Transaction U which is followed by Server Ns failure which is followed by Transaction T which is followed by server Xs failure. This is inconsistent, so the transactions must not be allowed to commit.
Failure and recovery must be serialised just like a Tx: They occur before or after a Tx, but not during.
44