Você está na página 1de 43

Distributed Systems

Principles and Paradigms

Chapter 02
(version 31st August 2001)

Maarten van Steen


Vrije Universiteit Amsterdam, Faculty of Science
Dept. Mathematics and Computer Science
Room R4.20. Tel: (020) 444 7784
E-mail: steen@cs.vu.nl, URL: www.cs.vu.nl/ steen

01 Introduction
02 Communication
03 Processes
04 Naming
05 Synchronization
06 Consistency and Replication
07 Fault Tolerance
08 Security
09 Distributed Object-Based Systems
10 Distributed File Systems
11 Distributed Document-Based Systems
12 Distributed Coordination-Based Systems
00 – 1 /
Layered Protocols

Low-level layers

Transport layer

Application layer

Middleware layer

02 – 1 Communication/2.1 Layered Protocols


Basic Networking Model

Application protocol
Application 7
Presentation protocol
Presentation 6
Session protocol
Session 5
Transport protocol
Transport 4
Network protocol
Network 3
Data link protocol
Data link 2
Physical protocol
Physical 1

Network

Drawbacks:

Focus on message-passing only


Often unneeded or unwanted functionality
Question: Violates transparency?

02 – 2 Communication/2.1 Layered Protocols


Low-level layers

Physical layer: contains the specification and imple-


mentation of bits, and their transmission between
sender and receiver

Data link layer: prescribes the transmission of a se-


ries of bits into a frame to allow for error and flow
control

Network layer: describes how packets in a network


of computers are to be routed.

Observation: for many distributed systems, the lowest-


level interface is that of the network layer.

02 – 3 Communication/2.1 Layered Protocols


Transport Layer

Important: The transport layer provides the actual


communication facilities for most distributed systems.

Standard Internet protocols:

TCP: connection-oriented, reliable, stream-oriented


communication
UDP: unreliable (best-effort) datagram communi-
cation

Note: IP multicasting is generally considered a stan-


dard available service.

02 – 4 Communication/2.1 Layered Protocols


Client–Server TCP

TCP for transactions (T/TCP): A transport protocol


aimed to support client–server interaction

Client Server Client Server

1 1
SYN SYN,request,FIN
2
2
SYN,ACK(SYN) SYN,ACK(FIN),answer,FIN

3
3
4 ACK(SYN)
ACK(FIN)
5 request
FIN

6
ACK(req+FIN)
7
answer 8
FIN

Time 9 Time
ACK(FIN)

(a) (b)

02 – 5 Communication/2.1 Layered Protocols


Application Layer

Observation: Many application protocols are directly


implemented on top of transport protocols, doing a lot
of application-independent work.
News FTP WWW
Transfer NNTP FTP HTTP
Encoding 7-bit + MIME 7-bit text + 8-bit +
8-bit binary content type
(user has to
guess)
Naming Newsgroup Host + path URL
Distribution Push Pull Pull
Replication Flooding Caching + Caching +
DNS tricks DNS tricks
Security None (PGP) Username + Username +
Password Password

02 – 6 Communication/2.1 Layered Protocols


Middleware Layer
Observation: Middleware is invented to provide com-
mon services and protocols that can be used by many
different applications:

A rich set of communication protocols, but which


allow different applications to communicate
Marshaling and unmarshaling of data, necessary
for integrated systems
Naming protocols, so that different applications
can easily share resources
Security protocols, to allow different applications
to communicate in a secure way
Scaling mechanisms, such as support for replica-
tion and caching

Note: what remains are truly application-specific pro-


tocols

Question: Such as...?


02 – 7 Communication/2.1 Layered Protocols
Remote Procedure Call (RPC)

Basic RPC operation

Parameter passing

Variations

02 – 8 Communication/2.2 Remote Procedure Call


Basic RPC Operation

Observations:

Application developers are familiar with simple pro-


cedure model
Well-engineered procedures operate in isolation
(black box)
There is no fundamental reason not to execute
procedures on separate machine

Conclusion: communication between caller & callee


can be hidden by using procedure-call mechanism.

Wait for result


Client

Call remote Return


procedure from call

Request Reply
Server
Call local procedure Time
and return results

02 – 9 Communication/2.2 Remote Procedure Call


RPC Implementation (1/2)
 
   
 

Local procedure call: (   ! #"%$ )

1: Push parameter values of the procedure on a stack


2: Call procedure
3: Use stack for local variables
4: Pop results (in parameters)
Stack pointer

Main program's Main program's


local variables local variables
bytes
buf
fd
return address
read's local
variables

(a) (b)

Principle: “communication” with local procedure is


handled by copying data to/from the stack (with a few
exceptions)
02 – 10 Communication/2.2 Remote Procedure Call
RPC Implementation (2/2)
Client machine Server machine

Client process Server process


1. Client call to
procedure Implementation 6. Stub makes
of add local call to "add"
Server stub
k = add(i,j) k = add(i,j)
Client stub
proc: "add" proc: "add"
int: val(i) int: val(i) 5. Stub unpacks
2. Stub builds message
int: val(j) message int: val(j)

proc: "add" 4. Server OS


Client OS int: val(i) Server OS hands message
int: val(j) to server stub

3. Message is sent
across the network

02 – 11 Communication/2.2 Remote Procedure Call


RPC: Parameter Passing (1/2)

Parameter marshaling: There’s more than just wrap-


ping parameters into a message:

Client and server machines may have different


data representations (think of byte ordering)
Wrapping a parameter means transforming a value
into a sequence of bytes
Client and server have to agree on the same en-
coding:
– How are basic data values represented (inte-
gers, floats, characters)
– How are complex data values represented (ar-
rays, unions)
Client and server need to properly interpret mes-
sages, transforming them into machine-dependent
representations.

02 – 12 Communication/2.2 Remote Procedure Call


RPC: Parameter Passing (2/2)
RPC parameter passing:
RPC assumes copy in/copy out semantics:
while procedure is executed, nothing can be as-
sumed about parameter values (only Ada sup-
ports this model).
RPC assumes all data that is to be operated on
is passed by parameters. Excludes passing ref-
erences to (global) data.

Conclusion: full access transparency cannot be re-


alized.

Observation: If we introduce a remote reference mech-


anism, access transparency can be enhanced:
Remote reference offers unified access to remote
data
Remote references can be passed as parameter
in RPCs

02 – 13 Communication/2.2 Remote Procedure Call


Local RPCs: Doors

Essence: Try to use the RPC mechanism as the only


mechanism for interprocess communication (IPC).
Doors are RPCs implemented for processes on the
same machine.

Computer

Client process Server process


server_door(...)
{
...
door_return(...);
}
main()
{ main()
... {
fd = open(door_name, ... ); ...
Register door fd = door_create(...);
door_call(fd, ... );
... fattach(fd, door_name, ... );
} ...
}

Operating system

Invoke registered door


at other process Return to calling process

02 – 14 Communication/2.2 Remote Procedure Call


Asynchronous RPCs
Essence: Try to get rid of the strict request-reply be-
havior, but let the client continue without waiting for an
answer from the server.

Client Wait for result Client Wait for acceptance

Call remote Return Call remote Return


procedure from call procedure from call

Request Request Accept request


Reply

Server Call local procedure Time Server Call local procedure Time
and return results
(a) (b)

Variation: deferred synchronous RPC:

Wait for Interrupt client


acceptance
Client

Call remote Return


procedure from call Return
results Acknowledge
Accept
Request request
Server
Call local procedure Time
Call client with
one-way RPC

02 – 15 Communication/2.2 Remote Procedure Call


RPC in Practice

Essence: Let the developer concentrate on only the


client- and server-specific code; let the RPC system
(generators and libraries) do the rest.

Uuidgen

Interface
definition file

IDL compiler

Client code Client stub Header Server stub Server code

#include #include

C compiler C compiler C compiler C compiler

Client Client stub Server stub Server


object file object file object file object file

Runtime Runtime
Linker Linker
library library

Client Server
binary binary

02 – 16 Communication/2.2 Remote Procedure Call


Client-to-Server Binding (DCE)

Issues: (1) Client must locate server machine, and


(2) locate the server.

Example: DCE uses a separate daemon for each


server machine.

Directory machine

Directory
server
2. Register service
3. Look up server
Server machine
Client machine

5. Do RPC 1. Register endpoint


Server
Client

4. Ask for endpoint DCE


daemon Endpoint
table

02 – 17 Communication/2.2 Remote Procedure Call


Remote Object Invocation

Distributed objects

Remote method invocation

Parameter passing

02 – 18 Communication/2.3 Remote Object Invocation


Remote Distributed Objects (1/2)

Data and operations encapsulated in an object


Operations are implemented as methods, and
are accessible through interfaces
Object offers only its interface to clients
Object server is responsible for a collection of
objects
Client stub (proxy) implements interface
Server skeleton handles (un)marshaling and ob-
ject invocation
Client machine Server machine
Object
Client Server
State
Same
Client interface Method
invokes as object
a method
Skeleton
Interface
invokes
Proxy same method Skeleton
at object
Client OS Server OS

Network
Marshalled invocation
is passed across network

02 – 19 Communication/2.3 Remote Object Invocation


Remote Distributed Objects (2/2)

Compile-time objects: Language-level objects, from


which proxy and skeletons are automatically gener-
ated.

Runtime objects: Can be implemented in any lan-


guage, but require use of an object adapter that makes
the implementation appear as an object.

Transient objects: live only by virtue of a server: if


the server exits, so will the object.

Persistent objects: live independently from a server:


if a server exits, the object’s state and code remain
(passively) on disk.

02 – 20 Communication/2.3 Remote Object Invocation


Client-to-Object Binding (1/2)

Object reference: Having an object reference allows


a client to bind to an object:

Reference denotes server, object, and communi-


cation protocol
Client loads associated stub code
Stub is instantiated and initialized for specific ob-
ject

Two ways of binding:

Implicit: Invoke methods directly on the refer-


enced object
Explicit: Client must first explicitly bind to object
before invoking it

02 – 21 Communication/2.3 Remote Object Invocation


Client-to-Object Binding (2/2)




   !"

 $# %& ' ()*+-,/. 012 Implicit

3 

45 687+9 
 :;
<=>

   !"

 <=?
+-,/%@0$ 
:1 Explicit

 <=A# %& ' (B*+-,/.012

Some remarks:

Reference may contain a URL pointing to an im-


plementation file
(Server,object) pair is enough to locate target ob-
ject
We need only a standard protocol for loading and
instantiating code

Observation: Remote-object references allows us to


pass references as parameters. This was difficult with
ordinary RPCs.
02 – 22 Communication/2.3 Remote Object Invocation
Remote Method Invocation

Basics: (Assume client stub and server skeleton are


in place)

Client invokes method at stub


Stub marshals request and sends it to server
Server ensures referenced object is active:
– Create separate process to hold object
– Load the object into server process
– ...
Request is unmarshaled by object’s skeleton, and
referenced method is invoked
If request contained an object reference, invoca-
tion is applied recursively (i.e., server acts as client)
Result is marshaled and passed back to client
Client stub unmarshals reply and passes result to
client application

02 – 23 Communication/2.3 Remote Object Invocation


RMI: Parameter Passing (1/2)

Object reference: Much easier than in the case of


RPC:

Server can simply bind to referenced object, and


invoke methods
Unbind when referenced object is no longer needed

Object-by-value: A client may also pass a complete


object as parameter value:

An object has to be marshaled:


– Marshall its state
– Marshall its methods, or give a reference to
where an implementation can be found
Server unmarshals object. Note that we have now
created a copy of the original object.
Object-by-value passing tends to introduce nasty
problems

02 – 24 Communication/2.3 Remote Object Invocation


RMI: Parameter Passing (2/2)
Machine A Machine B
Local object
Local Remote object
O1 Remote
reference L1 O2
reference R1

Client code with


RMI to server at C
(proxy) New local
reference Copy of O1
Remote
invocation with
L1 and R1 as Copy of R1 to O2
parameters Server code
Machine C (method implementation)

Question: What’s an alternative implementation for a


remote-object reference?

02 – 25 Communication/2.3 Remote Object Invocation


Message-Oriented
Communication

Synchronous versus asynchronous communica-


tions

Message-Queuing System

Message Brokers

Example: IBM MQSeries

02 – 26 Communication/2.4 Message-Oriented Communication


Synchronous Communication
Some observations: Client/Server computing is gen-
erally based on a model of synchronous communi-
cation:

Client and server have to be active at the time of


communication
Client issues request and blocks until it receives
reply
Server essentially waits only for incoming requests,
and subsequently processes them

Drawbacks synchronous communication:

Client cannot do any other work while waiting for


reply
Failures have to be dealt with immediately (the
client is waiting)
In many cases the model is simply not appropri-
ate (mail, news)

02 – 27 Communication/2.4 Message-Oriented Communication


Asynchronous Communication:
Messaging

Message-oriented middleware: Aims at high-level


asynchronous communication:

Processes send each other messages, which are


queued
Sender need not wait for immediate reply, but can
do other things
Middleware often ensures fault tolerance

Messaging interface

Sending host Communication server Communication server Receiving host

Buffer independent
Routing of communicating Routing
Application program hosts Application
program

To other (remote)
communication
server
OS OS OS OS

Local network Internetwork


Local buffer Local buffer
Incoming message

02 – 28 Communication/2.4 Message-Oriented Communication


Persistent vs. Transient
Communication

Persistent communication: A message is stored at


a communication server as long as it takes to deliver
it at the receiver.

Transient communication: A message is discarded


by a communication server as soon as it cannot be
delivered at the next server, or at the receiver.

02 – 29 Communication/2.4 Message-Oriented Communication


Messaging Combinations
A sends message A sends message A stopped
and continues A stopped and waits until accepted
running running

A A
Message is stored
at B's location for Accepted
Time later delivery Time

B B
B starts and B is not B starts and
B is not receives running receives
running message message
(a) (b)

A sends message Send request and wait


and continues until received

A Message can be A
sent only if B is
running Request ACK
is received
Time Time
B B
B receives Running, but doing Process
message something else request
(c) (d)

Send request and wait until Send request


accepted and wait for reply
A A

Request Request Accepted


is received Accepted is received
Time Time
B B
Running, but doing Process Running, but doing Process
something else request something else request

(e) (f)

02 – 30 Communication/2.4 Message-Oriented Communication


Message-Oriented Middleware

Essence: Asynchronous persistent communication


through support of middleware-level queues. Queues
correspond to buffers at communication servers.

Canonical example: IBM MQSeries

02 – 31 Communication/2.4 Message-Oriented Communication


IBM MQSeries (1/3)
Basic concepts:
Application-specific messages are put into, and
removed from queues
Queues always reside under the regime of a queue
manager
Processes can put messages only in local queues,
or through an RPC mechanism

Message transfer:
Messages are transferred between queues
Message transfer between queues at different pro-
cesses, requires a channel
At each endpoint of channel is a message chan-
nel agent
Message channel agents are responsible for:
– Setting up channels using lower-level network
communication facilities (e.g., TCP/IP)
– (Un)wrapping messages from/in transport-level
packets
– Sending/receiving packets

02 – 32 Communication/2.4 Message-Oriented Communication


IBM MQSeries (2/3)
Client's receive
Routing table Send queue queue Receiving client
Sending client

Queue Queue
Program manager manager Program

MQ Interface

Server Server
Stub MCA MCA MCA MCA Stub
stub stub

RPC Local network


(synchronous) Internetwork
To other remote
Message passing queue managers
(asynchronous)

Channels are inherently unidirectional

MQSeries provides mechanisms to automatically


start MCAs when messages arrive, or to have a
receiver set up a channel

Any network of queue managers can be created;


routes are set up manually (system administra-
tion)

02 – 33 Communication/2.4 Message-Oriented Communication


IBM MQSeries (3/3)

Routing: By using logical names, in combination


with name resolution to local queues, it is possible to
put a message in a remote queue

Alias table Routing table


LA1 QMC QMB SQ1 Alias table Routing table
LA2 QMD QMC SQ1 LA1 QMA QMA SQ1
QMD SQ2 LA2 QMD QMC SQ1
QMD SQ1
SQ2
SQ1
QMA SQ1
QMB

Routing table SQ1 QMC Routing table


QMA SQ1
QMA SQ1
QMC SQ2 SQ2 QMB SQ1
QMB SQ1
QMD SQ1
Alias table
LA1 QMA SQ1
LA2 QMC
QMD

Question: What’s a major problem here?

02 – 34 Communication/2.4 Message-Oriented Communication


Message Broker
Observation: Message queuing systems assume a
common messaging protocol: all applications agree
on message format (i.e., structure and data represen-
tation)

Message broker: Centralized component that takes


care of application heterogeneity in a message-queuing
system:
Transforms incoming messages to target format,
possibly using intermediate representation
May provide subject-based routing capabilities
Acts very much like an application gateway
Database with
Source client Message broker conversion rules Destination client

Broker
program

Queuing
layer
OS OS OS

Network

02 – 35 Communication/2.4 Message-Oriented Communication


Stream-Oriented Communication

Support for continuous media

Streams in distributed systems

Stream management

02 – 36 Communication/2.5 Stream-Oriented Communication


Continuous Media
Observation: All communication facilities discussed
so far are essentially based on a discrete, that is time-
independent exchange of information

Continuous media: Characterized by the fact that


values are time dependent:
Audio
Video
Animations
Sensor data (temperature, pressure, etc.)

Transmission modes: Different timing guarantees with


respect to data transfer:
Asynchronous: no restrictions with respect to
when data is to be delivered
Synchronous: define a maximum end-to-end de-
lay for individual data packets
Isochronous: define a maximum and minimum
end-to-end delay (jitter is bounded)

02 – 37 Communication/2.5 Stream-Oriented Communication


Stream (1/2)
Definition: A (continuous) data stream is a connection-
oriented communication facility that supports isochronous
data transmission

Some common stream characteristics:

Streams are unidirectional


There is generally a single source, and one or
more sinks
Often, either the sink and/or source is a wrapper
around hardware (e.g., camera, CD device, TV
monitor, dedicated storage)

Stream types:

Simple: consists of a single flow of data, e.g., au-


dio or video
Complex: multiple data flows, e.g., stereo audio
or combination audio/video

02 – 38 Communication/2.5 Stream-Oriented Communication


Stream (2/2)

Issue: Streams can be set up between two processes


at different machines, or directly between two different
devices. Combinations are possible as well.

Sending process
Receiving process

Program

Stream
OS OS

Network
(a)

Camera
Display

Stream
OS OS

Network
(b)

02 – 39 Communication/2.5 Stream-Oriented Communication


Streams and QoS

Essence: Streams are all about timely delivery of


data. How do you specify this Quality of Service
(QoS)? Make distinction between specification and
implementation of QoS.

Flow specification: Use a token-bucket model and


express QoS in that model

Application

Irregular stream One token is added


of data units to the bucket every ∆T

Regular stream

Input characteristics Required Service


Maximum data unit size (bytes) Loss sensitivity (bytes)
Token bucket rate (bytes/sec) Loss interval (µsec)
Token bucket size (bytes) Burst loss sensitivity (data units)
Max. transmission rate (bytes/sec) Min. delay noticed (µsec)
Max. delay variation (µsec)
Quality of guarantee

02 – 40 Communication/2.5 Stream-Oriented Communication


Implementing QoS

Problem: QoS specifications translate to resource


reservations in underlying communication system. There
is no standard way of (1) QoS specs, (2) describing
resources, (3) mapping specs to reservations.

Approach: Use Resource reSerVation Protocol (RSVP)


as first attempt. RSVP is a transport-level protocol.

Sender process
RSVP-enabled host

Policy RSVP process


Application
control
Application
data stream
RSVP
program

Local OS
Reservation requests
Admission from other RSVP hosts
Data link layer
control

Data link layer


data stream
Internetwork

Local network
Setup information to
other RSVP hosts

02 – 41 Communication/2.5 Stream-Oriented Communication


Stream Synchronization

Problem: Given a complex stream, how do you keep


the different substreams in synch?

Example: Think of playing out two channels, that to-


gether form stereo sound. Difference should be less
than 20–30 µsec! Application tells
Receiver's machine middleware what
to do with incoming
Multimedia control streams
Application
is part of middleware

Middleware layer

Incoming stream OS

Network

Alternative: multiplex all substreams into a single


stream, and demultiplex at the receiver. Synchroniza-
tion is handled at multiplexing/demultiplexing point
(MPEG).
02 – 42 Communication/2.5 Stream-Oriented Communication

Você também pode gostar