Você está na página 1de 25

Distributed system

Distributed File System

Nguyen Huu Tuong Vinh

Huynh Thi Thu Thuy
Dang Trang Tri
• File system: support convenient
programming interface to disk storage,
access control and file looking mechanism.
• Developing of intranet, internet  sharing
data between many PC on net  Distributed
file system.( many OS and many file system
types. )
Storage systems and their
Sharing Persis- Distributed Consistency Example
tence cache/replicas maintenance

Main memory 1 RAM

File system 1 UNIX file system
Distributed file system Sun NFS
Web Web server

Distributed shared memory Ivy (DSM, Ch. 18)

Remote objects (RMI/ORB) 1 CORBA
Persistent object store 1 CORBA Persistent
Object Service
Peer-to-peer storage system 2 OceanStore (Ch. 10)

Types of consistency:
1: strict one-copy. 3: slightly weaker guarantees. 2: considerably weaker guarantees.
File system modules

Directory module: relates file names to file IDs

File module: relates file IDs to particular files

Access control module: checks permission for operation requested

File access module: reads or writes file data or attributes

Block module: accesses and allocates disk blocks

Device module: disk I/O and buffering

File attribute record structure
File length
Creation timestamp
Read timestamp
Write timestamp
Attribute timestamp
Reference count
File type
Access control list
File system
• Responsibilities: organization, storage,
retrieval, naming, sharing, protection.
• Functions: friendly programming interface,
freeing programmers from concern with
the details of storage allocation and
UNIX file system operations
filedes = open(name, mode) Opens an existing file with the given name.
filedes = creat(name, mode) Creates a new file with the given name.
Both operations deliver a file descriptor referencing the open
file. The mode is read, write or both.
status = close(filedes) Closes the open file filedes.
count = read(filedes, buffer, n) Transfers n bytes from the file referenced by filedes to buffer.
count = write(filedes, buffer, n) Transfers n bytes to the file referenced by filedes from buffer.
Both operations deliver the number of bytes actually transferred
and advance the read-write pointer.
pos = lseek(filedes, offset, Moves the read-write pointer to offset (relative or absolute,
whence) depending on whence).
status = unlink(name) Removes the file name from the directory structure. If the file
has no other names, it is deleted.
status = link(name1, name2) Adds a new name (name2) for a file (name1).
status = stat(name, buffer) Gets the file attributes for file name into buffer.
Distributed File System
• Requirements:
– Transparency: access, location, mobility,
– Concurrent File Updates
– File replication
– Hardware and OS heterogeneity
– Fault tolerance
– Consistency
– Security
– Efficiency
Case studies
• File service architecture
• Sun NFS
• Andrew file system
File service architecture
• Flat file service
• Directory service
• Client module
• Access control
• Hierarchic file system
• File group
File service architecture
Client computer Server computer

Application Application Directory service

program program

Flat file service

Client module
Flat file service operations

Read(FileId, i, n) -> Data If 1 ≤ i ≤ Length(File): Reads a sequence of up to n items

— throws BadPosition from a file starting at item i and returns it in Data.
Write(FileId, i, Data) If 1 ≤ i ≤ Length(File)+1: Writes a sequence of Data to a
— throws BadPosition file, starting at item i, extending the file if necessary.
Create() -> FileId Creates a new file of length 0 and delivers a UFID for it.
Delete(FileId) Removes the file from the file store.
GetAttributes(FileId) -> Attr Returns the file attributes for the file.
SetAttributes(FileId, Attr) Sets the file attributes (only those attributes that are not
shaded in Figure 8.3).
Directory service operations

Lookup(Dir, Name) -> FileId Locates the text name in the directory and returns the
— throwsNotFound relevant UFID. If Name is not in the directory, throws an
AddName(Dir, Name, FileId) If Name is not in the directory, adds (Name, File) to the
— throwsNameDuplicate directory and updates the file’s attribute record.
If Name is already in the directory: throws an exception.
UnName(Dir, Name) If Name is in the directory: the entry containing Name is
— throwsNotFound removed from the directory.
If Name is not in the directory: throws an exception.
GetNames(Dir, Pattern) -> NameSeqReturns all the text names in the directory that match the
regular expression Pattern.
• Virtual file system
• Client integration
• Access control and authentication
• Mount service
• Path name translation
• Automounter
• Server caching
• Client caching
• Other optimizations
• Securing NFS with Kerberos
• Performance
NFS architecture
Client computer Server computer

Application Application
program program
system calls
UNIX kernel
UNIX kernel Virtual file system Virtual file system
Local Remote
file system


file file

client server
system system
NFS server operations (simplified) – 1
lookup(dirfh, name) -> fh, attr Returns file handle and attributes for the file name in the directory
create(dirfh, name, attr) -> Creates a new file name in directory dirfh with attributes attr and
newfh, attr returns the new file handle and attributes.
remove(dirfh, name) status Removes file name from directory dirfh.
getattr(fh) -> attr Returns file attributes of file fh. (Similar to the UNIX stat system
setattr(fh, attr) -> attr Sets the attributes (mode, user id, group id, size, access time
and time of a file). Setting the size to 0 truncates the file.
read(fh, offset, count) -> attr, data Returns up to count bytes of data from a file starting at offset.
Also returns the latest attributes of the file.
write(fh, offset, count, data) -> attrWrites count bytes of data to a file starting at offset. Returns the
attributes of the file after the write has taken place.
rename(dirfh, name, todirfh, toname)
Changes the name of file name in directory dirfh to toname in
-> status directory to todirfh
link(newdirfh, newname, dirfh, name)
Creates an entry newname in the directory newdirfh which refers to
-> status file name in the directory dirfh.
Continues on next slide ...
NFS server operations (simplified) – 2
symlink(newdirfh, newname, string)Creates an entry newname in the directory newdirfh of type
-> status symbolic link with the value string. The server does not interpret
the string but makes a symbolic link file to hold it.
readlink(fh) -> string Returns the string that is associated with the symbolic link file
identified by fh.
mkdir(dirfh, name, attr) -> Creates a new directory name with attributes attr and returns the
newfh, attr new file handle and attributes.
rmdir(dirfh, name) -> status Removes the empty directory name from the parent directory dirfh.
Fails if the directory is not empty.
readdir(dirfh, cookie, count) -> Returns up to count bytes of directory entries from the directory
entries dirfh. Each entry contains a file name, a file handle, and an opaque
pointer to the next directory entry, called a cookie. The cookie is
used in subsequent readdir calls to start reading from the following
entry. If the value of cookie is 0, reads from the first entry in the
statfs(fh) -> fsstats Returns file system information (such as block size, number of
free blocks and so on) for the file system containing a file fh.
Local and remote file systems
accessible on an NFS client
Server 1 Client Server 2
(root) (root) (root)

export ... vmunix usr nfs

Remote Remote
people students x staff users
mount mount

big jon bob ... jim ann jane joe

Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1;
the file system mounted at /usr/staff in the client is actually the sub-tree located at /nfs/users in Server 2.
Andrew file system
• 2 unusual design characteristics: whole
file serving and whole file caching
• Scenario
• Cache consistency: update semantics
• Others aspects: Unix kernel modification,
location database, thread, read only
replicas, bulk transfers, partial file caching,
performance, wide-area support.
Distribution of processes in the
Andrew File System
Workstations Servers

User Venus
UNIX kernel

UNIX kernel

User Venus Network

UNIX kernel


program UNIX kernel
UNIX kernel
File name space seen by clients
of AFS
Local Shared
/ (root)

tmp bin . . . vmunix cmu


System call interception in AFS

User Venus
UNIX file Non-local file
system calls operations

UNIX kernel
UNIX file system

Implementation of file system calls in AFS
U s er p r o ces s U N IX ker n el Ven u s Ne t Vice
o p en (FileN a m e, If FileN a m erefers to a
m o d e) file in s h ared file s p ace,
p as s th e req u es t to C h eck lis t o f files in
lo cal cach e. If n o t
V en u s . p res en t o r th ere is n o
v alidca llb a ck p r o m ,is e
s en d a req u es t fo r th e
file to th e V ice s erv er
th at is cu s to d ian o f th e
v o lu me co n tain in g th e Tran s fer a co p y o f th e
file. file an d aca llb a ck
p r o m is eto th e
w o rk s tatio n . Lo g th e
P lace th e co p y o f th e callb ack p ro mis e.
file in th e lo cal file
O p en th e lo cal file an ds y s tem, en ter its lo cal
retu rn th e file n ame in th e lo cal cach e
d es crip to r to th e lis t an d retu rn th e lo cal
ap p licatio n . n ame to U N IX .
r ea d (FileD es cr ip to r ,P erfo rm a n o rmal
Bu ffer , len g th ) U N IX read o p eratio n
o n th e lo cal co p y .
w r ite(FileD es cr ip to rP, erfo rm a n o rmal
Bu ffer , len g th ) U N IX w rite o p eratio n
o n th e lo cal co p y .
clo s e(FileD es cr ip to rC) lo s e th e lo cal co p y
an d n o tify V en u s th at
th e file h as b een clo s edIf . th e lo cal co p y h as
b een ch an g ed , s en d a
co p y to th e V ice s erv er R ep lace th e file
th at is th e cu s to d ian o f co n ten ts an d s en d a
th e file. ca llb a ckto all o th er
clien ts h o ld incag llba ck
p r o m is eso n th e file.
The main components of the Vice
service interface
Fetch(fid) -> attr, dataReturns the attributes (status) and, optionally, the contents of file
identified by the fid and records a callback promise on it.
Store(fid, attr, data) Updates the attributes and (optionally) the contents of a specified
Create() -> fid Creates a new file and records a callback promise on it.
Remove(fid) Deletes the specified file.
SetLock(fid, mode) Sets a lock on the specified file or directory. The mode of the
lock may be shared or exclusive. Locks that are not removed
expire after 30 minutes.
ReleaseLock(fid) Unlocks the specified file or directory.
RemoveCallback(fid) Informs server that a Venus process has flushed a file from its
BreakCallback(fid) This call is made by a Vice server to a Venus process. It cancels
the callback promise on the relevant file.
Recent advances
• NFS enhancement
• AFS enhancement
• Improvements in storage organization
• New design approaches