Você está na página 1de 61

File Systems

Fred Kuhns
(fredk@arl.wustl.edu, http://www.arl.wustl.edu/~fredk)

Department of Computer Science and Engineering


Washington University in St. Louis

Washington
WASHINGTON UNIVERSITY IN ST LOUIS
Why a file system
• There is a general need for long-term
and shared data storage:
– need to store large amount of information
– persistent storage (outlives process and
system reboots)
– concurrent sharing of information
• Files meet these requirements
• The file manager or file system within
the OS

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 2


File Concept
• Abstraction presented to the user
• Named collection of related information on
secondary storage.
• File name may encode the file type
– file extensions in UNIX and Windows
• Common examples of File types
– Regular files, directories
– Executable files
– special files (block and character)
– Archives
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 3
File Structure
• None - sequence of words, bytes
• Simple record structure
– Lines, Fixed length, Variable length
• Complex Structures
– Formatted document, multi-media documents
• Who decides:
– Operating system
– Application
– “Middleware”
– DBMS

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 4


File Attributes
• Name – only information kept in human-readable
form.
• Type – needed for systems that support
different types.
• Location – pointer to file location on device.
• Size – current file size.
• Protection – controls who can do reading, writing,
executing.
• Time, date, and user identification – data for
protection, security, and usage monitoring.
• Information about files are kept in the directory
structure, which is maintained on the disk.
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 5
File Operations
• create
• write
• read
• reposition within file – file seek
• delete
• truncate
• open(Fi) – search the directory structure on
disk for entry Fi, and move the content of
entry to memory.
• close (Fi) – move the content of entry Fi in
memory to directory structure on disk.
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 6
File Types – name, extension
File Type Usual extension Function
Executable exe, com, bin or ready­to­run machine­
none language program
Object obj, o complied, machine
language, not linked
Source code c, p, pas, 177, source code in various
asm, a languages
Batch bat, sh commands to the
command interpreter
Text txt, doc textual data documents
Word processor wp, tex, rrf, etc. various word­processor
formats
Library lib, a libraries of routines
Print or view ps, dvi, gif ASCII or binary file
Archive arc, zip, tar related files grouped
into one file, sometimes
compressed.
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 7
Access Methods
• Sequential Access -
– read next
– write next
– reset
– no read after last write
– (rewrite)
• Direct Access: n = relative block number
– read n
– write n
– position to n
– read next
– write next
– rewrite n
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 8
Directory Structure
• A collection of nodes containing information
about all files.

Directory

Files
F 1 F 2 F 4
F 3
F n

Both the directory structure and the files reside on disk.


Backups of these two structures are kept on tapes.
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 9
Information in a Device Directory
• Name
• Type
• Address
• Current length
• Maximum length
• Date last accessed (for archival)
• Date last updated (for dump)
• Owner ID (who pays)
• Protection information (discuss later)

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 10


Operations Performed on Directory

• Search for a file


• Create a file
• Delete a file
• List a directory
• Rename a file
• Traverse the file system

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 11


Organize the Directory (Logically)
• Efficiency – locating a file quickly.
• Naming – convenient to users.
– Two users can have same name for different
files.
– The same file can have several different
names.
• Grouping – logical grouping of files by
properties, (e.g., all Pascal programs, all
games, …)

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 12


Single-Level Directory

• A single directory for all users.

• Naming problem
• Grouping problem

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 13


Two-Level Directory
• Separate directory for each user.

• Path name
• Can have the same file name for different user
• Efficient searching
• No grouping capability
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 14
Tree-Structured Directories

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 15


Tree-Structured Directories
• Efficient searching
• Grouping Capability
• Current directory (working directory)
– cd /spell/mail/prog
– type list

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 16


Tree-Structured Directories
• Absolute or relative path name
• Creating a new file is done in current directory.
• Delete a file
rm <file-name>
• Creating a new subdirectory is done in current
directory.
mkdir <dir-name>
Example: if in current directory /spell/mail
mkdir count
mail

prog copy prt exp count

Deleting mail ⇒ deleting the entire subtree rooted by ‘mail’


Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 17
Acyclic-Graph Directories
• Have shared subdirectories and files.

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 18


Acyclic-Graph Directories
• Two different names (aliasing)
• If dict deletes all ⇒ dangling pointer.
Solutions:
– Backpointers, so we can delete all pointers.
Variable size records a problem.
– Backpointers using a daisy chain organization.
– Entry-hold-count solution.

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 19


General Graph Directory

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 20


General Graph Directory (Cont.)

• How do we guarantee no cycles?


– Allow only links to file not
subdirectories.
– Garbage collection.
– Every time a new link is added use a
cycle detection algorithm to determine
whether it is OK.

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 21


Protection
• File owner/creator should be able to
control:
– what can be done
– by whom
• Types of access
– Read
– Write
– Execute
– Append
– Delete
– List

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 22


Access Lists and Groups
• Mode of access: read, write, execute
• Three classes of users
RWX
a) owner access 7 ⇒ 111
RWX
b) groups access 6 ⇒ 110
RWX
c) public access 1 ⇒ 001

• Ask manager to create a group (unique name), say G, and add


some users to the group.
• For particular file or subdirectory, define an appropriate
access. owner group public

chmod 761 game

Attach a group to a file


chgrp G game
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 23
File-System Structure
• Disk divided into one or more partitions
– independent FS on each partition
– Sector 0 contains the Master Boot Record (MBR)
– MBR contains partition table
– one partition marked as active
– boot block – first block of active partition
– BIOS reads and executes MBR, which reads boot
block and executes it.
– program in boot block loads OS and runs it.
– Often FS contains superblock which contains key
FS parameters

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 24


Example Disk and Filesystem Layout
Partition Table

MBR partition 1 partition 2 (active) partition 3

boot block super block free space management inode list root dir files & dirs

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 25


Files: Contiguous Allocation
• Each file occupies a set of contiguous blocks on
the disk.
• Simple – only starting location (block #) and
length (number of blocks) are required.
• Random access.
• Wasteful of space external fragmentation, may
use compaction to fix.
• Files cannot grow.
• Mapping from logical to physical.

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 26


Linked Allocation
• Each file is a linked list of disk blocks:
blocks may be scattered anywhere on the
disk.

block      = pointer

Allocate as needed, link together; e.g., file


starts at block 9

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 27


Linked Allocation (Cont.)

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 28


Linked Allocation (Cont.)
• Simple – need only starting address
• Free-space management system – no
waste of space
• No random access
• Mapping

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 29


Indexed Allocation
• Brings all pointers together into the index
block.
• Logical view.

index table

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 30


Example of Indexed Allocation

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 31


Indexed Allocation (Cont.)
• Need index table
• Random access
• Dynamic access without external
fragmentation, but have overhead of
index block.
• Wasted space
• Index levels

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 32


Indexed Allocation – Mapping

outer­index

index table file

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 33


Combined Scheme: UNIX

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 34


Disk Space Management
• Block size – disk utilization and
performance dependent on file size.
– For example, assume medium size of 2KB
– 130,000 B/track, 7200RPM (8.33ms), 10ms
average seek time
– Txfer time = 10 + 4.165 + 8.33*k/130000
– rate = k/(10 + 4.165 + 8.33*k/130000)
• Disk space utilization
– with larger blocks you increase internal
fragmentation
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 35
Free-Space Management
• Bit vector (n blocks)
0 1 2 n­1

0 ⇒ block[i] free
bit[i] =


1  ⇒ block[i] occupied

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 36


Free-Space Management (Cont.)
• Bit map requires extra space. Example:
block size = 212 bytes (4096 Bytes)
disk size = 234 bytes (16 GByte)
n = 234/212 = 222 bits (4Mbits=512 KBytes)
• Linked list (free list)
– Cannot get contiguous space easily
– No waste of space
• Grouping
• Counting

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 37


Free-Space Management (Cont.)
• Need to protect:
– Pointer to free list
– Bit map
• Must be kept on disk
• Copy in memory and disk may differ.
• Cannot allow for block[i] to have a situation
where bit[i] = 1 in memory and bit[i] = 0 on disk.
– Solution:
• Set bit[i] = 1 in disk.
• Allocate block[i]
• Set bit[i] = 1 in memory

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 38


Directory Implementation
• Linear list of file names with pointer to the
data blocks.
– simple to program
– time-consuming to execute
• Hash Table – linear list with hash data
structure.
– decreases directory search time
– collisions – situations where two file names hash
to the same location
– fixed size

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 39


Efficiency and Performance
• Efficiency dependent on:
– disk allocation and directory algorithms
– types of data kept in file’s directory entry
• Performance
– disk cache – separate section of main memory
for frequently sued blocks
– free-behind and read-ahead – techniques to
optimize sequential access
– improve PC performance by dedicating section
of memory as virtual disk, or RAM disk.

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 40


Various Disk-Caching Locations

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 41


Recovery

• Consistency checker – compares data


in directory structure with data
blocks on disk, and tries to fix
inconsistencies.
• Use system programs to back up data
from disk to another storage device
(floppy disk, magnetic tape).
• Recover lost file or disk by restoring
data from backup.
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 42
File System Implementations
• UNIX Examples - SVR4 and BSD

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 43


UNIX FS Framework

Provides persistent storage


Facilities for managing data
Interface exported abstractions: files, directories, file
descriptors and file systems
kernel does not interpret file contents
files and directories form tree structure

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 44


File and Directory Organization

(hard) links /

bin etc dev usr vmunix

sh local etc
/usr/local/bin/bash
bin

bash

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 45


File Attributes

Type - for example regular, FIFO, special.


Reference count
size in bytes
device id
ownership
access modes
timestamps

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 46


User View of Files

File Descriptors (open, dup, dup2, fork)


All I/O is through file descriptors
references the open file object
per process object
File Object - holds context
created by an open() system call
stores file offset
reference to vnode
vnode - abstract representation of a file

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 47


How it works

File Descriptors Open File Objects


{{0, uf_ofile} {*f_vnode,f_offset,f_count,...},
{1, uf_ofile} {*f_vnode,f_offset,f_count,...},
{2 , uf_ofile} {*f_vnode,f_offset,f_count,...},
{3 , uf_ofile} {*f_vnode,f_offset,f_count,...},
{4 , uf_ofile} {*f_vnode,f_offset,f_count,...}}
{5 , uf_ofile}}

Vnode/vfsVnode/vfs
In-memory Vnode/vfsVnode/vfs
In-memory Vnode/vfs
representation In-memory
representation In-memory
In-memory
representation
of file of file representation
representation
of file of file
of file
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 48
File Systems

File hierarchy composed of one or more File Systems


One File System is designated the Root File System
Attached to mount points
File can not span multiple File Systems
resides on one logical disk

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 49


Logical Disks

Viewed as linear sequence of fixed sized, randomly


accessible blocks.
A file system must reside in a logical disk, however a logical
disk need not contain a file system.
Typically physical disks divided into partitions that
correspond to logical disks

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 50


FS Overview

System calls

vnode interface

tmpfs swapfs UFS HSFS PCFS RFS /proc NFS

disk cdrom diskette Process


Anonymous
address
memory
space

Example from Solaris


Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 51
Local Filesystems

S5fs - System V file system. Based on the original


implementation.
FFS/UFS - BSD developed filesystem with optimized disk
usage algorithms

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 52


S5fs - Disk layout

Viewed as a linear array of blocks


Typical disk block size 512, 1024, 2048 bytes
Physical block number is the block’s index
disk uses cylinder, track and sector
first few blocks are the boot area, which is followed by the
inode list (fixed size)

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 53


Disk Layout

tract sector heads

cylinder

Rotational speed platters


disk seek time

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 54


S5fs disk layout

bootarea superblock inode list data

Boot area - code to initialize bootstrap the system

Superblock - metadata for filesystem. Size of FS, size


of inode list, number of free blocks/inodes,
free block/inode list

inode list - linear array of 64byte inode structs

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 55


s5fs - some details

inode name
8 . Di_mode (2)
45 .. di_nlinks (2)
0 “” di_uid (2)
123 myfile di_gid (2)
di_size (4)
di_addr (39)
di_gen (1)
di_atime (4)
di_mtime (4)
2 byte 14byte di_ctime (4)

directory On-disk inode

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 56


Locating file data blocks

Assume 1024 Byte Blocks


0
1
2
3
4

cks
5
6 blo
7
256

8 k s
l o c
9 b
3 6
10 - indirect 5 , 5
11 - double indirect 6
12 - triple indirect 16,777,216 blocks
Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 57
BSD - FFS

Disk partition divided into cylinder groups


superblocks restructured and replicated across partition
Constant information
cylinder group summary info such as free inodes and free
block
support block fragments
Long file names
new disk block allocation strategy

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 58


FFS Allocation strategy

Goal: Collocate similar data/info.


file inodes located in same cyl group as dir.
new dirs created in different cyl groups.
Place file data blocks/inode in same cyl group - for size < 48K
allocate sequential blocks at a rotationally optimal position.
Choose cyl group with “best” free count

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 59


Is FFS/UFS Better?

Measurements have shown substantial performance benefits


over s5fs
FFS however, is sub-optimal when the disk is nearly full.
Thus 10% is always kept free.
Modern disks however, no longer match the underlying
assumptions of FFS

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 60


Example Disk Buffer Cache

Free
Hash (device,inode) (LRU)

Fred Kuhns (01/17/09) Cs422 – Operating Systems Organization 61

Você também pode gostar