Você está na página 1de 7

Lecture Outine

• Why Automatic Memory Management?

Automatic Memory Management • Garbage Collection

• Three Techniques
Lecture 17 – Mark and Sweep
– Stop and Copy
– Reference Counting

Prof. Aiken CS 143 Lecture 17 1 Prof. Aiken CS 143 Lecture 17 2

Why Automatic Memory Management? Type Safety and Memory Management

• Storage management is still a hard problem in modern • Can types prevent errors in programs with
programming
manual allocation and deallocation of memory?
– some fancy type systems (linear types) were
• C and C++ programs have many storage bugs
– forgetting to free unused memory
designed for this purpose but they complicate
– dereferencing a dangling pointer programming significantly
– overwriting parts of a data structure by accident
– and so on...
• Currently, if you want type safety then you
• Storage bugs are hard to find must use automatic memory management
– a bug can lead to a visible effect far away in time and
program text from the source

Prof. Aiken CS 143 Lecture 17 3 Prof. Aiken CS 143 Lecture 17 4

Automatic Memory Management The Basic Idea

• This is an old problem: • When an object is created, unused space is


– studied since the 1950s for LISP automatically allocated
– In Cool, new objects are created by new X
• There are well-known techniques for
completely automatic memory management • After a while there is no more unused space

• Became mainstream with the popularity of • Some space is occupied by objects that will
Java never be used again
– This space can be freed to be reused later

Prof. Aiken CS 143 Lecture 17 5 Prof. Aiken CS 143 Lecture 17 6

1
The Basic Idea (Cont.) Garbage

• How can we tell whether an object will “never • An object x is reachable if and only if:
be used again”? – a register contains a pointer to x, or
– in general, impossible to tell – another reachable object y contains a pointer to x
– we will use heuristics
• You can find all reachable objects by starting
• Observation: a program can use only the from registers and following all the pointers
objects that it can find:
let x : A  new A in { x  y; ... } • An unreachable object can never be used
– After x  y there is no way to access the newly – such objects are garbage
allocated object
Prof. Aiken CS 143 Lecture 17 7 Prof. Aiken CS 143 Lecture 17 8

Reachability is an Approximation Tracing Reachable Values in Coolc

• Consider the program: • In coolc, the only register is the accumulator


x  new A; – it points to an object
y  new B – and this object may point to other objects, etc.
x  y;
if alwaysTrue() then x  new A else x.foo() fi • The stack is more complex
– each stack frame contains pointers
• After x  y (assuming y becomes dead there) • e.g., method parameters
– the first object A is unreachable – each stack frame also contains non-pointers
– the object B is reachable (through x) • e.g., return address
– thus B is not garbage and is not collected – if we know the layout of the frame we can find the pointers in
• but object B is never going to be used it

Prof. Aiken CS 143 Lecture 17 9 Prof. Aiken CS 143 Lecture 17 10

A Simple Example Elements of Garbage Collection

acc A B C D E • Every garbage collection scheme has the


following steps
1. Allocate space as needed for new objects
SP Frame 1 Frame 2 2. When space runs out:
a) Compute what objects might be used again
• In Coolc we start tracing from acc and stack (generally by tracing objects reachable from a
– These are the roots set of “root” registers)
b) Free the space used by objects not found in (a)
• Note B and D are unreachable from acc and
stack
• Some strategies perform garbage collection
– Thus we can reuse their storage before the space actually runs out
Prof. Aiken CS 143 Lecture 17 11 Prof. Aiken CS 143 Lecture 17 12

2
Mark and Sweep The Mark Phase

• When memory runs out, GC executes two let todo = { all roots }
phases while todo   do
– the mark phase: traces reachable objects pick v  todo
– the sweep phase: collects garbage objects todo  todo - { v }
if mark(v) = 0 then (* v is unmarked yet *)
• Every object has an extra bit: the mark bit mark(v)  1
– reserved for memory management let v1,...,vn be the pointers contained in v
– initially the mark bit is 0
todo  todo  {v1,...,vn}
fi
– set to 1 for the reachable objects in the mark
phase od
Prof. Aiken CS 143 Lecture 17 13 Prof. Aiken CS 143 Lecture 17 14

The Sweep Phase The Sweep Phase (Cont.)

• The sweep phase scans the heap looking for (* sizeof(p) is the size of block starting at p *)
p  bottom of heap
objects with mark bit 0
while p < top of heap do
– these objects were not visited in the mark phase if mark(p) = 1 then
– they are garbage mark(p)  0
else
add block p...(p+sizeof(p)-1) to freelist
• Any such object is added to the free list
fi
p  p + sizeof(p)
• The objects with a mark bit 1 have their mark od
bit reset to 0
Prof. Aiken CS 143 Lecture 17 15 Prof. Aiken CS 143 Lecture 17 16

Mark and Sweep Example Details

root A 0 B 0 C 0 D 0 E 0 F 0 • While conceptually simple, this algorithm has a


number of tricky details
free
– typical of GC algorithms
After mark:
root A 1 B 0 C 1 D 0 E 0 F 1
• A serious problem with the mark phase
free – it is invoked when we are out of space
After sweep: – yet it needs space to construct the todo list
root A 0 B 0 C 0 D 0 E 0 F 0 – the size of the todo list is unbounded so we cannot
reserve space for it a priori
free
Prof. Aiken CS 143 Lecture 17 17 Prof. Aiken CS 143 Lecture 17 18

3
Mark and Sweep: Details Evaluation of Mark and Sweep

• The todo list is used as an auxiliary data • Space for a new object is allocated from the new list
structure to perform the reachability analysis – a block large enough is picked
– an area of the necessary size is allocated from it
– the left-over is put back in the free list
• There is a trick that allows the auxiliary data
to be stored in the objects themselves • Mark and sweep can fragment the memory
– pointer reversal: when a pointer is followed it is
reversed to point to its parent
• Advantage: objects are not moved during GC
– no need to update the pointers to objects
• Similarly, the free list is stored in the free – works for languages like C and C++
objects themselves
Prof. Aiken CS 143 Lecture 17 19 Prof. Aiken CS 143 Lecture 17 20

Another Technique: Stop and Copy Stop and Copy Garbage Collection

• Memory is organized into two areas • Starts when the old space is full
– old space: used for allocation
– new space: used as a reserve for GC • Copies all reachable objects from old space
heap pointer into new space
– garbage is left behind
– after the copy phase the new space uses less space
old space new space than the old one before the collection

• The heap pointer points to the next free word • After the copy the roles of the old and new
in the old space spaces are reversed and the program resumes
• allocation just advances the heap pointer
Prof. Aiken CS 143 Lecture 17 21 Prof. Aiken CS 143 Lecture 17 22

Example of Stop and Copy Garbage Collection Implementation of Stop and Copy

Before collection: • We need to find all the reachable objects, as for mark
and sweep
root A B C D E F new space
• As we find a reachable object we copy it into the new
heap pointer space
After collection: root – And we have to fix ALL pointers pointing to it!

new space A C F free • As we copy an object we store in the old copy a


forwarding pointer to the new copy
– when we later reach an object with a forwarding pointer we
know it was already copied

Prof. Aiken CS 143 Lecture 17 23 Prof. Aiken CS 143 Lecture 17 24

4
Implementation of Stop and Copy (Cont.) Stop and Copy. Example (1)

• We still have the issue of how to implement • Before garbage collection


the traversal without using extra space
• The following trick solves the problem:
– partition the new space in three contiguous regions
start scan alloc

copied and scanned copied empty


root A B C D E F new space
copied objects copied objects
whose pointer whose pointer
fields were followed fields were NOT
followed
Prof. Aiken CS 143 Lecture 17 25 Prof. Aiken CS 143 Lecture 17 26

Stop and Copy. Example (2) Stop and Copy. Example (3)

• Step 1: Copy the objects pointed to by roots • Step 2: Follow the pointer in the next
and set forwarding pointers unscanned object (A)
– copy the pointed-to objects (just C in this case)
alloc – fix the pointer in A alloc
– set forwarding pointer scan
scan

root A B C D E F A root A B C D E F A C

Prof. Aiken CS 143 Lecture 17 27 Prof. Aiken CS 143 Lecture 17 28

Stop and Copy. Example (4) Stop and Copy. Example (5)

• Follow the pointer in the next unscanned • Follow the pointer in the next unscanned
object (C) object (F)
– copy the pointed objects (F in this case) – the pointed object (A) was already copied. Set the
alloc pointer same as the forwading pointer alloc
scan scan

root A B C D E F A C F root A B C D E F A C F

Prof. Aiken CS 143 Lecture 17 29 Prof. Aiken CS 143 Lecture 17 30

5
Stop and Copy. Example (6) The Stop and Copy Algorithm

• Since scan caught up with alloc we are done while scan <> alloc do
let O be the object at scan pointer
• Swap the role of the spaces and resume the for each pointer p contained in O do
program find O’ that p points to
if O’ is without a forwarding pointer
alloc copy O’ to new space (update alloc pointer)
root scan set 1st word of old O’ to point to the new copy
change p to point to the new copy of O’
else
new space A C F set p in O equal to the forwarding pointer
fi
end for
increment scan pointer to the next object
od
Prof. Aiken CS 143 Lecture 17 31 Prof. Aiken CS 143 Lecture 17 32

Details of Stop and Copy Evauation of Stop and Copy

• As with mark and sweep, we must be able to • Stop and copy is generally believed to be the fastest
GC technique
tell how large an object is when we scan it
– and we must also know where the pointers are
• Allocation is very cheap
inside the object – just increment the heap pointer

• We must also copy any objects pointed to by • Collection is relatively cheap


the stack and update pointers in the stack – especially if there is a lot of garbage
– only touch reachable objects
– this can be an expensive operation
• But some languages do not allow copying
– C, C++

Prof. Aiken CS 143 Lecture 17 33 Prof. Aiken CS 143 Lecture 17 34

Why Doesn’t C Allow Copying? Conservative Garbage Collection

• Garbage collection relies on being able to find • But it is Ok to be conservative:


all reachable objects – if a memory word looks like a pointer it is considered a
– and it needs to find all pointers in an object pointer
• it must be aligned
• it must point to a valid address in the data segment
• In C or C++ it is impossible to identify the – all such pointers are followed and we overestimate the set of
contents of objects in memory reachable objects
– E.g., a sequence of two memory words might be
• A list cell (with data and next fields) • But we still cannot move objects because we cannot
• A binary tree node (with left and right fields) update pointers to them
– Thus we cannot tell where all the pointers are – what if what we thought to be a pointer is actually an account
number?

Prof. Aiken CS 143 Lecture 17 35 Prof. Aiken CS 143 Lecture 17 36

6
Reference Counting Implementation of Reference Counting

• Rather that wait for memory to be exhausted, • new returns an object with reference count 1
try to collect an object when there are no • Let rc(x) be the reference count of x
more pointers to it

• Store in each object the number of pointers • Assume x, y point to objects o, p


to that object
– this is the reference count • Every assignment x  y must be changed:
rc(p)  rc(p) + 1
• Each assignment operation manipulates the rc(o)  rc(o) - 1
reference count if(rc(o) == 0) then mark o as free
xy
Prof. Aiken CS 143 Lecture 17 37 Prof. Aiken CS 143 Lecture 17 38

Evaluation of Reference Counting Evaluation of Garbage Collection

• Advantages: • Automatic memory management prevents


– easy to implement serious storage bugs
– collects garbage incrementally without large pauses
in the execution
• But reduces programmer control
– e.g., layout of data in memory
• Disadvantages: – e.g., when is memory deallocated
– cannot collect circular structures
– manipulating reference counts at each assignment
is very slow
• Pauses problematic in real-time applications
• Memory leaks possible (even likely)
Prof. Aiken CS 143 Lecture 17 39 Prof. Aiken CS 143 Lecture 17 40

Evaluation of Garbage Collection

• Garbage collection is very important

• Researchers are working on advanced garbage


collection algorithms:
– concurrent: allow the program to run while the
collection is happening
– generational: do not scan long-lived objects at
every collection
– parallel: several collectors working in parallel

Prof. Aiken CS 143 Lecture 17 41

Você também pode gostar