Você está na página 1de 41

Intermediate Code Generation

dr. Hristo Nikolov


LIACS
Leiden University
Leiden, Fall 2011
Why Intermediate Code?
Front-end: generates intermediate representation
Back-end: generates target code

Why Intermediate code?
Retargeting is facilitated
A machine-independent code optimizer can be applied
Intermediate
Code
Parser
Static
Checker
Intermediate
Code
Generator
Code
Generator
Intermediate Languages
Syntax trees, Postfix notation,
Three-address code (x := y op z)
Example: a := b * -c + b * -c

assign
+
*
a
b
uniminus
c
*
b
uniminus
c
Syntax
tree:
assign
+
*
a
b
uniminus
c
DAG:
Postfix notation: a b c uniminus * b c uniminus * + assign
Three-Address Code
Sequence of statements:
x := y op z
x + y * z => t1 := y * z
t2 := x + t1

Example: a := b * -c + b * -c

t1 := - c
t2 := b *t1
t3 := - c
t4 := b * t3
t5 := t2 + t4
a := t5
Linearized representation
of a syntax tree
t1 := - c
t2 := b *t1
t5 := t2 + t2
a := t5
Linearized representation
of a syntax DAG
Types of Three-Address Statements
Assignment statements: x := y op z and x := op y
Indexed assignments: x := y[ i ] and x[ i ] := y
Pointer assignments: x := & y, x := * y, * x := y
Copy statements: x := y
Unconditional jumps: goto L
Conditional jumps: if x relop y goto L
Function calls: param x, call p, n, and return y

p(x1,..., xn) => param x1
. . .
param xn
call p, n
Syntax-directed Translation into
Three-Address Code
Temporary names are created
E E1 + E2 => t := E1 + E2,
e.g. t5 := t2 + t4
a := t5
When expression is an identifier no new temporary
Nonterminal E has two attributes:
E.place the name which holds the value of E
E.code three-address code sequence
Syntax-directed Translation
Production Semantic Rule
S id := E S.code := E.code || gen( id.place := E.place )
E E 1 + E2 E.place := newtemp();
E.code := E1.code || E2.code || gen( E.place := E1.place + E2.place )
E E1 * E2 E.place := newtemp();
E.code := E1.code || E2.code || gen( E.place := E1.place * E2 .place )
E - E1 E.place := newtemp();
E.code := E1.code || gen( E.place := uminus E1.place )
E ( E1 ) E.place := E1.place
E.code := E.code
E id E.place := id.place
E .code :=
Function newtemp() returns distinct names (t1, t2)
gen(x := y op z) => x := y op z
To produce three-address code for assignments
Flow-of-control Statements
Production Semantic Rule
S while E do S1 S.begin := newlabel()
S.after := newlabel()
S.code := gen( S.begin : ) ||
E.code ||
gen ( if E.place = 0 goto S.after ) ||
S1.code ||
gen ( goto S.begin) ||
gen( S.after : )
While statements
E.code
...
if E.place = 0 goto S.after
S1.code
...
goto S.begin
...
S.begin :
S.after :
Flow-of-control Statements
Example
i := 2 * n + k
while

i := 2 * n + k
while i do
i := i - k
Source program
fragment
t1 := 2



L1: if i = 0 goto L2



L2:
t1 := 2
t2 := t1 * n
t3 := t2 + k
i := t3
L1: if i = 0 goto L2
t4 := i - k
i := t4
goto L1
L2:
Three-address
code sequence
Implementation (1)
Quadruple is record: op, arg 1, arg 2, result

Quadruples, triples, and idirect triples
op arg 1 arg 2 result
(0) uniminus c t1
(1) * b t1 t2
(2) uniminus c t3
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5 a
Pro: easy to rearrange code for global optimization
Cons: lots of temporaries
t1 := - c
t2 := b *t1
t3 :=
t4 := b * t3
t5 := t2 + t4
a := t5
t1 := - c
t2 := b *t1
t3 := - c
t4 := b * t3
t5 := t2 + t4
a := t5
a := b * -c + b * -c
Three-address code Quads (quadruples)
Implementation (2)
Triple is record: op, arg 1, arg 2

Quadruples, triples, and idirect triples
op arg 1 arg 2
(0) uniminus c
(1) * b (0)
(2) uniminus c
(3) * b (2)
(4) + (1) (3)
(5) := a (4)
Pro: temporaries are implicit
Cons: difficult to rearrange code
t1 := - c
t2 := b *t1
t3 :=
t4 := b * t3
t5 := t2 + t4
a := t5
t1 := - c
t2 := b *t1
t3 := - c
t4 := b * t3
t5 := t2 + t4
a := t5
a := b * -c + b * -c
Three-address code Triples
Implementation (3)
Implicit Triples: pointers to triples
Quadruples, triples, and idirect triples
op arg 1 arg 2
(14) uniminus c
(15) * b (14)
(16) uniminus c
(17) * b (16)
(18) + (15) (17)
(19) := a (18)
Pro: temporaries are implicit and it is
easier to rearrange code
t1 := - c
t2 := b *t1
t3 :=
t4 := b * t3
t5 := t2 + t4
a := t5
t1 := - c
t2 := b *t1
t3 := - c
t4 := b * t3
t5 := t2 + t4
a := t5
a := b * -c + b * -c
Three-address code Implicit Triples Triple Container
statement
(0) (14)
(1) (15)
(2) (16)
(3) (17)
(4) (18)
(5) (19)
Names and Scope
The three-address code is simplistic
it assumes that the names of variables can be easily
resolved by the back end in global or local variables

We need local symbol tables to
record global and local declarations in procedures,
blocks, and structs to resolve names
Declarations in a Procedure
P M D
M { offset := 0 }
D D ; D
D id : T { enter( id.name, T.type, offset);
offset := offset + T.width }
T integer { T.type := integer;
T.width := 4 }
T real { T.type := real;
T.width := 8 }
T array [ num ] of T1 { T.type := array( num.val, T1.type );
T.width := num.val x T1.width }
T ^ T1 { T.type := pointer( T1.type );
T.width := 4 }
Keeping Track of Scope Information
Nested procedures
P D
D D ; D | id : T | proc id ; D ; S

D proc id ; D1 ; S: then a new symbol table is created
The new table points to the enclosing procedures table

Procedure enter() knows in which table to make an entry
Symbol Table Operations
mktable( previous ): returns a pointer to a new table
which is linked to a previous table in the outer scope

enter( table, name, type, offset ): creates a new entry
in table

addwidth( table, width ): accumulates the total width
of all entries in table

enterproc( table, name, newtable ): creates a new
entry in table for procedure with local scope newtable
Processing Declarations
Nested Procedures
P M D { addwidth( top( tblptr ), top( offset ) );
pop( tblptr ); pop( offset ) }
M { t := mktable( nil );
push( t, tblptr ); push( 0, offset ) }
D D1 ; D2
D proc id ; N D1 ; S

{ t := top( tblptr );
addwidth( t, top( offset ) );
pop( tblptr ); pop( offset );
enterproc( top( tblptr ), id.name, t ) }
D id : T { enter( top ( tblptr ), id.name, T.type, top( offset ) );
top( offset ) := top( offset ) + T.width }
N { t := mktable( top( tblptr ) );
push( t, tblptr ); push( 0, offset ) }
Assignment Statements
S id := E { p := lookup( id.name );
if p <> nil then
emit( p := E.place )
else error }
E E1 + E2 { E.place := newtemp;
emit( E.place := E1.place + E2.place ) }
E E1 * E2 { E.place := newtemp;
emit( E.place := E1.place * E2.place ) }
E - E1

{ E.place := newtemp;
emit( E.place := uniminus E1.place )}
E ( E1 ) { E.place := E1.place }
E id { p := lookup( id.name );
if p <> nil then
E.place := p
else error }
A Translation Scheme
Reusing Temporary Names
Each time newtemp() generates a new temporary name
Can we reuse temporaries?
E E1 + E2 results in:




Modify newtemp() to use a stack :
Keep a counter c, initialized to 0
newtemp() increments c and returns temporary $c
Decrement counter on each use of a $i in a three-address
statement
evaluate E1 into t1
evaluate E2 into t2
t3 := t1 + t2
evaluate E1 into t1
evaluate E2 into t2
t3 := t1 + t2
If t1 is no longer used, can reuse t1
instead of using new temp t3
Reusing Temporary Names
Statement Value of c
0
$0 := a * b 1
$1 := c * d 2
$0 := $0 + $1 1
$1 := e * f 2
$0 := $0 - $1 1
X := $0 0
Example: x := a * b + c * d e * f
Addressing Array Elements
A : array [low .. high] of type;

i
th
element of A begins in location
base + (i low) x W
base = A[low], W = sizeof( type )

Compile time evaluation if rewritten as
i x W + ( base low x W )

c = base low x W is stored in symbol-table entry for A
A[i] = i x W + c


One-Dimensional Arrays
t1 := c
t2 := i * W
t3 := t1[t2]
:= t3
t1 := c
t2 := i * W
t3 := t1[t2]
:= t3
Addressing Array Elements
In Row-major form A[i,j] is equivalent to A[i][j]



Two-Dimensional Arrays... (1)
A[1,1]
A[1,2]
A[1,3]
A[2,1]
A[2,2]
A[2,3]
A[1,1]
A[2,1]
A[1,2]
A[2,2]
A[1,3]
A[2,3]
Row-major Column-major
First
Column
First
Row
A : array [1..2, 1..3] of integer;
Addressing Array Elements
A[i,j]= base+((ilow
i
)x N
j
+j-low
j
)x W
N
j
= high
j
-low
j
+1

((i x Nj) + j) x W + c, where
c = base - ((low
i
x N
j
) + low
j
) x W


Two-Dimensional Arrays... (2)
A : array [1..2, 1..3] of integer; // Row-major form
t1 := i * 3
t1 := t1 + j
t2 := c
t3 := t1 * 4
t4 := t2[t3]
:= t4
t1 := i * 3
t1 := t1 + j
t2 := c
t3 := t1 * 4
t4 := t2[t3]
:= t4
Addressing Array Elements
Grammar
Synthesized Attributes:

E.place name of temp holding value of E
Elist.array array name
Elist.place name of temp holding index value
Elist.ndim number of array dimensions
L.place lvalue (=name of temp)
L.offset index into array (=name of temp)
null indicates non-array simple id
S L := E
E


L

Elist

S L := E
E E + E
| ( E )
| L
L Elist ]
| id
Elist Elist , E
| id [ E
Addressing Array Elements
A Translation Scheme ... (1)
S L := E { if L.offset = null then /* L is a simple id */
emit( L.place := E.place )
else
emit( L.place [ L.offset ] := E.place ) }
E E1 + E2 { E.place := newtemp;
emit( E.place := E1.place + E2.place ) }
E ( E1 ) { E.place := E1.place }
E L

{ if L.offset = null then /* L is a simple id */
E.place := L.place
else begin
E.place := newtemp;
emit( E.place :=L.place [ L.offset ] )
end }
Addressing Array Elements
A Translation Scheme... (2)
L Elist ] { L.place := newtemp; L.offset := newtemp
emit( L.place := c( Elist.array ) )
emit( L.offset := Elist.place * width( Elist.array )) }
L id { L.place := id.place;
L.offset := null }
Elist Elist 1, E { t := newtemp; m := Elist 1.ndim +1;
emit( t := Elist 1.place * limit( Elist 1.array, m ));
emit( t := t + E.place );
Elist.array := Elist 1.array;
Elist.place := t;
Elist.ndim := m }
Elist id [ E

{ Elist.array := id.place;
Elist.place := E.place;
Elist.ndim := 1 }
Translating Boolean Expressions
Composed by boolean operators and, or, not

Applied to boolean variables or relational expressions

Relational expression: E1 relop E2, where
E1 and E2 are arithmetic expressions

The grammar:


Attribute op: <, , =, , >, represented by relop

E E or E | E and E | not E | ( E ) | id relop id | true | false E E or E | E and E | not E | ( E ) | id relop id | true | false
Translating Boolean Expressions
True = 1, false = 0

Example:
a or b and not c

a < b: if a<b then 1 else 0

Numerical representation
t1 := not c
t2 := b and t1
t3 := a or t2
t1 := not c
t2 := b and t1
t3 := a or t2
100: if a < b goto 103
101: t := 0
102: goto 104
103: t := 1
104:
100: if a < b goto 103
101: t := 0
102: goto 104
103: t := 1
104:
Translating Boolean Expressions
A Translation Scheme
E E1 or E2 { E.place := newtemp;
emit( E.place := E1.place or E2.place ) }
E E1 and E2 { E.place := newtemp;
emit( E.place := E1.place and E2.place ) }
E not E1 { E.place := newtemp;
emit( E.place := not E1.place ) }
E ( E1 ) { E.place := E1.place }
E id1 relop id2 { E.place := newtemp;
emit( if id1.place relop.op id2.place goto nextstat + 3 );
emit( E.place := 0 );
emit( goto nextstat + 2 );
emit( E.place := 1 ) }
E true { E.place := newtemp;
emit( E.place := 1 ) }
E false { E.place := newtemp;
emit( E.place := 0 ) }
Backpatching
Code generation problem:
Labels (addresses) that control must go to may not be
known at the time the jump statements are generated

A solution - backpatching:
Generate branching statements with empty jumps
Put the statements in a goto list
Fill the labels of these statements when the proper label
is determined
Backpatching
Quadruples in a quadruple array
Labels are indexes into the array

Manipulate the list of lables:
makelist( i ) creates a new list containing three-address
location i, returns a pointer to the list

merge( p1 , p2 ) concatenates lists pointed to by p1 and
p2, returns a pointer to the concatenates list

backpatch( p, i ) inserts i as the target label for each of
the statements in the list pointed to by p
Backpatching
Grammar for boolean expressions
Synthesized Attributes:

E.code three-address code
E.truelist backpatch list for jumps on true
E.falselist backpatch list for jumps on false
M.quad location of current three-address
quad
E E1 or M E2
M E2


id1 relop id2


M
E E1 or M E2
| E1 and M E2
| not E1
| ( E1 )
| id1 relop id2
| true
| false
M
Backpatching
A Translation Scheme... (1)
E E1 or M E2 { backpatch( E1.falselist, M.quad );
E.truelist := merge( E1.truelist, E2.truelist );
E.falselist := E2.falselist }

E E1 and M E2 { backpatch( E1.truelist, M.quad );
E.truelist := E2.truelist;
E.falselist := merge( E1.falselist, E2.falselist ) }

E not E1 { E.truelist := E1.falselist;
E.falselist := E1.truelist }

E ( E1 ) { E.truelist := E1.truelist;
E.falselist := E1.falselist }
Backpatching
A Translation Scheme... (2)
E id1 relop id2 { E.truelist := makelist( nextquad );
E.falselist := makelist( nextquad + 1 );
emit( if id1.place relop.op id2.place goto_ );
emit( goto_ ) }
E true { E.truelist := makelist( nextquad );
emit( goto_ ) }
E false { E.falselist := makelist( nextquad );
emit( goto_ ) }
M { M.quad := nextquad }
Backpatching
Example: a<b or c<d and e<f
E id1 relop id2
E E1 or M E2
(M.quad := 102)

E E1 and M E2
(M.quad := 104)

backpatch( {102}, 104 )
backpatch( {101}, 102 )


100: if a < b goto_
101: goto_
102: if c < d
103: goto_
104: if e < f
105: goto_
100: if a < b goto_
101: goto_
102: if c < d goto_
103: goto_
104: if e < f goto_
105: goto_
100: if a < b goto TRUE
101: goto 102
102: if c < d
103: goto FALSE
104: if e < f
105: goto FALSE
100: if a < b goto TRUE
101: goto 102
102: if c < d goto 104
103: goto FALSE
104: if e < f goto TRUE
105: goto FALSE
backpatch( {102}, 104 )
backpatch( {101}, 102 )
Flow-of-control Statements
Grammar
Synthesized Attributes:

E.code three-address code
E.truelist backpatch list for jumps on true
E.falselist backpatch list for jumps on false
M.quad location of current three-address
quad
S.nextlist backpatch list for jumps to the
next statement after S (or nil)
L.nextlist backpatch list for jumps to the
next statement after L (or nil)

S if E then S




L

S if E then S
| if E then S else S
| while E do S
| begin L end
| A
L L ; S
| S
Flow-of-Control Statements
A Translation Scheme using backpatching... (1)
S if E then M1 S1 N
else M2 S2
{ backpatch( E.truelist, M1.quad );
backpatch( E.falselist, M2.quad );
S.nextlist := merge(S1.nextlist,
merge(N.nextlist,S2.nextlist) }
N { N.nextlist := makelist( nextquad );
emit( goto_) }
M { M.quad := nextquad }
S if E then M S1 { backpatch( E.truelist, M.quad );
S.nextlist := merge( E.falselist, S1.nextlist) }
S wihle M1 E do
M2 S1
{ backpatch( S1.nextlist, M1.quad );
backpatch( E.truelist, M2.quad );
S.nextlist := E.falselist;
emit( goto_ M1.quad ) }
Flow-of-Control Statements
A Translation Scheme using backpatching... (2)
S begin L end { S.nextlist := L.nextlist }
S A { S.nextlist := nil }
L L1 ; M S { backpatch( L1.nextlist, M.quad );
L.nextlist := S.nextlist }
L S { L.nextlist := S.nextlist }
Intermediate Code Generation

Chapters for reading:
8.1 to 8.6

Você também pode gostar