Escolar Documentos
Profissional Documentos
Cultura Documentos
CS 230
이준원
1
System Software
• components
– translator
• assembler
• compiler
• interpreter
– system manager
• operating system
– other utilities
• loader
• linker
• DBMS, editor, debugger, ...
• purpose of this course
– understand how to build system software
– understand how these components work
2
Issues in System Software
3
Assembler Overview
• functions
– translate programs written in assembly language to machine code
• mnemonic code to machine code
• symbols to addresses
– handles
• constants
• literals
• addressing
• 32 bit constant or address
• 32 bit offset
4
Assembler Overview (cont’d)
• pass 1: loop until the end of the program
1. read in a line of assembly code
2. assign an address to this line
• increment N (word addressing or byte addressing)
3. save address values assigned to labels
• in symbol tables
4. process assembler directives
• constant declaration
• space reservation
• pass2: same loop
1. read in a line of code
2. translate op code
using op code table
3. change labels to address
using the symbol table
4. process assembler directives
5. produce object program
5
Data Structures for Assembler
add $t0, $t1, $t2 000000 01001 01010 01000 00000 100000
• op code table
– looked up for the translation of mnemonic code
• key: mnemonic code
• result: bits
– hashing is usually used
• once prepared, the table is not changed
• efficient lookup is desired
• since mnemonic code is predefined, the hashing function can
be tuned a priori
– the table may have the instruction format and length
• to decide where to put op code bits, operands bits, offset bits
• for variable instruction size
• used to calculate the address
6
Data Structures for Assembler (cont’d)
.text
.globl main
• symbol table main:
la $t0, array
– stored and looked up to assign lw $t1, count
address to labels lw $t2, ($t0)
loop:
• efficient insertion and retrieval lw $t3, 4($t0)
is needed ble $t3, $t2, loop2
• deletion does not occur move $t2, $t3
7
Symbol Table Construction
.text
.globl main symbol name value
main: main 0
la $t0, array
lw $t1, count loop 12
lw $t2, ($t0)
loop: loop2 24
lw $t3, 4($t0)
ble $t3, $t2, loop2 …
move $t2, $t3
array 408
loop2: add $t1, $t1, -1 count 468
add $t0, $t0, 4
bnez $t1, loop string1 472
… bad 478
….
.data
array: .word 3, 5, 5, 1, 6, 7, …..
count: .word 15
string1: .asciiz “\nmax = “
bad: .word 7
8
Assembler Algorithm: pass1
begin
if starting address is given
LOCCTR = starting address;
else
LOCCTR = 0;
while OPCODE != END do ;; or EOF
begin
read a line from the code
if there is a label
if this label is in SYMTAB, then error
else insert (label, LOCCTR) into SYMTAB
search OPTAB for the op code
if found
LOCCTR += N ;; N is the length of this instruction (4 for MIPS)
else if this is an assembly directive
update LOCCTR as directed
else error
write line to intermediate file
end
program size = LOCCTR - starting address;
end
9
Assembler Algorithm: pass2
begin
read a line;
if op code = START then ;; .globl xxx for MIPS
write header record;
while op code != END do ;; or EOF
begin
search OPTAB for the op code;
if found
if the operand is a symbol then
replace it with an address using SYMTAB;
assemble the object code;
else if is a defined directive add $t0, $t1, $t2 =>
convert it to object code; 000000 01001 01010 01000 00000 100000
add object code to the text;
read next line;
end
write End record to the text;
output text;
end
10
Program Relocation
0 .
.
. jump to 1004 1004
. .
1076 5000 .
jump to 1004 .
. jump to 1004
.
6076
11
Program Relocation (cont’d)
12
Literals
• usage
– encoded as an operand (similar to the immediate in MIPS, but different)
• load $7, =X’0A7F’
– simple way to declare a constant
– assembler does
• declare a constant with a label
• use the label to use the value
• comparison with immediate
– literal is an assembler directive
• immediate is a machine recognizable data
– full word can be used for literals
• immediate: full word – (opcode, registers)
– values are obtained from data memory - slow
• immediate data is within the instruction itself
13
Literals (cont’d)
• literal pool
– assembler collects all the literals into one or more literal pools
– default location is at the end of the program
• for better code reading
– programmer can declare a place (LTORG)
• to use PC-relative addressing
• to keep data close to instruction
• optimization
– make one literal for the same value
• compare character string or value?
– x’454F46’ = c’EOF’
• value comparison needs evaluation
• literal table
– name(label), operand value, operand length, address in the table
– name and value are all used as a key
14
Literal Handling Algorithm
pass 1
at a recognition of a literal
search LITTAB by name
if found but different value, error
else if the same value, no action
else if not found insert a new literal (no address yet)
if the code is LTORG or END
allocate each literal assigning an address
pass 2
replace each literal with the address in the LITTAB
if these addresses are absolute,
prepare modification for relocation
15
Symbol Defining Statement
16
Expressions
17
Expression Rules
• basic
– constant is absolute
– address is relative
• using expressions
– expression with absolute arguments is absolute
– expression that has multiplication and division is absolute
– relative_1 - relative_2 is absolute
• dependencies on starting address are canceled out
– all the other expressions having relative terms are neither relative nor
absolute (error?)
• constant - relative
• relative_1 + relative_2
• 3 x relative_1
18
Program Blocks
block 0
block 0
block 1
assembled
block 2
block 0 block 1
block 1
block 2
block 2
19
Program Blocks (cont’d)
• motivation
– programmer’s view may be different from machine’s view
• affects only efficiency not functionality
– addressing can be simplified
• large data area can be moved to the end of code while source code places
it close to the instructions that use this data
• data structure and algorithm
– block table (name, block number, address, length)
– pass 1
• maintain separate LOCCTR for each block
– each label is assigned address relative to the start of the block that contains it
• SYMTAB stores block number for each symbol
• store starting address of each block in block table
– pass 2
• assign address to each symbol by adding the relative address to the block
starting address
20
Control Sections
21
Control Sections (cont’d)
22
One-Pass Assembler
• problem
– forward reference: reference to symbols that are not defined yet
• why do we need one-pass assembler?
– fast
• useful for program development and testing
• university computing environment
• load-and-go assembler
– writes the object code on memory not on disk file
– since it is on memory it is easy to modify a part of object code
23
One-Pass Assembler (cont’d)
24
Multi-Pass Assembler
• support forwarding reference even though it is bad for program readability