Você está na página 1de 24

CS133 GROUP A

Compiler Construction
HIGH-LEVEL LANGUAGE - uses human languages
- focus on logic rather than complex computer
architecture

MACHINE LANGUAGE - The only language a


computer understands
HIGH-LEVEL LANGUAGES → MACHINE
LANGUAGES

WE NEED A COMPILER!
• Compiler phases
– Lexical analysis
– Syntax analysis
– Semantic analysis
– Intermediate (machine-independent) code generation
– Intermediate code optimization
– Target (machine-dependent) code generation
– Target code optimization
Source program

Lexical Analyzer
FRONT END
Syntax Analyzer
Symbol table ANALYSIS Error
manager Semantic analyzer
Handler
Intermediate Code Generator
BACK END
Code Optimizer
SYNTHESIS
Target Code Generator and optimizer

Target Code Program


Compilation Phases and

Passes
Compilation of a program proceeds through a fixed series of phases
– Each phase use an (intermediate) form of the program produced by an
earlier phase
– Subsequent phases operate on lower-level code representations
• Each phase may consist of a number of passes over the program
representation
– Pascal, FORTRAN, C languages designed for one-pass compilation,
which explains the need for function prototypes
– Single-pass compilers need less memory to operate
– Java and ADA are multi-pass
Symbol table Manager
• During the analysis, the compiler manages
a SYMBOL TABLE by
– recording the identifiers of the source program
– collecting information (called ATTRIBUTES)
about them: storage allocation, type, scope,
and (for functions) signature.
Lexical Analysis
• Lexical analysis breaks up the source program into tokens.
• Grouping the characters into non-separable units or what we called
tokens.
• Changing a series of characters to a stream of tokens.

Int main ( )
Int main () { {
for(i=0; i<5; i++) { For ( i =
printf(“Hello World”) 0 ; i <
} 5 ; i +
} + ) {
Printf ( I )
}
}
Feed an input to a finite automaton. Accepts and rejects
Syntax Analysis
• Context-free grammar
• Checks whether the token stream meets the grammatical specification of
the language and generates the syntax tree.
• If the program is grammatically correct, this phase generates an internal
representation that is easy to manipulate in later phases. Typically a syntax
tree (also called a parse tree).
• A grammar of a programming language is typically described by a context
free grammar, which also defines the structure of the parse tree.
• There are notation techniques like Backus-Naur Form
Syntax Analysis
CFG for arithmetic expressions:
<expression> --> number
<expression> --> ( <expression> )
<expression> --> <expression> + <expression>
<expression> --> <expression> - <expression>
<expression> --> <expression> * <expression>
<expression> --> <expression> / <expression>
Parsing:
1-1+1*1
-
number +
*
number number
number
Semantic Analysis
• Semantic analysis is applied by a compiler to discover
the meaning of a program by analyzing its parse tree or
abstract syntax tree.
• A program without grammatical errors may not always
be correct program.
Semantic Analysis
• Static semantic checks: performed at
compile time.
• Dynamic semantic checks: performed at
run time, and the compiler produces code
that performs these checks.
Code Generations and Intermediate
Code forms
• A typical intermediate form of code produced by the
semantic analyzer is an abstract syntax tree (AST)
• The AST is annotated with useful information such
as pointers to the symbol table entry of identifiers
AST of the code:
while b ≠ 0
if a > b
a := a − b
else
b := b − a
return a
Code Generations and Intermediate
Code forms
• There are other intermediate code forms
such as three-address code and single
static assignment.
Target Code Generation and
Optimization
• From the machine-independent form assembly or object
code is generated by the compiler
• This machine-specific code is optimized to exploit
specific hardware features
• Basically a compiler's code generator converts
the intermediate representation of source code or the
intermediate code forms generated by the
intermediate code generation and optimization into a
form (e.g.,machine code) that can be readily executed by
a machine.
Basic Understanding: Compiler (Summary)
• Compiler front-end: lexical analysis, syntax analysis,
semantic analysis
– Tasks: understanding the source code, making sure
the source code is written correctly
• Compiler back-end: Intermediate code
generation/improvement, and Machine code
generation/improvement
– Tasks: translating the program to a semantically the
same program (in a different language) that can be
easily understand/execute by the machine.
Error Detection in Compilers
• A compiler should detect all errors in the
source code and report them to the user.
Error Detection: Types of Error
Lexical Errors

Compilation Errors
Syntactic Errors
Semantic Errors

Execution Errors
Run-time Errors
Errors during Lexical analysis
• Strange characters. (ñ, श, £ etc)
• Long quoted strings
• Invalid numbers (12231.545.23)
Errors during Syntax Analysis
• A syntax error is produced by the compiler when the
program does not meet the grammatical
specification.
• Example: for the previous CFG. 1+*1 is an error.
Error during Semantic Analysis
• One of the most common errors reported during
semantic analysis is "identifier not declared"; either you
have omitted a declaration or you have misspelt an
identifier.
• Another error is assignment of incompatible types.
• Other possible sources of semantic errors are parameter
miscount and subscript miscount.
REFERENCES
http://www.pasteur.fr/formation/infobio/python/ch05s02.ht
ml
www.pcmag.com/encyclopedia/term/44266/high-level-
language
www.diku.dk/~torbenm/Basics/basics_lulu2.pdf - BASICS OF COMPILER
DESIGN