Você está na página 1de 5

Conceptual Structure: two major phases Conceptual Structure: two major phases

Front-end performs the analysis of the source language:


Recognises legal and illegal programs and reports errors.
“understands” the input program and collects its semantics in an IR.
Produces IR and shapes the code for the back-end.
Back-end does the target language synthesis:
Translates IR into target code.

Lexical Analysis
Reads characters and Produces The output is called token and is a pair of the form

Syntax (or syntactic) Analysis (Parsing)


Imposes a hierarchical structure on the token stream
Ordering based on the precedence of mathematical operations.
Semantic Analysis (context handling)
Collects context (semantic) information, checks for semantic errors, and annotates nodes of the
tree with the results.
Important part is type checking, where compiler checks that each operator has matching
operands.

Examples:
 Type checking: report error if an operator is applied to an incompatible operand.
 Eg : Array index is an integer; report error if floating point index.
 Check flow-of-controls.
 Uniqueness or name-related checks.

Intermediate code generation


Translate language-specific constructs in the AST into more general constructs
 It should be easy to produce.
 It should be easy to translate into target machine code.

Code Optimisation
The goal is to improve the intermediate code

Code Generation Phase


Map the AST onto a linear list of target machine instructions in a symbolic form:
 Instruction selection: a pattern matching problem.
 Register allocation: each value should be in a register when it is used
 Instruction scheduling: take advantage of multiple functional units.

Compiler Construction Tools


 Parser Generator : Automatically produce syntax analysis from grammatical description
of a programming language.
 Scanner Generator : Produce lexical analyzer from regular expression description of the
language.
 Syntax Directed translation engines : produce collection of routines for walking a parse
tree and generating intermediate code.
 Code Generator generators : produce code generator from a collection of rules for
translating each operation of the intermediate language into the machine language for a
target machine.
 Data Flow Engines : that facilitate the gathering of information about how values are
transmitted from one part of a program to each other part.
 Compiler Construction tool kit : provide an integrated set of routines for constructing
various phases of a compiler

Some commonly used compiler-construction tools. include


1. Parser generators.
2. Scanner generators.
3. Syntax-directed translation engines.
4. Automatic code generators.
5. Data-flow analysis engines.
6. Compiler-construction toolkits.
Parser Generators
Input: Grammatical description of a programming language
Output: Syntax analyzers.
Parser generator takes the grammatical description of a programming language and produces a
syntax analyzer.
Scanner Generators
Input: Regular expression description of the tokens of a language
Output: Lexical analyzers.
Scanner generator generates lexical analyzers from a regular expression description of the tokens
of a language.
Syntax-directed Translation Engines
Input: Parse tree.
Output: Intermediate code.
Syntax-directed translation engines produce collections of routines that walk a parse tree and
generates intermediate code.
Automatic Code Generators
Input: Intermediate language.
Output: Machine language.
Code-generator takes a collection of rules that define the translation of each operation of the
intermediate language into the machine language for a target machine.
Data-flow Analysis Engines
Data-flow analysis engine gathers the information, that is, the values transmitted from one part of
a program to each of the other parts. Data-flow analysis is a key part of code optimization.
Compiler Construction Toolkits
The toolkits provide integrated set of routines for various phases of compiler. Compiler
construction toolkits provide an integrated set of routines for construction of phases of compiler.

 Scanner generators for C/C++: Flex (pdf), Lex (pdf).


 Parser generators for C/C++: Bison (in HTML), Bison (pdf), Yacc (pdf).
2. Available scanner generators for Java:
 JLex, a scanner generator for Java, very similar to Lex.
 JFLex, flex for Java.
3. Available parser generators for Java:
 CUP, a parser generator for Java, very similar to YACC.
 BYACC/J, a different version of Berkeley YACC for Java. It is an extension of the standard
YACC (a -j flag has been added to generate Java code).
4. Other compiler tools:
 JavaCC, a parser generator for Java, including scanner generator and parser generator. Input
specifications are different than those suitable for Lex/YACC. Also, unlike YACC, JavaCC
generates a top-down parser.
 ANTLR, a set of language translation tools (formerly PCCTS). Includes scanner/parser
generators for C, C++, and Java.

Error Recovery
Panic mode
When a parser encounters an error anywhere in the statement, it ignores the rest of the statement by
not processing input from erroneous input to delimiter, such as semi-colon
Statement mode
When a parser encounters an error, it tries to take corrective measures,
For example, inserting a missing semicolon, replacing comma with a semicolon etc
Error productions
Some common errors are known to the compiler designers that may occur in the code.
the designers can create augmented grammar to be used, as productions that generate erroneous
constructs when these errors are encountered.
Global correction
The parser considers the program in hand as a whole and tries to figure out what the program is
intended to do and tries to find out a closest match for it, which is error-free.
Input Buffering

Buffer Pairs

Parsing

Parsing methods
The top-down
Bottom-up methods

 Top-down parsing, construction starts at the root and proceeds to the leaves.
 Bottom-up parsing, construction starts at the leaves and proceeds towards the root.
1. Top-down parsers are easy to build by hand.
2. Bottom-up parsing,
3. Can handle a larger class of grammars.
They are not as easy to build, but tools for generating them directly from a grammar are
available.
Predictive Parsing
Recursive-descent parsing Is a top down method of syntax analysis in which a set of recursive
procedures is used to process the input.
Simple form of recursive descent–Predictive Parsing

Backtracking

A Context Free Grammar


A context-free grammar has four components:
Top-Down Parsing
The parse tree is created top to bottom.
Top-down parser
Recursive-descent parsing
Backtracking is needed
It is a general parsing technique, but not widely used.
Not efficient
Predictive parsing
No backtracking
Efficient
Needs a special form of grammars -(LL(1) grammars).
Recursive predictive parsing is a special form of recursive descent parsing without
backtracking.
Non-recursive (table driven) predictive parser is also known as LL(1) parser.
Syntax-directed translation

Grammars Used is called attribute grammars.

 An attribute has a name and an associated value.


 With each production in a grammar, give semantic rules or actions.
 The general approach to syntax-directed translation is to construct a parse tree or syntax
tree and compute the values of attributes at the nodes of the tree by visiting them in some
order.
(Sementic rules uses attribute grammar, which include production rule with attribute name and
value)
There are two ways to represent the semantic rules associated with grammar symbols.
Syntax-Directed Definitions (SDD)
Syntax-Directed Translation Schemes (SDT)

A syntax-directed definition (SDD) is a context-free grammar together with attributes and rules.
Attributes are associated with grammar symbols and rules are associated with productions.
PRODUCTION SEMANTIC RULE
E → E1 + T E.code= E1.code || T.code|| ‘+’

Syntax-Directed Translation Schemes (SDT) embeds program fragments called semantic actions
within production bodies

SDTs are more efficient than SDDs as they indicate the order of evaluation of semantic actions
associated with a production rule.
Annotated parse-tree

AN ANNOTATED PARSE TREE. is a parse tree showing the values of the attributes at each
node. The process of computing the attribute values at the nodes is called annotating or
decorating the parse tree. Example 3 The following syntax-directed definition is from a desk
calculator.

Você também pode gostar