Escolar Documentos
Profissional Documentos
Cultura Documentos
Chapter 1
CSE431
Introduction to Compilers
Programming Languages and Algorithms Theory of Computing & Software Engineering Computer Architecture & Operating Systems
Source program
Compiler
Target Program
CSE431
Classifications of Compilers
Multiple Pass
Load & Go
Construction
Debugging
Optimizing
Functional
Chapter 1
CSE431
The Model
Analysis:
Chapter 1
CSE431
Important Notes
Today: There are many Software Tools for helping with the Analysis Part. This Wasnt the Case in Early Days. (some) analysis is also important in:
Chapter 1
CSE431
Pretty Printers: Standardized version for program structure (i.e., blank space, indenting, etc.)
Analyzes the source program and prints it in such a way that the structure of the program becomes clearly visible. Examples
Comments may appear in a special font Statements may appear with an amount of indentations proportional to the depth of their nesting in a hierarchical organization of the stmts.
Examples
Detects parts of the program that can never be executed A variable used before it is defined
CSE431
Text Formatters
LATEX & TROFF Are Languages Whose Commands Format Text ( paragraphs, figures, mathematical structures etc)
Silicon Compilers
Textual / Graphical: Take Input and Generate Circuit Design
Chapter 1
CSE431
Lexical Analyzer
Syntax Analyzer
3 Symbol-table Manager
Code Optimizer
Code Generator
Target Program
Chapter 1
CSE431
Language-Processing System
Skeleton Source Program 1
Pre-Processor
Source program
2
Compiler
Assembler
Chapter 1
CSE431
Three Phases:
Hierarchical Analysis:
Grouping of Tokens Into Meaningful Collection
Semantic Analysis:
Checking to ensure Correctness of Components
Chapter 1
CSE431
All are tokens Blanks, Line breaks, etc. are scanned out
Chapter 1
CSE431
position
expression +
expression
identifier initial
CSE431
What is a Grammar?
Grammar is a Set of Rules Which Govern the Interdependencies & Structure Among the Tokens
is an assignment statement, or while statement, or if statement, or ...
statement
is an is an
identifier := expression ; (expression), or expression + expression, or expression * expression, or number, or identifier, or ...
Chapter 1
CSE431
Lexical Analysis - Scans Input, Its Linear Actions Are Not Recursive
Identify Only Individual words that are the the Tokens of the Language
Determine Whether the Sentences have One and Only One Unambiguous Interpretation and do something about it! e.g. John Took Picture of Mary Out on the Patio
Chapter 1
CSE431
Find More Complicated Semantic Errors and Support Code Generation Parse Tree Is Augmented With Semantic Actions
:= position initial rate + * 60 position initial := + * rate inttoreal 60 Compressed Tree Conversion Action
Chapter 1
CSE431
Chapter 1
CSE431
Contains Info (storage, type, scope, args) on Each Meaningful Token, Typically Identifiers Data Structure Created / Initialized During Lexical Analysis Utilized / Updated During Later Analysis & Synthesis Detection of Different Errors Which Correspond to All Phases What Kinds of Errors Are Found During the Analysis Phase? What Happens When an Error Is Found?
Error Handling
Chapter 1
CSE431
3 Symbol-table Manager
Code Optimizer
Code Generator
Target Program
Chapter 1
CSE431
Code Optimization
Find More Efficient Ways to Execute Code Replace Code With More Optimal Statements
Generate Relocatable Machine Dependent Code
Chapter 1
CSE431
:= id1 id2 + *
id3
semantic analyzer
60
:=
Symbol Table position .... initial . rate. intermediate code generator
id1 id2l
+ *
id3
inttoreal
60
E r r o r s
Chapter 1
CSE431
position ....
initial . rate. intermediate code generator temp1 := inttoreal(60) temp2 := id3 * temp1 temp3 := id2 + temp2 id1 := temp3 code optimizer temp1 := id3 * 60.0 id1 := id2 + temp1
3 address code
final code generator MOVF id3, R2 MULF #60.0, R2 MOVF id2, R1 ADDF R2, R1 MOVF R1, id1
Chapter 1
CSE431
Assemblers
Assembly code: names are used for instructions, and names are used for memory addresses.
MOV a, R1 ADD #2, R1 MOV R1, b
Two-pass Assembly:
First Pass: all identifiers are assigned to memory addresses (0-offset) e.g. substitute 0 for a, and 4 for b Second Pass: produce relocatable machine code:
Load
Store add
relocation bit
Chapter 1
CSE431
Loader: taking relocatable machine code, altering the addresses and placing the altered instructions into memory. Link-editor: taking many (relocatable) machine code programs (with cross-references) and produce a single file.
Need to keep track of correspondence between variable names and corresponding addresses in each piece of code.
Chapter 1
CSE431
Chapter 1
CSE431
2. File Inclusion
#include in C - bring in another file before compiling
defs.h ////// ////// ////// main.c #include defs.h ------------------------////// ////// ////// -------------------------
Chapter 1
CSE431
3. Rational Preprocessors
Augment Old Languages With Modern Constructs Add Macros for If - Then, While, Etc.
Chapter 1
CSE431
is
Preprocessed
into:
ingres_system(Retr..Research,____,____);
Chapter 1
CSE431
Data-Flow Engines:
Support Optimization
Chapter 1
CSE431
The End
Chapter 1