Você está na página 1de 11

GCC COMPILER'S

Optimization Levels ------------

-O1 optimize for maximum speed, but disable some which increase code size for a small speed benefit -O2 optimize for maximum speed (DEFAULT)

optimizations

-O3 optimize for maximum speed and enable more aggressive optimizations that may not improve performance on some programs -O same as -O2 optimizations

-Os enable speed optimizations, but disable some which increase code size for small speed benefit -O0 disable optimizations

1. Introduction:The GNU Compiler Collection(GCC) is a computer system produced by the GNU Project support various programming languages. GCC stable release is 4.6.2, written in C,C++, platform is GNU, OS is cross-platform/multi-platform. GCC was originally written as the compiler for the GNU operating system. It is a compiler generation framework which generates production quality optimizing compilers from descriptions of target platforms. => GNU is a acronym for GNU's NOT Unix, is a Unix-like OS developed by the GNU project, aiming to be a complete Unix-compatible software system composed wholly of free software. GNU development was initiated by Richard Stallman in 1983, latest alpha release of the GNU system is GNU 0.401 released on 1 April 2011, featuring GNU Hurd as the system's kernel. Figure Compiler flow of GCC compiler:C tree GENERIC Gimplifier Genericizer C source code Parser GIMPLE

Tree Optimizer Tree- -SSA

Assembly Code

optimized RTL Code Genera RTL -tor optimizer

RTL

RTL Generator

GCC Front-end:It have to produce trees that can be handled by the backend but initially the meaning of a trees was somewhat different for different language front-ends. This was simplified with the introduction of GENERIC and GIMPLE, two new forms of language-independent trees that were introduced with the advent of GCC 4.0. =>GENERIC is used to simply to provide a language-independent way of representing an entire function in trees. GIMPLE is a simplified GENERIC, in which various constructs are lowered to multiple GIMPLE instructions. The C,C++ and Java front ends produce GENERIC directly in the front end. GENERIC is an intermediate representation language used as a middle-end while compiling source code into executable binaries. The middle stage of GCC does all the code analysis and optimization, working independently off both the compiled language and the target architecture, starting from the GENERIC representation and expanding it to Register Transfer Language(RTL- is a term used to describe a kind of intermediate representation(IR) that is very close to assembly language). In transforming the source code to GIMPLE, complex expressions are split into a three address code using temporary variables. Optimization:It can occur during any phase of compilation, however the bulk of optimizations are performed after the syntax and semantic analysis of the front-end and before the code generation of the back-end. This part of compiler is middle end named with some contradiction. Back-end:Preprocessor macros partially decides GCC back end behavior along with functions specific to a target architecture, for instance to define the endianness, word size and calling conventions. The front part of the back end uses these to help decide RTL generation, although GCC RTL is nominally processor-independent. The actual RTL instructions forming the program representation have to comply with the machine description of the target architecture. The machine description file contains the RTL patterns, along with operand constraints and code snippets to output the final assembly. Figure Architecture of GCC compiler:Front End C AST

Middle End Opt Pass 1 . . . . Opt Pass N

C++

AST

Generic

GIMPLE

SSA

un-SSA

RTL

Java

AST

Machine code Back End

Debugging GCC:GNU Debugger(gdb) is the primary tool used to debug GCC code. More specialized tools are Valgrind( for finding memory errors & leaks, and the graph profiler(gprof) that can determine how much time is spent in which routines and how often they are called, this requires program to be compiled with profiling options.

2. Register Translation Language(RTL):RTL is an low-level intermediate representation in which instructions to be output are described one by one in an algebraic form that describes what the instruction does in the later stage of compilation. In GCC, RTL is generated from the GIMPLE representation, transformed by various passes in the GCC 'middle-end', and converted to assembly language. Actually RTL is a convenient tool for describing the internal organization of digital computers in concise and precise manner. RTL is inspired by Lisp S-expression: e.g register(b) = register(c) + register(a), is expressed in RTL as (set (reg:SI a) (plus:SI (reg:SI b) (reg:SI c))) where SI-specifies the access mode for each registers. Thus we can say that RTL is a design abstraction which models a synchronous digital circuit in terms of the flow of the digital signals(data) between hardware registers, and the logical operations performed on those signals. Example RTL representation of 32bit integer plus operation: (insn UID PREV NEXT (set (reg : SI 1)) (plus:SI (reg : SI 2) (reg : SI 3))) where insn node is an container with field UID containing unique identifier of instruction and NEXT,PREV of linked list of instructions. set node is used to represent stores to the first operand( pseudo register 1) and the 2nd operand is the actual expression. SI is mode representing 32bit integer RTL advantages: Has some dependency on the characteristics of the processor for which GCC is generated Knowledge of target processor for RTL code understanding is not necessary Its meaning doesn't depend on the high-level language(source language) of the program. RTL semantics is target independent making it possible to write common optimizers for all targets,

however the syntax( set of allowed instructions) is target dependent. For instance i386 conditional jump described as: ( insn 56 13 57 (set (reg : CCGC 17 flags) (compare: CCGC (reg:SI 61) (reg:SI 62))) (jump_insn 57 56 33 (set (pc) (if_then_else (ge (reg:CCGC 17 flags) (const_int 0)) (label_ref 22) (pc)))) Above insn describing the presence of flags register( register 17) on i386 architecture and split between compare and conditional jumps. Consider the example: RTL for i386 translation of a=a+1; Dump file: test.c.141r.expand
Current instruction

(insn 12 11 13 4 t.c:24 (parallel [ File name line number (set (mem/c/i:SI Single integer (plus:SI Memory reference Scalar that is not part (reg/f:SI 54 virtual-stack-vars) of aggregate (const_int -4 [0xfffffffc])) [0 a+0 S4 A32]) Register that holds pointer (plus:SI (mem/c/i:SI (plus:SI (reg/f:SI 54 virtual-stack-vars) (const_int -4 [0xfffffffc])) [0 a+0 S4 A32]) (const_int 1 [0x1]))) (clobber (reg:CC 17 flags)) here plus modify condition code register non-dterministically ]) -1 (nil)) here plus modify condition code register non-dterministically =>Clobber register:- We use clobber register to inform gcc that values stored in these registers are use and modify by ourselves. So gcc will not assume that the values it loads into these registers will be valid. We shouldn't list the input and output registers in this list. If our instruction can alter the condition code register, we have to add CC to the list of clobbered registers. Like (clobber ( reg:CC 17 flags)). => Control Flow Graph:-CFG is a data structure built on top of the intermediate code representation ( RTL instruction chain or trees) abstracting the control flow behavior of compiled function. References:- 1. http://www.ucw.cz/~hubicka/papers/proj/node6.html 2. http://kcchao.wikidot.com/gcc-internals 3. http://ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.3 4. http://en.wikipedia.org/wiki/Intel_80386 3. Gray Box Probing of GCC:Black Box probing:- The user is only aware of what the software is supposed to do, but not how i.e examining only the input & output relationship of a system. White Box probing:- Examining the internal structures or workings of an application for a given input, as opposed to its functionality as in Black box probing. Gray Box probing:- It is a combination of both mentioned above means the internal

Basic Block

structure as well as the algorithms of the application for the input & output relationship inspection. It includes: Overview if translation sequence in GCC Overview of intermediate representations Intermediate representations of programs across important phases Basic Translation in GCC Transformation from high level language to low level language Target Independent Parse Gimpli-fy Tree SSA Optimizer Target Dependent Generate RTL Optimize RTL Generate ASM

GIMPLE--->RTL

RTL ----> ASM

GIMPLE Passes

RTL Passes

Transformation Passes in GCC The Middle-End of GCC performs SSA based optimizations on GIMPLE, then converts the GIMPLE to RTL and does more optimizations. Finally it hands it off the optimized RTL to the BackEnd. There are a total of 203 unique pass names initialized in ${SOURCE}/gcc/passes.c but actually there a total number of 239 pass. Some Passes are listed below:Parsing pass- This pass reads the entire text of a function definition, constructing a partial syntax trees The tree representation does not entirely follow C syntax, because it is intended to support other language as well. Language-specific data type analysis is also done in this pass, every node that represent an expression has a data type attached. Variables are defined as declaration nodes. Constant folding and some arithmetic simplification are also done here RTL Pass- It is actually done statement-by-statement during parsing, but for most purposes it is considered as separate pass. Optimization is done in this pass and decisions are made about how best to arrange loops and how to output 'switch' statements. The decision of whether the function can and should be expanded inline in its subsequent callers is made at the end of RTL generation The option '-dr' causes a debugging dump of the RTL code is done after this pass. This dump file's name is made by appending '.rtl' to the input file name.

Jump Optimization pass- This pass simplifies jumps to the following instruction, jumps across jumps, and jumps to jumps. It modified some code originally written with jumps into sequences of instructions that directly set values from output of comparisons, if machine have such instructions It deletes unreferenced labels and unreachable code( have some restrictions) This pass is performed two or three times. 1St time is immediately following RTL generation, 2nd is after common subexpression elimination(CSE), but only if CSE required repeated optimization The option '-dj' causes a debugging dump of the RTL code after this pass is run for the first time. This dump file's name is made by appending '.jump' to the input file name. Register Scan pass- This pass finds the first and the last use of each register, as a guide for CSE. While considering all the passes they are broadly divided in two parts: 1. Passes on GIMPLE: Approximately everything passes through here atleast once. It also checks whether the expression is language specific construct or not etc. 2. Passes on RTL:- optimizations is done here along with generation of exception landing pads etc Passes on GIMPLE:Pass Group Interprocedural Optimization Example Conditional Construct propagation, Inlining, SSA construction, LTO Constant Propagation, Dead Code Elimination, PRE Vectorization, Parallelization Value Range Propagation, Rename SSA Number of Passes 49

Intraprocedural Optimization Loop Optimizations Remaining Intraprocedural Optimizations Generating RTL

42 27 23

01 Lowering GIMPLE IR, CFG Construction 12 154

Total Number of passes on GIMPLE Passes on RTL:Pass Group Example

Number of Passes

Intraprocedural Optimizations Loop Optimizations Machine Dependent Optimizations Assembly Emission and Finishing Total number of Passes on RTL

CSE, Jump Optimization Loop Invariant Movement, Peeling, Unswitching Register Allocation, Instruction Scheduling, Peephole optimizations

21 7 54

03 85

Total number of dumbs in Different optimization level:Optimization Level Default O1 O2 O3 Os Number of Dumps 47 134 156 165 154 Optimize for space Goals Fast compilation

Command Line Commands for optimizations and output passes: list of optimization with brief description $ gcc -c help=optimizers optimization enabled at level 2( others are 0,1,3 and s) $ gcc -c -O2 help=optimizers -Q format is -fdump-<ir>-<passname> where <ir> could be tree, rtl, ipa(interprocedural passes on GIMPLE) for seeing all dumps $ gcc -fdump-tree-all -fdump-rtl-all test.c

Diagrammatic representation of passes for First Level Gray-box Probing of GCC:C Source Code

Parser AST

Reg Allocator IRA

Gimplifier GIMPLE

pro_epilogue generation Prologue-epilogue

CFG Generator Pattern Matcher CFG RTL Generator RTL Expand

ASM Program

==>> Inline assembly with GCC:If your assembler instruction can alter the condition code register, add 'cc' to the list of clobbered registers, 'cc' serves to name this register. The input operands are guaranteed not to use any of the clobbered registers, and neither will the output operands' addresses, so you can read and write the clobbered registers as many times as you like. Conventions: Register naming-- register names are prefixed with % like %eax Source/Destination orderingthe source is always on the left, and the destination is always on the right, Like load ebx with the value in eax: movl %eax, %ebx Constant/immediate value format-- you must prefix with $, Like load eax with 0xd00d: movvl $0xd00d, %ebx Operator size specification: suffix the instruction with one of b, w, or l to specify the width of the destination register as a byte, word or longword. If you omit this GAS(GNU assembler) will attempt to guess. movw %ax, %bx Referencing memory:- It has 386- protected mode. The canonical format for 32-bit addressing: immed32(basepointer, indexpointer, indexscale) Addressing a variable offset by a value in a register: _variable(%eax) where underscore(_) is how you get at static(global) C variable from assembler. Addressing a value in an array of interger (scaling up by 4): _array(,%eax,4)

Basic inline assembly:- It's very simple, like asm (statement); You can even push your registers onto the stack, use them, and put them back, like asm (pushl %eax\n\t movl $0, %eax\n\t popl %eax); Extended inline assembly:- Basic format is: asm ( statement : output_registers : input_registers : clobbered_registers); ==> Load Effective Address acronym form is lea , it does an address calculation without affecting any flags. Types of GAS(GNU Aseembler) instructions: opcode (e.g pushal) opcode operand (e.g pushl %edx) opcode source, dest (e.g movl %edx, %eax) (e.g addl %edx, %eax Important Processor Register set: EAX,EBX,ECX,EDX- general purpose, more or less interchangeable EBP- used to access data on stack ESI, EDI- index registers SS(stack segment),DS(data segment),CS(code segment),ES,FS,GS segmentation registers EIP- program counter (instruction pointer) ESP- stack pointer EFLAGS- condition codes Example:- CODE Void function(){ int A=10; A +=66; }

ASSEMBLY pushl %ebp // push ebp movl %esp, %ebp //copy stack pointer to ebp Subl $4, %esp //make space on stack for local data Movl $10, -4(%ebp) #, A //put value 10 in A leal -4(%ebp), %eax //load address of A into EAX Addl $66 (%eax) // add 66 to A

References:- http://www.ibm.com/developerworks/linux/library/l-ia/index.html http://stackoverflow.com/questions/4003894/leal-assembler-instruction http://www.hep.wisc.edu/~pinghc/x86AssmTutorial.htm

GCC Configuration and Building GCC Native Compiler :Today I gone through topics related to GCC configuration and building while my first half of session went on reading workshop slide and searching internet thereafter I will be able to build the needed libraries and configure GCC. Its a tedious task to install GCC because it checks our patience means it takes approx 40 minutes. First of all we need three other libraries for a successful build of gcc: MPC-0.9, MPFR-3.1.0 and GMP-5.0.3. Use below link and download latest version for all of them thereafter

download the latest version of the GCC which is GCC-4.6.2 from net. Now follow the mentioned steps to build all three libraries and gcc step by step:Build GMP-5.0.3: $mkdir ~/gmp-5.0.3 ~/build $cd ~/build $tar xjf /home/rfs54/Downloads/gmp-5.0.3.tar.bz2 $cd gmp-5.0.3 $ ./configure prefix=/home/rfs54/gmp-5.0.3 enable-cxx $ nice -n19 time make -j8 $make install Build MPFR-3.1.0: $mkdir ~/mpfr-3.1.0 $cd ~/build $tar xjf /home/rfs54/Downloads/mpfr-3.1.0.tar.bz2 $cd mpfr-3.1.0 $./configure prefix=/home/rfs54/mpfr-3.1.0 with-gmp=/home/rfs54/gmp-5.0.3 $nice -n 19 time make -j8 $make install

Build MPC-0.9: $mkdir ~/mpc-0.9 $cd ~/build $tar xzf /home/rfs54/Downloads/mpc-0.9.tar.bz2 $cd mpc-0.9 $LD_LIBRARY_PATH=/home/rfs54/gmp-5.0.3/lib:/home/user/mpfr-3.1.0/lib ./configure --prefix=/home/rfs54/mpc-0.9 with-gmp=/home/rfs54/gmp-45.0.3 --with-mpfr=/home/rfs54/mpfr-3.1.0 $LD_LIBRARY_PATH=/home/user/gmp-4.3.2/lib:/home/user/mpfr-2.4.2/lib $nice -n 19 time make -j8 $make install Build GCC-4.6.2: $mkdir ~/gcc-4.6.2 $cd ~/build $tar xjf /home/rfs54/Downloads/gcc-4.6.2.tar.bz2 $cd gcc-4.6.2 $LD_LIBRARY_PATH=/home/rfs54/gmp-5.0.3/lib:/home/rfs54/mpfr-3.1.0/lib :/home/rfs54/mpc-0.9/lib ./configure prefix=/home/rfs54/gcc-4.6.2 --with-gmp=/home/rfs54/gmp-5.0.3 with-mpfr=/home/rfs54/mpfr-3.1.0 --with-mpc=/home/rfs54/mpc-0.9 disable-multilib $LD_LIBRARY_PATH=/home/rfs54/gmp-5.0.3/lib:/home/rfs54/mpfr-3.1.0/lib: /home/rfs54/mpc-0.9/lib $nice -n 19 time make -j8 $make install

Conclusion:- All other libraries are build but while running make command in gcc it shows 1 errors, which I am showing below. I will get back to it tomorrow and try to resolve the problem. Error:- 1. $ error while loading shared libraries: libgmp.so.10: cannot open shared object file: No such file or directory References:-http://studystuff.in/content/steps-install-and-configure-gcc-462 http://gcc.gnu.org/install/build.html http://solarianprogrammer.com/2011/12/01/compiling-gcc-4-6-2-on-mac-osx-lion/ http://www.multiprecision.org/index.php?prog=mpc&page=download http://www.mpfr.org/mpfr-current/#download http://openwall.info/wiki/internal/gcc-local-build

Você também pode gostar