Você está na página 1de 31

Chapter 1: Introduction

1 Introduction
The purpose of this book is to explain the use of the GNU C and C++ compilers, gcc and g++. After reading this book you should understand how to compile a program, and how to use basic compiler options for optimization and debugging. This book does not attempt to teach the C or C++ languages themselves, since this material can be found in many other places (see [Further reading], page 91). Experienced programmers who are familiar with other systems, but new to the GNU compilers, can skip the early sections of the chapters Compiling a C program, Using the preprocessor and Compiling a C++ program. The remaining sections and chapters should provide a good overview of the features of GCC for those already know how to use other compilers.

1.1 A brief history of GCC


The original author of the GNU C Compiler (GCC) is Richard Stallman, the founder of the GNU Project. The GNU project was started in 1984 to create a complete Unix-like operating system as free software, in order to promote freedom and cooperation among computer users and programmers. Every Unix-like operating system needs a C compiler, and as there were no free compilers in existence at that time, the GNU Project had to develop one from scratch. The work was funded by donations from individuals and companies to the Free Software Foundation, a non-prot organization set up to support the work of the GNU Project. The rst release of GCC was made in 1987. This was a signicant breakthrough, being the rst portable ANSI C optimizing compiler released as free software. Since that time GCC has become one of the most important tools in the development of free software. A major revision of the compiler came with the 2.0 series in 1992, which added the ability to compile C++. In 1997 an experimental branch of the compiler (EGCS) was created, to improve optimization and C++ support. Following this work, EGCS was adopted as the new main-line of GCC development, and these features became widely available in the 3.0 release of GCC in 2001. Over time GCC has been extended to support many additional languages, including Fortran, ADA, Java and Objective-C. The acronym

An Introduction to GCC

GCC is now used to refer to the GNU Compiler Collection. Its development is guided by the GCC Steering Committee, a group composed of representatives from GCC user communities in industry, research and academia.

1.2 Major features of GCC


This section describes some of the most important features of GCC. First of all, GCC is a portable compilerit runs on most platforms available today, and can produce output for many types of processors. In addition to the processors used in personal computers, it also supports microcontrollers, DSPs and 64-bit CPUs. GCC is not only a native compilerit can also cross-compile any program, producing executable les for a dierent system from the one used by GCC itself. This allows software to be compiled for embedded systems which are not capable of running a compiler. GCC is written in C with a strong focus on portability, and can compile itself, so it can be adapted to new systems easily. GCC has multiple language frontends, for parsing dierent languages. Programs in each language can be compiled, or cross-compiled, for any architecture. For example, an ADA program can be compiled for a microcontroller, or a C program for a supercomputer. GCC has a modular design, allowing support for new languages and architectures to be added. Adding a new language front-end to GCC enables the use of that language on any architecture, provided that the necessary run-time facilities (such as libraries) are available. Similarly, adding support for a new architecture makes it available to all languages. Finally, and most importantly, GCC is free software, distributed under the GNU General Public License (GNU GPL).(1) This means you have the freedom to use and to modify GCC, as with all GNU software. If you need support for a new type of CPU, a new language, or a new feature you can add it yourself, or hire someone to enhance GCC for you. You can hire someone to x a bug if it is important for your work. Furthermore, you have the freedom to share any enhancements you make to GCC. As a result of this freedom you can also make use of enhancements to GCC developed by others. The many features oered by GCC today show how this freedom to cooperate works to benet you, and everyone else who uses GCC.
(1)

For details see the license le COPYING distributed with GCC.

Chapter 1: Introduction

1.3 Programming in C and C++


C and C++ are languages that allow direct access to the computers memory. Historically, they have been used for writing low-level systems software, and applications where high-performance or control over resource usage are critical. However, great care is required to ensure that memory is accessed correctly, to avoid corrupting other data-structures. This book describes techniques that will help in detecting potential errors during compilation, but the risk in using languages like C or C++ can never be eliminated. In addition to C and C++ the GNU Project also provides other highlevel languages, such as GNU Common Lisp (gcl), GNU Smalltalk (gst), the GNU Scheme extension language (guile) and the GNU Compiler for Java (gcj). These languages do not allow the user to access memory directly, eliminating the possibility of memory access errors. They are a safer alternative to C and C++ for many applications.

1.4 Conventions used in this manual


This manual contains many examples which can be typed at the keyboard. A command entered at the terminal is shown like this, $ command followed by its output. For example: $ echo "hello world" hello world The rst character on the line is the terminal prompt, and should not be typed. The dollar sign $ is used as the standard prompt in this manual, although some systems may use a dierent character. When a command in an example is too long to t in a single line it is wrapped and then indented on subsequent lines, like this: $ echo "an example of a line which is too long to fit in this manual" When entered at the keyboard, the entire command should be typed on a single line. The example source les used in this manual can be downloaded from the publishers website,(2) or entered by hand using any text editor, such as the standard GNU editor, emacs. The example compilation commands use gcc and g++ as the names of the GNU C and C++ compilers, and cc to refer to other compilers. The example programs should work with any
(2)

See http://www.network-theory.co.uk/gcc/intro/

An Introduction to GCC

version of GCC. Any command-line options which are only available in recent versions of GCC are noted in the text. The examples assume the use of a GNU operating systemthere may be minor dierences in the output on other systems. Some non-essential and verbose system-dependent output messages (such as very long system paths) have been edited in the examples for brevity. The commands for setting environment variables use the syntax of the standard GNU shell (bash), and should work with any version of the Bourne shell.

Chapter 2: Compiling a C program

2 Compiling a C program
This chapter describes how to compile C programs using gcc. Programs can be compiled from a single source le or from multiple source les, and may use system libraries and header les. Compilation refers to the process of converting a program from the textual source code, in a programming language such as C or C++, into machine code, the sequence of 1s and 0s used to control the central processing unit (CPU) of the computer. This machine code is then stored in a le known as an executable le, sometimes referred to as a binary le.

2.1 Compiling a simple C program


The classic example program for the C language isC

An Introduction to GCC

they are enabled. Compiler warnings are an essential aid in detecting problems when programming in C and C++. In this case, the compiler does not produce any warnings with the -Wall option, since the program is completely valid. Source code which does not produce any warnings is said to compile cleanly. To run the program, type the path name of the executable like this: $ ./hello Hello, world! This loads the executable le into memory and causes the CPU to begin executing the instructions contained within it. The path ./ refers to the current directory, so ./hello loads and runs the executable le hello located in the current directory.

2.2 Finding errors in a simple program


As mentioned above, compiler warnings are an essential aid when programming in C and C++. To demonstrate this, the program below contains a subtle error: it uses the function printf incorrectly, by specifying a oating-point format %f for an integer value: #include <stdio.h> int main (void) { printf ("Two plus two is %f\n", 4); return 0; } This error is not obvious at rst sight, but can be detected by the compiler if the warning option -Wall has been enabled. Compiling the program above, bad.c, with the warning option -Wall produces the following message: $ gcc -Wall bad.c -o bad bad.c: In function main: bad.c:6: warning: double format, different type arg (arg 2) This indicates that a format string has been used incorrectly in the le bad.c at line 6. The messages produced by GCC always have the form le:line-number:message. The compiler distinguishes between error messages, which prevent successful compilation, and warning messages which indicate possible problems (but do not stop the program from compiling). In this case, the correct format specier would have been %d (the allowed format speciers for printf can be found in any general book on

Chapter 2: Compiling a C program

C, such as the GNU C Library Reference Manual, see [Further reading], page 91). Without the warning option -Wall the program appears to compile cleanly, but produces incorrect results: $ gcc bad.c -o bad $ ./bad Two plus two is 2.585495 (incorrect output) The incorrect format specier causes the output to be corrupted, because the function printf is passed an integer instead of a oating-point number. Integers and oating-point numbers are stored in dierent formats in memory, and generally occupy dierent numbers of bytes, leading to a spurious result. The actual output shown above may dier, depending on the specic platform and environment. Clearly, it is very dangerous to develop a program without checking for compiler warnings. If there are any functions which are not used correctly they can cause the program to crash, or to produce incorrect results. Turning on the compiler warning option -Wall will catch many of the commonest errors which occur in C programming.

2.3 Compiling multiple source les


A program can be split up into multiple les. This makes it easier to edit and understand, especially in the case of large programsit also allows the individual parts to be compiled independently. In the following example we will split up the program Hello World into three les: main.c, hello_fn.c and the header le hello.h. Here is the main program main.c: #include "hello.h" int main (void) { hello ("world"); return 0; } The original call to the printf system function in the previous program hello.c has been replaced by a call to a new external function hello, which we will dene in a separate le hello_fn.c. The main program also includes the header le hello.h which will contain the declaration of the function hello. The declaration is used to ensure that the types of the arguments and return value match up correctly between the function call and the function denition. We no

10

An Introduction to GCC

longer need to include the system header le stdio.h in main.c to declare the function printf, since the le main.c does not call printf directly. The declaration in hello.h is a single line specifying the prototype of the function hello: void hello (const char * name); The denition of the function hello itself is contained in the le hello_fn.c: #include <stdio.h> #include "hello.h" void hello (const char * name) { printf ("Hello, %s!\n", name); } This function prints the message Hello, name ! using its argument as the value of name. Incidentally, the dierence between the two forms of the include statement #include "FILE.h" and #include <FILE.h> is that the former searches for FILE.h in the current directory before looking in the system header le directories. The include statement #include <FILE.h> searches the system header les, but does not look in the current directory by default. To compile these source les with gcc, use the following command: $ gcc -Wall main.c hello_fn.c -o newhello In this case, we use the -o option to specify a dierent output le for the executable, newhello. Note that the header le hello.h is not specied in the list of les on the command line. The directive #include "hello.h" in the source les instructs the compiler to include it automatically at the appropriate points. To run the program, type the path name of the executable: $ ./newhello Hello, world! All the parts of the program have been combined into a single executable le, which produces the same result as the executable created from the single source le used earlier.

Chapter 2: Compiling a C program

11

2.4 Compiling les independently


If a program is stored in a single le then any change to an individual function requires the whole program to be recompiled to produce a new executable. The recompilation of large source les can be very timeconsuming. When programs are stored in independent source les, only the les which have changed need to be recompiled after the source code has been modied. In this approach, the source les are compiled separately and then linked togethera two stage process. In the rst stage, a le is compiled without creating an executable. The result is referred to as an object le, and has the extension .o when using GCC. In the second stage, the object les are merged together by a separate program called the linker. The linker combines all the object les together to create a single executable. An object le contains machine code where any references to the memory addresses of functions (or variables) in other les are left undened. This allows source les to be compiled without direct reference to each other. The linker lls in these missing addresses when it produces the executable.

2.4.1 Creating object les from source les


The command-line option -c is used to compile a source le to an object le. For example, the following command will compile the source le main.c to an object le: $ gcc -Wall -c main.c This produces an object le main.o containing the machine code for the main function. It contains a reference to the external function hello, but the corresponding memory address is left undened in the object le at this stage (it will be lled in later by linking). The corresponding command for compiling the hello function in the source le hello_fn.c is: $ gcc -Wall -c hello_fn.c This produces the object le hello_fn.o. Note that there is no need to use the option -o to specify the name of the output le in this case. When compiling with -c the compiler automatically creates an object le whose name is the same as the source le, with .o instead of the original extension. There is no need to put the header le hello.h on the command line, since it is automatically included by the #include statements in main.c and hello_fn.c.

12

An Introduction to GCC

2.4.2 Creating executables from object les


The nal step in creating an executable le is to use gcc to link the object les together and ll in the missing addresses of external functions. To link object les together, they are simply listed on the command line: $ gcc main.o hello_fn.o -o hello This is one of the few occasions where there is no need to use the -Wall warning option, since the individual source les have already been successfully compiled to object code. Once the source les have been compiled, linking is an unambiguous process which either succeeds or fails (it fails only if there are references which cannot be resolved). To perform the linking step gcc uses the linker ld, which is a separate program. On GNU systems the GNU linker, GNU ld, is used. Other systems may use the GNU linker with GCC, or may have their own linkers. The linker itself will be discussed later (see Chapter 11 [How the compiler works], page 81). By running the linker, gcc creates an executable le from the object les. The resulting executable le can now be run: $ ./hello Hello, world! It produces the same output as the version of the program using a single source le in the previous section.

2.4.3 Link order of object les


On Unix-like systems, the traditional behavior of compilers and linkers is to search for external functions from left to right in the object les specied on the command line. This means that the object le which contains the denition of a function should appear after any les which call that function. In this case, the le hello_fn.o containing the function hello should be specied after main.o itself, since main calls hello: $ gcc main.o hello_fn.o -o hello (correct order) With some compilers or linkers the opposite ordering would result in an error, $ cc hello_fn.o main.o -o hello (incorrect order) main.o: In function main: main.o(.text+0xf): undefined reference to hello because there is no object le containing hello after main.o. Most current compilers and linkers will search all object les, regardless of order, but since not all compilers do this it is best to follow the convention of ordering object les from left to right.

Chapter 2: Compiling a C program

13

This is worth keeping in mind if you ever encounter unexpected problems with undened references, and all the necessary object les appear to be present on the command line.

2.5 Recompiling and relinking


To show how source les can be compiled independently we will edit the main program main.c and modify it to print a greeting to everyone instead of world: #include "hello.h" int main (void) { hello ("everyone"); /* changed from "world" */ return 0; } The updated le main.c can now be recompiled with the following command: $ gcc -Wall -c main.c This produces a new object le main.o. There is no need to create a new object le for hello_fn.c, since that le and the related les that it depends on, such as header les, have not changed. The new object le can be relinked with the hello function to create a new executable le: $ gcc main.o hello_fn.o -o hello The resulting executable hello now uses the new main function to produce the following output: $ ./hello Hello, everyone! Note that only the le main.c has been recompiled, and then relinked with the existing object le for the hello function. If the le hello_fn.c had been modied instead, we could have recompiled hello_fn.c to create a new object le hello_fn.o and relinked this with the existing le main.o.(1) In general, linking is faster than compilationin a large project with many source les, recompiling only those that have been modied can make a signicant saving. The process of recompiling only the modied
(1)

If the prototype of a function has changed, it is necessary to modify and recompile all of the other source les which use it.

14

An Introduction to GCC

les in a project can be automated using GNU Make (see [Further reading], page 91).

2.6 Linking with external libraries


A library is a collection of precompiled object les which can be linked into programs. The most common use of libraries is to provide system functions, such as the square root function sqrt found in the C math library. Libraries are typically stored in special archive les with the extension .a, referred to as static libraries. They are created from object les with a separate tool, the GNU archiver ar, and used by the linker to resolve references to functions at compile-time. We will see later how to create libraries using the ar command (see Chapter 10 [Compiler-related tools], page 73). For simplicity, only static libraries are covered in this section dynamic linking at runtime using shared libraries will be described in the next chapter. The standard system libraries are usually found in the directories /usr/lib and /lib.(2) For example, the C math library is typically stored in the le /usr/lib/libm.a on Unix-like systems. The corresponding prototype declarations for the functions in this library are given in the header le /usr/include/math.h. The C standard library itself is stored in /usr/lib/libc.a and contains functions specied in the ANSI/ISO C standard, such as printfthis library is linked by default for every C program. Here is an example program which makes a call to the external function sqrt in the math library libm.a: #include <math.h> #include <stdio.h> int main (void) { double x = sqrt (2.0); printf ("The square root of 2.0 is %f\n", x); return 0; } Trying to create an executable from this source le alone causes the compiler to give an error at the link stage:
(2)

On systems supporting both 64 and 32-bit executables the 64-bit versions of the libraries will often be stored in /usr/lib64 and /lib64, with the 32-bit versions in /usr/lib and /lib.

Chapter 2: Compiling a C program

15

$ gcc -Wall calc.c -o calc /tmp/ccbR6Ojm.o: In function main: /tmp/ccbR6Ojm.o(.text+0x19): undefined reference to sqrt The problem is that the reference to the sqrt function cannot be resolved without the external math library libm.a. The function sqrt is not dened in the program or the default library libc.a, and the compiler does not link to the le libm.a unless it is explicitly selected. Incidentally, the le mentioned in the error message /tmp/ccbR60jm.o is a temporary object le created by the compiler from calc.c, in order to carry out the linking process. To enable the compiler to link the sqrt function to the main program calc.c we need to supply the library libm.a. One obvious but cumbersome way to do this is to specify it explicitly on the command line: $ gcc -Wall calc.c /usr/lib/libm.a -o calc The library libm.a contains object les for all the mathematical functions, such as sin, cos, exp, log and sqrt. The linker searches through these to nd the object le containing the sqrt function. Once the object le for the sqrt function has been found, the main program can be linked and a complete executable produced: $ ./calc The square root of 2.0 is 1.414214 The executable le includes the machine code for the main function and the machine code for the sqrt function, copied from the corresponding object le in the library libm.a. To avoid the need to specify long paths on the command line, the compiler provides a short-cut option -l for linking against libraries. For example, the following command, $ gcc -Wall calc.c -lm -o calc is equivalent to the original command above using the full library name /usr/lib/libm.a. In general, the compiler option -lNAME will attempt to link object les with a library le libNAME.a in the standard library directories. Additional directories can specied with command-line options and environment variables, to be discussed shortly. A large program will typically use many -l options to link libraries such as the math library, graphics libraries and networking libraries.

2.6.1 Link order of libraries


The ordering of libraries on the command line follows the same convection as for object les: they are searched from left to righta library

16

An Introduction to GCC

containing the denition of a function should appear after any source les or object les which use it. This includes libraries specied with the short-cut -l option, as shown in the following command: $ gcc -Wall calc.c -lm -o calc (correct order) With some compilers the opposite ordering (placing the -lm option before the le which uses it) would result in an error, $ cc -Wall -lm calc.c -o calc (incorrect order) main.o: In function main: main.o(.text+0xf): undefined reference to sqrt because there is no library or object le containing sqrt after calc.c. The option -lm should appear after the le calc.c. When several libraries are being used, the same convention should be followed for the libraries themselves. A library which calls an external function dened in another library should appear before the library containing the function. For example, a program data.c using the GNU Linear Programming library libglpk.a, which in turn uses the math library libm.a, should be compiled as, $ gcc -Wall data.c -lglpk -lm since the object les in libglpk.a use functions dened in libm.a. As for object les, most current compilers will search all libraries, regardless of order. However, since not all compilers do this it is best to follow the convention of ordering libraries from left to right.

2.7 Using library header les


When using a library it is essential to include the appropriate header les, in order to declare the function arguments and return values with the correct types. Without declarations, the arguments of a function can be passed with the wrong type, causing corrupted results. The following example shows another program which makes a function call to the C math library. In this case, the function pow is used to compute the cube of two (2 raised to the power of 3): #include <stdio.h> int main (void) { double x = pow (2.0, 3.0); printf ("Two cubed is %f\n", x); return 0;

Chapter 2: Compiling a C program

17

} However, the program contains an errorthe #include statement for math.h is missing, so the prototype double pow (double x, double y) given there will not be seen by the compiler. Compiling the program without any warning options will produce an executable le which gives incorrect results: $ gcc badpow.c -lm $ ./a.out Two cubed is 2.851120 (incorrect result, should be 8) The results are corrupted because the arguments and return value of the call to pow are passed with incorrect types.(3) This can be detected by turning on the warning option -Wall: $ gcc -Wall badpow.c -lm badpow.c: In function main: badpow.c:6: warning: implicit declaration of function pow This example shows again the importance of using the warning option -Wall to detect serious problems that could otherwise easily be overlooked.

(3)

The actual output shown above may dier, depending on the specic platform and environment.

18

An Introduction to GCC

Chapter 3: Compilation options

19

3 Compilation options
This chapter describes other commonly-used compiler options available in GCC. These options control features such as the search paths used for locating libraries and include les, the use of additional warnings and diagnostics, preprocessor macros and C language dialects.

3.1 Setting search paths


In the last chapter, we saw how to link to a program with functions in the C math library libm.a, using the short-cut option -lm and the header le math.h. A common problem when compiling a program using library header les is the error: FILE.h : No such file or directory This occurs if a header le is not present in the standard include le directories used by gcc. A similar problem can occur for libraries: /usr/bin/ld: cannot find library This happens if a library used for linking is not present in the standard library directories used by gcc. By default, gcc searches the following directories for header les: /usr/local/include/ /usr/include/ and the following directories for libraries: /usr/local/lib/ /usr/lib/ The list of directories for header les is often referred to as the include path, and the list of directories for libraries as the library search path or link path. The directories on these paths are searched in order, from rst to last in the two lists above.(1) For example, a header le found in /usr/local/include takes precedence over a le with the same name in /usr/include. Similarly, a library found in /usr/local/lib takes precedence over a library with the same name in /usr/lib.
(1)

The default search paths may also include additional system-dependent or site-specic directories, and directories in the GCC installation itself. For example, on 64-bit platforms additional lib64 directories may also be searched by default.

20

An Introduction to GCC

When additional libraries are installed in other directories it is necessary to extend the search paths, in order for the libraries to be found. The compiler options -I and -L add new directories to the beginning of the include path and library search path respectively.

3.1.1 Search path example


The following example program uses a library that might be installed as an additional package on a systemthe GNU Database Management Library (GDBM). The GDBM Library stores key-value pairs in a DBM le, a type of data le which allows values to be stored and indexed by a key (an arbitrary sequence of characters). Here is the example program dbmain.c, which creates a DBM le containing a key testkey with the value testvalue: #include <stdio.h> #include <gdbm.h> int main (void) { GDBM_FILE dbf; datum key = { "testkey", 7 }; /* key, length */ datum value = { "testvalue", 9 }; /* value, length */ printf ("Storing key-value pair... "); dbf = gdbm_open ("test", 0, GDBM_NEWDB, 0644, 0); gdbm_store (dbf, key, value, GDBM_INSERT); gdbm_close (dbf); printf ("done.\n"); return 0; } The program uses the header le gdbm.h and the library libgdbm.a. If the library has been installed in the default location of /usr/local/lib, with the header le in /usr/local/include, then the program can be compiled with the following simple command: $ gcc -Wall dbmain.c -lgdbm Both these directories are part of the default gcc include and link paths. However, if GDBM has been installed in a dierent location, trying to compile the program will give the following error: $ gcc -Wall dbmain.c -lgdbm dbmain.c:1: gdbm.h: No such file or directory

Chapter 3: Compilation options

21

For example, if version 1.8.3 of the GDBM package is installed under the directory /opt/gdbm-1.8.3 the location of the header le would be, /opt/gdbm-1.8.3/include/gdbm.h which is not part of the default gcc include path. Adding the appropriate directory to the include path with the command-line option -I allows the program to be compiled, but not linked: $ gcc -Wall -I/opt/gdbm-1.8.3/include dbmain.c -lgdbm /usr/bin/ld: cannot find -lgdbm collect2: ld returned 1 exit status The directory containing the library is still missing from the link path. It can be added to the link path using the following option: -L/opt/gdbm-1.8.3/lib/ The following command line allows the program to be compiled and linked: $ gcc -Wall -I/opt/gdbm-1.8.3/include -L/opt/gdbm-1.8.3/lib dbmain.c -lgdbm This produces the nal executable linked to the GDBM library. Before seeing how to run this executable we will take a brief look at the environment variables that aect the -I and -L options. Note that you should never place the absolute paths of header les in #include statements in your source code, as this will prevent the program from compiling on other systems. The -I option or the INCLUDE_PATH variable described below should always be used to set the include path for header les.

3.1.2 Environment variables


The search paths for header les and libraries can also be controlled through environment variables in the shell. These may be set automatically for each session using the appropriate login le, such as .bash_profile. Additional directories can be added to the include path using the environment variable C_INCLUDE_PATH (for C header les) or CPLUS_INCLUDE_ PATH (for C++ header les). For example, the following commands will add /opt/gdbm-1.8.3/include to the include path when compiling C programs: $ C_INCLUDE_PATH=/opt/gdbm-1.8.3/include $ export C_INCLUDE_PATH This directory will be searched after any directories specied on the command line with the option -I, and before the standard default directories /usr/local/include and /usr/include. The shell command export is needed to make the environment variable available to programs outside the shell itself, such as the compilerit is only needed once for each

22

An Introduction to GCC

variable in each shell session, and can also be set in the appropriate login le. Similarly, additional directories can be added to the link path using the environment variable LIBRARY_PATH. For example, the following commands will add /opt/gdbm-1.8.3/lib to the link path: $ LIBRARY_PATH=/opt/gdbm-1.8.3/lib $ export LIBRARY_PATH This directory will be searched after any directories specied on the command line with the option -L, and before the standard default directories /usr/local/lib and /usr/lib. With the environment variable settings given above the program dbmain.c can be compiled without the -I and -L options, $ gcc -Wall dbmain.c -lgdbm because the default paths now use the directories specied in the environment variables C_INCLUDE_PATH and LIBRARY_PATH.

3.1.3 Extended search paths


Following the standard Unix convention for search paths, several directories can be specied together in an environment variable as a colon separated list: DIR1 :DIR2 :DIR3 :... The directories are then searched in order from left to right. A single dot . can be used to specify the current directory.(2) For example, the following settings create default include and link paths for packages installed in the current directory . and the include and lib directories under /opt/gdbm-1.8.3 and /net respectively: $ C_INCLUDE_PATH=.:/opt/gdbm-1.8.3/include:/net/include $ LIBRARY_PATH=.:/opt/gdbm-1.8.3/lib:/net/lib To specify multiple search path directories on the command line, the options -I and -L can be repeated. For example, the following command, $ gcc -I. -I/opt/gdbm-1.8.3/include -I/net/include -L. -L/opt/gdbm-1.8.3/lib -L/net/lib ..... is equivalent to the environment variable settings given above. When environment variables and command-line options are used together the compiler searches the directories in the following order: 1. command-line options -I and -L, from left to right
(2)

The current directory can also be specied using an empty path element. For example, :DIR1 :DIR2 is equivalent to .:DIR1 :DIR2 .

Chapter 3: Compilation options

23

2. directories specied by environment variables, such as C_INCLUDE_ PATH and LIBRARY_PATH 3. default system directories In day-to-day usage, directories are usually added to the search paths with the options -I and -L.

3.2 Shared libraries and static libraries


Although the example program above has been successfully compiled and linked, a nal step is needed before being able to load and run the executable le. If an attempt is made to start the executable directly, the following error will occur on most systems: $ ./a.out ./a.out: error while loading shared libraries: libgdbm.so.3: cannot open shared object file: No such file or directory This is because the GDBM package provides a shared library. This type of library requires special treatmentit must be loaded from disk before the executable will run. External libraries are usually provided in two forms: static libraries and shared libraries. Static libraries are the .a les seen earlier. When a program is linked against a static library, the machine code from the object les for any external functions used by the program is copied from the library into the nal executable. Shared libraries are handled with a more advanced form of linking, which makes the executable le smaller. They use the extension .so, which stands for shared object. An executable le linked against a shared library contains only a small table of the functions it requires, instead of the complete machine code from the object les for the external functions. Before the executable le starts running, the machine code for the external functions is copied into memory from the shared library le on disk by the operating systema process referred to as dynamic linking. Dynamic linking makes executable les smaller and saves disk space, because one copy of a library can be shared between multiple programs. Most operating systems also provide a virtual memory mechanism which allows one copy of a shared library in physical memory to be used by all running programs, saving memory as well as disk space.

24

An Introduction to GCC

Furthermore, shared libraries make it possible to update a library without recompiling the programs which use it (provided the interface to the library does not change). Because of these advantages gcc compiles programs to use shared libraries by default on most systems, if they are available. Whenever a static library libNAME.a would be used for linking with the option -lNAME the compiler rst checks for an alternative shared library with the same name and a .so extension. In this case, when the compiler searches for the libgdbm library in the link path, it nds the following two les in the directory /opt/gdbm-1.8.3/lib: $ cd /opt/gdbm-1.8.3/lib $ ls libgdbm.* libgdbm.a libgdbm.so Consequently, the libgdbm.so shared object le is used in preference to the libgdbm.a static library. However, when the executable le is started its loader function must nd the shared library in order to load it into memory. By default the loader searches for shared libraries only in a predened set of system directories, such as /usr/local/lib and /usr/lib. If the library is not located in one of these directories it must be added to the load path.(3) The simplest way to set the load path is through the environment variable LD_LIBRARY_PATH. For example, the following commands set the load path to /opt/gdbm-1.8.3/lib so that libgdbm.so can be found: $ LD_LIBRARY_PATH=/opt/gdbm-1.8.3/lib $ export LD_LIBRARY_PATH $ ./a.out Storing key-value pair... done. The executable now runs successfully, prints its message and creates a DBM le called test containing the key-value pair testkey and testvalue. To save typing, the LD_LIBRARY_PATH environment variable can be set once for each session in the appropriate login le, such as .bash_profile for the GNU Bash shell. Several shared library directories can be placed in the load path, as a colon separated list DIR1 :DIR2 :DIR3 :...:DIRN . For example, the fol(3)

Note that the directory containing the shared library can, in principle, be stored (hard-coded) in the executable itself using the linker option -rpath, but this is not usually done since it creates problems if the library is moved or the executable is copied to another system.

Chapter 3: Compilation options

25

lowing command sets the load path to use the lib directories under /opt/gdbm-1.8.3 and /opt/gtk-1.4: $ LD_LIBRARY_PATH=/opt/gdbm-1.8.3/lib:/opt/gtk-1.4/lib $ export LD_LIBRARY_PATH If the load path contains existing entries, it can be extended using the syntax LD_LIBRARY_PATH=NEWDIRS :$LD_LIBRARY_PATH. For example, the following command adds the directory /opt/gsl-1.5/lib to the load path shown above: $ LD_LIBRARY_PATH=/opt/gsl-1.5/lib:$LD_LIBRARY_PATH $ echo $LD_LIBRARY_PATH /opt/gsl-1.5/lib:/opt/gdbm-1.8.3/lib:/opt/gtk-1.4/lib It is possible for the system administrator to set the LD_LIBRARY_PATH variable for all users, by adding it to a default login script, such as /etc/profile. On GNU systems, a system-wide path can also be dened in the loader conguration le /etc/ld.so.conf. Alternatively, static linking can be forced with the -static option to gcc to avoid the use of shared libraries: $ gcc -Wall -static -I/opt/gdbm-1.8.3/include/ -L/opt/gdbm-1.8.3/lib/ dbmain.c -lgdbm This creates an executable linked with the static library libgdbm.a which can be run without setting the environment variable LD_LIBRARY_ PATH or putting shared libraries in the default directories: $ ./a.out Storing key-value pair... done. As noted earlier, it is also possible to link directly with individual library les by specifying the full path to the library on the command line. For example, the following command will link directly with the static library libgdbm.a, $ gcc -Wall -I/opt/gdbm-1.8.3/include dbmain.c /opt/gdbm-1.8.3/lib/libgdbm.a and the command below will link with the shared library le libgdbm.so: $ gcc -Wall -I/opt/gdbm-1.8.3/include dbmain.c /opt/gdbm-1.8.3/lib/libgdbm.so In the latter case it is still necessary to set the library load path when running the executable.

3.3 C language standards


By default, gcc compiles programs using the GNU dialect of the C language, referred to as GNU C. This dialect incorporates the ocial

26

An Introduction to GCC

ANSI/ISO standard for the C language with several useful GNU extensions, such as nested functions and variable-size arrays. Most ANSI/ISO programs will compile under GNU C without changes. There are several options which control the dialect of C used by gcc. The most commonly-used options are -ansi and -pedantic. The specic dialects of the C language for each standard can also be selected with the -std option.

3.3.1 ANSI/ISO
Occasionally a valid ANSI/ISO program may be incompatible with the extensions in GNU C. To deal with this situation, the compiler option -ansi disables those GNU extensions which conict with the ANSI/ISO standard. On systems using the GNU C Library (glibc) it also disables extensions to the C standard library. This allows programs written for ANSI/ISO C to be compiled without any unwanted eects from GNU extensions. For example, here is a valid ANSI/ISO C program which uses a variable called asm: #include <stdio.h> int main (void) { const char asm[] = "6502"; printf ("the string asm is %s\n", asm); return 0; } The variable name asm is valid under the ANSI/ISO standard, but this program will not compile in GNU C because asm is a GNU C keyword extension (it allows native assembly instructions to be used in C functions). Consequently, it cannot be used as a variable name without giving a compilation error: $ gcc -Wall ansi.c ansi.c: In function main: ansi.c:6: parse error before asm ansi.c:7: parse error before asm In contrast, using the -ansi option disables the asm keyword extension, and allows the program above to be compiled correctly: $ gcc -Wall -ansi ansi.c $ ./a.out the string asm is 6502

Chapter 3: Compilation options

27

For reference, the non-standard keywords and macros dened by the GNU C extensions are asm, inline, typeof, unix and vax. More details can be found in the GCC Reference Manual Using GCC (see [Further reading], page 91). The next example shows the eect of the -ansi option on systems using the GNU C Library, such as GNU/Linux systems. The program below prints the value of pi, = 3.14159..., from the preprocessor denition M_PI in the header le math.h: #include <math.h> #include <stdio.h> int main (void) { printf("the value of pi is %f\n", M_PI); return 0; } The constant M_PI is not part of the ANSI/ISO C standard library (it comes from the BSD version of Unix). In this case, the program will not compile with the -ansi option: $ gcc -Wall -ansi pi.c pi.c: In function main: pi.c:7: M_PI undeclared (first use in this function) pi.c:7: (Each undeclared identifier is reported only once pi.c:7: for each function it appears in.) The program can be compiled without the -ansi option. In this case both the language and library extensions are enabled by default: $ gcc -Wall pi.c $ ./a.out the value of pi is 3.141593 It is also possible to compile the program using ANSI/ISO C, by enabling only the extensions in the GNU C Library itself. This can be achieved by dening special macros, such as _GNU_SOURCE, which enable extensions in the GNU C Library:(4) $ gcc -Wall -ansi -D_GNU_SOURCE pi.c $ ./a.out the value of pi is 3.141593 The GNU C Library provides a number of these macros (referred to as feature test macros ) which allow control over the support for POSIX ex(4)

The -D option for dening macros will be explained in detail in the next chapter.

28

An Introduction to GCC

tensions (_POSIX_C_SOURCE), BSD extensions (_BSD_SOURCE), SVID extensions (_SVID_SOURCE), XOPEN extensions (_XOPEN_SOURCE) and GNU extensions (_GNU_SOURCE). The _GNU_SOURCE macro enables all the extensions together, with the POSIX extensions taking precedence over the others in cases where they conict. Further information about feature test macros can be found in the GNU C Library Reference Manual, see [Further reading], page 91.

3.3.2 Strict ANSI/ISO


The command-line option -pedantic in combination with -ansi will cause gcc to reject all GNU C extensions, not just those that are incompatible with the ANSI/ISO standard. This helps you to write portable programs which follow the ANSI/ISO standard. Here is a program which uses variable-size arrays, a GNU C extension. The array x[n] is declared with a length specied by the integer variable n. int main (int argc, char *argv[]) { int i, n = argc; double x[n]; for (i = 0; i < n; i++) x[i] = i; return 0; } This program will compile with -ansi, because support for variable length arrays does not interfere with the compilation of valid ANSI/ISO programsit is a backwards-compatible extension: $ gcc -Wall -ansi gnuarray.c However, compiling with -ansi -pedantic reports warnings about violations of the ANSI/ISO standard: $ gcc -Wall -ansi -pedantic gnuarray.c gnuarray.c: In function main: gnuarray.c:5: warning: ISO C90 forbids variable-size array x Note that an absence of warnings from -ansi -pedantic does not guarantee that a program strictly conforms to the ANSI/ISO standard. The standard itself species only a limited set of circumstances that should generate diagnostics, and these are what -ansi -pedantic reports.

Chapter 3: Compilation options

29

3.3.3 Selecting specic standards


The specic language standard used by GCC can be controlled with the -std option. The following C language standards are supported: -std=c89 or -std=iso9899:1990 The original ANSI/ISO C language standard (ANSI X3.159-1989, ISO/IEC 9899:1990). GCC incorporates the corrections in the two ISO Technical Corrigenda to the original standard. -std=iso9899:199409 The ISO C language standard with ISO Amendment 1, published in 1994. This amendment was mainly concerned with internationalization, such as adding support for multibyte characters to the C library. -std=c99 or -std=iso9899:1999 The revised ISO C language standard, published in 1999 (ISO/IEC 9899:1999). The C language standards with GNU extensions can be selected with the options -std=gnu89 and -std=gnu99.

3.4 Warning options in -Wall


As described earlier (see Section 2.1 [Compiling a simple C program], page 7), the warning option -Wall enables warnings for many common errors, and should always be used. It combines a large number of other, more specic, warning options which can also be selected individually. Here is a summary of these options: -Wcomment (included in -Wall) This option warns about nested comments. Nested comments typically arise when a section of code containing comments is later commented out: /* commented out double x = 1.23 ; /* x-position */ */ Nested comments can be a source of confusionthe safe way to comment out a section of code containing comments is to surround it with the preprocessor directive #if 0 ... #endif: /* commented out */ #if 0 double x = 1.23 ; /* x-position */ #endif

30

An Introduction to GCC

-Wformat (included in -Wall) This option warns about the incorrect use of format strings in functions such as printf and scanf, where the format specier does not agree with the type of the corresponding function argument. -Wunused (included in -Wall) This option warns about unused variables. When a variable is declared but not used this can be the result of another variable being accidentally substituted in its place. If the variable is genuinely not needed it can be removed from the source code. -Wimplicit (included in -Wall) This option warns about any functions that are used without being declared. The most common reason for a function to be used without being declared is forgetting to include a header le. -Wreturn-type (included in -Wall) This option warns about functions that are dened without a return type but not declared void. It also catches empty return statements in functions that are not declared void. For example, the following program does not use an explicit return value: #include <stdio.h> int main (void) { printf ("hello world\n"); return; } The lack of a return value in the code above could be the result of an accidental omission by the programmerthe value returned by the main function is actually the return value of the printf function (the number of characters printed). To avoid ambiguity, it is preferable to use an explicit value in the return statement, either as a variable or a constant, such as return 0. The complete set of warning options included in -Wall can be found in the GCC Reference Manual Using GCC (see [Further reading], page 91). The options included in -Wall have the common characteristic that they report constructions which are always wrong, or can easily be rewritten in an unambiguously correct way. This is why they are so usefulany warning produced by -Wall can be taken as an indication of a potentially serious problem.

Chapter 3: Compilation options

31

3.5 Additional warning options


GCC provides many other warning options that are not included in -Wall, but are often useful. Typically these produce warnings for source code which may be technically valid but is very likely to cause problems. The criteria for these options are based on experience of common errorsthey are not included in -Wall because they only indicate possibly problematic or suspicious code. Since these warnings can be issued for valid code it is not necessary to compile with them all the time. It is more appropriate to use them periodically and review the results, checking for anything unexpected, or to enable them for some programs or les. -W This is a general option similar to -Wall which warns about a selection of common programming errors, such as functions which can return without a value (also known as falling o the end of the function body), and comparisons between signed and unsigned values. For example, the following function tests whether an unsigned integer is negative (which is impossible, of course): int foo (unsigned int x) { if (x < 0) return 0; /* cannot occur */ else return 1; } Compiling this function with -Wall does not produce a warning, $ gcc -Wall -c w.c but does give a warning with -W: $ gcc -W -c w.c w.c: In function foo: w.c:4: warning: comparison of unsigned expression < 0 is always false In practice, the options -W and -Wall are normally used together.

-Wconversion This option warns about implicit type conversions that could cause unexpected results. For example, the assignment of a negative value to an unsigned variable, as in the following code, unsigned int x = -1; is technically allowed by the ANSI/ISO C standard (with the negative integer being converted to a positive integer, according to the

32

An Introduction to GCC machine representation) but could be a simple programming error. If you need to perform such a conversion you can use an explicit cast, such as ((unsigned int) -1), to avoid any warnings from this option. On twos-complement machines the result of the cast gives the maximum number that can be represented by an unsigned integer.

-Wshadow This option warns about the redeclaration of a variable name in a scope where it has already been declared. This is referred to as variable shadowing, and causes confusion about which occurrence of the variable corresponds to which value. The following function declares a local variable y that shadows the declaration in the body of the function: double test (double x) { double y = 1.0; { double y; y = x; } return y; } This is valid ANSI/ISO C, where the return value is 1. The shadowing of the variable y might make it seem (incorrectly) that the return value is x, when looking at the line y = x (especially in a large and complicated function). Shadowing can also occur for function names. For example, the following program attempts to dene a variable sin which shadows the standard function sin(x). double sin_series (double x) { /* series expansion for small x */ double sin = x * (1.0 - x * x / 6.0); return sin; } This error will be detected by the -Wshadow option. -Wcast-qual This option warns about pointers that are cast to remove a type qualier, such as const. For example, the following function dis-

Chapter 3: Compilation options

33

cards the const qualier from its input argument, allowing it to be overwritten: void f (const char * str) { char * s = (char *)str; s[0] = \0; } The modication of the original contents of str is a violation of its const property. This option will warn about the improper cast of the variable str which allows the string to be modied. -Wwrite-strings This option implicitly gives all string constants dened in the program a const qualier, causing a compile-time warning if there is an attempt to overwrite them. The result of modifying a string constant is not dened by the ANSI/ISO standard, and the use of writable string constants is deprecated in GCC. -Wtraditional This option warns about parts of the code which would be interpreted dierently by an ANSI/ISO compiler and a traditional pre-ANSI compiler.(5) When maintaining legacy software it may be necessary to investigate whether the traditional or ANSI/ISO interpretation was intended in the original code for warnings generated by this option. The options above produce diagnostic warning messages, but allow the compilation to continue and produce an object le or executable. For large programs it can be desirable to catch all the warnings by stopping the compilation whenever a warning is generated. The -Werror option changes the default behavior by converting warnings into errors, stopping the compilation whenever a warning occurs.

(5)

The traditional form of the C language was described in the original C reference manual The C Programming Language (First Edition) by Kernighan and Ritchie.

Você também pode gostar