Escolar Documentos
Profissional Documentos
Cultura Documentos
1 Introduction
The purpose of this book is to explain the use of the GNU C and C++ compilers, gcc and g++. After reading this book you should understand how to compile a program, and how to use basic compiler options for optimization and debugging. This book does not attempt to teach the C or C++ languages themselves, since this material can be found in many other places (see [Further reading], page 91). Experienced programmers who are familiar with other systems, but new to the GNU compilers, can skip the early sections of the chapters Compiling a C program, Using the preprocessor and Compiling a C++ program. The remaining sections and chapters should provide a good overview of the features of GCC for those already know how to use other compilers.
An Introduction to GCC
GCC is now used to refer to the GNU Compiler Collection. Its development is guided by the GCC Steering Committee, a group composed of representatives from GCC user communities in industry, research and academia.
Chapter 1: Introduction
See http://www.network-theory.co.uk/gcc/intro/
An Introduction to GCC
version of GCC. Any command-line options which are only available in recent versions of GCC are noted in the text. The examples assume the use of a GNU operating systemthere may be minor dierences in the output on other systems. Some non-essential and verbose system-dependent output messages (such as very long system paths) have been edited in the examples for brevity. The commands for setting environment variables use the syntax of the standard GNU shell (bash), and should work with any version of the Bourne shell.
2 Compiling a C program
This chapter describes how to compile C programs using gcc. Programs can be compiled from a single source le or from multiple source les, and may use system libraries and header les. Compilation refers to the process of converting a program from the textual source code, in a programming language such as C or C++, into machine code, the sequence of 1s and 0s used to control the central processing unit (CPU) of the computer. This machine code is then stored in a le known as an executable le, sometimes referred to as a binary le.
An Introduction to GCC
they are enabled. Compiler warnings are an essential aid in detecting problems when programming in C and C++. In this case, the compiler does not produce any warnings with the -Wall option, since the program is completely valid. Source code which does not produce any warnings is said to compile cleanly. To run the program, type the path name of the executable like this: $ ./hello Hello, world! This loads the executable le into memory and causes the CPU to begin executing the instructions contained within it. The path ./ refers to the current directory, so ./hello loads and runs the executable le hello located in the current directory.
C, such as the GNU C Library Reference Manual, see [Further reading], page 91). Without the warning option -Wall the program appears to compile cleanly, but produces incorrect results: $ gcc bad.c -o bad $ ./bad Two plus two is 2.585495 (incorrect output) The incorrect format specier causes the output to be corrupted, because the function printf is passed an integer instead of a oating-point number. Integers and oating-point numbers are stored in dierent formats in memory, and generally occupy dierent numbers of bytes, leading to a spurious result. The actual output shown above may dier, depending on the specic platform and environment. Clearly, it is very dangerous to develop a program without checking for compiler warnings. If there are any functions which are not used correctly they can cause the program to crash, or to produce incorrect results. Turning on the compiler warning option -Wall will catch many of the commonest errors which occur in C programming.
10
An Introduction to GCC
longer need to include the system header le stdio.h in main.c to declare the function printf, since the le main.c does not call printf directly. The declaration in hello.h is a single line specifying the prototype of the function hello: void hello (const char * name); The denition of the function hello itself is contained in the le hello_fn.c: #include <stdio.h> #include "hello.h" void hello (const char * name) { printf ("Hello, %s!\n", name); } This function prints the message Hello, name ! using its argument as the value of name. Incidentally, the dierence between the two forms of the include statement #include "FILE.h" and #include <FILE.h> is that the former searches for FILE.h in the current directory before looking in the system header le directories. The include statement #include <FILE.h> searches the system header les, but does not look in the current directory by default. To compile these source les with gcc, use the following command: $ gcc -Wall main.c hello_fn.c -o newhello In this case, we use the -o option to specify a dierent output le for the executable, newhello. Note that the header le hello.h is not specied in the list of les on the command line. The directive #include "hello.h" in the source les instructs the compiler to include it automatically at the appropriate points. To run the program, type the path name of the executable: $ ./newhello Hello, world! All the parts of the program have been combined into a single executable le, which produces the same result as the executable created from the single source le used earlier.
11
12
An Introduction to GCC
13
This is worth keeping in mind if you ever encounter unexpected problems with undened references, and all the necessary object les appear to be present on the command line.
If the prototype of a function has changed, it is necessary to modify and recompile all of the other source les which use it.
14
An Introduction to GCC
les in a project can be automated using GNU Make (see [Further reading], page 91).
On systems supporting both 64 and 32-bit executables the 64-bit versions of the libraries will often be stored in /usr/lib64 and /lib64, with the 32-bit versions in /usr/lib and /lib.
15
$ gcc -Wall calc.c -o calc /tmp/ccbR6Ojm.o: In function main: /tmp/ccbR6Ojm.o(.text+0x19): undefined reference to sqrt The problem is that the reference to the sqrt function cannot be resolved without the external math library libm.a. The function sqrt is not dened in the program or the default library libc.a, and the compiler does not link to the le libm.a unless it is explicitly selected. Incidentally, the le mentioned in the error message /tmp/ccbR60jm.o is a temporary object le created by the compiler from calc.c, in order to carry out the linking process. To enable the compiler to link the sqrt function to the main program calc.c we need to supply the library libm.a. One obvious but cumbersome way to do this is to specify it explicitly on the command line: $ gcc -Wall calc.c /usr/lib/libm.a -o calc The library libm.a contains object les for all the mathematical functions, such as sin, cos, exp, log and sqrt. The linker searches through these to nd the object le containing the sqrt function. Once the object le for the sqrt function has been found, the main program can be linked and a complete executable produced: $ ./calc The square root of 2.0 is 1.414214 The executable le includes the machine code for the main function and the machine code for the sqrt function, copied from the corresponding object le in the library libm.a. To avoid the need to specify long paths on the command line, the compiler provides a short-cut option -l for linking against libraries. For example, the following command, $ gcc -Wall calc.c -lm -o calc is equivalent to the original command above using the full library name /usr/lib/libm.a. In general, the compiler option -lNAME will attempt to link object les with a library le libNAME.a in the standard library directories. Additional directories can specied with command-line options and environment variables, to be discussed shortly. A large program will typically use many -l options to link libraries such as the math library, graphics libraries and networking libraries.
16
An Introduction to GCC
containing the denition of a function should appear after any source les or object les which use it. This includes libraries specied with the short-cut -l option, as shown in the following command: $ gcc -Wall calc.c -lm -o calc (correct order) With some compilers the opposite ordering (placing the -lm option before the le which uses it) would result in an error, $ cc -Wall -lm calc.c -o calc (incorrect order) main.o: In function main: main.o(.text+0xf): undefined reference to sqrt because there is no library or object le containing sqrt after calc.c. The option -lm should appear after the le calc.c. When several libraries are being used, the same convention should be followed for the libraries themselves. A library which calls an external function dened in another library should appear before the library containing the function. For example, a program data.c using the GNU Linear Programming library libglpk.a, which in turn uses the math library libm.a, should be compiled as, $ gcc -Wall data.c -lglpk -lm since the object les in libglpk.a use functions dened in libm.a. As for object les, most current compilers will search all libraries, regardless of order. However, since not all compilers do this it is best to follow the convention of ordering libraries from left to right.
17
} However, the program contains an errorthe #include statement for math.h is missing, so the prototype double pow (double x, double y) given there will not be seen by the compiler. Compiling the program without any warning options will produce an executable le which gives incorrect results: $ gcc badpow.c -lm $ ./a.out Two cubed is 2.851120 (incorrect result, should be 8) The results are corrupted because the arguments and return value of the call to pow are passed with incorrect types.(3) This can be detected by turning on the warning option -Wall: $ gcc -Wall badpow.c -lm badpow.c: In function main: badpow.c:6: warning: implicit declaration of function pow This example shows again the importance of using the warning option -Wall to detect serious problems that could otherwise easily be overlooked.
(3)
The actual output shown above may dier, depending on the specic platform and environment.
18
An Introduction to GCC
19
3 Compilation options
This chapter describes other commonly-used compiler options available in GCC. These options control features such as the search paths used for locating libraries and include les, the use of additional warnings and diagnostics, preprocessor macros and C language dialects.
The default search paths may also include additional system-dependent or site-specic directories, and directories in the GCC installation itself. For example, on 64-bit platforms additional lib64 directories may also be searched by default.
20
An Introduction to GCC
When additional libraries are installed in other directories it is necessary to extend the search paths, in order for the libraries to be found. The compiler options -I and -L add new directories to the beginning of the include path and library search path respectively.
21
For example, if version 1.8.3 of the GDBM package is installed under the directory /opt/gdbm-1.8.3 the location of the header le would be, /opt/gdbm-1.8.3/include/gdbm.h which is not part of the default gcc include path. Adding the appropriate directory to the include path with the command-line option -I allows the program to be compiled, but not linked: $ gcc -Wall -I/opt/gdbm-1.8.3/include dbmain.c -lgdbm /usr/bin/ld: cannot find -lgdbm collect2: ld returned 1 exit status The directory containing the library is still missing from the link path. It can be added to the link path using the following option: -L/opt/gdbm-1.8.3/lib/ The following command line allows the program to be compiled and linked: $ gcc -Wall -I/opt/gdbm-1.8.3/include -L/opt/gdbm-1.8.3/lib dbmain.c -lgdbm This produces the nal executable linked to the GDBM library. Before seeing how to run this executable we will take a brief look at the environment variables that aect the -I and -L options. Note that you should never place the absolute paths of header les in #include statements in your source code, as this will prevent the program from compiling on other systems. The -I option or the INCLUDE_PATH variable described below should always be used to set the include path for header les.
22
An Introduction to GCC
variable in each shell session, and can also be set in the appropriate login le. Similarly, additional directories can be added to the link path using the environment variable LIBRARY_PATH. For example, the following commands will add /opt/gdbm-1.8.3/lib to the link path: $ LIBRARY_PATH=/opt/gdbm-1.8.3/lib $ export LIBRARY_PATH This directory will be searched after any directories specied on the command line with the option -L, and before the standard default directories /usr/local/lib and /usr/lib. With the environment variable settings given above the program dbmain.c can be compiled without the -I and -L options, $ gcc -Wall dbmain.c -lgdbm because the default paths now use the directories specied in the environment variables C_INCLUDE_PATH and LIBRARY_PATH.
The current directory can also be specied using an empty path element. For example, :DIR1 :DIR2 is equivalent to .:DIR1 :DIR2 .
23
2. directories specied by environment variables, such as C_INCLUDE_ PATH and LIBRARY_PATH 3. default system directories In day-to-day usage, directories are usually added to the search paths with the options -I and -L.
24
An Introduction to GCC
Furthermore, shared libraries make it possible to update a library without recompiling the programs which use it (provided the interface to the library does not change). Because of these advantages gcc compiles programs to use shared libraries by default on most systems, if they are available. Whenever a static library libNAME.a would be used for linking with the option -lNAME the compiler rst checks for an alternative shared library with the same name and a .so extension. In this case, when the compiler searches for the libgdbm library in the link path, it nds the following two les in the directory /opt/gdbm-1.8.3/lib: $ cd /opt/gdbm-1.8.3/lib $ ls libgdbm.* libgdbm.a libgdbm.so Consequently, the libgdbm.so shared object le is used in preference to the libgdbm.a static library. However, when the executable le is started its loader function must nd the shared library in order to load it into memory. By default the loader searches for shared libraries only in a predened set of system directories, such as /usr/local/lib and /usr/lib. If the library is not located in one of these directories it must be added to the load path.(3) The simplest way to set the load path is through the environment variable LD_LIBRARY_PATH. For example, the following commands set the load path to /opt/gdbm-1.8.3/lib so that libgdbm.so can be found: $ LD_LIBRARY_PATH=/opt/gdbm-1.8.3/lib $ export LD_LIBRARY_PATH $ ./a.out Storing key-value pair... done. The executable now runs successfully, prints its message and creates a DBM le called test containing the key-value pair testkey and testvalue. To save typing, the LD_LIBRARY_PATH environment variable can be set once for each session in the appropriate login le, such as .bash_profile for the GNU Bash shell. Several shared library directories can be placed in the load path, as a colon separated list DIR1 :DIR2 :DIR3 :...:DIRN . For example, the fol(3)
Note that the directory containing the shared library can, in principle, be stored (hard-coded) in the executable itself using the linker option -rpath, but this is not usually done since it creates problems if the library is moved or the executable is copied to another system.
25
lowing command sets the load path to use the lib directories under /opt/gdbm-1.8.3 and /opt/gtk-1.4: $ LD_LIBRARY_PATH=/opt/gdbm-1.8.3/lib:/opt/gtk-1.4/lib $ export LD_LIBRARY_PATH If the load path contains existing entries, it can be extended using the syntax LD_LIBRARY_PATH=NEWDIRS :$LD_LIBRARY_PATH. For example, the following command adds the directory /opt/gsl-1.5/lib to the load path shown above: $ LD_LIBRARY_PATH=/opt/gsl-1.5/lib:$LD_LIBRARY_PATH $ echo $LD_LIBRARY_PATH /opt/gsl-1.5/lib:/opt/gdbm-1.8.3/lib:/opt/gtk-1.4/lib It is possible for the system administrator to set the LD_LIBRARY_PATH variable for all users, by adding it to a default login script, such as /etc/profile. On GNU systems, a system-wide path can also be dened in the loader conguration le /etc/ld.so.conf. Alternatively, static linking can be forced with the -static option to gcc to avoid the use of shared libraries: $ gcc -Wall -static -I/opt/gdbm-1.8.3/include/ -L/opt/gdbm-1.8.3/lib/ dbmain.c -lgdbm This creates an executable linked with the static library libgdbm.a which can be run without setting the environment variable LD_LIBRARY_ PATH or putting shared libraries in the default directories: $ ./a.out Storing key-value pair... done. As noted earlier, it is also possible to link directly with individual library les by specifying the full path to the library on the command line. For example, the following command will link directly with the static library libgdbm.a, $ gcc -Wall -I/opt/gdbm-1.8.3/include dbmain.c /opt/gdbm-1.8.3/lib/libgdbm.a and the command below will link with the shared library le libgdbm.so: $ gcc -Wall -I/opt/gdbm-1.8.3/include dbmain.c /opt/gdbm-1.8.3/lib/libgdbm.so In the latter case it is still necessary to set the library load path when running the executable.
26
An Introduction to GCC
ANSI/ISO standard for the C language with several useful GNU extensions, such as nested functions and variable-size arrays. Most ANSI/ISO programs will compile under GNU C without changes. There are several options which control the dialect of C used by gcc. The most commonly-used options are -ansi and -pedantic. The specic dialects of the C language for each standard can also be selected with the -std option.
3.3.1 ANSI/ISO
Occasionally a valid ANSI/ISO program may be incompatible with the extensions in GNU C. To deal with this situation, the compiler option -ansi disables those GNU extensions which conict with the ANSI/ISO standard. On systems using the GNU C Library (glibc) it also disables extensions to the C standard library. This allows programs written for ANSI/ISO C to be compiled without any unwanted eects from GNU extensions. For example, here is a valid ANSI/ISO C program which uses a variable called asm: #include <stdio.h> int main (void) { const char asm[] = "6502"; printf ("the string asm is %s\n", asm); return 0; } The variable name asm is valid under the ANSI/ISO standard, but this program will not compile in GNU C because asm is a GNU C keyword extension (it allows native assembly instructions to be used in C functions). Consequently, it cannot be used as a variable name without giving a compilation error: $ gcc -Wall ansi.c ansi.c: In function main: ansi.c:6: parse error before asm ansi.c:7: parse error before asm In contrast, using the -ansi option disables the asm keyword extension, and allows the program above to be compiled correctly: $ gcc -Wall -ansi ansi.c $ ./a.out the string asm is 6502
27
For reference, the non-standard keywords and macros dened by the GNU C extensions are asm, inline, typeof, unix and vax. More details can be found in the GCC Reference Manual Using GCC (see [Further reading], page 91). The next example shows the eect of the -ansi option on systems using the GNU C Library, such as GNU/Linux systems. The program below prints the value of pi, = 3.14159..., from the preprocessor denition M_PI in the header le math.h: #include <math.h> #include <stdio.h> int main (void) { printf("the value of pi is %f\n", M_PI); return 0; } The constant M_PI is not part of the ANSI/ISO C standard library (it comes from the BSD version of Unix). In this case, the program will not compile with the -ansi option: $ gcc -Wall -ansi pi.c pi.c: In function main: pi.c:7: M_PI undeclared (first use in this function) pi.c:7: (Each undeclared identifier is reported only once pi.c:7: for each function it appears in.) The program can be compiled without the -ansi option. In this case both the language and library extensions are enabled by default: $ gcc -Wall pi.c $ ./a.out the value of pi is 3.141593 It is also possible to compile the program using ANSI/ISO C, by enabling only the extensions in the GNU C Library itself. This can be achieved by dening special macros, such as _GNU_SOURCE, which enable extensions in the GNU C Library:(4) $ gcc -Wall -ansi -D_GNU_SOURCE pi.c $ ./a.out the value of pi is 3.141593 The GNU C Library provides a number of these macros (referred to as feature test macros ) which allow control over the support for POSIX ex(4)
The -D option for dening macros will be explained in detail in the next chapter.
28
An Introduction to GCC
tensions (_POSIX_C_SOURCE), BSD extensions (_BSD_SOURCE), SVID extensions (_SVID_SOURCE), XOPEN extensions (_XOPEN_SOURCE) and GNU extensions (_GNU_SOURCE). The _GNU_SOURCE macro enables all the extensions together, with the POSIX extensions taking precedence over the others in cases where they conict. Further information about feature test macros can be found in the GNU C Library Reference Manual, see [Further reading], page 91.
29
30
An Introduction to GCC
-Wformat (included in -Wall) This option warns about the incorrect use of format strings in functions such as printf and scanf, where the format specier does not agree with the type of the corresponding function argument. -Wunused (included in -Wall) This option warns about unused variables. When a variable is declared but not used this can be the result of another variable being accidentally substituted in its place. If the variable is genuinely not needed it can be removed from the source code. -Wimplicit (included in -Wall) This option warns about any functions that are used without being declared. The most common reason for a function to be used without being declared is forgetting to include a header le. -Wreturn-type (included in -Wall) This option warns about functions that are dened without a return type but not declared void. It also catches empty return statements in functions that are not declared void. For example, the following program does not use an explicit return value: #include <stdio.h> int main (void) { printf ("hello world\n"); return; } The lack of a return value in the code above could be the result of an accidental omission by the programmerthe value returned by the main function is actually the return value of the printf function (the number of characters printed). To avoid ambiguity, it is preferable to use an explicit value in the return statement, either as a variable or a constant, such as return 0. The complete set of warning options included in -Wall can be found in the GCC Reference Manual Using GCC (see [Further reading], page 91). The options included in -Wall have the common characteristic that they report constructions which are always wrong, or can easily be rewritten in an unambiguously correct way. This is why they are so usefulany warning produced by -Wall can be taken as an indication of a potentially serious problem.
31
-Wconversion This option warns about implicit type conversions that could cause unexpected results. For example, the assignment of a negative value to an unsigned variable, as in the following code, unsigned int x = -1; is technically allowed by the ANSI/ISO C standard (with the negative integer being converted to a positive integer, according to the
32
An Introduction to GCC machine representation) but could be a simple programming error. If you need to perform such a conversion you can use an explicit cast, such as ((unsigned int) -1), to avoid any warnings from this option. On twos-complement machines the result of the cast gives the maximum number that can be represented by an unsigned integer.
-Wshadow This option warns about the redeclaration of a variable name in a scope where it has already been declared. This is referred to as variable shadowing, and causes confusion about which occurrence of the variable corresponds to which value. The following function declares a local variable y that shadows the declaration in the body of the function: double test (double x) { double y = 1.0; { double y; y = x; } return y; } This is valid ANSI/ISO C, where the return value is 1. The shadowing of the variable y might make it seem (incorrectly) that the return value is x, when looking at the line y = x (especially in a large and complicated function). Shadowing can also occur for function names. For example, the following program attempts to dene a variable sin which shadows the standard function sin(x). double sin_series (double x) { /* series expansion for small x */ double sin = x * (1.0 - x * x / 6.0); return sin; } This error will be detected by the -Wshadow option. -Wcast-qual This option warns about pointers that are cast to remove a type qualier, such as const. For example, the following function dis-
33
cards the const qualier from its input argument, allowing it to be overwritten: void f (const char * str) { char * s = (char *)str; s[0] = \0; } The modication of the original contents of str is a violation of its const property. This option will warn about the improper cast of the variable str which allows the string to be modied. -Wwrite-strings This option implicitly gives all string constants dened in the program a const qualier, causing a compile-time warning if there is an attempt to overwrite them. The result of modifying a string constant is not dened by the ANSI/ISO standard, and the use of writable string constants is deprecated in GCC. -Wtraditional This option warns about parts of the code which would be interpreted dierently by an ANSI/ISO compiler and a traditional pre-ANSI compiler.(5) When maintaining legacy software it may be necessary to investigate whether the traditional or ANSI/ISO interpretation was intended in the original code for warnings generated by this option. The options above produce diagnostic warning messages, but allow the compilation to continue and produce an object le or executable. For large programs it can be desirable to catch all the warnings by stopping the compilation whenever a warning is generated. The -Werror option changes the default behavior by converting warnings into errors, stopping the compilation whenever a warning occurs.
(5)
The traditional form of the C language was described in the original C reference manual The C Programming Language (First Edition) by Kernighan and Ritchie.