And other Unixy bits & pieces. A tutorial. To follow along with this tutorial you will need a Unix-like system such as Linux, BSD, Mac OS X, or (on Windows) Cygwin. A simple C programWe’ll begin with a very simple C (not C++) program. Unless mentioned otherwise, everything we’ll see about C also applies to C++ (which was designed with C compatibility in mind); in due course we will move onto C++ specifics. Our program, reproduced below in full, is a complete implementation of a read-only “database” of the number of passengers on a set of flights. Take a moment to understand it fully, as we will be using this program throughout this study. paxCount.cint flights[] = { 20, 15, 0 };
int getCount(char* flightNumber)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return 0;
}
int main(int argc, char** argv)
{
if (argc > 1)
return getCount(argv[1]);
return 0;
}
The program returns, as its shell exit status, the number of passengers on the specified flight. Let’s try it: $ gcc -Wall -Werror paxCount.c && ( ./a.out 1 ; echo $? ) 15
Object files & the linkerOur boss is very impressed by our database of flights. He’s asked us to extract
it from the Our plan is to split the program into two separate source files, paxCount.cint main(int argc, char** argv)
{
if (argc > 1)
return getCount(argv[1]);
return 0;
} paxDB.cint flights[] = { 20, 15, 0 };
int getCount(char* flightNumber)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return 0;
} “ What we usually think of as compilation consists of several stages, some of which we will cover later. For now, we are interested in what we’ll call the actual compilation, followed by linking. In the compilation stage, the compiler takes each source file ( Even when you specify all the sources in a single command like “ In the linking stage, the linker combines all of the object files into a single executable file. The compiler comes from your compiler vendor (in this case, GNU), whereas the linker comes with your system (in the case of Linux, the linker is also made by GNU). This implies that the compiler must create object files in a format that the linker will understand (on Linux, and many other systems, this is the “Executable and Linkable Format”, or ELF). Let’s try to compile our new paxCount.c: $ gcc -Wall -Werror paxCount.c paxCount.c: In function ‘main’:
paxCount.c:4: warning: implicit declaration of function ‘getCount’ Oh-oh! The definition of To compile Now our paxCount.cint getCount(char* flightNumber); /* function declaration */
int main(int argc, char** argv)
{
if (argc > 1)
return getCount(argv[1]);
return 0;
} $ gcc -Wall -Werror paxCount.c Undefined symbols:
"_getCount", referenced from:
_main in ccjyTfqz.o
ld: symbol(s) not found
collect2: ld returned 1 exit status This time the compilation to To tell $ gcc -c -Wall -Werror paxCount.c $ ls paxCount.o paxCount.o Finally! Now let’s compile $ gcc -c -Wall -Werror paxDB.c $ ls paxDB.o paxDB.o We can use the $ nm paxCount.o U _getCount
0000000000000000 T _main $ nm paxDB.o 0000000000000058 D _flights
0000000000000000 T _getCount The last column lists the symbols in the object file. The first column shows each symbol’s value (its location or offset within the file) and the second column shows the symbol’s type. We will investigate the various types later on, but for now all we need to know
is that “ The other symbols are all defined, so let’s be content knowing that Now we can tell gcc to link the object files we compiled just before. Note
that we use $ gcc -Wall -Werror paxCount.o paxDB.o $ ls a.out a.out $ ./a.out 1 ; echo $? 15 Just like our previous monolithic program! To take advantage of our modular database code, let’s write another program.
This one is called “ paxCheck.cint getCount(char* flightNumber);
int main(int argc, char** argv)
{
int count = 0;
if (argc > 1)
count = getCount(argv[1]);
if (count == 0)
return 1; /* error */
else
return 0; /* success */
} So that we don’t get confused about which “ $ gcc -Wall -Werror -o paxCheck paxCheck.c paxDB.o $ ./paxCheck 1 && echo "OK" || echo "ERROR" OK $ ./paxCheck 5 && echo "OK" || echo "ERROR" ERROR All that hard work paid off! See how easy it was to re-use the flights database code. Before we move on, let’s ask ourselves: What would happen if we tried to
compile and link $ gcc -Wall -Werror paxDB.c Undefined symbols:
"_main", referenced from:
start in crt1.10.6.o
ld: symbol(s) not found
collect2: ld returned 1 exit status The compilation (to object code) worked, but the linker reported an error.
Every C (or C++) program must define a “
The assemblerIn the previous section we saw some of the stages of what we usually think of as compilation. We saw that compilation produced an object file from a source code file, and that linking combined one or more object files into a single executable file. Now we will see that what we called “compilation” in the previous section actually consists of two separate stages: compilation proper andassembly.1 Compilation converts the C (or C++) code to assembler code; assembly, performed by the assembler, “assembles” the assembler code into object code—machine instructions. Like the linker, the assembler comes with your system. To tell $ gcc -S -Wall -Werror paxCount.c $ cat paxCount.s _main:
LFB2:
pushq %rbp
LCFI0:
movq %rsp, %rbp
LCFI1:
subq $32, %rsp
LCFI2:
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
cmpl $1, -4(%rbp)
jle L2
movq -16(%rbp), %rax
addq $8, %rax
movq (%rax), %rdi
call _getCount
movl %eax, -20(%rbp)
jmp L4
L2:
movl $0, -20(%rbp)
L4:
movl -20(%rbp), %eax
leave
ret In the assembler code for the After linking with $ gcc -Wall -Werror -o paxCount paxCount.c paxDB.c $ nm paxCount 0000000100000e9a T _getCount
0000000100000e64 T _main $ echo "disassemble main" > gdb.instructions $ gdb -n -batch -x gdb.instructions paxCount Reading symbols for shared libraries .. done
Dump of assembler code for function main:
0x0000000100000e64 <main+0>: push %rbp
0x0000000100000e65 <main+1>: mov %rsp,%rbp
0x0000000100000e68 <main+4>: sub $0x20,%rsp
0x0000000100000e6c <main+8>: mov %edi,-0x4(%rbp)
0x0000000100000e6f <main+11>: mov %rsi,-0x10(%rbp)
0x0000000100000e73 <main+15>: cmpl $0x1,-0x4(%rbp)
0x0000000100000e77 <main+19>: jle 0x100000e8e <main+42>
0x0000000100000e79 <main+21>: mov -0x10(%rbp),%rax
0x0000000100000e7d <main+25>: add $0x8,%rax
0x0000000100000e81 <main+29>: mov (%rax),%rdi
0x0000000100000e84 <main+32>: callq 0x100000e9a <getCount>
0x0000000100000e89 <main+37>: mov %eax,-0x14(%rbp)
0x0000000100000e8c <main+40>: jmp 0x100000e95 <main+49>
0x0000000100000e8e <main+42>: movl $0x0,-0x14(%rbp)
0x0000000100000e95 <main+49>: mov -0x14(%rbp),%eax
0x0000000100000e98 <main+52>: leaveq
0x0000000100000e99 <main+53>: retq
End of assembler dump.
Header files & the preprocessorAfter we shipped Easy enough. For future flexibility, we decide to add a “default” parameter
to paxDB.cint flights[] = { 20, 15, 0 };
int getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} paxCount.cint getCount(char* flightNumber, int deflt);
int main(int argc, char** argv)
{
if (argc > 1)
return getCount(argv[1], -1);
return 0;
} This works pretty much as expected ( $ gcc -Wall -Werror -o paxCount paxCount.c paxDB.c $ ./paxCount 2 ; echo $? 0 $ ./paxCount 5 ; echo $? 255 However, next morning our automated tests of $ gcc -Wall -Werror -o paxCheck paxCheck.c paxDB.c $ ./paxCheck 5 && echo "OK" || echo "ERROR" OK What’s going on? Flight paxCheck.cint getCount(char* flightNumber);
int main(int argc, char** argv)
{
int count = 0;
if (argc > 1)
count = getCount(argv[1]);
if (count == 0)
return 1; /* error */
else
return 0; /* success */
} Oh right! We changed Even if you are confused by how this still compiles and runs,1 at
least it is clear what the cause of the problem is. It’s easy enough to fix, by
correcting the declaration of paxCheck.cint getCount(char* flightNumber, int deflt);
int main(int argc, char** argv)
{
int count = 0;
if (argc > 1)
count = getCount(argv[1]);
if (count == 0)
return 1; /* error */
else
return 0; /* success */
} $ gcc -Wall -Werror -o paxCheck paxCheck.c paxDB.c paxCheck.c: In function ‘main’:
paxCheck.c:8: error: too few arguments to function ‘getCount’ Aha! Now that the compiler knows the correct prototype for The correct program is: paxCheck.cint getCount(char* flightNumber, int deflt);
int main(int argc, char** argv)
{
int count = 0;
if (argc > 1)
count = getCount(argv[1], 0);
if (count == 0)
return 1; /* error */
else
return 0; /* success */
} $ gcc -Wall -Werror -o paxCheck paxCheck.c paxDB.c $ ./paxCheck 5 && echo "OK" || echo "ERROR" ERROR Well, that works, but imagine that we had 10, 20, or 100 programs all using The idea is that paxDB.hint getCount(char* flightNumber, int deflt); paxDB.c2#include "paxDB.h"
int flights[] = { 20, 15, 0 };
int getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} paxCount.c#include "paxDB.h"
int main(int argc, char** argv)
{
if (argc > 1)
return getCount(argv[1], -1);
return 0;
} paxCheck.c#include "paxDB.h"
int main(int argc, char** argv)
{
int count = 0;
if (argc > 1)
count = getCount(argv[1], 0);
if (count == 0)
return 1; /* error */
else
return 0; /* success */
} So how exactly does this work? Time to introduce another stage in the compilation process: Preprocessing. Although we have studied it last of all, it is the first stage to take place in the compilation process. The C PreProcessor ( $ gcc -E -Wall -Werror paxCount.c # 1 "paxCount.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "paxCount.c"
# 1 "paxDB.h" 1
int getCount(char* flightNumber, int deflt);
# 2 "paxCount.c" 2
int main(int argc, char** argv)
{
if (argc > 1)
return getCount(argv[1], -1);
return 0;
} That output, called a translation unit, is what is fed to the compiler. The
lines beginning with “ That’s really all there is to header files: Textual inclusion. There is nothing
magic about them. We could call our file In spite of their simplicity, header files and the preprocessor provide a powerful mechanism for specifying the interface to a source code library or module.3 To recap all the stages of compilation, the following commands show all the
operations that take place for the command $ gcc -E -Wall -Werror paxCount.c > paxCount.i $ gcc -S -Wall -Werror paxCount.i $ gcc -c -Wall -Werror paxCount.s $ gcc -E -Wall -Werror paxDB.c > paxDB.i $ gcc -S -Wall -Werror paxDB.i $ gcc -c -Wall -Werror paxDB.s $ gcc -Wall -Werror -o paxCount paxCount.o paxDB.o $ ./paxCount 1 ; echo $? 15
Preprocessor macrosThe preprocessor can also make other substitutions to your source
files—there’s more than just paxCheck.c#include "paxDB.h"
#define ERROR 1
#define SUCCESS 0
int main(int argc, char** argv)
{
int count = 0;
if (argc > 1)
count = getCount(argv[1], 0);
if (count == 0)
return ERROR;
else
return SUCCESS;
} The preprocessor replaces every occurrence of the $ gcc -E -Wall -Werror paxCheck.c # 1 "paxCheck.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "paxCheck.c"
# 1 "paxDB.h" 1
int getCount(char* flightNumber, int deflt);
# 2 "paxCheck.c" 2
int main(int argc, char** argv)
{
int count = 0;
if (argc > 1)
count = getCount(argv[1], 0);
if (count == 0)
return 1;
else
return 0;
}
paxDB.c#include "paxDB.h"
int flights[] = { 20, 15, 0 };
#define GET(n) if (flightNumber[0] == #n[0]) return flights[n]
int getCount(char* flightNumber, int deflt)
{
GET(0);
GET(1);
GET(2);
return deflt;
} $ gcc -E -Wall -Werror paxDB.c # 1 "paxDB.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "paxDB.c"
# 1 "paxDB.h" 1
int getCount(char* flightNumber, int deflt);
# 2 "paxDB.c" 2
int flights[] = { 20, 15, 0 };
int getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == "0"[0]) return flights[0];
if (flightNumber[0] == "1"[0]) return flights[1];
if (flightNumber[0] == "2"[0]) return flights[2];
return deflt;
} $ gcc -Wall -Werror -o paxCount paxCount.c paxDB.c $ ./paxCount 1 ; echo $? 15 Preprocessor macros (and constants) don’t have to be all-uppercase; that’s a convention. It’s a useful convention because macros are plain text substitution, so they act very differently to run-time function calls, sometimes in unexpected ways. Much has been written on the evils of preprocessor macros—google it. They are very useful, very occasionally. Internal vs. external linkageRemember how we looked at the symbols in $ nm paxDB.o 0000000000000058 D _flights
0000000000000000 T _getCount The paxCount.c#include "paxDB.h"
extern int flights[];
int main(int argc, char** argv)
{
flights[1]++;
if (argc > 1)
return getCount(argv[1], -1);
return 0;
} $ gcc -Wall -Werror -o paxCount paxCount.c paxDB.o $ ./paxCount 1 ; echo $? 16 The line starting with In paxDB.c#include "paxDB.h"
static int flights[] = { 20, 15, 0 };
int getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} Looking at the symbols in $ gcc -c -Wall -Werror paxDB.c $ nm paxDB.o 0000000000000058 d _flights
0000000000000000 T _getCount Now the code outside of $ gcc -c -Wall -Werror paxCount.c $ gcc -Wall -Werror -o paxCount paxCount.o paxDB.o Undefined symbols:
"_flights", referenced from:
_main in paxCount.o
_main in paxCount.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
C++ name manglingSuppose we want to package up paxDB.hint getCount(char* flightNumber, int deflt);
int getCount(int flightNumber, int deflt); paxDB.c#include "paxDB.h"
static int flights[] = { 20, 15, 0 };
int getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
}
int getCount(int flightNumber, int deflt)
{
if (flightNumber >= 0 && flightNumber <= 2)
return flights[flightNumber];
return deflt;
} $ gcc -c -Wall -Werror paxDB.c In file included from paxDB.c:1:
paxDB.h:2: error: conflicting types for ‘getCount’
paxDB.h:1: error: previous declaration of ‘getCount’ was here
paxDB.c:14: error: conflicting types for ‘getCount’
paxDB.c:6: error: previous definition of ‘getCount’ was here Error! C doesn’t allow function overloading, or two functions with the same name but taking different parameters. This makes sense if you think about the linker’s job: As we saw, the linker has
to take a machine instruction like Instead of trying to solve this problem in C, let’s start looking at C++. We’ll
compile and link C++ code by calling $ mv paxDB.c paxDB.cpp $ g++ -c -Wall -Werror paxDB.cpp $ mv paxCount.c paxCount.cpp $ g++ -c -Wall -Werror paxCount.cpp $ g++ -Wall -Werror -o paxCount paxCount.o paxDB.o $ ./paxCount 1 ; echo $? 15 Now let’s investigate how C++ goes about supporting function overloading: $ nm paxDB.o 0000000000000000 T __Z8getCountPci
0000000000000058 T __Z8getCountii
0000000000000098 d __ZL7flights $ nm paxCount.o U __Z8getCountPci
0000000000000000 T _main To tell the two The exact format of the name mangling isn’t defined by the C++ standard so it
depends on your compiler. Here the two Let’s look at the assembler code for the $ g++ -S -Wall -Werror paxCount.cpp $ cat paxCount.s _main:
LFB2:
pushq %rbp
LCFI0:
movq %rsp, %rbp
LCFI1:
subq $32, %rsp
LCFI2:
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
cmpl $1, -4(%rbp)
jle L2
movq -16(%rbp), %rax
addq $8, %rax
movq (%rax), %rdi
movl $-1, %esi
call __Z8getCountPci
movl %eax, -20(%rbp)
jmp L4
L2:
movl $0, -20(%rbp)
L4:
movl -20(%rbp), %eax
leave
ret The compiler generated assembler code using the mangled names. The linker
doesn’t know anything about the compiler’s name-mangling scheme; all the linker
knows is that there is a call to The tool $ nm paxDB.o | c++filt 0000000000000000 T getCount(char*, int)
0000000000000058 T getCount(int, int)
0000000000000098 d flights
Linking C++ code with C librariesSuppose that paxDB.hint getCount(char* flightNumber, int deflt); paxDB.c#include "paxDB.h"
static int flights[] = { 20, 15, 0 };
int getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} However, we want to write our paxCount.cpp#include "paxDB.h"
int main(int argc, char** argv)
{
if (argc > 1)
return getCount(argv[1], -1);
return 0;
} What happens if we try to link these as they are? $ g++ -c -Wall -Werror paxCount.cpp $ g++ -Wall -Werror -o paxCount paxCount.o paxDB.o Undefined symbols:
"getCount(char*, int)", referenced from:
_main in paxCount.o
ld: symbol(s) not found
collect2: ld returned 1 exit status $ nm paxDB.o 0000000000000058 d _flights
0000000000000000 T _getCount $ nm paxCount.o U __Z8getCountPci
0000000000000000 T _main It should be clear, from the above mismatch, why this didn’t work. We can fix this situation by declaring that paxCount.cppextern "C" {
#include "paxDB.h"
}
int main(int argc, char** argv)
{
if (argc > 1)
return getCount(argv[1], -1);
return 0;
}
$ g++ -E -Wall -Werror paxCount.cpp # 1 "paxCount.cpp"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "paxCount.cpp"
extern "C" {
# 1 "paxDB.h" 1
int getCount(char* flightNumber, int deflt);
# 3 "paxCount.cpp" 2
}
int main(int argc, char** argv)
{
if (argc > 1)
return getCount(argv[1], -1);
return 0;
} Now the linking will succeed: $ g++ -c -Wall -Werror paxCount.cpp $ nm paxCount.o U _getCount
0000000000000000 T _main $ g++ -Wall -Werror -o paxCount paxCount.o paxDB.o $ ./paxCount 1 ; echo $? 15
NamespacesLet’s forget about the previous chapter, and assume that once again we have
control over the source code of In this chapter we are going to write a cargo database, and a new program that
retrieves either the passenger information, or the cargo information, based on
a command-line flag. It’s all much simpler than it sounds because our cargo
database only stores the number of containers on each flight, and, like cargoDB.hint getCount(char* flightNumber, int deflt); cargoDB.cpp#include "cargoDB.h"
static int flights[] = { 0, 8, 9 };
int getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} As you surely noticed, the interface and implementation of the cargo database are identical to the passenger database. Only the data is different (flights 0, 1 and 2 have 0, 8 and 9 cargo containers, compared to 20, 15 and 0 passengers, respectively). However, since we chose the same name for our interface (getCount) we won’t be
able to use the cargo database together with the passenger database; both versions
of $ g++ -c -Wall -Werror paxDB.cpp $ g++ -c -Wall -Werror cargoDB.cpp $ g++ -c -Wall -Werror paxCount.cpp $ g++ -Wall -Werror -o paxCount paxCount.o paxDB.o cargoDB.o ld: duplicate symbol getCount(char*, int)in cargoDB.o and paxDB.o
collect2: ld returned 1 exit status This error is due to the one definition rule: “Every program shall contain exactly one definition of every non-inline function or object that is used in that program” (quoted from the C++ standard, section 3.2). Namespaces to the rescue! We can group related functions and data in a
namespace, and we can disambiguate between the different paxDB.hnamespace pax
{
int getCount(char* flightNumber, int deflt);
} paxDB.cpp#include "paxDB.h"
static int flights[] = { 20, 15, 0 };
int pax::getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} cargoDB.hnamespace cargo
{
int getCount(char* flightNumber, int deflt);
} cargoDB.cpp#include "cargoDB.h"
static int flights[] = { 0, 8, 9 };
int cargo::getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} See below how the C++ compiler incorporates the namespace names into its name
mangling. Note also that the two different $ g++ -c -Wall -Werror paxDB.cpp $ nm paxDB.o 0000000000000058 d __ZL7flights
0000000000000000 T __ZN3pax8getCountEPci $ g++ -c -Wall -Werror cargoDB.cpp $ nm cargoDB.o 0000000000000058 d __ZL7flights
0000000000000000 T __ZN5cargo8getCountEPci Now we can write our new program, flightInfo.cpp#include "cargoDB.h"
#include "paxDB.h"
int main(int argc, char** argv)
{
if (argc > 2)
{
if (*argv[1] == 'c') return cargo::getCount(argv[2], -1);
if (*argv[1] == 'p') return pax::getCount(argv[2], -1);
}
return 0;
} $ g++ -Wall -Werror -o flightInfo flightInfo.cpp cargoDB.o paxDB.o $ ./flightInfo c 1 ; echo $? 8 $ ./flightInfo p 1 ; echo $? 15 Unnamed namespacesWe mentioned in chapter 6 that C++ deprecated usage of the Deprecated features are not removed altogether, so our programs still behave correctly. But can we achieve the same effect without using deprecated features? We could try something like this: paxDB.cpp#include "paxDB.h"
namespace nobodywilleverguessthisname
{
int flights[] = { 20, 15, 0 };
}
int pax::getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return nobodywilleverguessthisname::flights[0];
if (flightNumber[0] == '1') return nobodywilleverguessthisname::flights[1];
if (flightNumber[0] == '2') return nobodywilleverguessthisname::flights[2];
return deflt;
} We can make usage of paxDB.cpp#include "paxDB.h"
namespace nobodywilleverguessthisname
{
int flights[] = { 20, 15, 0 };
}
int pax::getCount(char* flightNumber, int deflt)
{
using nobodywilleverguessthisname::flights;
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} This simply allows If the namespace had multiple entities inside it, instead of separate using declarations for each entity, we can make the entire namespace’s contents visible with a using directive: paxDB.cpp#include "paxDB.h"
namespace nobodywilleverguessthisname
{
int flights[] = { 20, 15, 0 };
}
int pax::getCount(char* flightNumber, int deflt)
{
using namespace nobodywilleverguessthisname;
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} $ g++ -S -Wall -Werror paxDB.cpp $ cat paxDB.s __ZN3pax8getCountEPci:
LFB2:
pushq %rbp
LCFI0:
movq %rsp, %rbp
LCFI1:
movq %rdi, -8(%rbp)
movl %esi, -12(%rbp)
movq -8(%rbp), %rax
movzbl (%rax), %eax
cmpb $48, %al
jne L2
movl __ZN27nobodywilleverguessthisname7flightsE(%rip), %eax
movl %eax, -16(%rbp)
jmp L4 Note that the compiler, thanks to the using directive, converted references to $ g++ -c -Wall -Werror paxDB.cpp $ nm paxDB.o 0000000000000058 D __ZN27nobodywilleverguessthisname7flightsE
0000000000000000 T __ZN3pax8getCountEPci Note that The alternative recommended by the C++ standard is the unnamed (oranonymous) namespace: paxDB.cpp#include "paxDB.h"
namespace
{
int flights[] = { 20, 15, 0 };
}
int pax::getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} cargoDB.cpp#include "cargoDB.h"
namespace
{
int flights[] = { 0, 8, 9 };
}
int cargo::getCount(char* flightNumber, int deflt)
{
if (flightNumber[0] == '0') return flights[0];
if (flightNumber[0] == '1') return flights[1];
if (flightNumber[0] == '2') return flights[2];
return deflt;
} Unnamed namespaces have an implicit using directive placed at the translation unit’s global scope. Depending on your compiler implementation, names inside an unnamed namespace will be given internal linkage; or the compiler will generate a random namespace name, guaranteed to be unique. $ g++ -c -Wall -Werror paxDB.cpp $ nm paxDB.o 0000000000000058 d __ZN12_GLOBAL__N_17flightsE
0000000000000000 T __ZN3pax8getCountEPci $ g++ -c -Wall -Werror cargoDB.cpp $ nm cargoDB.o 0000000000000058 d __ZN12_GLOBAL__N_17flightsE
0000000000000000 T __ZN5cargo8getCountEPci It seems our compiler chooses the internal linkage method, with the same generated name for both unnamed namespaces.1
$ nm paxDB.o | c++filt 0000000000000058 d (anonymous namespace)::flights
0000000000000000 T pax::getCount(char*, int) For the record, the compiler I used to generate this material is: $ g++ --version GCC 4.2.1 (Apple Inc. build 5664)
Copyright (C) 2007 Free Software Foundation, Inc.
counter.hnamespace {
int counter = 0;
}
void count(); counter.cpp#include "counter.h"
void count() { ++counter; } main.cpp#include "counter.h"
int main()
{
count();
return counter;
} Include guardsInclude guards are placed around the contents of a header file to prevent the contents being seen twice by the compiler: paxDB.h#ifndef __PAX_DB_H__
#define __PAX_DB_H__
namespace pax
{
int getCount(char* flightNumber, int deflt);
}
#endif These prevent the preprocessor from outputting the contents between This is more likely to happen on large codebases, where a Include guards do not affect the inclusion into separate translation units, so they won’t help if you are seeing duplicate symbol errors at link time. Static librariesMultiple object files can be packaged together into a single archive called astatic library. The tool for this is $ ar -r libFlightDBs.a paxDB.o cargoDB.o $ nm libFlightDBs.a | c++filt libFlightDBs.a(paxDB.o):
0000000000000058 d (anonymous namespace)::flights
0000000000000000 T pax::getCount(char*, int)
libFlightDBs.a(cargoDB.o):
0000000000000058 d (anonymous namespace)::flights
0000000000000000 T cargo::getCount(char*, int) As a library supplier, you would deliver the archive file together with the
relevant header files ( The linker will look inside archive files specified with the $ g++ -Wall -Werror -o paxCount -L. -lFlightDBs paxCount.cpp $ nm paxCount | c++filt 0000000100001068 d (anonymous namespace)::flights
0000000100000e4c T pax::getCount(char*, int)
0000000100000ea4 T _main
Shared librariesWhen a library is used by many different programs (think, for example, of the C Posix library), copying the used functions into each executable program is an inefficient use of disk and memory. Functions in shared libraries aren’t linked into an executable program directly; instead, the linker generates code that, at run time, will look up the address of the shared library’s symbols. The run-time overhead is minimal (only one extra jump, via a jump table containing the addresses of all shared library symbols used by the program). At run time, only one copy of the shared library needs to be loaded in memory, regardless of how many different programs are using it. Another advantage is that a shared library can be upgraded independently of the programs that use it (as long as the library’s interface hasn’t changed). To generate a shared library, the object files must be compiled with the $ g++ -c -fPIC -Wall -Werror paxDB.cpp $ g++ -c -fPIC -Wall -Werror cargoDB.cpp To build the shared library, we use $ g++ -shared -fPIC -o libFlightDBs.so paxDB.o cargoDB.o $ nm libFlightDBs.so | c++filt 0000000000001014 d (anonymous namespace)::flights
0000000000001008 d (anonymous namespace)::flights
0000000000000e50 T pax::getCount(char*, int)
0000000000000ea8 T cargo::getCount(char*, int)
0000000000000000 t __mh_dylib_header
U dyld_stub_binder After we compile a program that uses $ g++ -fPIC -Wall -Werror -o paxCount -L. -lFlightDBs paxCount.cpp $ nm paxCount | c++filt U pax::getCount(char*, int)
0000000100000ee4 T _main When we execute the program, the OS first invokes the dynamic linker (orloader) which loads the required shared libraries. The dynamic loader
searches for libraries in standard locations like ‘/usr/lib’, as well as (on
Linux) the directories specified by the environment variable $ DYLD_LIBRARY_PATH=. ./paxCount 1 ; echo $? 15 On Linux, $ otool -L paxCount paxCount:
libFlightDBs.so (compatibility version 0.0.0, current version 0.0.0)
/usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.2.1)
MakefilesNo discussion of C++ compilation would be complete without mentioning A makefile contains a set of rules. Each rule specifies a target(or multiple targets), prerequisites, and a recipe (a shell command) for generating the target from its prerequisites. For example: makefilepaxCount: paxCount.cpp paxDB.h libFlightDBs.so
g++ -fPIC -Wall -Werror -o paxCount -L. -lFlightDBs paxCount.cpp
libFlightDBs.so: paxDB.o cargoDB.o
g++ -shared -fPIC -o $@ $^
paxDB.o cargoDB.o: %.o: %.cpp %.h
g++ -c -fPIC -Wall -Werror $<
clean:
rm paxCount libFlightDBs.so paxDB.o cargoDB.o The first rule specifies how to build the The second rule specifies how to build the shared library. It uses The third rule is a pattern rule. Its effect is the same as specifying
separate rules for each of the object files: A rule with target The final rule specifies how to remove all generated files. It has no
prerequisites so it will be run whenever you specify the target name ( If we type $ make paxCount g++ -c -fPIC -Wall -Werror paxDB.cpp
g++ -c -fPIC -Wall -Werror cargoDB.cpp
g++ -shared -fPIC -o libFlightDBs.so paxDB.o cargoDB.o
g++ -fPIC -Wall -Werror -o paxCount -L. -lFlightDBs paxCount.cpp If we run $ make paxCount make[1]: `paxCount' is up to date. $ touch paxDB.cpp $ make paxCount g++ -c -fPIC -Wall -Werror paxDB.cpp
g++ -shared -fPIC -o libFlightDBs.so paxDB.o cargoDB.o
g++ -fPIC -Wall -Werror -o paxCount -L. -lFlightDBs paxCount.cpp In large projects, tracking the prerequisites of a $ g++ -M paxDB.cpp paxDB.o: paxDB.cpp paxDB.h (Integrating this output with the project’s makefiles is beyond the scope of this tutorial; see “Generating Prerequisites Automatically”in the GNU make manual.)
|
|