Virtual functions are at the heart of object-oriented programming and runtime polymorphism in C . Countless programmers rely on them for creating and operating intuitively on large class hierarchies. They are a vital part of the language. But how are they actually implemented by the compiler? Their implementation details is a common C question. The usual answer involves the mention of a pointer to a table of functions. But what exactly does that table contain? What part of the implementation details are done at compile time and what is done at runtime? In this article I’ll take a closer look at what happens behind the scenes when virtual functions are involved. It’s important to note that the C Standard does not specify how virtual functions should be implemented so it’s entirely up to each compiler how they solve it. For reference, at the time of writing I’m using the following compiler1 and architecture: $ clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.5.0
Let’s begin with some background. PolymorphismC effectively supports three types of polymorphism:
Virtual functions allows for late binding of function calls based on object type. It comes into play when a derived object is addressed via a pointer or a reference to a base class. They effectively enable a inerhitable common interface with potentially overriden implementation in the derived classes. In order to support this late binding of function calls the compiler needs to augment the qualifying objects with information so that the function calls will be possible at runtime. In order to understand this augmentation, let’s first look at how class object are represented in memory. Object Memory LayoutOnly nonstatic data members are part of an object. Member functions and static data members, despite being part of the class declaration, are “hosted” outside the object. The nonstatic data members are laid out in memory in the order of their declaration:
f will be represented in memory as: 0: ---
| p |
4: ---
| q |
8: ---
We can confirm this with a debugger:
Indeed it’s only the nonstatic data members that
contributes the the object size. Well, that and any compiler augmentation
that may go into it - potential padding of the nonstatic data members, as
well as the virtual pointer: We can check for compiler added padding by inspecting the objects memory as before: class Foo {
public:
char c[3] { 0, 0, 0}; // 3 bytes
int p { 5 }; // 4 bytes
};
Foo f;
Here the compiler has added 1 byte of padding to align c on a 4-byte boundary. Base class nonstatic data members are contained directly in the derived class object: class Base {
public:
int x { 3 };
};
class Derived : public Base {
public:
int p { 5 };
int q { 7 };
};
Derived d;
Again, we verify using a debugger:
Base class nonstatic data members are laid out in the derived object exactly as they are in the base class, including any padding: class Base {
public:
char c[3] { 0, 0, 0 };
int x { 3 };
};
class Derived : public Base {
public:
char d;
};
Derived d;
Here we might expect that the size of a Derived object would be 8 bytes, the total size of the nonstatic data members in the two classes. However, Base has been padded with 1 byte to align c on a 4-byte boundary. This padding is carried over to the derived class. At this point the Derived object is now 9 bytes, which the compiler pads with an additional 3 bytes to align d on a 4-byte boundary. Hence the final size of 12 bytes, with effectively 4 bytes wasted due to alignment padding. That may sound insignificant but imagine that Derived was instead a Particle in a particle system. Imagine further that there was 500,000 particles active in this system, then we’d be wasting 2 MB due to padding. 2 MB might not sound too bad either, but when you consider that the total memory usage in this case is 6 MB and you’re wasting 30% of that on padding you realise that these things adds up quickly. Of course there’s a good reason for the compiler adding this padding - performance. The CPU’s load and store operations performs the best when it’s working with its “natural data size”, which is a word.2 Now, let’s see what happens when we add a virtual function: class Foo {
public:
virtual ~Foo();
int p { 5 }; // 4 bytes
};
Foo f;
First let’s check the size of the object.
Interesting, 16 bytes yet we only have a 4-byte data member. This implies that the compiler has augmented our object. We can guess with what at this point, padding and a virtual pointer due to the virtual function being present. Let’s have a look at the object: (lldb) x/16b &f
0x7fff5fbffb48: 0x30 0x10 0x00 0x00 0x01 0x00 0x00 0x00
| |
.......................................
virtual pointer
0x7fff5fbffb50: 0x05 0x00 0x00 0x00 0xff 0x7f 0x00 0x00
| | | |
................... ...................
p padding
This memory dump also highlights an important fact; the compiler has inserted
the Let’s take a closer look at the virtual pointer and virtual table layout. Virtual Pointer and Virtual TableAs soon as a class either derive from a virtual base class or has
virtual functions either directly or from inheritance the compiler will
synthesize a pointer into the class object. This is the virtual pointer,
The virtual table contains the following:
The set of virtual functions you can invoke on an object is known at compile time and it’s invariant, meaning it can’t change during runtime. Thus the virtual table is set up during compilation. Each virtual function gets assigned a fixed position in the virtual table that remains the same throughout class inheritance. The compiler will transform a virtual function call:
Where n is the associated slot in the virtual table. Note how the pointer
itself is passed as the first argument to the function; that corresponds to the
The virtual table is ordered based on the function declaration order in the class. For example: class Foo {
public:
virtual ~Foo();
virtual void SomeFunction();
int p { 5 };
int q { 5 };
};
Foo f;
As always a debugger is our friend: (lldb) x/4w &f
0x7fff5fbffb68: 0x00001030 0x00000001 0x00000005 0x00000005
| | | | | |
..................... .......... ..........
virtual pointer p q
Let’s look at virtual table associated with f:
Oh, interesting. There’s two destructors in the virtual table. How come? It
turns out that destructors come in pairs as a The However, where’s the f:
-------- ---------------
| __vptr |---- | offset_to_top | -2
-------- | ---------------
| p | | | RTTI | -1
-------- | ---------------
| q | ---> | ~Foo() | 0
-------- ---------------
| SomeFunction | 1
---------------
Now that we have a good understanding of how objects are laid out in memory and what the virtual table looks like, let’s see how inheritance influences things. Single InheritanceThe Simplified it looks like this:
// Virtual table for d:
[ 0 ] - Derived::~Derived
[ 1 ] - Base::SomeFunction
[ 3 ] - Derived::AnotherFunction
// If 'd' had overriden SomeFunction() it would look like this:
[ 0 ] - Derived::~Derived
[ 1 ] - Derived::SomeFunction
[ 2 ] - Derived::AnotherFunction
This is pretty much as expected. It gets slightly more complicated with multiple inheritance. Multiple InheritanceRemeber how with single inheritance the base data member are contained at the start of the derived object? This effectively means that under single inheritance the Base and Derived part of the object points to the same memory. This is not the case for subsequent base classes in mulitple inheritance. And
therein lies the complexities - multiple inheritance requires patching the
location of the Let’s first consider the base object pointer patching. Let’s say we have a class hierarchy like this:
In memory a Derived object will be laid out like this: Derived:
--------- 0
| Base0 |
---------
| Base1 |
---------
| Derived |
--------- n
That means we can easily do a conversion from Derived to Base0 because the start of Derived and the start of Base0 points to the same address:
However, what happens if we want to assign a Derived object to a Base1 pointer which is not at the same address? The compiler will add an offset: // For this:
Derived* d = new Derived;
Base1* b = d;
// the compiler will transform the code to (via vtable):
Base1* b = d sizeof(Base0);
A similar patching process also happens on function calls where a base virtual
function is called via a pointer to a derived object. This is the patching of
the A
// then for a virtual function call needing pointer adjustment:
ptr->function();
// becomes:
ptr->__function_thunk(ptr);
__function_thunk:
ptr = offset;
function(ptr);
Both of these transformation happens at runtime because the type of object being addressed is not known at compile time.3 Finally let’s look at what happens in terms of the
The derived class object ends up with a virtual table for each base
class that has one. This set is made up of a It looks like this: ----------------
| offset_to_top |
Derived: ----------------
------------- | Derived RTTI |
| _vptr_Base0 |--- ----------------
------------- --> | Base0 virtuals |
| ... | ----------------
------------- | ... |
| _vptr_Base1 |--- ----------------
------------- | | offset_to_top |
| ... | | ----------------
------------- | | Derived RTTI |
| ----------------
--> | Base1 virtuals |
----------------
| ... |
----------------
The reason we end up with multiple virtual pointers and tables is to support the object address adjustment mentioned above. If we pass a pointer-to-Derived object to a function taking a pointer-to-Base1 pointer we pass in an object whose address has been adjusted to start at _vptr_Base1. Thus any virtual function calls will map into the correct slot in the virtual table. This is also the reason we end up with the same content in the virtual tables - for better runtime performance. If the entries wasn’t duplicated then more runtime pointer adjustments would have to take place. With this setup we just need one adjustment, and then call into virtual table as normal. Finally, let’s take a look at virtual inheritance. Virtual InheritanceLet’s consider the simple case with only one virtual base class:
As usual, let’s check the size and memory layout: (lldb) p sizeof(d)
(unsigned long) $0 = 16
(lldb) x/4w &d
0x7fff5fbffb68: 0x00001028 0x00000001 0x00000007 0x00000005
| | | | | |
..................... .......... ..........
virtual pointer q p
Oh, this is interesting. We immediately see that the virtual base nonstatic data members are laid out in memory after the derived class members. This is different from non-virtual inheritance where the base class members came first. This means that our base and derived doesn’t start on the same address, like it does for non-virtual inheritance. The virtual base subobject is contained directly in Derived, which makes sense since there can only be one copy of a virtual base subobject. Let’s take a look at the virtual table:
Curious, what’s this VTT for Derived? During object construction the object
takes the form of the current class for whose constructor is being executed. So
during the Base constructor execution the Dervied object we’re creating is of
type Base. During this construction process the compiler needs to make sure that the
virtual pointer points to the correct virtual table. This information is stored
in the Finally let’s take a look at the classic diamond shaped inheritance graph: class Root {
public:
virtual ~Root();
int a { 3 };
};
class Left : virtual public Root {
public:
virtual ~Left();
int b { 5 };
};
class Right : virtual public Root {
public:
virtual ~Right();
int c { 7 };
};
class Derived : public Left, public Right {
public:
virtual ~Derived();
int d { 9 };
};
Derived d;
Let’s inspect the memory layout:
Here we again see that there’s only one virtual base object and that it’s contained directly in the Derived object. What does the virtual table look like in this instance? It’s similar to the one for regular multiple inheritance except we have another offset pointer, this time to the virtual base subject contained in Derived. The virtual table looks like this: ---------------
| vbase_offset |
---------------
| offset_to_top |
---------------
| RTTI |
---------------
| Left entries |
---------------
| ... |
---------------
| vbase_offset |
---------------
| offset_to_top |
---------------
| RTTI |
---------------
| Right entries |
---------------
| ...
---------------
| vbase_offset |
---------------
| offset_to_top |
---------------
| RTTI |
---------------
|Derived Entries|
---------------
| ... |
---------------
That’s quite a lot of information. With this it’s also easy to see the potential memory overhead of supporting large inheritance graphs with virtual base classes, especially if there’s a lot of virtual functions. Please note that for all the multiple inheritance examples in this article, if Derived had added any virtual functions of its own we would’ve gotten yet another virtual pointer and virtual table entries for that as well, as demonstrated by this last diagram. It’s handled in the same way as the the other virtual pointers. SummaryVirtual functions are at the heart of designing intuitive class hierarchy interfaces. The implementation support for them is quite intuitive and allows for good runtime performance, at the cost of some memory overhead. When designing designing these class hierarchies its worth considering the object layout to minimize wasting memory due to padding for alignment. While there is some runtime overhead for invoking virtual functions, don’t assume they are much more expensive than normal function calls without proper profiling.
|
|
来自: astrotycoon > 《待分类》