Decentralized modules declarations in C using ELF sections17 Aug 2016 by David CorvoysierIn modular programming, a standard practice is to define common interfaces allowing the same type of operation to be performed on a set of otherwise independent modules.
To implement this pattern, two mechanisms are required:
Instantiation is typically supported natively in high-level languages. Registration is more difficult and usually requires specific code to be written, or relying on external frameworks. Let’s see how these two mechanisms can be implemented for C programs.
Interface instantiationIn C programs, interface instantiation is implemented using function pointers: basically, the common interface is specified using a struct whose members are the functions that needs to be implemented.
Interface registrationThe goal here is to allow client code to be able to ‘find’ the interface instances provided by the modules. The first question we need to address is whether we register interfaces statically at design time or dynamically at runtime. Some systems like Linux provide mechanisms for special ‘constructors’ functions to be called at program initialization. We could take advantage of that feature to allow each module to register its interfaces: see a full example here. In this article, I assume that we are on a system without such capability, and that we as a consequence can only rely on static registration.
A first solution for static registration of modules is to give the client code a direct access to the interface instances, by exposing them in public headers.
This works, but it is not quite satisfactory: as more modules are added to the program, the client code needs to be modified. A better solution would be to store the instances anonymously in a static array:
This is quite neat, as we will only need to modify the This could be even better though: what if we could add modules without editing any other files ? Taking advantage of ELF sections to create decentralized module tablesThe only reason why we need to edit the The array in itself is just a bunch of pointers written one after the other in a contiguous memory space: what if we could find a way to populate it directly from the modules themselves ? This cannot be achieved by either the preprocessor or the compiler, as they process compilation units atomically (when a file is processed, the compiler has no knowledge of the other files it has compiled or will compile in the future). The linker however has the knowledge of all symbols declared in the program, and is even capable of grouping them according to section definitions, as long as we specify them in a custom linker script. We can take advantage of that to make sure that all references to the interface instances are stored in the same section, and define the modules array as being the start address of the section.
What we do here is that we add an extension to the generic linker script to add a In the
The modules have to be slightly nodified, to make sure they assign their interfaces to the new section:
The syntax is quite ugly, so you probably would hide it inside a preprocessor macro in the
Now we just have to access the global array from the client code using the variables defining its boundaries:
What we have now is a modules framework that can be extended without modifying its core. The modules registraton being static, this is greatly effective both in terms of RAM and CPU consumption. Pitfalls with interface sectionsThere are a few things that you need to be aware of when using this framework. First, you need to make sure that the linker aligns the modules in the same way the compiler would: otherwise when going through the table, you may shift and access the wrong data. This is usually taken care of by enforcing alignment in the linker script:
Second, depending on your your link configuration, your modules section may be optimized out, as the linker has no way of knowing that it is actually used. In particular, the The workaround is to explicitly tell the linker that it should keep these symbols:
Last, if some of your modules are distributed as static libraries, the linker may also optimize out the corresponding symbols when linking the whole binary. The workaround in that case is to prevent optimization by using the linker |
|
来自: astrotycoon > 《待分类》