On 9/15/2024 12:46 PM, Anton Ertl wrote:
Michael S <already5chosen@yahoo.com> writes:
Padding is another thing that should be Implementation Defined.
It is. It's defined in the ABI, so when the compiler documents to
follow some ABI, you automatically get that ABI's structure layout.
And if a compiler does not follow an ABI, it is practically useless.
Though, there also isn't a whole lot of freedom of choice here regarding layout.
If member ordering or padding differs from typical expectations, then any code which serializes structures to files is liable to break, and this practice isn't particularly uncommon.
Say, typical pattern:
Members are organized in the same order they appear in the source code;
If the current position is not a multiple of the member's alignment, it is padded to an offset that is a multiple of the member's alignment;
For primitive types, the alignment is equal to the size, which is also a power of 2;
If needed, the total size of the struct is padded to a multiple of the largest alignment of the struct members.
For C++ classes, it is more chaotic (and more compiler dependent), but:
If no virtual methods are present, layout matches that of a struct (POD class);
If virtual methods are present, an implicit first member is added holding the vtable pointer;
Normal child classes will append new members to the end of the parent class, and new virtual methods to the end of the parents' vtable;
In multiple inheritance, each parent class is packed end-to-end, followed by any new members for the child class (unsure ATM, but I think at this point a new vtable pointer is created for any new child members);
If virtual inheritance is used, IIRC, the parent classes are instead represented as object pointers, object creation initialized these pointers to the location of the parent class within the aggregate;
...
I didn't really bother with MI in my own stuff (nor is there a POD special case, as class and struct were regarded as separate entities), but instead used a single-inheritance model internally (but can fake MI by turning the parent classes into members).
If interfaces were used, they were handled as interface vtable pointers that were added following any data members:
<main-vtable> <members> <interface1> <interface2>
With an append pattern still being followed:
<main-vtable> <parent-members> <parent-interface1> <parent-interface2>
<child-members> <child-interface>
...
There is usually a convention that the first 4 vtable pointers are not used for methods, say:
index 0: points to the ClassInfo object;
index 1: offset relative to the object base
Subtracted from current pointer to get 'this' for the method call;
index 2: typically NULL;
index 3: typically NULL.
Though, in my case, I had ended up using index 2/3 for IPC method routing.
A ClassInfo object would usually be used for RTTI or similar, but there isn't really a standardized layout. In my case it contains a Self-RVA (used to find the image base address), RVA's for the QName and similar, and lists of virtual methods, members, and superclass and interfaces.
For dynamic classes (like in ActionScript), another hidden member was added, which contained a key-value mapping encoded as a sort of B-Tree (with each node containing a small fixed number of keys). Where, on average, small B-Trees were more memory-efficient than allocating variable-sized key/value tables (though, granted, a key/value table is simpler). In these cases, any fixed members would be accessed as fixed members, but if an undefined member was accessed, rather than failing in the compiler, it would add a dynamic lookup to deal with it (generally with these members always being 'variant').
But, none of this is particularly relevant for C proper.
...
- anton