On 6/6/2024 4:38 PM, Scott Lurndal wrote:
BGB-Alt <bohannonindustriesllc@gmail.com> writes:
On 5/31/2024 4:11 PM, Scott Lurndal wrote:
jak <nospam@please.ty> writes:
bart ha scritto:
On 31/05/2024 15:34, Michael S wrote:
On Fri, 31 May 2024 15:04:46 +0100
bart <bc@freeuk.com> wrote:
>
>
>
<snip>
>
>
Instead of one compiler, here I used two compilers, a tool 'objcopy'
(which bizarrely needs to generate ELF format files) and lots of extra
ugly code. I also need to disregard whatever the hell _binary_..._size
does.
>
But it works.
>
>
>
You could use the pe-x86-64 format instead of the elf64-x86-64 to reduce
the size of the object.
>
By a half dozen bytes, perhaps, and only if your binutils have been
built to support pe-x86-64:
>
$ objcopy -I binary -O pe-x86-64 main.cpp /tmp/test1.o
objcopy:/tmp/test1.o: Invalid bfd target
>
The ELF64 format has a 64 byte header, the string table and the
symbol table, and the remainder is the binary
data. The PE header may save a few bytes by using 32-bit fields in
the PE COFF header and symbol table.
>
Note, you might want to trim your posts when replying with a one-sentence reply.
>
>
While I can't say much for using objcopy here (it is likely to be
hindered by however the program was compiled and linked, in any case),
in some other contexts PE/COFF can save more significant amounts of
space vs ELF.
>
>
In particular:
>
PE/COFF typically only stores symbols for imports and exports, rather
than for every symbol in the binary (though, IIRC, GCC+LD does tend to
generate PE/COFF output with every symbol present, *1, so this advantage
is mostly N/A if using GCC).
$ man 1 strip
>
The PE/COFF base relocation format is more compact than the ELF64
relocation formats:
ELF64 tends to spend 24 bytes for every symbol, and 24 bytes for each
reloc; along with an ASCII string for every symbol.
Use ELF32 then.
Generally, using ELF32 on 64-bit targets isn't a thing...
Granted, if it were done, it could make sense. After all, this is more or less what PE/COFF is doing. There were changes to some of the headers for 64-bit PE32+, but all of the the address fields remain as 32 bits, etc.
>
It also tends to redirect most calls and loads/stores for global
variables through the GOT, rather than using PC-relative / RIP-relative
addressing (or fixed displacements relative to a Global Pointer),
causing the generated code to be larger (along with the size of the GOT).
That has nothing to do with ELF, per se. The ELF format supports
dynamic linking. It does not require it.
Generally, ELF binaries seem to come in two major variants:
ET_EXEC: Flat static-linked binary that can only be loaded at a certain address;
ET_DYN: Can be loaded at any address, but requires symbols and relocations, and does everything via a GOT.
If you want the ability to load at any address, one needs PIE, which is an ET_DYN binary with all of the dynamic linking stuff; even if the program itself is static-linked.
And, seemingly, within the format at it exists, there is no way to make a relocatable binary that does not have a GOT and symbol tables.
Contrast PE/COFF:
The import/export tables and base-relocation tables exist independently of each other;
The base relocations's do not depend on a symbol table;
...
Also, in contrast to 24 byte relocation entries, the average size of a base-reloc in PE/COFF is closer to 2 bytes (though, with 8 bytes 4K per-page, and 2 bytes for each reloc within that page).
Experimentally, I had developed a more compact variant of the base relocs by using almost exclusively 16-bit values:
0000: No-Op / Pad / End
00zz: Adjust current position forward by zz pages;
1zzz..Bzzz: Apply a base-reloc of various types.
Czzz..Fzzz: Escape into a larger set of reloc types.
Generally, the high 4 bits giving the relocation type, and the low 12 giving the offset within the page.
While it was effective, for now my compiler is still using the original format (the new format breaks compatibility with my previous loaders).
Can note that, for example, with building Doom for RV64G:
445K, ".text"
42K, ".rodata"
141K, ".data"
18K, ".got"
skip, various smaller sections
skip ".bss" (1442K), not present in binary.
Dynamic stuff:
75K, ".dynsym"
43K, ".dynstr"
49K, ".rela.dyn"
32K, ".rela.plt"
21K, ".plt"
21K, ".hash"
25K, ".gnu.hash"
So:
646K, stuff that is present either way.
266K, dynamic linking metadata.
It was smaller in its non-PIE form, but PIE kinda ruined it.
This is with "-ffunction-sections -fdata-sections -Wl,-gc-sections" otherwise it would have been bigger.
Comparison, Doom built for BJX2-XG2 (with a PE/COFF variant):
283K, ".text"
22K, ".strtab"
1K, ".rodata"
2K, ".reloc"
142K, ".data"
(not present in binary) 1274K, ".bss"
This is for an image that still supports base relocations, using an ABI capable of NOMMU operation, and an ISA variant that does not use 16-bit ops.
If I switch to "Baseline" mode (which still has 16-bit instructions), ".text" drops to around 250K.
In all of these cases, it is the same program static-linked with the same C library.
Generally, my ISA also seems to be winning in terms of performance in my tests.
An x86-64 build of Doom (as ELF, also PIE) is a little smaller than the RV64 build (but is dynamically linked with glibc), at around 297K for ".text".
And, a MSVC / VS2022 build of Doom dwarfs all of them with its roughly 1.1MB ".text" section. Though, this drops to 515K if I use "/MD" which tells MSVC to use a dynamically-linked C library.
But, still not great (it is still the biggest even with the handicap of using a dynamic-linked C library).
...