On 7/6/2024 2:15 PM, bart wrote:
On 06/07/2024 19:46, BGB wrote:
But, yeah, for most people, writing the compiler is always "someone else's problem".
>
>
Granted, even with as much suck as it was to write my own C compiler to target my ISA, it is still likely a better experience than it would have been trying to retarget GCC or Clang.
>
GCC is sort of a tangled mess of code, trying to add a new target would likely involve weaving it in all across the codebase, which is comparably fairly large (well into MLOC territory IIRC).
>
>
Any desire to retarget Clang is limited by the issue that it would involve needing to edit and recompile LLVM a large number of times, and LLVM takes an absurdly long time to recompile...
>
They are like "well, use -j N".
But, like, "-j 1": It takes like 4 hours;
"-j 4": It still takes around 1 hour, PC still usable.
"-j 8": It takes 45 minutes, but PC is made unusable.
>
Doing so also eats most of the RAM, on a PC with 112 GB... (Can't install a full 128GB with my MOBO/chipset it seems).
>
At "-j 8", the HDD with the pagefile is basically running at full capacity.
Those figures are extraordinary.
But do you really need to recompile everything in LLVM even when changing only target-specific components?
I haven't looked into it too much, but as I understand it, changing target stuff will require rebuilding LLVM each time. Either way, I didn't feel terribly inclined to find out.
Seemingly the usual answer is "have a Linux box with some absurd hardware stats to use as a build server". This is, not ideal...
Like, I don't exactly have the money to afford on a "Dual socket Threadripper with 256GB of RAM" or similar in order to try to get semi-fast LLVM rebuilds.
How much of it is due to it using C++ rather than C?
Probably a fair bit.
When I looked at the LLVM code before, it seems like they are writing C++ in a similar way to how one would write code in Java. This sort of thing seems "kinda evil" for build times...
Maybe you should consult with David Brown who doesn't believe in the benefits of fast compilers, and always knows some tricks to get a build-time in seconds no matter how slow the compiler.
I can generally recompile BGBCC in a few seconds...
Though, for the most part, it is a monolithic unity build, for better or for worse.
Could potentially split it up into multiple components, but doing so would make it more of a hassle than leaving it as a single monolithic entity.
With my own tools, that run on one core with 8GB (2GB of which seems tied up with the display for some reason, even though the HD display needs only a 0.006GB frame buffer), I can't do anything else when compiling, because it will have finished before I've taken my finger off the Enter key!
This is largely thanks to machines now being so fast rather than my coding skills, but also because I've never managed to learn how to write huge, bloated, inefficient software.
Code tends to get kinda bulky in my case, but still fairly small vs a lot of OSS projects.
As noted, BGBCC is around 250 kLOC.
Overall project is around 2 MLOC, but around half of this is from ported software.
So, say (very loosely):
250 kLOC: BGBCC
100 kLOC: jx2vm (CPU emulator)
200 kLOC: TestKern "OS" and runtime libraries.
Highly modified PDPCLIB
Stuff for memory management, filesystem, etc.
A makeshift GUI of sorts;
An OpenGL 1.x implementation;
...
...
Then, say, other stuff:
50 kLOC: Doom
75 kLOC: Hexen
150 kLOC: ROTT
300 kLOC: Quake3
...
Comparably, Doom 3 is around 1 MLOC, but like, it is also C++ and has no chance of running on BJX2 as-is, so not worth bothering.
I had just barely crossed the threshold to where it seemed worthwhile to resume an effort to port Quake3 (after looking at it a few years ago, and noting that at the time, I didn't have a lot of the infrastructure needed for a Quake3 port).
Nevermind an open question of whether it has any hope of being playable on a 50MHz CPU.
Though, if anything gives hope here, it is that ironically Quake3 seems to throw significantly less geometry at the OpenGL backend than Quake1 did (mostly I suspect because QBSP carves all the map surfaces into little pieces along all of the BSP planes, whereas Quake3 tends to leave surfaces intact across planes, duplicating a reference to each surface within each leaf rather than carving it up).
So, for example, Quake3 tended to average (IIRC) around 300-500 polygons per frame (despite the much more elaborate scenery) vs around 1000-1500 per frame for Quake1 (also it has LOD for the 3D models, so models use simpler geometry as they are further from the camera, etc).
But, to really see what happens, I need to get it fully working...
My porting effort did make some aggressive corner cutting in a few areas, like replacing the ZIP based PK3 files with WAD4, and replacing most of the use of JPEG images with ".DDS" files.
Where WAD4 is effectively:
Similar to the WAD2 format, but supporting directory trees and 32-character names. Where, WAD2 was a format originally used in Quake 1 for storing various graphics (and for textures in the QBSP tools). The WAD2 format was also used for similar purposes in Half-Life.
I had also used it for various purposes in my compilers and in my "OS".
Technically also the ".rsrc" section in my PE output is also based on the WAD2 format (replacing the original format used by Windows, which seemed hopelessly convoluted). It still serves the purpose though of mostly holding BMP images and icons.
Or, like:
IWAD/PWAD:
Used in Doom and friends
No compression
8 character lump names (nominally upper case).
WAD2:
Used in Quake and Half-Life
Optional compression
16 character lump names (nominally lower case).
Names will omit file extensions,
which are stored as a 1 byte lump type.
WAD3:
Used by Half-Life / GoldSrc, slightly tweaked WAD2.
WAD4:
Custom format (used in my projects), expanded form of WAD2.
Optional compression
Directory Trees
32 character lump names (case sensitive, nominally lower case).
I felt OK with calling it WAD4 as it seemed like no one else had used the WAD4 name, and the format design does mostly carry on the legacy of its WAD predecessors.
I am mostly using LZ4 and RP2 compression, say:
0: Store
1: Unused (IIRC, was an LZSS like format in Quake)
2: Unused
3: RP2
4: LZ4
5..7: Unused
8: Deflate (unused at present)
9: Deflate64 (unused at present)
...
Deflate gives decent compression, but is mostly a bit slow and heavyweight for my uses here.
Using ZIP for a VFS, while technically possible, has the drawback that one needs to read-in and process the central directory (and doing it the way Quake3 did it, ate a lot of memory).
Comparably, using fixed-length 64-byte directory entries both avoids a lot of the up-front processing and still uses less memory (say, vs representing every file name in the PK3 file as a full-path string which is allocated via the Z_Malloc allocator, with thousands of files, this is a lot of bytes...).
For some uses where TGA or PNG may have been used, I am using the QOI format (moderately fast to decode, like a "Poor man's PNG").
I don't like products like LLVM; one reason I'm still doing this stuff is to makes a stand.
(About 10 years ago I needed a new side-gate for my house. Rather than buy one, I made one myself that was a perfect size, 6' high. If the people behind LLVM made garden gates, theirs would have been 9 MILES high! Probably 15 miles now.)
Possibly...
>
>
It is a mystery why anyone uses LLVM, I would have considered it a dead end due to all this...
Apparently it takes care of all that pesky back-end stuff. For me it would make the task a thousand times harder. Plus I'd end up with a compiler of which only 0.4% was my own work; 99.6% somebody else's. Based on typical LLVM-based compilers being 100MB.
Yeah.
Main EXE for BGBCC is ~ 4MB, but this is building with "/Zi" for VS2022, which doesn't exactly generate the most compact binaries possible (similar to "-O0 -g" with GCC).
Essentially, the binary in this case takes on the role of the entire compiler toolchain (to mimic a GCC cross-compiler setup, I can symlink the various tools to the BGBCC binary, and it tries to mimic the CLI for the tool it is called as).
Luckily, in past small-scale experiments, GNU autotools doesn't seem to notice/care what file-formats are being used, but is fairly particular about some other things (like how the files are named).
For example, if called with no output file, the compiler needs to output "a.out" or "a.exe" as the default output, otherwise autotools gets confused (vs, the MSVC behavior of using the name of the first C source file or similar to derive the name of the default output EXE).
Well, along with reliance of behaviors and command-line options outside the scope of the POSIX spec (where GCC is similar to POSIX 'cc' or 'c89' in terms of its CLI, but not exactly the same).
Though, there are a lot of tools that it does not yet mimic:
nm, objdump, objcopy, etc...
But these were not used by "./configure" or the Makefile's that configure spits out, so...
It cares not that:
".o" files are RIL3 bytecode rather than COFF or ELF.
".a" files were also RIL3 bytecode.
The 'ar' tool linking the '.o' files together,
rather than an '!<arch>' file...
...
Though, technically, POSIX specifies that the '!<arch>' format is used.
For the stalled new compiler effort, the idea was to use WAD2 or similar instead. Well, unless someone can make a strong case for why using the original format actually matters.