Re: 80386 C compiler

Liste des GroupesRevenir à cl c  
Sujet : Re: 80386 C compiler
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.lang.c
Date : 25. Nov 2024, 00:46:59
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vi0dtn$2ehti$1@dont-email.me>
References : 1 2 3
User-Agent : Mozilla Thunderbird
On 11/24/2024 12:00 PM, Bart wrote:
On 24/11/2024 17:51, fir wrote:
Paul Edwards pisze:
Hi.
>
I have been after a public domain C compiler for decades.
None of them reach C90 compliance. SubC comes the
closest but was written without full use of C90, which
makes it difficult to read. I'm after C90 written in C90.
>
A number of people have tried, but they always seem
to fall short. One of those attempts is pdcc. The
preprocessor was done, but the attempt (by someone
else) to add C code generation was abandoned.
>
I decided to take a look at it, and it looks to me like
a significant amount of work has already been done.
>
Also, my scope is limited - I am only after enough
functionality to get my 80386 OS (PDOS) compiled,
and I don't mind short=int=long = 32 bits, I don't
mind not having float. I don't use bitfields.
>
Anyway, I have had some success in making enhancements
to it, and here is one:
>
https://sourceforge.net/p/pdos/gitcode/ ci/3356e623785e2c2e16c28c5bf8737e72df
d39e04/
>
But I don't really know what I'm doing (I do know some
of the theory - but this is a particular design).
>
E.g. now that I have managed to get a variable passed to
a function, I now want the address of that variable passed
to the function - ie I want to do &x instead of x - and I am
not sure whether to create a new ADDRESS type, or
whether it is part of VARREF or what - in the original
(incomplete) concept. Or CC_EXPR_AMPERSAND.
>
I am happy to do the actual coding work - I'm just looking
for some nudges in the right direction if anyone can assist.
>
Thanks. Paul.
>
>
you mean there is no such a compiler? rise a fund for some to
write it and they will write it..and if few thousand of people
will give some money there it will be written
 There are any number of open source C compilers. But they need to be good enough (too many support only a subset, which may not be enough for the OP) and they need to be public domain for the OP's purposes.
 
I am more in the camp of MIT or BSD license should be good enough for most things.
Trying to go full public domain has a few of its own issues:
* Not always recognized as valid;
* Implicitly lacks "No Warranty" and "No Liability" protections for the author (say, if someone wanted to file a lawsuit over the code being buggy, etc).
* ...
There could almost be a "MIT Minus" or something, which could be, say, MIT with a clause saying one is allowed to discard the license terms for sake of derived works (but still offering protection from liability).
As for C compilers, I have a compiler for my own uses, but:
* MIT licensed;
* Doesn't target x86.
* Sorta implements C99 with various fragments of newer standards.
** Though, is a bit hit/miss on the now-optional parts.
** VLAs sorta exist but do not necessarily work correctly.
*** Currently unsupported in DLLs;
*** Seemingly may result in memory leaks if used.
*** Essentially, they are implemented via runtime library calls.
**** With memory provided indirectly via malloc.
Old target list (for which the code still exists):
* SH-4 (AKA: SuperH, most well known for SEGA Saturn and Dreamcast)
* BJX-1 (Was a highly modified version of SH-4)
* BTSR1 (a small SH inspired ISA, intended to be comparable to MSP430).
** Not maintained, RV32IC seems like a better option.
Currently active targets:
* BJX-2: Now a group of several closely related variants.
** All are 64-bit, most using a 48-bit VAS (some had a 32-bit VAS)
** Baseline: 16/32/64/96 bit instructions, 32 or 64 GPRs
** XG2: 32/64/96 bit, 64 GPRs
* RISC-V, RV64G + Custom Extensions
** Has some extensions which can help notably with performance.
** Can support plain RV64G as well.
** No current support for the 16-bit 'C' encodings.
* XG3RV
** Mostly a tweaked and repacked version of XG2 used alongside RV64G.
** The XG3 encoding space replaces the RV64 'C' (Compressed) extension.
** Both XG3 and RV64 instructions may be encoded at the same time.
** XG3 is used in a functionally-similar subset, just with 64 GPRs.
Not yet bothered with a target for RV32IC, GCC does this well enough.
* x86/x86-64/ARM: We generally have GCC and Clang.
Granted, GCC and Clang are both very large and slow/painful to rebuild from source. My compiler is at least a lot smaller and easy to rebuild.
Likely, far more of the total effort of my project has ended up going into my compiler than into the emulator or Verilog implementation though.
The BJX-2 register space had 64 registers and was split in half for the RV64G modes (32 GPRs and 32 FPRs), whereas XG3 and my jumbo-prefix extensions partly undo this split.
( Decided to try changing the way I write my ISA name as maybe adding a hyphen will get me less trouble... ).
Though, partly this is because for performance BGBCC seems to need a lot of registers (it could barely operate with the SH4's 16 GPRs, and still has a fairly high spill-and-fill rate with 32 GPRs).
Though, can note that with my compiler and XG3RV, despite not adding much over RV64+Jumbo, does beat both code density and performance of RV64G via "GCC -O3" (and also beats the code-density of RV64GC, as in this case, fewer instructions is better than smaller instructions).
A big part of the performance delta between the ISAs could be addressed by adding a few major features to RV64:
* Jumbo Prefixes: Prefix may extend 12-bit imm/disp fields to 33 bits;
** Also extends LUI, AUIPC, and JAL to 33-bit forms.
* Load/Store with a register index;
* Load/Store Pair.
With BGBCC vs GCC RV64G, this gives around a 30% speedup.
* It is closer to 70% if comparing against BGBCC with plain RV64G.
* BGBCC can't match GCC if both are targeting RV64G.
** I am not sure what GCC would do if it had my extensions.
The specific extensions here mostly targeting the dominant sources of inefficiency in the RV64G encodings as they exist (the ISA design deals poorly to exceeding what can be encoded directly in an immediate, ...).
The jumbo prefixes may also be used to merge the register space back into a 64 register space (at the cost of using 64-bit instruction encodings to do so), but this only extends the imm/disp fields to 23 bits (except for LUI/AUPIC/JAL, which always have an expanded 6b register field with jumbo prefixes).
Note that J+AUIPC loads an address of PC +/- 4GB into Rd. Likewise, J+JAL is +/- 4GB (with LSB as MBZ).
The relative performance gains from the XG3RV vs extended RV64G were smaller, it mostly serves to improve code-density (makes Doom roughly 16% smaller; and is around 44% smaller than plain RV64G).
Main thing it has (in theory) is access to a lot of the specialized SIMD instructions and similar that exist in my ISA but lack equivalents on the RV64G side of things.
There are a few instructions that exist here which are tempting to add as extended instructions to RV64:
* Compare Equal, Not-Equal, and Greater-Equal instructions (SEQ, SNE, SGE);
* Load/Store relative to GP with a larger displacement (TBD, 2).
Some notable features from BJX-2 were effectively made optional in XG3, such as support for an SR.T bit (originally carried over SuperH), and predication (in BJX-2, instructions could be encoded for whether or not to execute based on the status of the SR.T bit). However, no direct architectural equivalent exists in RV64.
In XG3RV, the questionable design choice had been made to conceptually holding these parts of the architectural state in the high-order bits of PC and LR/RA (in my other ISA variants, LR merely captured these bits from SR).
2: It is tempting to consider, possibly:
   LW/LD, SW/SD, with an addressing mode like: [GP+Disp14u*4|8]
So, able to encode an access 64 or 128K relative to GP rather than +/- 2K. This would save some space over the use of a jumbo prefix (at least with my compiler tending to use GP to access global variables).
Where, it would be "better" here if one could access most of the global variables in a single 32-bit instruction. But, wouldn't fit in as well with the existing ISA encodings.
Generally, BGBCC uses a modified PE/COFF variant.
* For RV64G, I switched it to default to using plain PE/COFF.
* Some people might find this slightly easier to deal with.
Though, can note that GNU binutils still has no idea how to handle RV64 PE/COFF, as it seemingly treats every machine-type as its own file format (and does not support any RV64 + PE/COFF targets).
Where, for some of the other ISAs, BGBCC generates LZ4 compressed binaries (file headers are uncompressed, but the rest of the image is compressed). Rationale is mostly that loading binaries from an SDcard is IO bound, and within the limits tested, LZ4 did best for executable code.
I have another byte-oriented LZ format (RP2) which works better for general data, but seemingly worse on program binaries. Entropy coded format were not used, as the speed cost of Huffman decoding is higher than that of the time spent reading data from an SDcard.
Seeming main difference is the RP2 correlates match length and distance (to encode a larger distance also encodes a longer match-length field). This correlation is true of most data, but less true of program binaries. LZ4 has a fixed 16-bit match distance, and came out ahead.
TBD if I should add support for 64-bit ELF, but ELF kinda sucks IMHO (and for ELF PIE binaries, roughly around half of the binary ends up eaten by metadata).
Where, bloated binaries are bad for both loading time and memory use (it is bad to have a 900K binary as an "ELF tax" when PE/COFF would have only needed 400K; well, and say the binary LZ4 compresses down to around 260K in the latter case; though one will still need 400K in RAM).
Further not helped in this case by ELF needing to load a new copy of the binaries for every program instance, whereas with my ABI, I was able to share the read-only sections across multiple instances (only the data/bss sections need to be instantiated per-process).
Can note that in my case, the PE/COFF "Global Pointer" entry in the Data Directory is effectively used to express the start of ".data" (which is also where the Global Pointer points), along with the combined size of data and bss (if the size is non-zero).
Global Pointer:
   RVA=Size=0: No Global Pointer
   RVA!=0, Size=0: Global Pointer points here, may not be relocated.
   RVA!=0, Size!=0: Start of data area, may be relocated per instance.
...

Date Sujet#  Auteur
24 Nov 24 * 80386 C compiler36Paul Edwards
24 Nov 24 +* Re: 80386 C compiler6fir
24 Nov 24 i+* Re: 80386 C compiler2fir
25 Nov 24 ii`- Re: 80386 C compiler1Paul Edwards
24 Nov 24 i`* Re: 80386 C compiler3Bart
25 Nov 24 i `* Re: 80386 C compiler2BGB
25 Nov 24 i  `- Re: 80386 C compiler1Paul Edwards
24 Nov 24 +* Re: 80386 C compiler24Janis Papanagnou
25 Nov 24 i`* Re: 80386 C compiler23Paul Edwards
25 Nov 24 i `* Re: 80386 C compiler22Kaz Kylheku
25 Nov 24 i  +* Re: 80386 C compiler20Rosario19
26 Nov 24 i  i`* Re: 80386 C compiler19Kaz Kylheku
26 Nov 24 i  i +* Re: 80386 C compiler7Keith Thompson
26 Nov 24 i  i i+* Re: 80386 C compiler5Paul Edwards
27 Nov 24 i  i ii`* Re: 80386 C compiler4Keith Thompson
27 Nov 24 i  i ii `* Re: 80386 C compiler3Paul Edwards
27 Nov 24 i  i ii  `* Re: 80386 C compiler2Keith Thompson
27 Nov 24 i  i ii   `- Re: 80386 C compiler1Paul Edwards
28 Nov 24 i  i i`- Re: 80386 C compiler1Tim Rentsch
27 Nov 24 i  i +* Re: 80386 C compiler9David Brown
27 Nov 24 i  i i`* Re: 80386 C compiler8Kaz Kylheku
27 Nov 24 i  i i +* Re: 80386 C compiler6James Kuyper
27 Nov 24 i  i i i`* Re: 80386 C compiler5Kaz Kylheku
28 Nov 24 i  i i i `* Re: 80386 C compiler4James Kuyper
30 Nov 24 i  i i i  `* Re: 80386 C compiler3Kaz Kylheku
30 Nov 24 i  i i i   +- Re: 80386 C compiler1Tim Rentsch
30 Nov 24 i  i i i   `- Re: 80386 C compiler1James Kuyper
28 Nov 24 i  i i `- Re: 80386 C compiler1David Brown
28 Nov 24 i  i +- Re: 80386 C compiler1Tim Rentsch
30 Nov 24 i  i `- Re: 80386 C compiler1Rosario19
26 Nov 24 i  `- Re: 80386 C compiler1Paul Edwards
25 Nov 24 `* Re: 80386 C compiler5Lynn McGuire
26 Nov 24  `* Re: 80386 C compiler4Keith Thompson
26 Nov 24   `* Re: 80386 C compiler3Lynn McGuire
26 Nov 24    `* Re: 80386 C compiler2Keith Thompson
26 Nov 24     `- Re: 80386 C compiler1BGB

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal