Sujet : Re: transpiling to low level C
De : bc (at) *nospam* freeuk.com (bart)
Groupes : comp.lang.cDate : 18. Dec 2024, 13:08:24
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vjudvn$28ulf$1@dont-email.me>
References : 1 2 3 4 5 6 7
User-Agent : Mozilla Thunderbird
On 17/12/2024 18:51, BGB wrote:
On 12/17/2024 6:04 AM, bart wrote:
C can apparently compile to WASM via Clang, so I tried this program:
>
void F(void) {
int i=0;
while (i<10000) ++i;
}
>
which compiled to 128 lines of WASM (technically, some form of 'WAT', as WASM is a binary format). The 60 lines correspondoing to F are shown below, and below that, is my own stack IL code.
I'm not even sure what format that code is in, as WAT is supposed to use S-expressions. The generated code is flat. It differs in other ways from examples of WAT.
Hmm... It looks like the WASM example is already trying to follow SSA rules, then mapped to a stack IL... Not necessarily the best way to do it IMO.
I hadn't considered that SSA could be represented in stack form.
But couldn't each push be converted to an assignment to a fresh variable, and the same with pop?
As for Phi functions, the only similar thing I encounter (but could be mistaken), is when there is a choice of paths to yield a value (such as (c ? a : b) in C; my language has several such constructs).
With stack code, the result conveniently ends up on top of the stack whichever path is taken, which is a big advantage. Unless you then have to convert that to register code, and need to ensure the values end up in the same register when the control paths join up again.
But, yeah, in BGBCC I am also using a stack-based IL (RIL), which follows rules more in a similar category to .NET CIL (in that, stack items carry type, and the stack is generally fully emptied on branch).
In my IL, labels are identified with a LABEL opcode (with an immediate), and things like branches work by having the branch target and label having the same immediate (label ID).
So, you jump to label L123, and the label looks like:
L123:
I think that is pretty standard! But it sounds like you use a very tight encoding for bytecode, while mine uses a 32-byte descriptor for each IL instruction.
(One quibble with labels is whether a label definition occupies an actual IL instruction. With my IL used as a backend for static languages, it does. And there can be clusters of labels at the same spot.
With dynamic bytecode designed for interpretation, it doesn't. It uses a different structure. This means labels don't need to be 'executed' when encountered.)