Sujet : Re: transpiling to low level C
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.lang.cDate : 17. Dec 2024, 08:03:20
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vjr7np$1j57r$2@dont-email.me>
References : 1 2 3 4 5
User-Agent : Mozilla Thunderbird
On 12/16/2024 5:21 AM, Thiago Adams wrote:
On 15/12/2024 20:53, BGB wrote:
On 12/15/2024 3:32 PM, bart wrote:
On 15/12/2024 19:08, Bonita Montero wrote:
C++ is more readable because is is magnitudes more expressive than C.
You can easily write a C++-statement that would hunddres of lines in
C (imagines specializing a unordered_map by hand). Making a language
less expressive makes it even less readable, and that's also true for
your reduced C.
>
>
That's not really the point of it. This reduced C is used as an intermediate language for a compiler target. It will not usually be read, or maintained.
>
An intermediate language needs to at a lower level than the source language.
>
And for this project, it needs to be compilable by any C89 compiler.
>
Generating C++ would be quite useless.
>
>
As an IL, even C is a little overkill, unless turned into a restricted subset (say, along similar lines to GCC's GIMPLE).
>
Say:
Only function-scope variables allowed;
No high-level control structures;
...
>
Say:
int foo(int x)
{
int i, v;
for(i=x, v=0; i>0; i--)
v=v*i;
return(v);
}
>
Becoming, say:
int foo(int x)
{
int i;
int v;
i=x;
v=0;
if(i<=0)goto L1;
L0:
v=v*i;
i=i-1;
if(i>0)goto L0;
L1:
return v;
}
>
...
>
I have considered to remove loops and keep only goto.
But I think this is not bring too much simplification.
It depends.
If the compiler works like an actual C compiler, with a full parser and AST stage, yeah, it may not save much.
If the parser is a thin wrapper over 3AC operations (only allowing statements that map 1:1 with a 3AC IR operation), it may save a bit more...
As for whether or not it makes sense to use a C like syntax here, this is more up for debate (for practical use within a compiler, I would assume a binary serialization rather than an ASCII syntax, though ASCII may be better in terms of inter-operation or human readability).
But, as can be noted, I would assume a binary serialization that is oriented around operators; and *not* about serializing the structures used to implement those operators. Also I would assume that the IR need not be in SSA form (conversion to full SSA could be done when reading in the IR operations).
Ny argument is that not using SSA form means fewer issues for both the serialization format and compiler front-end to need to deal with (and is comparably easy to regenerate for the backend, with the backend operating with its internal IR in SSA form).
Well, contrast to LLVM assuming everything is always in SSA form.
...