Sujet : Re: question about linker
De : bc (at) *nospam* freeuk.com (Bart)
Groupes : comp.lang.cDate : 27. Nov 2024, 12:57:29
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vi71f9$7be$1@dont-email.me>
References : 1 2 3 4 5 6
User-Agent : Mozilla Thunderbird
On 27/11/2024 01:52, Thiago Adams wrote:
Em 11/26/2024 9:59 PM, Bart escreveu:
Hard C Features
>
...
I think K&R C is simpler than C89 that is simpler than C99 etc...
My objective if to find the minimum code generator (backend) in C and leave the other complexities (like warnings, static analysis, constexpr, preprocessor) to the frond end.
This also facilitates to have more than on backend sharing the job done by the front end.
For instance, I may move all constant expressions from the generated code. So the backend does not have to compute constant expressions any more and it still C89 compatible. I removed enuns for instance.
So my question is not about how to create a simple C compiler but how to separate and move most of the job to the front end creating a very simple backend (code generator) which the input is code C89 compatible.
I believe that K&R C is simpler than C89, which in turn is simpler than C99, and so on.
My goal is to design a minimal code generator (backend) in C reading C89, while delegating other complexities—such as warnings, static analysis, constexpr, and preprocessing—to the front end.
This approach also facilitates using multiple backends that share the work handled by the front end.
For example, I might remove all constant expressions from the generated code, so the backend no longer needs to compute them and remains C89-compatible. I've already removed features like enums, typedefs for instance.
Therefore, my question isn't about how to create a simple C compiler.
Instead, it's about how to shift most of the workload to the front end, resulting in a very simple backend (code generator) that processes C89-compatible code as input.
Does it makes sense?
Not really. You're basically talking about using an IR or IL, which most compilers already do, including mine now. Clang for example uses LLVM IR.
Some languages use C as intermediate language.
But you seem to be getting C, and lower level ILs, mixed up.
If you are transpiling to C, then just generate C code, C89 if you like. In that case you don't need to discard 90% of the language to make it simpler! Simpler for whom? C89 compilers that can deal with function prototypes etc already exist; you said you are not writing your own compiler.
This shows your confusion:
> I was wondering if splitting expressions would make the backend simpler
>
> for instance
>
> int r = a + b * c;
>
> converted to
>
> int r1 = b * c;
> int r2 = a + r1;
> int r = r2;
You are still talking as though YOU are writing the backend! You will either use an existing C compiler or an existing IL backend, but there aren't that many of the latter.
The most famous is LLVM, but it is fantastically complex, huge and slow.
(To see examples of LLVM IR, go to godbolt.org, choose C language, choose a Clang compiler, and enter '-S -emit-llvm' as the compiler options. Then try an example C function in the left panel.)
I also use ILs for my compilers, but I write my own backends. I've worked on two diifferent kinds. One looks like a HLL, and only exists for my language. So this original source:
proc F=
int r, a, b, c
r := a + b*c
end
Generates this IL:
Proc f():
i64 r
i64 a
i64 b
i64 c
!------------------------
T1 := b * c i64
T2 := a + T1 i64
r := T2 i64
!------------------------
retproc
End
It looks great, but was hard to work with. Instead I settled on this lower level IL, which looks like assembly. That one works also with C, so given this C function:
void F() {
int r, a, b, c;
r = a + b*c;
}
it produces this IL code:
proc F::
local i32 r.1
local i32 a.1
local i32 b.1
local i32 c.1
!------------------------
load i32 a.1 ! 00005
load i32 b.1 ! 00006
load i32 c.1 ! 00007
mul i32 ! 00008
add i32 ! 00009
store i32 r.1 ! 00010
!------------------------
#1:
retproc ! 00013
endproc
(The '::' indicates an exported function, as 'static' was not used. The .1 suffixes are to do with block scopes, since there can be multiple 'a' identifiers in a function.)
No function prototypes are needed, since everything needed is specified at the call-site. The front-end will provide an necesary conversions or promotions. There are attributes that appear to mark variadic calls for example.
Only imported functions need to be listed.
This sounds vaguely like what you are trying to achieve, but you have the idea that this IL must be C.
C however will generally need that extra info (function signatures etc) or it will cause problems. But it is very little trouble to provide them.