Sujet : Re: question about linker
De : bc (at) *nospam* freeuk.com (Bart)
Groupes : comp.lang.cDate : 27. Nov 2024, 16:12:39
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vi7ct8$28ej$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11
User-Agent : Mozilla Thunderbird
On 27/11/2024 13:27, Thiago Adams wrote:
On 27/11/2024 09:57, Thiago Adams wrote:
On 27/11/2024 09:38, Thiago Adams wrote:
On 27/11/2024 09:10, Bart wrote:
On 27/11/2024 11:57, Bart wrote:
On 27/11/2024 01:52, Thiago Adams wrote:
>
I also use ILs for my compilers, but I write my own backends. I've worked on two diifferent kinds. One looks like a HLL, and only exists for my language. So this original source:
>
>
>
I was wondering if is possible to write C programs without struct/union?
>
I did this experiment.
>
struct X {
int a, b;
};
>
void F1() {
struct X x;
x.a = 1;
x.b = 2;
printf("%d, %d", x.a, x.b);
}
>
The equivalent C89 program in a subset without structs count be
>
#define M(T, obj, OFF) *((T*)(((char*)&(obj)) + (OFF)))
>
void F2() {
char x[8];
M(int, x, 0 /*offset of a*/) = 1;
M(int, x, 4 /*offset of b*/) = 2;
printf("\n");
printf("%d, %d", M(int, x, 0), M(int, x, 4));
}
>
The char array represents the struct X memory, then we have to find the offset of the members and cast to their types.
>
>
Does your IL have structs?
No. It has a 'block' type which defines a fixed-length memory block. So your struct ST type below I think would be represented as the type 'mem:824', as it is 824 bytes.
(This works fine for WinABI. But for SYS V ABI, that has a much more complex set of rules where struct passing may depend on the types of the members. They may be split up amongst different registers.
I'm not too worried about that however; it will only apply to structs passed by value across an FFI, and most external libraries don't pass by-value structs. There will also be workarounds.)
The QBE IL has aggregates types. I think this removes the front end calculate the the offsets.
>
https://c9x.me/compile/doc/il.html#Aggregate-Types
>
>
>
>
I tried this sample with clang and -S -emit-llvm to see if it generates structs. The answer is yes.
https://llvm.org/docs/LangRef.html#getelementptr-instruction
struct RT {
char A;
int B[10][20];
char C;
};
struct ST {
int X;
double Y;
struct RT Z;
};
int *foo(struct ST *s) {
return &s[1].Z.B[5][13];
}
The LLVM code generated by Clang is approximately:
%struct.RT = type { i8, [10 x [20 x i32]], i8 }
%struct.ST = type { i32, double, %struct.RT }
define ptr @foo(ptr %s) {
entry:
%arrayidx = getelementptr inbounds %struct.ST, ptr %s, i64 1, i32 2, i32 1, i64 5, i64 13
ret ptr %arrayidx
}
This example is misleading. That's the output from using -O3. Unoptimised LLVM output is this:
---------------------------------
define dso_local ptr @foo(ptr noundef %0) #0 !dbg !10 {
%2 = alloca ptr, align 8
store ptr %0, ptr %2, align 8
#dbg_declare(ptr %2, !34, !DIExpression(), !35)
%3 = load ptr, ptr %2, align 8, !dbg !36
%4 = getelementptr inbounds %struct.ST, ptr %3, i64 1, !dbg !36
%5 = getelementptr inbounds nuw %struct.ST, ptr %4, i32 0, i32 2, !dbg !37
%6 = getelementptr inbounds nuw %struct.RT, ptr %5, i32 0, i32 1, !dbg !38
%7 = getelementptr inbounds [10 x [20 x i32]], ptr %6, i64 0, i64 5, !dbg !36
%8 = getelementptr inbounds [20 x i32], ptr %7, i64 0, i64 13, !dbg !36
ret ptr %8, !dbg !39
}
---------------------------------
If you are writing the IR code, then it will be up to you to combine that chain of constant offsets into a single offset. Othewise it will still be you needing to do so the other side of the IR!
(I don't know if the reduction above is done pre-LLVM or by LLVM.
In my IL, it will generate multiple instructions, and there will be a reduction pass, to combine instructions where possible. That's a WIP, but such examples like yours are incredibly rare in my code-base, while the speed-up achieved is likely to be minor. Modern CPUs are good at running poor code fast. Mostly this just makes code more compact.
My IL for your example (I translated to my language) starts off as this:
---------------------
proc t.foo:
param u64 s
rettype u64
load u64 s
load i64 1
addpx mem:824 /824/-824 # /scale factor /extra byte offset
load i64 20
addpx u64 /1
load i64 5
addpx mem:80 /80
load i64 13
addpx i32 /4
jumpret u64 #1
#1:
retfn u64
endproc
---------------------
The reductions could also be applied during codegen to native code. But as it is, no reductions are done, and the body of the function generates this x64 code:
mov rax, [rbp + `t.foo.s] # or mov rax, rcx with reg allocator
lea rax, [rax+20]
lea rax, [rax+400]
lea rax, [rax+52]
Here the reduction could also be done with a peephole optimiser to combined the three LEAs into one instruction. With 's' in a register, probably the optimum code here would be one LEA instruction.