Sujet : Re: transpiling to low level C
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.lang.cDate : 17. Dec 2024, 19:51:04
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vjsh6r$1s5j5$1@dont-email.me>
References : 1 2 3 4 5 6
User-Agent : Mozilla Thunderbird
On 12/17/2024 6:04 AM, bart wrote:
On 16/12/2024 21:23, Lawrence D'Oliveiro wrote:
On Sun, 15 Dec 2024 17:53:30 -0600, BGB wrote:
>
As an IL, even C is a little overkill, unless turned into a restricted
subset ...
>
Why not use WASM as your IL?
Have you tried it? I mean, directly generating WASM from a compiler front-end, not just using somebody else's tool to do so.
WASM is a stack-based language, but one that supposedly doesn't even have branching, although there is a 'br' statement, with some restrictions.
Information about it is quite elusive; it took me 5 minutes to even get examples of what it looks like (and I've seen it before).
C can apparently compile to WASM via Clang, so I tried this program:
void F(void) {
int i=0;
while (i<10000) ++i;
}
which compiled to 128 lines of WASM (technically, some form of 'WAT', as WASM is a binary format). The 60 lines correspondoing to F are shown below, and below that, is my own stack IL code.
So, what do you with your WASM/WAT program once generated? I've no idea, except that WASM is inextricably typed up with with browsers and with JavaScript, in which I have no interest.
With C, you run a compiler; with ASM, an assembler; these formats are well understood.
You can appreciate that it can be easier to devise your own format and your own tools that you understand 100%.
Hmm... It looks like the WASM example is already trying to follow SSA rules, then mapped to a stack IL... Not necessarily the best way to do it IMO.
But, yeah, in BGBCC I am also using a stack-based IL (RIL), which follows rules more in a similar category to .NET CIL (in that, stack items carry type, and the stack is generally fully emptied on branch).
In my IL, labels are identified with a LABEL opcode (with an immediate), and things like branches work by having the branch target and label having the same immediate (label ID).
I ended up considering this preferable to byte offsets, as:
Easier to generate from the front-end;
LABEL also marks the start/end of basic blocks;
...
There are also opcodes to convey the source filename and line number, these don't generate any output but merely serve to transport filename and line number information (useful for debugging).
RIL was a little weird in that functions and variables are themselves defined via bytecode operations. This is unlike both JVM and .NET CIL, which had used external metadata/structures for defining functions and variables (nevermind the significant differences between JVM and .NET in this area).
This is pros/cons, main downside of the current format is that it requires the bytecode modules to be loaded sequentially and fully. This works OK for a compiler on a modern PC, but does impose on RAM somewhat for a compiler on a more memory-constrained target. One idea would be to individually wrap functions and have a mechanism so that they can be loaded dynamically. But, this hasn't really been done for my existing IL. Most likely option is that metadata continues to be defined via bytecode operations, just that each function is separately wrapped, and there may be an index to map function names to the corresponding "lump" (say, if using a WAD variant as the top-level container).
Say:
Lump name is "FNC01234" (IWAD) or "func_1234" (WAD2).
And there is a table mapping "FOO_SomeFunction" to "FNC01234" or "func_1234".
But, this sort of things, along with past ideas to try moving this over to a format along similar lines to RIFF/AVI, have generally fizzled (along with possible debate over to to the merits of a WAD-like or RIFF-like format).
Though, an arguably simpler option might be to just individually wrap the bytecode for each translation unit, and have an manifest of what symbols are present. In this way, it would function more like a traditional static library (as opposed to the current strategy of globing all of the translation units in the library into a single large blob of bytecode); and probably dumping the bytecode for each translation unit into a WAD (again, possibly either IWAD or WAD2, though probably WAD2 in this case, as the comparably larger lumps would eliminate most concern over the larger directory entries).
When converting to the 3AC IR, there is the quirk that function calls are split into multiple parts:
The CALL operation, which ends the current basic-block;
A CSRV operation, which is at the start of a new basic block.
CSRV = Caller Save Return Value.
In cases where the 3AC was being interpreted, this was better, as the CSRV operation serves to save the return value from the called function to the correct place in the caller's frame (where the interpreter does not use recursion for its own operation).
Internal conversion to 3AC was faster than trying to directly interpret a stack bytecode (as well as 3AC being a better format for code generation).
--------------------------------------
F: # @F
.functype F () -> ()
.local i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32
# %bb.0:
global.get __stack_pointer
local.set 0
i32.const 16
local.set 1
local.get 0
local.get 1
i32.sub
local.set 2
i32.const 0
local.set 3
local.get 2
local.get 3
i32.store 12
.LBB0_1: # =>This Inner Loop Header: Depth=1
block
loop # label1:
local.get 2
i32.load 12
local.set 4
i32.const 10000
local.set 5
local.get 4
local.set 6
local.get 5
local.set 7
local.get 6
local.get 7
i32.lt_s
local.set 8
i32.const 1
local.set 9
local.get 8
local.get 9
i32.and
local.set 10
local.get 10
i32.eqz
br_if 1 # 1: down to label0
# %bb.2: # in Loop: Header=BB0_1 Depth=1
local.get 2
i32.load 12
local.set 11
i32.const 1
local.set 12
local.get 11
local.get 12
i32.add
local.set 13
local.get 2
local.get 13
i32.store 12
br 0 # 0: up to label1
.LBB0_3:
end_loop
end_block # label0:
return
end_function
-----------------------------
proc F::
local i32 i.1
load i32 0
store i32 i.1
jump #2
#4:
load u64 &i.1
incrto i32 /1
#2:
load i32 i.1
load i32 10000
jumplt i32 #4
#3:
#1:
retproc
endproc