On 06/12/2024 23:30, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
(For example, I got tcc.c working at one point. My generated tcc.exe
could compile tcc.c, but that second-generation tcc.c didn't work.)
Clear, you work in stages: first you find out what is wrong with
second-generation tcc.exe.
Ha, ha, ha!
While C /can/ written reasonably clearly, tcc sources are more typical. Very dense, mixed-up lower and upper case everywhere, apparent over-use of macros, eq:
for_each_elem(symtab_section, 1, sym, ElfW(Sym)) {
if (sym->st_shndx == SHN_UNDEF) {
name = (char *) symtab_section->link->data + sym->st_name;
sym_index = find_elf_sym(s1->dynsymtab_section, name);
If I was looking to develop this product then it might be worth spending days or weeks learning how it all works. But it's not worth mastering this codebase inside out just to discover I wrote 0 instead of 1 somewhere in my compiler.
I need whatever error it is to manifest itself in a simpler way. Or have two versions (eg. one interpreted the other native code) that give different results. The problem with this app is that those different results appear too far down the line; I don't want to trace a billion instructions first.
So, when I get back to it, I'll test other open source C code. (The annoying thing though is that either it won't compile for reasons I've lost interest in, or it works completely fine.)
In
my interpreter, it grows downwards!)
You probably meant upwards?
Yes.
And handling such things is natural
when you have portablity in mind, either you parametrise stdarg.h
so that it works for both stack directions, or you make sure that
interpreter and compiler use the same direction (the later seem to
be much easier).
This is quite a tricky one actually. There is currently conditional code in my stdarg.h that detects whether the compiler has set a flag saying result will be interpreted. But it doesn't always know that.
For example, the compiler might be told to do -E (preprocess) and the result compiled later. The stack direction is baked into the output.
Or it will do -p (generate discrete IL), where it doesn't know whether that will be interpreted.
But this is not a serious issue; the interpreted option is for either debugging or novelty uses.
Actually, I think that most natural way is to
have data structure layout in the interpreter to be as close as
possible to compiler data layout.
I don't want my hand forced in this. The point of interpreting is to be independent of hardware. A downward growing stack is unnatural.
They'd have to use it from the start. But then they may want to use
libraries which only work with gcc ...
Well, you see that there are reasons to use 'gcc'.
Self-perpetuating ones, which are the wrong reasons.
Next version was cross-compiled on Linux using gcc. This version
used inline assembly for rounding and was significantly faster
than what Borland C produced. Note: images to process were
largish (think of say 12000 by 20000 pixels) and speed was
important factor. So using 'gcc' specific code was IMO justified
(this code was used conditionally, other compilers would get
slow portable version using 'floor').
I have a little image editor written entirely in interpreted code. (It was supposed to a project that was mixed language, but that's some way off.)
However it is just about usable. Eg. inverting the colours (negative to positive etc) of a 6Mpix colour image takes 1/8th of a second. Splitting into separate R,G,B 8-bit planes takes half a second. This is with bytecode working on a pixel at a time.
It uses no optimised code in the interpreter. Only a mildly accelerated dispatcher.
You need to improve your propaganda for faster C compilers...
>
I actually don't know why I care. I get the benefit of my fast tools
every day; they're a joy to use. So I'm not bothered that other people
are that tolerant of slow, cumbersome build systems.
>
But then, people in this group do like to belittle small, fast products
(tcc for example as well as my stuff), and that's where it gets annoying.
I tried tcc compiling TeX. Long ago it did not work due to limitations
of tcc. This time it worked. Small comparison on main file (19062
lines):
Command time size code size data
tcc -g 0.017 290521 1188
tcc 0.015 290521 1188
gcc -O0 -g 0.440 248467 14
gcc -O0 0.413 248467 14
This is demonstrating that tcc is translating C code at over 1 million lines per second, and generating binary code at 17MB per second. You're not impressed by that?
Here are a couple of reasonably substantial one-file programs that can be run, both interpreters:
https://github.com/sal55/langs/blob/master/lua.cThis is a one-file Lua interpreter, which I modified to take input from a file. (For original, see comment at start.)
On my machine, these are typical results:
gcc -s -O3 14 secs 378KB 3.0 secs (compile-time, size, runtime)
gcc -s -O0 3.3 secs 372KB 10.0 secs
tcc 0.12 secs 384KB 8.5 secs
cc 0.14 secs 315KB 8.3 secs
The runtime refers to running this Fibonacci test (fib.lua):
function fibonacci(n)
if n<3 then
return 1
else
return fibonacci(n-1) + fibonacci(n-2)
end
end
for n = 1, 36 do
f=fibonacci(n)
io.write(n," ",f, "\n")
end
The one is a version of my interpreter, minus ASM acceleration, transpiled to C, and for Linux:
https://github.com/sal55/langs/blob/master/qc.cCompile using for example:
gcc qc.c -oqc -fno-builtin -lm -ldl
tcc qc.c -oqc -fdollars-in-identifiers -lm -ldl
The input there can be (fib.q):
func fib(n)=
if n<3 then
1
else
fib(n-1)+fib(n-2)
fi
end
for i to 36 do
println i,fib(i)
od
Run like this:
./qc -nosys fib
On my Windows machine, gcc-O3-compiled version takes 4.1 seonds, and tcc is 9.3 seconds. It's narrower than the Lua version which uses a C style that depends more on function inlining. (Note that being in one file, allows gcc to do whole-program optimisations.)
My cc-compiled version runs in 5.1 seconds, so only 25% slower than gcc-O3. It also produces a 360KB executable, compared with gcc's 467KB, even with -s. tcc's code is about the same as gcc-O3.
(My cc-compiler doesn't yet have the optimising pass that makes code smaller. The original source qc project, builds to 266KB with that pass enabled, while gcc's -Os on qc.c manages 280KB.
But my 266KB version runs faster than gcc's 280KB! And accelerated code runs 5 times as fast. (6 secs vs 1.22 secs.)