Bart <
bc@freeuk.com> wrote:
On 24/11/2024 00:24, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
I'm not saying no optimisation is needed, ever, I'm saying that the NEED
for optimisation is far smaller than most people seem to think.
There is also question of disc space. 'tcc' compiled by itself is
404733 bytes (code + data) (0.024s compile time), by gcc (default) is
340950 (0.601s compile time), by gcc -O is 271229 (1.662s compile
time), by gcc -Os is 228855 (2.470s compile time), by gcc -O2
is 323392 (3.364s compile time), gcc -O3 is 407952 (4.627s compile
time). As you can see gcc -Os can save quite a bit of disc space
for still moderate compile time.
I thought David Brown said that disk space is irrelevant?
I am not David Brown.
Anyway this is
the exact copy of what I tried just now, compiling a 5-line hello.c
program. I hadn't used these compilers since earlier today:
c:\c>tm gcc hello.c
TM: 5.80
c:\c>tm tcc hello.c
TM: 0.19
c:\c>tm gcc hello.c
TM: 0.19
c:\c>tm tcc hello.c
TM: 0.03
From cold, gcc took nearly 6 seconds (if you've been used to instant
feedback all day, it can feel like an age). tcc took 0.2 seconds.
Doing it a second time, now gcc takes 0.2 seconds, and tcc takes 0.03
seconds! (It can't get much faster on Windows.)
gcc is just a lumbering giant, a 870MB installation, while tcc is 2.5MB.
Yes, but exact size depends which version you install and how you
install it. I installed version 6.5 and removed debugging info from
executables. The result is 177MB, large but significantly smaller
than what you have. Debian package for gcc-12.2 is something like
144MB (+ about 8MB libraries which are usable for other purpose but
mainly for gcc), but it only gives C compiler. To that one should
add 'libc6-dev' (about 12MB) which is needed to create useful
programs. C++ adds 36MB, Fortran 35MB, Ada 94MB so my installation
is something like 330MB. Note: my 177MB reuses probably about 50MB
from system installation and includes C and C++. Also, in both cases
I do not count libc which is about 13MB (but needed by almost
anything in the system), shell kernel, etc.
On Windows some space savings trick do not work and traditionally
program ship their own libraries, so size may be bigger.
For me it is problematic that each gcc language and each extra
target adds a lot of space. I have extra targets (not counted in
size above) and together this is closer to 1G. In this aspect
LLVM is somewhat better, it gives me more targets that I have
intalled for gcc for total "cost" of something like 210MB (plus
about 50MB shared with gcc).
As for sizes:
c:\c>dir hello.exe
24/11/2024 00:44 2,048 hello.exe
c:\c>dir a.exe
24/11/2024 00:44 91,635 a.exe (48K with -s)
(At least that's one good thing of gcc writing out that weird a.exe each
time; I can compare both exes!)
AFAICS this is one-time Windows overhead + default layout rules for
the linker. On Linux I get 15952 bytes by defauls, 14472 after
striping. However, the actual code + data size is 1904 and even
in this most is crap needed to support extra features of C library.
In other words, this is mostly irrelevant, as people who want to
get size down can link it with different options to get smaller
size down. Actual hello world code size is 99 bytes when compiled
by gcc (default options) and 64 bytes by tcc. Again, gcc add things
like exception handling which increase size for tiny files, but
do not add much in a bigger file.
I did
hebisch@komp:~/kompi$ gcc -c hell2.c
hebisch@komp:~/kompi$ tcc -o hell2.gcc hell2.o
hebisch@komp:~/kompi$ tcc -c hell2.c
hebisch@komp:~/kompi$ tcc -o hell2.tcc hell2.o
hebisch@komp:~/kompi$ ls -l hell2.gcc hell2.tcc
-rwxr-xr-x 1 hebisch hebisch 3680 Nov 24 04:21 hell2.gcc
-rwxr-xr-x 1 hebisch hebisch 3560 Nov 24 04:21 hell2.tcc
As you can see, when using tcc as a linker there is small size
difference due to extra exception handling code put there by gcc.
This size difference will vanish in the noise when there is
bigger real code. And when you are really determined, linker
tricks can completely remove the exception handling code (AFAICS
it is not needed for simple programs).
As for mine (however it's possible I used it more recently):
c:\c>tm cc hello
Compiling hello.c to hello.exe
TM: 0.04
c:\c>dir hello.exe
24/11/2024 00:52 2,560 hello.exe
My installation is 0.3MB (excluding windows.h which is 0.6MB). Being
self-contained, I can trivally apply UPX compression to get a 0.1MB
compiler, which can be easily copied to a memory stick or bundled in one
of my apps. However compiling hello.c now takes 0.05 seconds.
(I don't use UPX because my apps are already tiny; it's just to marvel
at how much redundancy they still contain, and how much tinier they
could be.)
I know none of this will cut any ice; for various reasons you don't want
to use tcc.
Well, I tried to use tcc when it first appeared. Unfortunalty it
could not compile some valid C code that I passed to it. I filled
a bug report, but it was not fixed for several years. Shortly after
that I got AMD-64 machine and configured it as only 64-bit (one
reason to do this was to avoid bloat due to having both 64-bit
and 32-bit libraries). At that time and in following several
years tcc did not support 64-bit code, so was not usable for me.
Later IIRC it got 64-bit support, but I needed also ARM (and
on ARM faster compiler would make more difference).
There is question of trust: when what I reported remained unfixed
I lost faith in quality of tcc. I still need to check if it is
fixed now, but at least now tcc seem to have some developement.
One of them being that your build process involves N slow stages so
speeding up just one makes little difference.
Yes.
This however is very similar to my argument about optimisation; a
running app consists of lots of parts which take up execution time, not
all of which can be speeded up by a factor of 9. The net benefit will be
a lot less, just like your reduced build time.
If I do not have good reasons to write program in C, then likely I
will write it in some higher-level language. One good reason
to use C is to code performance-critical routines.
And of course, there is a question why program with runtime that
does not matter is written in a low level language?
I mean it doesn't matter if it's half the speed. It might matter if it
was 40 times slower.
If you code bottlenecks in C, than 40 times slower may be OK for the
rest. And there are compiled higher-level languages, you pay for
higher-level features, but overhead is much lower, closer to your
half speed (and that is mostly due to simpler code generator).
There's quite a gulf between even unoptimised native code and even a
fast dynamic language interpreter.
People seem to think that the only choices are the fastest possible C
code at one end, and slow CPython at the other:
gcc/O3-tcc-----------------------------------------------------CPython
On this scale, gcc/O3 code and tcc code are practically the same!
There is Ocaml, it offers interpreter (faster than Python) and a
compiler (which pobably gives faster code than your 'mcc -opt').
There are Lisp compilers. There is Java and C# (I am avoiding
them as they depend on sizeable runtime and due to propritary
games played by the vendors).
IME big productivity boost comes from garbage collection. But
nobody knows how to make cooperating garbage collectors. So
each garbage collected runtime forms its own island which has
trouble reusing code from other garbage collected environments.
ATM Python is biggest kind-of garbage collected environment so
people are attracted to it to reuse existing code.
-- Waldek Hebisch