Re: else ladders practice

Liste des GroupesRevenir à cl c  
Sujet : Re: else ladders practice
De : antispam (at) *nospam* fricas.org (Waldek Hebisch)
Groupes : comp.lang.c
Date : 07. Dec 2024, 00:30:40
Autres entêtes
Organisation : To protect and to server
Message-ID : <vj01eu$3u86n$1@paganini.bofh.team>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14
User-Agent : tin/2.6.2-20221225 ("Pittyvaich") (Linux/6.1.0-9-amd64 (x86_64))
Bart <bc@freeuk.com> wrote:
On 01/12/2024 13:04, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
On 28/11/2024 12:37, Michael S wrote:
On Wed, 27 Nov 2024 21:18:09 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>
>
    c:\cx>tm gcc sql.c            #250Kloc file
    TM:  7.38
>
Your example illustrates my point.  Even 250 thousand lines of
source takes only a few seconds to compile.  Only people nutty
enough to have single source files over 25,000 lines or so --
over 400 pages at 60 lines/page! -- are so obsessed about
compilation speed.
>
My impression was that Bart is talking about machine-generated code.
For machine generated code 250Kloc is not too much.
>
This file mostly comprises sqlite3.c which is a machine-generated
amalgamation of some 100 actual C files.
>
You wouldn't normally do development with that version, but in my
scenario, where I was trying to find out why the version built with my
compiler was buggy, I might try adding debug info to it then building
with a working compiler (eg. gcc) to compare with.
 
Even in context of developing a compiler I would not run blindly
many compiliations of large file.
Difficult bugs always occur in larger codebases, but with C, these in a
language that I can't navigate, and for programs which are not mine, and
which tend to be badly written, bristling with typedefs and macros.
 
It could take a week to track down where the error might be ...

It could be.  You could declare that the program is hopeless or do
what is needed.  Which frequently means effectively using available
debugging features.  For example, I got strange crash.  Looking at
data in the debugger suggested that data is malformed.  So I used
data breakpoints to figure out which instruction initialized the data.
That needed several runs of the program, in each run looking what
happened to suspected memory location.  At the end I localized the
problem and rest was easy.

Some problems are easy, for example significat percentage of
segfaults: you have something which is not a valid address
ad freqently you immediatly see why the address is wrong and
how to fix this.  Still, finding this usually takes longer
than compilation.

 At first stage I would debug
compiled program, to find out what is wrong with it.
 
... within the C program. Except there's nothing wrong with the C
program! It works fine with a working compiler.
 
The problem will be in the generated code, so in an entirely different
program.

Of course problem is in the generated code.  But debug info (I had
at least _some_ debug info, apparently you do not have it) shows you
which part of source is responsible for given machine code.  And you
can see data, so can see what is happening in the generated program.
And you have C source so you can see what should happen.  Once
you know place where "what is happening" differs from "what should
happen" you normally can produce quite small reproducing example.

So normal debugging tools are useful when several sets of
source code are in involved, in different languages, or the error occurs
in the second generation version of either the self-hosted tool, or the
program under test if it is to do with languages.
 
(For example, I got tcc.c working at one point. My generated tcc.exe
could compile tcc.c, but that second-generation tcc.c didn't work.)

Clear, you work in stages: first you find out what is wrong with
second-generation tcc.exe.  Then you find out piece of tcc.c that was
miscompiled by first generation tcc.exe (producing wrong second
generation compiler).  Then you find piece of tcc.c which was
responsible for this miscompilation.  And finally you look why
your compiler miscompiled this piece of tcc.c.

Tedius, yes.  It is easier if you have good testsuite, that is
collection of small programs that excercise various constructs
and potentially problematic combinations.

Anyway, most of the work involves executing programs in debugger
and observing critical things.  Re-creating executables is rare
in comparison.  Main point where compiler speed matters is time
to run compiler testsuite.

After that I would try to minimize the testcase, removing code which
do not contribute to the bug.
 
Again, there is nothing wrong with the C program, but in the code
generated for it. The bug can be very subtle, but it usually turns out
to be something silly.
 
Removing code from 10s of 1000s of lines (or 250Kloc for sql) is not
practical. But yet, the aim is to isolate some code which can be used to
recreate the issue in a smaller program.

If you have "good" version (say one produced by 'gcc' or by earlier
worong verion of your compiler), then you can isolate problem by
linking parts produced by different compilers.  Even if you have
one huge file, typically you can split it into parts (if it is one
huge function normally it is possible to split it into smaller
ones).  Yes, it is work but getting quality product needs work.

Debugging can involve comparing two versions, one working, the other
not, looking for differences. And here there may be tracking statements
added.
 
If the only working version is via gcc, then that's bad news because it
makes the process even more of a PITA.

Well, IME tracking statements frequently produce too much or too little
data.  When dealing with C code I tend to depend more on debugger,
setting breakpoints in crucial places and examing data there.  Extra
printing functions can help, for example gcc has printing functions
for its main data structures.  Such functions can be called from
debugger and give nicer output than generic debugger functions.
But even if you need extra printiong functions you can put them
in separate file, compile once and use multiple times.

I added an interpreter mode to my IL, because I assume that would give a
solid, reliable reference implementation to compare against.
 
If turned out to be even more buggy than the generated native code!
 
(One problem was to do with my stdarg.h header which implements VARARGS
used in function definitions. It assumes the stack grows downwords.

This is true on most machines, but not all.

In
my interpreter, it grows downwards!)

You probably meant upwards?  And handling such things is natural
when you have portablity in mind, either you parametrise stdarg.h
so that it works for both stack directions, or you make sure that
interpreter and compiler use the same direction (the later seem to
be much easier).  Actually, I think that most natural way is to
have data structure layout in the interpreter to be as close as
possible to compiler data layout.  Of course, there are some
unavoidable differences, interpreter needs registers for its operation
so some variables that could be in registers in compiled code
will end in stack frame.

 That involves severla compilations
of files with quickly decreasing sizes.
 
Tim isn't asking the right questions (or any questions!). WHY does gcc
take so long to generate indifferent code when the task can clearly be
done at least a magnitude faster?
 
The simple answer is: users tolerate long compile time.  If users
abandoned 'gcc' to some other compiler due to long compile time,
then 'gcc' developers would notice.
 
People use gcc. They come to depend on its features, or they might use
(perhaps unknowingly) some extensions. On Windows, gcc includes some
headers and libraries that belong to Linux, but other compilers don't
provide them.
 
The result is that if they were to switch to a smaller, faster compiler,
their program may not work.
 
They'd have to use it from the start. But then they may want to use
libraries which only work with gcc ...
 
Well, you see that there are reasons to use 'gcc'.  Long ago I
produced image processing DLL for Windows.  First version was
developed on Linux using 'gcc' and then compiled on Windows
using Borland C.  It turned out that in Borland C 'setjmp/longjmp'
did not work, so I had to work around this.  Not nice, but
managable.  At that time C standard did not include function
to round floats to integers and that proved to be problematic.
C default, that is truncation produced artifacts that were not
acceptable.  So I used emulation of rounding based on 'floor',
that worked OK, but turned out to be slow (something like 70%
of runtime went into rounding).  So I replaced this by assembler
code.  With Borland C I had to call a separate assembler routine,
which had some overhead.

Next version was cross-compiled on Linux using gcc.  This version
used inline assembly for rounding and was significantly faster
than what Borland C produced.  Note: images to process were
largish (think of say 12000 by 20000 pixels) and speed was
important factor.  So using 'gcc' specific code was IMO justified
(this code was used conditionally, other compilers would get
slow portable version using 'floor').
 
You need to improve your propaganda for faster C compilers...
 
I actually don't know why I care. I get the benefit of my fast tools
every day; they're a joy to use. So I'm not bothered that other people
are that tolerant of slow, cumbersome build systems.
 
But then, people in this group do like to belittle small, fast products
(tcc for example as well as my stuff), and that's where it gets annoying.

I tried tcc compiling TeX.  Long ago it did not work due to limitations
of tcc.  This time it worked.  Small comparison on main file (19062
lines):

Command           time              size code    size data
tcc -g            0.017              290521        1188
tcc               0.015              290521        1188
gcc -O0 -g        0.440              248467          14
gcc -O0           0.413              248467          14
gcc -O -g         1.385              167565           0
gcc -O            1.151              167565           0
gcc -Os -g        1.998              142336           0
gcc -Os           1.724              142336           0
gcc -O2 -g        2.683              207913           0
gcc -O2           2.257              207913           0
gcc -O3 -g        3.510              255909           0
gcc -O3           2.934              255909           0
clang -O0 -g      0.302              232755          14
clang -O0         0.189              232755          14
clang -O -g       1.996              223410           0
clang -O          1.683              223410           0
clang -Os -g      1.693              154421           0
clang -Os         1.451              154421           0
clang -O2 -g      2.774              259569           0
clang -O2         2.359              259569           0
clang -O3 -g      2.970              280235           0
clang -O3         2.537              280235           0

I have dully provided both time when using '-g' and without.
Both are supposed to produce the same code (so also code
and data sizes are the same), but you can see that '-g'
measurably increases compile time.  AFAIK compiler data
structures contain slots for debug info even if '-g' is
not given and compiler generates no debug info.  So
actial cost of supporting '-g' is higher than the difference,
you pay part of this cost even if you do not use the
capability.

ATM I do not have data handy to compare runtimes (TeX needs
extra data to do uesful work), so I provide code and data
size as a proxy.  As you can see even at -O0 gcc and clang
manage to put almost all data into istructions (actually
in tex.c _all_ intialized data is constant), while tcc
keeps it as data which requires extra instructions to
access.  gcc at -O and -Os and clang at -Os produce code
which is about half of size of tcc result.  Some part
of it may be due to using smaller instructions, but most
is likely because gcc and clang results simply have much
less instructions.  At higher optimization level code
size grows, this is probably due to inlining and code
duplication.  This usually gives some small speedup at
cost of bigger code, but one would have to measure
(sometimes attempts at optimization backfire and lead
to slower code).

Anyway, 19062 lines is much larger than typical file that
I work with and even for such size compile time is reasonable.
Maybe less typical is modest use of include files, tex.c
uses few standard C headers and 1613 lines of project-specific
headers.  Still, there are macros and macro-expanded result
is significantly bigger than the source.

In the past TeX execution time correlated reasonably well with
Dhrystone.  On Dhrystone tcc compiled code is about 4 times
slower than gcc/lang, so one can expect tcc compiled TeX to
be significantly slower than one compiled by gcc or clang.

--
                              Waldek Hebisch

Date Sujet#  Auteur
31 Oct 24 * else ladders practice255fir
31 Oct 24 +* Re: else ladders practice9Anton Shepelev
31 Oct 24 i+- Re: else ladders practice1fir
31 Oct 24 i`* Re: else ladders practice7James Kuyper
1 Nov 24 i `* Re: else ladders practice6David Brown
2 Nov 24 i  +* Re: else ladders practice2James Kuyper
2 Nov 24 i  i`- Re: else ladders practice1David Brown
2 Nov 24 i  `* Re: else ladders practice3fir
2 Nov 24 i   +- Re: else ladders practice1David Brown
2 Nov 24 i   `- Re: else ladders practice1James Kuyper
31 Oct 24 +* Re: else ladders practice5Richard Harnden
31 Oct 24 i+* Re: else ladders practice3fir
31 Oct 24 ii`* Re: else ladders practice2fir
31 Oct 24 ii `- Re: else ladders practice1fir
31 Oct 24 i`- Re: else ladders practice1Bonita Montero
31 Oct 24 +* Re: else ladders practice22Dan Purgert
31 Oct 24 i+* Re: else ladders practice3fir
31 Oct 24 ii`* Re: else ladders practice2Dan Purgert
31 Oct 24 ii `- Re: else ladders practice1fir
16 Nov 24 i`* Re: else ladders practice18Stefan Ram
16 Nov 24 i +* Re: else ladders practice5Bart
16 Nov 24 i i`* Re: else ladders practice4David Brown
19 Nov 24 i i `* Re: else ladders practice3Janis Papanagnou
19 Nov 24 i i  +- Re: else ladders practice1David Brown
19 Nov 24 i i  `- Re: else ladders practice1Michael S
16 Nov 24 i +* Re: else ladders practice3James Kuyper
19 Nov 24 i i`* Re: else ladders practice2Janis Papanagnou
1 Dec 24 i i `- Re: else ladders practice1Tim Rentsch
16 Nov 24 i +* Re: else ladders practice2Lew Pitcher
17 Nov 24 i i`- Re: else ladders practice1Tim Rentsch
20 Nov 24 i +* Re: else ladders practice3Dan Purgert
30 Nov 24 i i`* Re: else ladders practice2Rosario19
5 Dec 24 i i `- Re: else ladders practice1Dan Purgert
1 Dec 24 i `* Re: else ladders practice4Waldek Hebisch
1 Dec 24 i  `* Re: else ladders practice3Janis Papanagnou
2 Dec 24 i   `* Re: else ladders practice2Waldek Hebisch
2 Dec 24 i    `- Re: else ladders practice1Janis Papanagnou
31 Oct 24 +- Re: else ladders practice1Janis Papanagnou
31 Oct 24 `* Re: else ladders practice217Bart
1 Nov 24  `* Re: else ladders practice216fir
1 Nov 24   +* Re: else ladders practice198Bart
1 Nov 24   i+* Re: else ladders practice196fir
1 Nov 24   ii`* Re: else ladders practice195Bart
1 Nov 24   ii `* Re: else ladders practice194fir
1 Nov 24   ii  `* Re: else ladders practice193fir
1 Nov 24   ii   `* Re: else ladders practice192Bart
1 Nov 24   ii    `* Re: else ladders practice191David Brown
1 Nov 24   ii     `* Re: else ladders practice190Bart
1 Nov 24   ii      `* Re: else ladders practice189David Brown
1 Nov 24   ii       `* Re: else ladders practice188Bart
2 Nov 24   ii        `* Re: else ladders practice187David Brown
2 Nov 24   ii         `* Re: else ladders practice186Bart
3 Nov 24   ii          +- Re: else ladders practice1Tim Rentsch
3 Nov 24   ii          +* Re: else ladders practice4fir
3 Nov 24   ii          i`* Re: else ladders practice3Bart
3 Nov 24   ii          i `* Re: else ladders practice2fir
3 Nov 24   ii          i  `- Re: else ladders practice1fir
3 Nov 24   ii          +* Re: else ladders practice4fir
3 Nov 24   ii          i`* Re: else ladders practice3Bart
3 Nov 24   ii          i `* Re: else ladders practice2fir
3 Nov 24   ii          i  `- Re: else ladders practice1fir
3 Nov 24   ii          +* Re: else ladders practice35David Brown
3 Nov 24   ii          i+- Re: else ladders practice1Kaz Kylheku
3 Nov 24   ii          i+* Re: else ladders practice23Bart
4 Nov 24   ii          ii+* Re: else ladders practice21David Brown
4 Nov 24   ii          iii`* Re: else ladders practice20Bart
4 Nov 24   ii          iii +* Re: else ladders practice2David Brown
5 Nov 24   ii          iii i`- Re: else ladders practice1Bart
5 Nov 24   ii          iii `* Re: else ladders practice17David Brown
5 Nov 24   ii          iii  +* Re: else ladders practice2Bart
5 Nov 24   ii          iii  i`- Re: else ladders practice1David Brown
6 Nov 24   ii          iii  +* Re: else ladders practice5Bart
6 Nov 24   ii          iii  i`* Re: else ladders practice4David Brown
6 Nov 24   ii          iii  i `* Re: else ladders practice3Bart
7 Nov 24   ii          iii  i  `* Re: else ladders practice2David Brown
7 Nov 24   ii          iii  i   `- Re: else ladders practice1Bart
9 Nov 24   ii          iii  `* Re: else ladders practice9Janis Papanagnou
9 Nov 24   ii          iii   `* Re: else ladders practice8David Brown
10 Nov 24   ii          iii    `* Re: else ladders practice7Janis Papanagnou
10 Nov 24   ii          iii     `* Re: else ladders practice6David Brown
19 Nov 24   ii          iii      `* Re: else ladders practice5Janis Papanagnou
19 Nov 24   ii          iii       `* Re: else ladders practice4David Brown
19 Nov 24   ii          iii        `* Re: else ladders practice3Janis Papanagnou
19 Nov 24   ii          iii         `* Re: else ladders practice2David Brown
20 Nov 24   ii          iii          `- Re: else ladders practice1Janis Papanagnou
9 Nov 24   ii          ii`- Re: else ladders practice1Janis Papanagnou
8 Nov 24   ii          i+* Re: else ladders practice9Janis Papanagnou
8 Nov 24   ii          ii+* Re: else ladders practice4David Brown
9 Nov 24   ii          iii`* Re: else ladders practice3Janis Papanagnou
9 Nov 24   ii          iii `* Re: else ladders practice2David Brown
10 Nov 24   ii          iii  `- Re: else ladders practice1Janis Papanagnou
9 Nov 24   ii          ii`* Re: else ladders practice4Bart
9 Nov 24   ii          ii `* Re: else ladders practice3Janis Papanagnou
9 Nov 24   ii          ii  `* Re: else ladders practice2Bart
10 Nov 24   ii          ii   `- Re: else ladders practice1Janis Papanagnou
8 Nov 24   ii          i`- Re: else ladders practice1Bart
5 Nov 24   ii          `* Re: else ladders practice141Waldek Hebisch
5 Nov 24   ii           +- Re: else ladders practice1fir
5 Nov 24   ii           +* Re: else ladders practice24David Brown
5 Nov 24   ii           i+* Re: else ladders practice17Waldek Hebisch
5 Nov 24   ii           ii`* Re: else ladders practice16David Brown
6 Nov 24   ii           i`* Re: else ladders practice6Bart
5 Nov 24   ii           `* Re: else ladders practice115Bart
1 Nov 24   i`- Re: else ladders practice1fir
2 Nov 24   `* Re: else ladders practice17Tim Rentsch

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal