Newsportal USENET - Re: else ladders practice

Bart <bc@freeuk.com> wrote:

On 19/11/2024 01:53, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
On 10/11/2024 06:00, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
>
I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the two.
>
You want to create a language that is easily compilable, no matter how
complex the input.
>
Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.
>
That doesn't add up: the more the compiler gets used, the slower it
should get?!

More complicated does not mean slower. Binary search or hash tables
are more complicated than linear search, but for larger data may
be much faster.

That's not the complexity I had in mind. The 100-200MB sizes of
LLVM-based compilers are not because they use hash-tables over linear
search.

It is related: both gcc anf LLVM are doing analyses that in the
past were deemed inpracticaly expensive (both in time and in space).
Those analyses work now thanks to smart algorithms that
significantly reduced resource usage. I know that you consider
this too expensive. But the point is that there are also things
which are easy to program and are slow, but acceptable for some
people. You can speed up such things adding complexity to the
compiler.

More generaly, I want to minimize time spent by the programmer,
that is _sum over all iterations leading to correct program_ of
compile time and "think time". Compiler that compiles slower,
but allows less iterations due to better diagnostics may win.
Also, humans perceive 0.1s delay almost like no delay at all.
So it does not matter if single compilation step is 0.1s or
0.1ms. Modern computers can do a lot of work in 0.1s.

What's the context of this 0.1 seconds? Do you consider it long or short?

Context is interactive response. It means "pretty fast for interactive
use".

My tools can generally build my apps from scratch in 0.1 seconds; big
compilers tend to take a lot longer. Only Tiny C is in that ballpark.

So I'm failing to see your point here. Maybe you picked up that 0.1
seconds from an earlier post of mine and are suggesting I ought to be
able to do a lot more analysis within that time?

This 0.1s is old thing. My point is that if you are compiling simple
change, than you should be able to do more in this time. In normal
developement source file bigger than 10000 lines are relatively
rare, so once you get in range of 50000-100000 lines per second
making compiler faster is of marginal utility.

Yes. This may lead to some complexity. Simple approach is to
avoid obviously useless recompilation ('make' is doing this).
More complicated approach may keep some intermediate data and
try to "validate" them first. If previous analysis is valid,
then it can be reused. If something significant changes, than
it needs to be re-done. But many changes only have very local
effect, so at least theoretically re-using analyses could
save substantial time.

I consider compilation: turning textual source code into a form that can
be run, typically binary native code, to be a completely routine task
that should be as simple and as quick as flicking a light switch.

While anything else that might be a deep analysis of that program I
consider to be a quite different task. I'm not saying there is no place
for it, but I don't agree it should be integrated into every compiler
and always invoked.

We clearly differ in question of what is routine. Creating usable
executable is rare task, once executable is created it can be used
for long time. OTOH developement is routine and for this one wants
to know if a change is correct. Extra analyses and diagonstics
help here. And since normal developement works in cycles there is
a lot of possiblity to re-use results between cycles.

Since now that last statement is the '0' value (any int value wil do).
What should my compiler report instead? What analysis should it be
doing? What would that save me from typing?

Currently in typed language that I use literal translation of
the example hits a hole in checks, that is the code is accepted.

Concerning needed analyses: one thing needed is representation of
type, either Pascal range type or enumeration type (the example
is _very_ unatural because in modern programming magic numbers
are avoided and there would be some symbolic representation
adding meaning to the numbers). Second, compiler must recognize
that this is a "multiway switch" and collect conditions.

The example came from C. Even if written as a switch, C switches do not
return values (and also are hard to even analyse as to which branch is
which).

In my languages, switches can return values, and a switch written as the
last statement of a function is considered to do so, even if each branch
uses an explicit 'return'. Then, it will consider a missing ELSE a 'hole'.

It will not do any analysis of the range other than what is necessary to
implement switch (duplicate values, span of values, range-checking when
using jump tables).

So the language may require you to supply a dummy 'else x' or 'return
x'; so what?

The alternative appears to be one of:

* Instead of 'else' or 'return', to write 'unreachable', which puts some
trust, not in the programmer, but some person calling your function
who does not have sight of the source code, to avoid calling it with
invalid arguments

Already simple thing would be an improvement: make compiler aware of
error routine (if you do not have it add one) so that when you
signal error compiler will know that there is no need for normal
return value.

Once
you have such representation (which may be desirable for other
reasons) it is easy to determine set of handled values. More
precisely, in this example we just have small number of discrete
values. More ambitious compiler may have list of ranges.
If type also specifies list of values or list of ranges, then
it is easy to check if all values of the type are handled.

The types are tyically plain integers, with ranges from 2**8 to 2**64.
The ranges associated with application needs will be more arbitrary.

If talking about a language with ranged integer types, then there might
be more point to it, but that is itself a can of worms. (It's hard to do
without getting halfway to implementing Ada.)

C has 'enum'. And a lot of languages treat such types much more
seriously than C.

You can't do this stuff with the compilers David Brown uses; I'm
guessing you can't do it with your prefered ones either.

To recompile the typed system I use (about 0.4M lines) on new fast
machine I need about 53s. But that is kind of cheating:
- this time is for parallel build using 20 logical cores
- the compiler is not in the language it compiles (but in untyped
vesion of it)
- actuall compilation of the compiler is small part of total
compile time
On slow machine compile time can be as large as 40 minutes.

40 minutes for 400K lines? That's 160 lines per second; how old is this
machine? Is the compiler written in Python?

This is simple compiler doing rather complex analyses and time used by
them may grow exponentialy. Compiler is written in untyped version
of language it compiles and generates Lisp (so actual machine code
is generated by Lisp).

Concerning slowness, few years old Atoms are quite slow.

An untyped system that I use has about 0.5M lines and recompiles
itself in 16s on the same machine. This one uses single core.
On slow machine compile time may be closer to 2 minutes.

So 4K to 30Klps.

Closer to 50Klps, as there are other things taking time.

Again, compiler compile time is only a part of build time.
Actualy, one time-intensive part is creating index for included
documentation.

Which is not going to be part of a routine build.

In a sense build is not routine. Build is done for two purposes:
- to install working system from sources, that includes
documentaion
- to check that build works properly after changes, this also
should check documentaion build.

Normal developement goes without rebuilding the system.

Another is C compilation for a library file
(system has image-processing functions and low-level part of
image processing is done in C). Recomplation starts from
minimal version of the system, rebuilding this minimal
version takes 3.3s.

My language tools work on a whole program, where a 'program' is a single
EXE or DLL file (or a single OBJ file in some cases).

A 'build' then turns N source files into 1 binary file. This is the task
I am talking about.

I know. But this is not what I do. Build produces mutiple
artifacts, some of them executable, some are loadable code (but _not_
in form recogized by operating system), some essentially non-executable
(like documentation).

A complete application may have several such binaries and a bunch of
other stuff. Maybe some source code is generated by a script. This part
is open-ended.

However each of my current projects is a single, self-contained binary
by design.

Anyway, I do not need cascaded recompilation than you present.
Both system above have incermental compilation, the second one
at statement/function level: it offers interactive prompt
which takes a statement from the user, compiles it and immediately
executes. Such statement may define a function or perform compilation.
Even on _very_ slow machine there is no noticable delay due to
compilation, unless you feed the system with some oversized statement
or function (presumably from a file).

This sounds like a REPL system. There, each line is a new part of the
program which is processed, executed and discarded.

First, I am writing about two different systems. Both have REPL.
Lines typed at REPL are "discarded", but their effect may last
long time.

In that regard, it
is not really what I am talking about, which is AOT compilation of a
program represented by a bunch of source files.

Untyped system is intended for "image based developement", you
compile bunch of routines to memory and dump the result to an
"image" file. You can load the image file later and use previously
compiled routines. This system also has second compiler which
outputs assembler file, and after using assembler you get object
file. If you insist compilation, assembly and linking can be
done by a single invocation of the compiler (which calls assembler
and linker behind the scene). But this is not normal use,
it is mainly used during system build to build base executable
which is later extended with extra functionality (like compilers
for extra languages) in saved images.

Typed system distingush "library compilation" and "user compilation".
"Library compilation" is done with module granularity and produces
loadable module.

Compilation is really AOT, you need to compile befor use.
Compiled functions may be replaced by new definitions, but in
absence of new definition compiled code is used without change.

Or can a new line redefine something, perhaps a function definition,
previously entered amongst the last 100,000 lines? Can a new line
require compilation of something typed 50,000 lines ago?

What happens if you change the type of a global; are you saying that
none of the program codes needs revising?

In typed system there are no global "library" variables, all data
is encapsulated in modules and normally accessed in abstract way,
by calling apropriate functions. So, in "clean" code you
can recompile a single module and the whole system works.
There is potential trouble with user variables, if data layout
(representation) changes, old values will lead to trouble.
There is potential trouble if you remove exported function.
All previously compiled modules will assume that such function
is present and you will get runtime error when other modules
attempt to call such a function. For efficiency functions
from "core" modules may be inlined, if you make change to
of core modules you may need to recompile the whole system.
Similarly, some modules depend on structure of data in other
modules, if you change data layout you need to recompile
everything which depends on it (which as I wrote normally is
a single module, but may be more). In other words, if you
change data layout or module interfaces, than you may
need to recompile several modules. But during normal
developement this is much less frequent than changes which
affect only single module.

As an example, I changed representation of multidimensional arrays,
that required rebuild the whole system. OTOH most changes
are either bug fixes or replacing existing routine by a faster
one or adding new functionality. In those 3 cases there is
no change in interface seen by non-changed part. There are
also changes to module interfaces, those affect multiple
modules, but are less frequent.

Untyped (or if you prefer dynamicaly typed) system just acts
on what is in variables, if you put nonsense there you will
get error or possibly crash.

An untyped system

What do you mean by an untyped system? To me it usually means
dynamically typed.

Well, "untyped" is shorter and in a sense more relevant for
compiler. '+' is treated as a function call to a function
named '+' which performs actual work starting from dispatch
on type tags. OTOH 'fi_+' assume that it is given (tagged)
integers and is compiled to inline code which in case when
one argument is a constant may reduce to one or zero instructions
(zero instructions means that addition may be done as part
of address mode of load or store). At even lower level
there is '_add' which adds two things treating them as
machine integers.

--
Waldek Hebisch

Date	Sujet	#	Auteur
31 Oct 24	else ladders practice	255	fir
31 Oct 24	Re: else ladders practice	9	Anton Shepelev
31 Oct 24	Re: else ladders practice	1	fir
31 Oct 24	Re: else ladders practice	7	James Kuyper
1 Nov 24	Re: else ladders practice	6	David Brown
2 Nov 24	Re: else ladders practice	2	James Kuyper
2 Nov 24	Re: else ladders practice	1	David Brown
2 Nov 24	Re: else ladders practice	3	fir
2 Nov 24	Re: else ladders practice	1	David Brown
2 Nov 24	Re: else ladders practice	1	James Kuyper
31 Oct 24	Re: else ladders practice	5	Richard Harnden
31 Oct 24	Re: else ladders practice	3	fir
31 Oct 24	Re: else ladders practice	2	fir
31 Oct 24	Re: else ladders practice	1	fir
31 Oct 24	Re: else ladders practice	1	Bonita Montero
31 Oct 24	Re: else ladders practice	22	Dan Purgert
31 Oct 24	Re: else ladders practice	3	fir
31 Oct 24	Re: else ladders practice	2	Dan Purgert
31 Oct 24	Re: else ladders practice	1	fir
16 Nov 24	Re: else ladders practice	18	Stefan Ram
16 Nov 24	Re: else ladders practice	5	Bart
16 Nov 24	Re: else ladders practice	4	David Brown
19 Nov 24	Re: else ladders practice	3	Janis Papanagnou
19 Nov 24	Re: else ladders practice	1	David Brown
19 Nov 24	Re: else ladders practice	1	Michael S
16 Nov 24	Re: else ladders practice	3	James Kuyper
19 Nov 24	Re: else ladders practice	2	Janis Papanagnou
30 Nov 24	Re: else ladders practice	1	Tim Rentsch
16 Nov 24	Re: else ladders practice	2	Lew Pitcher
17 Nov 24	Re: else ladders practice	1	Tim Rentsch
20 Nov 24	Re: else ladders practice	3	Dan Purgert
30 Nov 24	Re: else ladders practice	2	Rosario19
5 Dec 24	Re: else ladders practice	1	Dan Purgert
1 Dec 24	Re: else ladders practice	4	Waldek Hebisch
1 Dec 24	Re: else ladders practice	3	Janis Papanagnou
1 Dec 24	Re: else ladders practice	2	Waldek Hebisch
2 Dec 24	Re: else ladders practice	1	Janis Papanagnou
31 Oct 24	Re: else ladders practice	1	Janis Papanagnou
31 Oct 24	Re: else ladders practice	217	Bart
1 Nov 24	Re: else ladders practice	216	fir
1 Nov 24	Re: else ladders practice	198	Bart
1 Nov 24	Re: else ladders practice	196	fir
1 Nov 24	Re: else ladders practice	195	Bart
1 Nov 24	Re: else ladders practice	194	fir
1 Nov 24	Re: else ladders practice	193	fir
1 Nov 24	Re: else ladders practice	192	Bart
1 Nov 24	Re: else ladders practice	191	David Brown
1 Nov 24	Re: else ladders practice	190	Bart
1 Nov 24	Re: else ladders practice	189	David Brown
1 Nov 24	Re: else ladders practice	188	Bart
2 Nov 24	Re: else ladders practice	187	David Brown
2 Nov 24	Re: else ladders practice	186	Bart
2 Nov 24	Re: else ladders practice	1	Tim Rentsch
3 Nov 24	Re: else ladders practice	4	fir
3 Nov 24	Re: else ladders practice	3	Bart
3 Nov 24	Re: else ladders practice	2	fir
3 Nov 24	Re: else ladders practice	1	fir
3 Nov 24	Re: else ladders practice	4	fir
3 Nov 24	Re: else ladders practice	3	Bart
3 Nov 24	Re: else ladders practice	2	fir
3 Nov 24	Re: else ladders practice	1	fir
3 Nov 24	Re: else ladders practice	35	David Brown
3 Nov 24	Re: else ladders practice	1	Kaz Kylheku
3 Nov 24	Re: else ladders practice	23	Bart
4 Nov 24	Re: else ladders practice	21	David Brown
4 Nov 24	Re: else ladders practice	20	Bart
4 Nov 24	Re: else ladders practice	2	David Brown
5 Nov 24	Re: else ladders practice	1	Bart
4 Nov 24	Re: else ladders practice	17	David Brown
5 Nov 24	Re: else ladders practice	2	Bart
5 Nov 24	Re: else ladders practice	1	David Brown
6 Nov 24	Re: else ladders practice	5	Bart
6 Nov 24	Re: else ladders practice	4	David Brown
6 Nov 24	Re: else ladders practice	3	Bart
7 Nov 24	Re: else ladders practice	2	David Brown
7 Nov 24	Re: else ladders practice	1	Bart
9 Nov 24	Re: else ladders practice	9	Janis Papanagnou
9 Nov 24	Re: else ladders practice	8	David Brown
10 Nov 24	Re: else ladders practice	7	Janis Papanagnou
10 Nov 24	Re: else ladders practice	6	David Brown
19 Nov 24	Re: else ladders practice	5	Janis Papanagnou
19 Nov 24	Re: else ladders practice	4	David Brown
19 Nov 24	Re: else ladders practice	3	Janis Papanagnou
19 Nov 24	Re: else ladders practice	2	David Brown
20 Nov 24	Re: else ladders practice	1	Janis Papanagnou
9 Nov 24	Re: else ladders practice	1	Janis Papanagnou
8 Nov 24	Re: else ladders practice	9	Janis Papanagnou
8 Nov 24	Re: else ladders practice	4	David Brown
9 Nov 24	Re: else ladders practice	3	Janis Papanagnou
9 Nov 24	Re: else ladders practice	2	David Brown
10 Nov 24	Re: else ladders practice	1	Janis Papanagnou
8 Nov 24	Re: else ladders practice	4	Bart
9 Nov 24	Re: else ladders practice	3	Janis Papanagnou
9 Nov 24	Re: else ladders practice	2	Bart
10 Nov 24	Re: else ladders practice	1	Janis Papanagnou
8 Nov 24	Re: else ladders practice	1	Bart
5 Nov 24	Re: else ladders practice	141	Waldek Hebisch
5 Nov 24	Re: else ladders practice	1	fir
5 Nov 24	Re: else ladders practice	24	David Brown
5 Nov 24	Re: else ladders practice	17	Waldek Hebisch
5 Nov 24	Re: else ladders practice	16	David Brown
5 Nov 24	Re: else ladders practice	6	Bart
5 Nov 24	Re: else ladders practice	115	Bart
1 Nov 24	Re: else ladders practice	1	fir
2 Nov 24	Re: else ladders practice	17	Tim Rentsch