Sujet : Re: Calling conventions (particularly 32-bit ARM)
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.archDate : 08. Jan 2025, 23:08:46
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2025Jan8.230846@mips.complang.tuwien.ac.at>
References : 1 2 3
User-Agent : xrn 10.11
Stefan Monnier <
monnier@iro.umontreal.ca> writes:
For languages where the type systems ensures that the max number of
arguments is known (and the same) when compiling the function and when
compiling the calls to it, you could adjust the number of caller-saved
argument registers according to the actual number of arguments of the
function, thus making it "cheap" to allow, say, 13 argument registers
for those functions that take 13 arguments, since it doesn't impact the
other functions.
ABI calling conventions tend to be designed to support at least C,
including varargs and often also tolerant of differences between the
number of arguments in the caller and callee.
Language-private calling conventions can be a good idea, but then, if
you want to call C code (or be called by C code), you need to handle
ABI calling conventions in addition.
But in any case, I suspect there are also diminishing returns at some
point: how much faster is it in practice to pass/return 13 values in
registers instead of 8 of them in registers and the remaining 5 on
the stack? I expect a 13-arg function to perform an amount
of work that will dwarf the extra work of going through the stack.
I certainly have a use for as many arguments as the ABI provides, for
functions that typically contain only a few payload instructions: You
can implement a direct-threaded VM interpreter using tail-call
optimization, along the lines of
void add(VMinst *ip, long *sp, long sp_top)
{
/* payload start */
sp_top += *sp++;
/* payload end */
/* invoke the next VM instruction */
(*ip)(ip+1,sp,sp_top);
}
30 years ago gcc could not tail-call-optimize this, in the meantime it
can (and clang can do it, too). However, typical VMs have more than
just these three VM registers (Gforth has ip, sp, rp, fp, lp, up,
fp_top (usually mapped to a real-machine FP register) and registers
for as many sp stack items as practical; we intend to cache rp_top in
a register, too), and ideally you can pass them all as arguments; so
we could make good use of 10+ arguments. If there are not enough
arguments in registers, you have to use explicit register vars (a GNU
C extension) in addition, but that is more architecture-specific.
Some preliminary testing on AMD64 resulted in gcc apparently
supporting a lot of explicit registers on AMD64, and clang/LLVM only
one.
- anton
-- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>