Newsportal USENET - Re: Why I've Dropped In

On 6/12/2025 2:05 AM, Anton Ertl wrote:

BGB <cr88192@gmail.com> writes:
On 6/11/2025 11:51 AM, Anton Ertl wrote:
Link register: On some architectures there is a register that is a GPR
as far as most instructions are concerned. But the call instruction
with immediate (relative) target uses that register as implicit target
for the return address. MIPS is an example of that. Power has LR as
a special-purpose register.
>
>
It is GPR like, but in terms of role, I don't consider it as such.
>
In RV64, in theory, JAL and JALR could use any register. But, the C ABI
effectively limits this choice to X1.
So what? The architecture does not. Also, given that static linking
and maybe even whole-program optimization are on the rise, the forces
that coerce you to use the ABI are getting smaller.

Possible, though on some implementations (such as mine), using an arbitrary register would have a performance penalty as there were only a few registers that the branch-predictor can actually see.

Implicitly, the 'C' extension and some other (less standardized)
extensions also tend to hard-code the assumption of X1 being the link
register.
Yes, the C extension is designed for minimizing the code size of
common code, and assumes that the code follows the ABI. But nothing
in the architecture forces you to use compressed instructions.
Whenever it is more advantageous to use an instruction in a way that
cannot be compressed, you just do it. E.g., if you do whole-program
optimization, and you have functions
A (called from 10 sites)
B calls A (called from 2 sites, address not taken for indirect calling)
C calls B
Then one can use JAL or JALR with the target X1 to call A, and these
instructions may be compressible. And one can use JAL or JALR with a
different target to call B. The benefit of that is that B does not
need to save and restore the address it returns to, eliminating the
code needed for that, and the time needed to perform this saving and
restoring.
However, the architecture specification says that using x1 and x5 are
considered to be link registers for branch prediction purposes, so
ideally one will use x5 as target for calls to B, and for further
levels the question is if it is good enough to just use plain
indirect-branch prediction for those calls, or if one invests into the
saving and restoring in order to use the return-address stack for
branch prediction.

Mine doesn't use X5 here, and X5 was mostly being used as a "stomp register" (my reference further down was a typo, had meant X4 for TP).
Mostly a case of, for the RV target, needing a register to be readily available for whenever a value doesn't fit into an immediate and so it needs to be loaded into a register (and/or an address generated because RV doesn't natively support the addressing mode).

Well, it is more a case here of, "try to put something other than the
stack pointer in SP and see how far you get with that".
>
There are multiple levels of systems (ISA design, OS, ...)
Neither the ISA nor the system call interfaces I have looked at would
cause any problems if I use x2 (sp) on RISC-V for something else.
Maybe in case of a handled signal the OS would write to to a place
pointed to by x2, but that requires 1) installing a signal handler and
2) not using sigaltstack() to tell the OS where to write in such a
case. Of course the signal handler (if any) with see x2 set to point
to the alternative stack, but all the regular user code can use x2 for
whatever purpose seems appropriate. Some programming languages are
designed to work without stack (e.g., early Fortran), some to use
multiple stacks (e.g., Forth).

Dunno about all RISC-V implementations...
Seems the basic case of the CLINT mechanism is fairly minimal:
   Copy PC to MEPC;
   Poke around in MSTATUS;
   Branch to MTVEC.
In this case, seemingly CLINT will use whatever stack is present in SP unless the entry point takes deliberate action to swap stacks before saving off the various registers (but seemingly lacks a dedicated CSR for this purpose in the simple case?...).
The example interrupt handlers would apparently also use the user stack for interrupt handling (and so would break if the program had repurposed SP).
There seem to be other more advanced variants, and other interrupt handler mechanisms.
Granted, one could make a case that no self-respecting OS would use the minimal form of the CLINT mechanism and ISR examples as a base, but, ...
Can also note here that SH-2 (and SH-3) had an interrupt mechanism like:
   Push SR and PC onto stack;
   Branch to entry point.
In this case, the OS would likely explode if SP were used differently.
It was a vaguely similar interrupt mechanism to that used on the 8086 in this sense.
Though, SH-2A had switched to a more advanced interrupt mechanism with bank-swapped register files.
Though, generally, any loaders will typically also need to provide the initial SP, etc, meaning there still needs to be agreement here as to which register holds the stack.
In my case, for BJX2, a mechanism was used where the interrupt handler will cause SP and SSP to swap places. Implicitly, it does assume here that it is swapping the stacks.
So, say, my mechanism:
   Copy SP to SPC;
   Copy SR bits, and status code, into EXSR;
   Twiddle bits in SR to set to ISR mode;
Causes SP and SSP to switch places.
   Computed branch relative to VBR (roughly equivalent to MTVEC).
   There are entry points for different ISR categories,
   with an 8-byte offset between each.
   Each entry point generally encodes a branch.
Early forms of BJX2 has also banked R0 and R1, but this was dropped.
I had a few times also considered dropping the SP/SSP switch, but ended up not doing so.
Early on, the mechanism would actually swap the values of SP and SSP, but this was later changed to having them switch places via the instruction decoder (was cheaper than a value-swap mechanism).
For general interrupts, the mechanism usually involved manually saving all the registers to the ISR stack.
For interrupts where the task context was known valid, the mechanism became to instead save all the registers into the task context. Though, in this case, this was partly because things like SYSCALL events effectively performed a context switch merely using the ISR handler as a springboard.

I am not saying they don't look like GPRs in the ISA, but rather that
they aren't really GPRs in terms of roles or behavior, but rather they
are essentially SPRs that just so happen to live in the GPR numbering space.
As far as the architecture is concerned, they are GPRs. Yes, an ABI
specifies a special role for some of them, but the ABI is software,
not architecture. E.g., in early MIPS ABIs (in particular, on
Ultrix), there was no GP, in later MIPS ABIs, there was.
One can nicely see the role of the ABI in Table 25.1 of
<http://staff.ustc.edu.cn/~comparch/reference/riscv-spec%EF%BC%880305%EF%BC%89.pdf>
(page 137); it has a column called "Register" (with names like "x1")
and a column called "ABI name" (with names like "ra"). The caption
says: "Assembler mnemonics for RISC-V integer and floating-point
registers, and their role in the first standard calling convention."
So the architects expect the architecture to live longer than this
ABI.

I think there is a definition / tradeoff here:
Once a register's usage pattern is sufficiently entrenched that one can no longer "reasonably" do otherwise, it goes from being a GPR to being an SPR.
And, if the ABI, instruction encodings, and any OS and/or firmware, are likely to assume a particular usage (and/or at risk of breaking something if the assumption is not followed); the status applies de-facto (regardless of whether or not any special logic exists at the hardware level).
Though, at least in my CPU core, special hardware-level logic does exist for these registers. So, it isn't a stretch to consider them as being SPRs.
In any case, it is a contrast with the other GPRs, where any contents they may hold is purely at the whim of the compiler (and their usage is more subject to categorical distinctions, like caller vs callee saved, etc).

It might be even due to things as simple as "well, the OS kernel and
program launcher assume that stack is in X2, and system calls assume
stack is in X2, ...". You have little real choice but to put the stack
in X2, and if you try putting something else there, and a system call or
interrupt happens, ..., there is little to say that things wont "go
sideways", so, not really a GPR.
I don't know what OS you have in mind, but in any OS where there is a
boundary between user space and system space, the system does not use
what may be the user-space stack pointer for storing its data, not on
system calls, and certainly not on interrupts. And when I last looked
at Linux system calls, the actual system call interface (not the C
wrapper around it) passed parameters to system calls in registers, not
on the user-level stack.
The only case where a stack pointer register may come into play is
when the OS calls a signal handler, but I have not looked at the
machine-level interface there, so I cannot say for sure. In any case,
that does not affect all the code that is not a signal handler.

I would not be so quick to exclude the likes of uClinux and XV6...
For something like x86 protected mode or long mode, well, yeah, user rSP shouldn't matter, since each protection ring has its own registers here.
But, even with a proper OS, it isn't unreasonable to assume that the kernel could see that the user-process has gone and set SP or GP to something unorthodox and then promptly terminate the whole process out of principle (say, for example, mangled SP or GP being taken as evidence that the process may have been hijacked by shell code).

Global Pointer is assumed as such by the ABI, and OS may care about it,
so not really a GPR.
Why should the OS kernel care about the global pointer of a user-level
program?

For both PE/COFF (*1) and ELF PIE, the loader generally needs to set the value for the global pointer. The OS loader also needs to provide the stack, etc.
Actually, for ELF PIE there is a whole bunch of stuff here, not just the initial state of some registers, but the contents of the stack (argument lists and "auxiliary vector").
*1: At least on targets where a global pointer is used. Here, TestKern would fall into this category.
Granted, this differs from static ELF where typically setting the global pointer, and sometimes zeroing ".bss" is managed by the program itself (though, generally, ELF loaders will also zero the ".bss", *).
*: There is an informal way of loading static ELF which is to simply reading the whole ELF image to the load address, in which case it needs to zero its own ".bss".
In PEL, I had also defined FileOffset==RVA, which allows for simply decompressing the PEL image into RAM. Whereas, traditional PE/COFF would require reading into a temporary buffer any then copying over each section.

I decided to classify X5/TP as a GPR as its usage is roughly up to the
discretion of the ABI and C runtime library (at least in RISC-V, there
are no hard-coded ISA level assumptions about TP, nor does it cross into
the OS kernel's realm of concern).
Table 25.1 (mentioned above) gives tp as ABI name for x4, and t0 as
ABI name for x5.

Typo...

- anton

Date	Sujet	#	Auteur
19 May 25	Why I've Dropped In	417	quadibloc
19 May 25	Re: Why I've Dropped In	349	quadibloc
21 May 25	Re: Why I've Dropped In	348	quadibloc
22 May 25	Re: Why I've Dropped In	11	David Chmelik
22 May 25	Re: Why I've Dropped In	2	MitchAlsup1
23 May 25	Re: Why I've Dropped In	1	MitchAlsup1
10 Jun 25	Re: Why I've Dropped In	8	quadibloc
11 Jun 25	Re: Why I've Dropped In	1	BGB
11 Jun 25	Re: Why I've Dropped In	6	quadibloc
11 Jun 25	Re: Why I've Dropped In	4	Chris M. Thomasson
12 Jun 25	Re: Why I've Dropped In	3	quadibloc
12 Jun 25	Re: Why I've Dropped In	1	Chris M. Thomasson
16 Jun 25	Re: Why I've Dropped In	1	Chris M. Thomasson
12 Jun 25	Re: Why I've Dropped In	1	quadibloc
10 Jun 25	Re: Why I've Dropped In	335	quadibloc
11 Jun 25	Re: Why I've Dropped In	322	Thomas Koenig
11 Jun 25	Re: Why I've Dropped In	23	BGB
11 Jun 25	Re: Why I've Dropped In	8	MitchAlsup1
11 Jun 25	Re: Why I've Dropped In	7	BGB
12 Jun 25	Re: Why I've Dropped In	6	MitchAlsup1
12 Jun 25	Re: Why I've Dropped In	5	BGB
13 Jun 25	Re: Why I've Dropped In	4	MitchAlsup1
15 Jun 25	Re: Why I've Dropped In	3	BGB
15 Jun 25	Re: Why I've Dropped In	2	MitchAlsup1
15 Jun 25	Re: Why I've Dropped In	1	BGB
11 Jun 25	Re: Why I've Dropped In	10	Anton Ertl
11 Jun 25	Re: Why I've Dropped In	6	MitchAlsup1
12 Jun 25	Re: Why I've Dropped In	5	MitchAlsup1
12 Jun 25	Re: Why I've Dropped In	4	Anton Ertl
12 Jun 25	Re: Why I've Dropped In	2	MitchAlsup1
20 Jun 25	Re: Why I've Dropped In	1	Anton Ertl
12 Jun 25	Re: Why I've Dropped In	1	Thomas Koenig
11 Jun 25	Re: Why I've Dropped In	3	BGB
12 Jun 25	Re: Why I've Dropped In	2	Anton Ertl
12 Jun 25	Re: Why I've Dropped In	1	BGB
20 Jun 25	Re: Why I've Dropped In	4	quadibloc
20 Jun 25	Re: Why I've Dropped In	3	MitchAlsup1
20 Jun 25	Re: Why I've Dropped In	2	moi
20 Jun 25	Re: Why I've Dropped In	1	quadibloc
11 Jun 25	Re: Why I've Dropped In	298	quadibloc
11 Jun 25	Re: Why I've Dropped In	19	MitchAlsup1
11 Jun 25	Re: Why I've Dropped In	3	quadibloc
11 Jun 25	Re: Why I've Dropped In	2	MitchAlsup1
14 Jun 25	Re: Why I've Dropped In	1	BGB
16 Jun 25	Re: Why I've Dropped In	15	Stefan Monnier
17 Jun 25	Re: Why I've Dropped In	1	quadibloc
17 Jun 25	Re: Why I've Dropped In	13	Stephen Fuld
17 Jun 25	Re: Why I've Dropped In	12	MitchAlsup1
17 Jun 25	Re: Why I've Dropped In	1	Stephen Fuld
17 Jun 25	Re: Why I've Dropped In	10	Stefan Monnier
17 Jun 25	Re: Why I've Dropped In	6	MitchAlsup1
17 Jun 25	Re: Why I've Dropped In	5	Stefan Monnier
18 Jun 25	Re: Why I've Dropped In	4	Anton Ertl
18 Jun 25	Re: Why I've Dropped In	2	Stefan Monnier
19 Jun 25	Re: Why I've Dropped In	1	Anton Ertl
18 Jun 25	Re: Why I've Dropped In	1	BGB
18 Jun 25	Re: Why I've Dropped In	3	Chris M. Thomasson
18 Jun 25	Re: Why I've Dropped In	2	Stefan Monnier
20 Jun 25	Re: Why I've Dropped In	1	Chris M. Thomasson
11 Jun 25	Re: Why I've Dropped In	198	Thomas Koenig
12 Jun 25	Re: Why I've Dropped In	197	quadibloc
12 Jun 25	Re: Why I've Dropped In	193	Stephen Fuld
13 Jun 25	Re: Why I've Dropped In	54	quadibloc
13 Jun 25	Re: Why I've Dropped In	53	Stephen Fuld
13 Jun 25	Re: Why I've Dropped In	52	Thomas Koenig
13 Jun 25	Re: Why I've Dropped In	1	quadibloc
13 Jun 25	Re: Why I've Dropped In	50	Stephen Fuld
13 Jun 25	Re: Why I've Dropped In	49	Thomas Koenig
13 Jun 25	Re: Why I've Dropped In	21	Stephen Fuld
13 Jun 25	Re: Why I've Dropped In	19	Thomas Koenig
13 Jun 25	Re: Why I've Dropped In	2	MitchAlsup1
15 Jun 25	Re: Why I've Dropped In	1	Stephen Fuld
13 Jun 25	Re: Why I've Dropped In	1	Stephen Fuld
15 Jun 25	Re: base and bounds, Why I've Dropped In	15	John Levine
15 Jun 25	Re: base and bounds, Why I've Dropped In	13	Stephen Fuld
15 Jun 25	Re: base and bounds, Why I've Dropped In	12	John Levine
15 Jun 25	Re: base and bounds, Why I've Dropped In	9	MitchAlsup1
16 Jun 25	Re: base and bounds, Why I've Dropped In	7	Stephen Fuld
16 Jun 25	Re: base and bounds, Why I've Dropped In	2	quadibloc
16 Jun 25	Re: base and bounds, Why I've Dropped In	1	Stephen Fuld
16 Jun 25	Re: base and bounds, Why I've Dropped In	4	MitchAlsup1
16 Jun 25	Re: base and bounds, Why I've Dropped In	3	Stephen Fuld
16 Jun 25	Re: base and bounds, Why I've Dropped In	2	quadibloc
16 Jun 25	Re: base and bounds, Why I've Dropped In	1	Stephen Fuld
16 Jun 25	Re: base and bounds, Why I've Dropped In	1	quadibloc
15 Jun 25	Re: base and bounds, Why I've Dropped In	2	Stephen Fuld
16 Jun 25	Re: base and bounds, Why I've Dropped In	1	John Levine
16 Jun 25	Re: big pages, base and bounds, Why I've Dropped In	1	John Levine
13 Jun 25	Re: Why I've Dropped In	1	Lars Poulsen
13 Jun 25	Re: Why I've Dropped In	1	MitchAlsup1
13 Jun 25	Re: Why I've Dropped In	26	quadibloc
14 Jun 25	Re: Why I've Dropped In	25	Thomas Koenig
14 Jun 25	Re: Why I've Dropped In	24	Stephen Fuld
14 Jun 25	Re: Why I've Dropped In	3	Thomas Koenig
14 Jun 25	Re: Why I've Dropped In	2	Stephen Fuld
14 Jun 25	Re: Why I've Dropped In	1	Thomas Koenig
14 Jun 25	Re: Why I've Dropped In	14	Stephen Fuld
14 Jun 25	Re: Why I've Dropped In	13	quadibloc
14 Jun 25	Re: Why I've Dropped In	1	Stephen Fuld
14 Jun 25	Re: Why I've Dropped In	11	quadibloc
15 Jun 25	Re: Why I've Dropped In	10	Stephen Fuld
15 Jun 25	Re: Why I've Dropped In	6	quadibloc
13 Jun 25	Re: Why I've Dropped In	134	quadibloc
14 Jun 25	Re: base registers and addres size, Why I've Dropped In	3	John Levine
18 Jun 25	Re: Why I've Dropped In	1	Lynn Wheeler
13 Jun 25	Re: Why I've Dropped In	3	BGB
11 Jun 25	Re: Why I've Dropped In	55	Anton Ertl
11 Jun 25	Re: Why I've Dropped In	4	quadibloc
11 Jun 25	Re: Why I've Dropped In	21	MitchAlsup1
11 Jun 25	Re: Why I've Dropped In	11	quadibloc
13 Jun 25	Re: Why I've Dropped In	1	quadibloc
16 Jun 25	Re: Why I've Dropped In	1	quadibloc
12 Jun 25	Re: Why I've Dropped In	58	quadibloc
27 Jun 25	Re: errno, Code density	9	John Levine