Sujet : Keeping other stuff with addresses (was: What is an N-bit machine?)
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.archDate : 30. Nov 2024, 07:28:29
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Nov30.072829@mips.complang.tuwien.ac.at>
References : 1 2 3
User-Agent : xrn 10.11
John Levine <
johnl@taugh.com> writes:
S/360 had 24 bit addresses and 32 bit registers. When doing address arithmetic
the high 8 bits of the register were ignored. That turned out to be a really bad
decision since a few instructions and a lot of programming conventions stored
other stuff in that high byte, causing severe pain a few years later when
memories got bigger than 16 meg.
The technique of putting stuff in unused bits of an address has its
drawbacks, but it also has benefits, in particular type information is
often stored there (even on architectures that do not ignore any
bits). Of course AMD and Intel have the bad examples of S/360 and
68000 in mind, and did not want to have anything to do with that
during the first two decades of AMD64.
The designers of ARM A64 could think beyond that and designed in the
top-byte-ignore feature. Apparently this made AMD and Intel see the
light:
AMD added the upper-address ignore feature, which, when enabled,
ignores the top 7 bits. One problem with this in the Linux kernel
(and maybe other OSs) is that the Linux kernel expects the top bit to
be set only for kernel addresses. Not sure how that works with ARMs
top-byte ignore feature, which is supported since Linux 5.4 in 2019
using the PR_SET_TAGGED_ADDR_CTRL option of the prctl() call.
Intel added the linear address masking feature, with two variants:
LAM_U57 ignores bits 57-62 (but not the MSB), allowing 6 bits for
other uses; LAM_U48 ignores bits 48-62, allowing 15 bits for other
uses. These variants require bit 63 to have the same value as bit
56/47; another bit could be made available by ignoring bit 56/47 (the
information is in bit 63 anyway), but Intel apparently decided that
programmers don't need that extra bit.
RISC-V has the pointer-masking extension, which ignores the top 7 bits
(like AMD's upper-address ignore) or optionally 16 bits.
See <
https://muxup.com/2023q4/storing-data-in-pointers> and
<
https://lwn.net/Articles/902094/>.
Concerning the kernel requirements, as someone who has implemented
Prolog with tagging, having to untag on passing an address to the
kernel would be only a minimal cost. Not having to unmask on every
memory access would be quite useful. Having the top bit always be 0
with the tags in the 6 bits below would not have been a restriction
for us (we used 4-bit tags); OTOH, if one more bit was available,
programmers would find good uses for it.
These days I'd say the relevant N is the size of arithmetic registers but a
lot of marketers appear to disagree with me.
Which arithmetic registers on an Intel processor? The 64 bits of a
GPR? The 128 bits of an XMM register? The 256 bits of a YMM
register? The 512 bits of a ZMM register? Note that until recently,
Intel sold you the same silicon either with only XMM or with XMM and
YMM registers. They sell consumer CPUs with XMM, YMM, and ZMM
registers, but more recent consumer and small-server CPUs have
reverted to only supporting XMM and YMM registers.
- anton
-- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>