Sujet : Re: Arguments for a sane ISA 6-years later
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.archDate : 29. Jul 2024, 13:59:33
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Jul29.145933@mips.complang.tuwien.ac.at>
References : 1 2 3 4 5
User-Agent : xrn 10.11
BGB <
cr88192@gmail.com> writes:
On 7/26/2024 12:00 PM, Anton Ertl wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
and it's more efficient
That depends on the hardware.
Yes, the Alpha 21164 with its imprecise exceptions was "more
efficient" than other hardware for a while, then the Pentium Pro came
along and gave us precise exceptions and more efficiency. And
eventually the Alpha people learned the trick, too, and 21264 provided
precise exceptions (although they did not admit this) and more
efficieny.
Similarly, I expect that hardware that is designed for good TSO or
sequential consistency performance will run faster on code written for
this model than code written for weakly consistent hardware will run
on that hardware. That's because software written for weakly
consistent hardware often has to insert barriers or atomic operations
just in case, and these operations are slow on hardware optimized for
weak consistency.
>
TSO requires more significant hardware complexity though.
An efficient implementation of TSO or sequential consistency requires
more hardware, yes.
Floating point requires more hardware than fixed point. Precise
exceptions require more hardware than imprecise exceptions. Caches
require more hardware than the local memory of Cells SPEs. OoO
requires more hardware than in-order; in this case the IA-64
implementations demonstrated that you could then spend the area budget
on more in-order resources (and big caches) and still fail to keep up
on SPECint with the smaller OoO competition. In all these cases we
decided that the benefit is worth the additional hardware. I think
that's the case for strong memory ordering, too.
Seems like it would be harder to debug the hardware since:
There is more that has to go on in the hardware for TSO to work;
Software will have higher expectations that it actually work.
Possible. Delivering working hardware is the job of hardware
engineers. Intel and AMD apparently have no problems getting the TSO
parts of their architectures right. However, it seems that they don't
go for "really efficient" TSO, or they would just upgrade the parts of
their architecture with weaker consistency to have TSO.
- anton
-- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>