Sujet : Re: Instruction Tracing
De : terje.mathisen (at) *nospam* tmsw.no (Terje Mathisen)
Groupes : comp.archDate : 12. Aug 2024, 07:10:41
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v9c912$35the$1@dont-email.me>
References : 1 2 3 4 5
User-Agent : Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.18.2
Lawrence D'Oliveiro wrote:
On Sun, 11 Aug 2024 14:44:38 GMT, Anton Ertl wrote:
Power (IIRC) and Alpha don't have delayed branches.
Not only does POWER not have delayed branches, but I recall the IBM folks
claiming in the initial publicity that branches could often execute in
zero clock cycles--that is, fully overlapped with surrounding
instructions.
Afair, the original POWER had 3 chips, with branches in a separate unit from integer/logic ops, right?
The idea, as presented in that month's BYTE magazine was that the entire latency of transferring comparison flags over to the branch unit, select the corresponding direction and then transmit the resulting IP back to the fetch unit would happen fast enough that those offchip latencies would not matter.
It also had multiple (8?) sets of compare result flags in order to avoid making them a speed limiter.
POWER was also “superscalar” (being able to execute more than one
operation per clock cycle) right from the beginning. Not sure if other
RISC architectures of the time were like that. I don’t think Alpha was:
one thing I remember from its early descriptions was its use of very high
clock speeds. That seemed to me to be the opposite of “(at least) one
instruction per clock cycle”, which I thought was supposed to be one of
the defining features of RISC.
Yeah, the R part was intended to make latency a single cycle for _most_ instructions.
Terje
-- - <Terje.Mathisen at tmsw.no>"almost all programming can be viewed as an exercise in caching"