Sujet : Re: Instruction Tracing
De : already5chosen (at) *nospam* yahoo.com (Michael S)
Groupes : comp.archDate : 12. Aug 2024, 16:14:53
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <20240812181453.00004e50@yahoo.com>
References : 1 2 3 4 5 6 7 8 9
User-Agent : Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)
On Mon, 12 Aug 2024 08:42:51 -0000 (UTC)
Lawrence D'Oliveiro <
ldo@nz.invalid> wrote:
On Mon, 12 Aug 2024 11:09:18 +0300, Michael S wrote:
On Mon, 12 Aug 2024 06:33:17 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
But in spite of having, say, 2½ times the clock speed of POWER,
Alpha was not 2½ times faster, was it?
Of course not.
That’s what I mean: it took several clock cycles per instruction,
contrary to just about every other RISC architecture.
On EV4 simple ALU instructions took 1 cycle , both for throughput and
for latency.
Shifts and conditional moves had latency of 2, throughput of 1.
Integer multiplier was not pipelined, but few RISC also had it
none-pipelined. Latency of integer multiplier was 19-21 cycles.
On FP side both FADD and FMUL were fully pipelined (T=1) and had
latency of 6 cycles.
L1D cache hits were fully pipelined (T=1) and had latency of 3 cycles.
So, as long as code/data was fitting in L1 cache, EV4 IPC was not
far behind competition. Relatively to MIPS R4K, may be, even ahead.
Of course, cache misses were relatively more expensive than for much
lower clocked competitors. DEC's solution to that was wide and fast
system bus.