Newsportal USENET - Re: Instruction Tracing

Re: Instruction Tracing

Sujet : Re: Instruction Tracing
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.arch
Date : 10. Aug 2024, 19:33:36

Autres entêtes

Organisation : Rocksolid Light
Message-ID : <4982caec9dafb4d0dac0e86a85220e56@www.novabbs.org>
References : 1 2
User-Agent : Rocksolid Light

On Sat, 10 Aug 2024 10:18:02 +0000, Anton Ertl wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:
One thing these instruction traces would frequently report is that
integer
multiply and divide instructions were not so common, and so could be
omitted and emulated in software, with minimal impact on overall
performance. We saw this design decision taken in the early versions of
Sun’s SPARC for example, and also IBM’s ROMP as used in the RT PC.
>
Alpha and IA-64 have no integer division. IIRC IA-64 has no FP
division.

"Stupid is a stupid does" Forest Gump.
{and applicable, too}}

One interesting aspect of RISC-V is that they put multiplication and
division in the same extension (which is included in RV64G, i.e., the
General version of RISC-V).
>
Later, it seems, the CPU designers realized that instruction traces were
not the final word on performance measurements, and started to include
hardware integer multiply and divide instructions.
>
When you invest more hardware to increase performance per cycle, at
one point the best return on investment is to have multiplication and
division instructions. What is interesting is that the multipliers
have than soon been fully pipelined.

The MUL unit of Mc88100 was fully pipelined (1985) Integer multiply was
3 cycles, single was 4 cycles, double was 7 IIRC.

Or, as Mitch Alsup reports, in
cases where that was cheaper, have two half-pipelined multipliers.

When the multiplier tree delay is greater than 1 cycle, it becomes
cheaper to have 2×½ multipliers without a stage delay than to have
1 multiplier with 4096 flip-flops in the middle. Where cheaper is
smaller and consumes less power.

Apparently there are enough applications that require a huge number of
multiplications; my guess is that the NSA won't tell us what they are.

AES is greatly sped up with a carry-less multiplication, all one has to
do is to deactivate the majority gate in the CAS cell (which adds no
gates of delay or area.)

>
- anton

Les messages affichés proviennent d'usenet.

Date	Sujet	#	Auteur
10 Aug 24	Instruction Tracing	31	Lawrence D'Oliveiro
10 Aug 24	Re: Instruction Tracing	29	Anton Ertl
10 Aug 24	Re: Instruction Tracing	1	MitchAlsup1
10 Aug 24	Re: Instruction Tracing	8	John Dallman
10 Aug 24	Re: Instruction Tracing	1	MitchAlsup1
10 Aug 24	Re: Instruction Tracing	6	BGB
11 Aug 24	Re: Instruction Tracing	4	Lawrence D'Oliveiro
11 Aug 24	Re: Instruction Tracing	3	BGB
11 Aug 24	Re: Instruction Tracing	2	George Neuner
11 Aug 24	Re: Instruction Tracing	1	BGB
12 Aug 24	Re: Instruction Tracing	1	Michael S
10 Aug 24	Re: Instruction Tracing	3	BGB
11 Aug 24	Re: Instruction Tracing	2	MitchAlsup1
11 Aug 24	Re: Instruction Tracing	1	BGB
11 Aug 24	Re: Instruction Tracing	16	John Levine
11 Aug 24	Re: Instruction Tracing	3	OrangeFish
11 Aug 24	Re: Instruction Tracing	2	John Levine
12 Aug 24	Re: Instruction Tracing	1	Lynn Wheeler
11 Aug 24	Re: Instruction Tracing	12	Anton Ertl
11 Aug 24	Re: Instruction Tracing	2	MitchAlsup1
12 Aug 24	Re: Instruction Tracing	1	Lawrence D'Oliveiro
12 Aug 24	Re: Instruction Tracing	9	Lawrence D'Oliveiro
12 Aug 24	Re: Instruction Tracing	2	Terje Mathisen
12 Aug 24	Re: Instruction Tracing	1	Anton Ertl
12 Aug 24	Re: Instruction Tracing	6	Anton Ertl
12 Aug 24	Re: Instruction Tracing	5	Lawrence D'Oliveiro
12 Aug 24	Re: Instruction Tracing	4	Michael S
12 Aug 24	Re: Instruction Tracing	3	Lawrence D'Oliveiro
12 Aug 24	Re: Instruction Tracing	2	Michael S
12 Aug 24	Re: Instruction Tracing	1	MitchAlsup1
10 Aug 24	Re: Instruction Tracing	1	MitchAlsup1