Newsportal USENET - Re: Instruction Tracing

On 8/10/2024 2:41 PM, John Dallman wrote:

In article <2024Aug10.121802@mips.complang.tuwien.ac.at>,
anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

IIRC IA-64 has no FP division.
You recall correctly. It has an "approximation to reciprocal" instruction,
which gives you about 8 bits of precision, and then requires the compiler
to generate Newton-Raphson sequences. Intel's manual, 2010 edition, says
this is advantageous because users can generate only the precision they
need. Writing Itanium assembler for customised precision? Not many people
would have wanted to do that in 2001, let alone 2010.
In, I think, 1996, my employers had visitors from Intel trying to
persuade us to adopt their C/C++ compiler for IA-32. They had been able
to speed up one of our competitors' code by a factor of two, and hoped to
do the same for us.
They failed. We already had that factor of two, which was "ordinary
compiler optimisation." That competitor had some rather odd coding
standards at the time, which meant most compilers failed if asked to
optimise their code. Someone from Intel had stayed at their site for most
of a year, reporting the bugs and getting them fixed until Intel's
compiler could optimise the code.
While visiting us, Intel asked what may have been a significant question
about the mixture of floating-point arithmetic instructions we used. We
didn't have precise figures, but were sure that we used at least as many
square roots as divides. IA-64 does square roots like divides, with a
starter approximation and Newton-Raphson sequences. Slowly, because the
N-R instructions all depend on the previous instruction, and can't be run
in parallel.

FWIW:
In my case it is similar (if not using the FDIV) instruction, where there are approximations for divide / reciprocal / square root.
Meanwhile, saw a video recently where someone had ported Doom to a 233 MHz PowerPC (running Windows NT4) machine and, its performance was not good...
Not obvious is what combination of factors conspired to cause Doom to apparently run at single-digit framerates.
Video mentioned that it was drawing using GDI calls, but this by itself wouldn't explain the level of slowness seen in the video.
Like, presumably, this would require around 90% + of the clock cycles going into overhead, which seems a bit much.
Reference:
https://www.youtube.com/watch?v=LAkSJ-HqKw8

John

Date	Sujet	#	Auteur
10 Aug 24	Instruction Tracing	31	Lawrence D'Oliveiro
10 Aug 24	Re: Instruction Tracing	29	Anton Ertl
10 Aug 24	Re: Instruction Tracing	1	MitchAlsup1
10 Aug 24	Re: Instruction Tracing	8	John Dallman
10 Aug 24	Re: Instruction Tracing	1	MitchAlsup1
10 Aug 24	Re: Instruction Tracing	6	BGB
11 Aug 24	Re: Instruction Tracing	4	Lawrence D'Oliveiro
11 Aug 24	Re: Instruction Tracing	3	BGB
11 Aug 24	Re: Instruction Tracing	2	George Neuner
11 Aug 24	Re: Instruction Tracing	1	BGB
12 Aug 24	Re: Instruction Tracing	1	Michael S
10 Aug 24	Re: Instruction Tracing	3	BGB
11 Aug 24	Re: Instruction Tracing	2	MitchAlsup1
11 Aug 24	Re: Instruction Tracing	1	BGB
11 Aug 24	Re: Instruction Tracing	16	John Levine
11 Aug 24	Re: Instruction Tracing	3	OrangeFish
11 Aug 24	Re: Instruction Tracing	2	John Levine
12 Aug 24	Re: Instruction Tracing	1	Lynn Wheeler
11 Aug 24	Re: Instruction Tracing	12	Anton Ertl
11 Aug 24	Re: Instruction Tracing	2	MitchAlsup1
12 Aug 24	Re: Instruction Tracing	1	Lawrence D'Oliveiro
12 Aug 24	Re: Instruction Tracing	9	Lawrence D'Oliveiro
12 Aug 24	Re: Instruction Tracing	2	Terje Mathisen
12 Aug 24	Re: Instruction Tracing	1	Anton Ertl
12 Aug 24	Re: Instruction Tracing	6	Anton Ertl
12 Aug 24	Re: Instruction Tracing	5	Lawrence D'Oliveiro
12 Aug 24	Re: Instruction Tracing	4	Michael S
12 Aug 24	Re: Instruction Tracing	3	Lawrence D'Oliveiro
12 Aug 24	Re: Instruction Tracing	2	Michael S
12 Aug 24	Re: Instruction Tracing	1	MitchAlsup1
10 Aug 24	Re: Instruction Tracing	1	MitchAlsup1