Re: What integer C type to use

Liste des GroupesRevenir à c arch 
Sujet : Re: What integer C type to use
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.arch
Date : 13. Mar 2024, 16:45:00
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <b95e1b5c526cf9a13a42048297c1b7ec@www.novabbs.org>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
User-Agent : Rocksolid Light
Michael S wrote:

On Tue, 12 Mar 2024 19:00:46 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

Michael S wrote:
 
On Tue, 12 Mar 2024 17:18:36 +0000
mitchalsup@aol.com (MitchAlsup1) wrote: 
 
 While theoretically possible, they did not do this because both
halves of a 2×SP would not arrive from memory necessarily
simultaneously. {Consider a gather load you need a vector of
addresses 2× as long for pairs of SP going into a single vector
register element.}
 
Doctor, it hurts when I do this!
So, what prevents you from providing no gather with resolution
below 64 bits? 
 Well, then, you have SP values in a container than could hold 2 and
you don't get any SIMD speedup.
>

(1) - a need for full gather it hopefully rare. Majority of time things
accessed continuously or, at worst, with non-unit strides.
My personal rule of thumb is "if I need generic gather, most likely I
shouldn't have been bothered with vectorizing." Of course, as every
rule of thumb, it's imprecise.
CRAY-1 XMP gained considerable speedup (about 20%) on its benchmarks
of the day after adding Scatter/Gather and 5 more Livermore Loops
would vectorize.

(2) - there are several important applications that naturally have
pair-wise data layout. Complex numbers is just one.
What about Quaterions--which alleviate the programmer from having to
remember which multiplications are subtracted instead of added.

Which, of course, leaves the question of what property makes
vector processor Cray-style. Just having ALU/FPU several times
narrower than VR is, IMHO, not enough to be considered
Cray-style.   
 That property is that the length of the vector register is chosen
to absorb the latency to memory. SMID is too short to have this
property.
 
I don't like this definition at all.
For starter, what is "memory"? Does L1D cache count, or only L2 and
higher? 
 Those machines had no L1 or L2 (or LLC) caches. Consider the problems
for which they were designed--arrays as big as the memory (sometimes bigger !!) and processed over and over again with numerical
algorithms. Caches would simply miss on each memory reference
(ignoring TLB effects) With the caches never supplying data to the
calculations why have them at all ??
 
Then, what is "absorb" ?  
 Absorb means that the first data of a vector arrives and can start calculation before the last address of the memory reference goes out.
This, in turn, means that one can create a continuous stream of outbound addresses forever and thus cone can create a stream of
continuous calculations forever. {{Where 'forever' means thousands of cycles but no where near the lifetime of the universe.}}  Now, obviously, this means the memory system has to be able to make
forward progress on all those memory accesses continuously.
 
                         Is the whole VR register file part of
absorbent or latency should be covered by one register?  
 A single register covers a single memory reference latency. 
                                                        Is OoO
machinery part of absorbent?  
 The only OoO in the CRAYs was delivery of gather data back to the
vector register*. Scatter stores were sent out in order, as were the
addresses of the gather loads.  (*) bank conflicts would delay conflicting accesses but not those
of other banks, creating an OoO effect of returning data. This was
re-ordered back to IO prior to forwarding data into calculation.
 
                    Is HW threading part of absorbent?  
 Absolutely not--none of the CRAYs did this--later XMPs and YMPs did
use lanes (SIMD with vector) but always did calculations in order
and always sent out addresses (and data when appropriate) in order.
 
                                                         And for
any of your possible answers I have my "Why?". 
 No harm in asking.
>

It seems, we are talking about different things.
You are talking about Cray vectors, as done in Cray's 1/X-MP/Y-MP
series. I.e. something fixed, known and of interest mostly for
computing historians among us.
Fair enough, but it remains my model for how to discuss vector calculations.

OTH, I am trying to discuss a vague notion of "Cray-style vectors". My
intentions are to see what was applicable in more recent times and
which ideas are not totally obsolete for a future.
Only after you figure out a way to feed 2-LDs and consume 1 ST per-cycle
continuously (cache miss or cache hit; TLB miss or TLB hit) are you in a
position to utilize CRAY-like vector-register architecture effectively.

Date Sujet#  Auteur
11 Mar 24 * Re: What integer C type to use17MitchAlsup1
12 Mar 24 `* Re: What integer C type to use16David Brown
12 Mar 24  `* Re: What integer C type to use15Michael S
12 Mar 24   +* Re: What integer C type to use13MitchAlsup1
12 Mar 24   i`* Re: What integer C type to use12Michael S
12 Mar 24   i `* Re: What integer C type to use11MitchAlsup1
13 Mar 24   i  +* Re: What integer C type to use9Michael S
13 Mar 24   i  i+- Re: What integer C type to use1MitchAlsup1
13 Mar 24   i  i`* Re: What integer C type to use7Stefan Monnier
13 Mar 24   i  i +* Re: What integer C type to use5MitchAlsup1
15 Mar 24   i  i i`* Re: What integer C type to use4Paul A. Clayton
15 Mar 24   i  i i +- Re: What integer C type to use1Michael S
15 Mar 24   i  i i `* Re: What integer C type to use2MitchAlsup1
15 Mar 24   i  i i  `- Re: What integer C type to use1MitchAlsup1
14 Mar 24   i  i `- Re: What integer C type to use1Michael S
15 Mar 24   i  `- Re: What integer C type to use1Terje Mathisen
13 Mar 24   `- Re: What integer C type to use1Thomas Koenig

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal