Sujet : Re: What integer C type to use
De : monnier (at) *nospam* iro.umontreal.ca (Stefan Monnier)
Groupes : comp.archDate : 13. Mar 2024, 20:02:12
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <jwva5n2m1lr.fsf-monnier+comp.arch@gnu.org>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
User-Agent : Gnus/5.13 (Gnus v5.13)
OTH, I am trying to discuss a vague notion of "Cray-style vectors". My
intentions are to see what was applicable in more recent times and
which ideas are not totally obsolete for a future.
Another way to look at the difference between SSE-style vectors (which
I'd call "short vectors") at the ISA level is the fact that SSE-style
vector instructions are designed under the assumption that the latency
of a vector instruction will be more or less the same as that of
a non-vector instruction (i.e. you have enough ALUs to do all the
operations at the same time), whereas Cray-style vector instructions
(which we could call "long vectors") are designed under the assumption
that the latency will be somewhat proportional to the length of the
vector because the core of the CPU will only access a chunk of the
vector at a time.
So, short vectors have a fairly free hand at shuffling data across their
vector (e.g. bitmatrix transpose), and they can be
implemented/scheduled/dispatched just like any other instruction, but
the vector length tends to be severely limited and exposed all over
the place.
In contrast long vectors usually depend on specialized implementations
(e.g. chaining) to get good performance, but their length is
easier/cheaper to change.
AFAICT long vectors made sense when we could build machines with
a memory bandwidth that was higher and ALUs were more expensive.
Nowadays we tend to have the opposite.
Also, the massive number of transistors we spend nowadays on OoO means
that a good OoO CPU can dispatch individual non-vector instructions to
ALUs just as well as the Cray did with its vectors with chaining.
Stefan