Sujet : Re: Stealing a Great Idea from the 6600
De : tkoenig (at) *nospam* netcologne.de (Thomas Koenig)
Groupes : comp.archDate : 20. Jun 2024, 17:41:16
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v51m3c$2lh9s$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14
User-Agent : slrn/1.0.3 (Linux)
MitchAlsup1 <
mitchalsup@aol.com> schrieb:
Now, if &b[1] happens to = &a[0], then your construction fails while VVM
succeeds--it just runs slower because there IS a dependency checked by
HW and enforced. In those situations where the dependency is
nonexistent,
then the loop vectorizes--and the programmer remains blissfuly unaware.
The performance loss can be significant, unfortunately, depending
on the ratio of the width of the data in quesiton to the width of
the SIMD which actually performs the operation. In the case of
8-bit data and 256-bit wide SIMD, this would be a factor of 32,
which could lead to a slowdown of a factor of... 25, maybe?
This would be enough to trigger bug reports, I can tell you from
experience :-)
One technique that could get around that would be loop reversal,
with a branch to the correct loop at runtime (or a predicate
chosing the right values for the loop constants).
An option to raise an exception when there is a slowdown due
to loops running the wrong direction could be helpful in this
context.