Re: auto predicating branches

Liste des GroupesRevenir à c arch 
Sujet : Re: auto predicating branches
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.arch
Date : 21. Apr 2025, 18:29:15
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <d47cdad26528b4d2309ac9df60120315@www.novabbs.org>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
User-Agent : Rocksolid Light
On Mon, 21 Apr 2025 6:05:32 +0000, Anton Ertl wrote:

Robert Finch <robfi680@gmail.com> writes:
Having branches automatically convert into
predicates when they branch forward a short distance <7 instructions.
>
If-conversion in hardware is a good idea, if done well, because it
involves issues that tend to be unknown to compilers:
I had little trouble teaching Brian how to put if-conversion into the
compiler with my PRED instructions. Alleviating HW from having to bother
other than being able to execute PREDicated clauses.

* How predictable is the condition?  If the condition is very well
  predictable, if-conversion is not a good idea, because it turns the
  control dependency (which does not cost latency when the prediction
  is correct) into a data dependency.  Moreover, in this case the
  if-conversion increases the resource consumption.  Compilers are not
  good at predicting the predictability AFAIK.
Rather than base the choice on the predictability of the condition,
It is based on whether FETCH will pass the join-point before the
condition resolves. On an 8-wide machine this might be "THE next
cycle".

* Is the condition available before or after the original data
  dependencies?  And if afterwards, by how many cycles?  If it is
  afterwards and the branch prediction would be correct, the
  if-conversion means that the result of the instruction is available
  later, which may reduce IPC.
Generally, it only adds latency--if the execution window is not staled
at either end this does not harm IPC.

                                OTOH, if the branch prediction would
  be incorrect, the recovery also depends on when the condition
  becomes available,
There is no "recovery" from PREDication, just one clause getting
nullified.

                     and the total latency is higher in the case of no
  if-conversion.  The compiler may do an ok job at predicting whether
  a condition is available before or after the original data
  dependencies (I don't know a paper that evaluates that), but without
  knowing about the prediction accuracy of a specific condition that
  does not help much.
>
So the hardware should take predictability of a condition and the
availability of the condition into consideration for if-conversion.
My argument is that this is a SW decision (in the compiler) not a
HW decision (other than providing the PREDs). Since PREDs are not
predicted (unless you think they are predicted BOTH ways) they do
not diminish the performance of the branch predictors.

What about reverse if-conversion in hardware, i.e., converting
predicated instructions and the like (conditional moves, if-then-else
instructions and the instructions they control) into branch-predicted
phantom branches and eliminating the data dependency on the condition
from the instruction.
The compiler choose PRED because FETCH reaches the join-point prior
to the branch resolving. PRED is almost always faster--and when
it has both then-clause and else-clause, it always saves a branch
instruction (jumping over the else-clause).

For performance, one might consider reverse if-conversion, because the
same considerations apply; however, there is also a security aspect:
programmers have used these instructions instead of branches to
produce constant-time code to avoid timing side channels of code that
deals with secrets; and the discovery of Spectre has shown additional
timing side channels of branches.  Because you cannot be sure that the
predicated instruction is there for security reasons, you must not use
reverse if-conversion in hardware.
>
- anton

Date Sujet#  Auteur
17 Apr 25 * Re: register sets56Robert Finch
17 Apr 25 +* Re: register sets53Stephen Fuld
17 Apr 25 i+- Re: register sets1Robert Finch
17 Apr 25 i+* Re: register sets46MitchAlsup1
18 Apr 25 ii`* Re: register sets45Robert Finch
18 Apr 25 ii `* Re: register sets44MitchAlsup1
20 Apr 25 ii  `* Re: register sets43Robert Finch
21 Apr 25 ii   `* Re: auto predicating branches42Robert Finch
21 Apr 25 ii    `* Re: auto predicating branches41Anton Ertl
21 Apr 25 ii     +- Is an instruction on the critical path? (was: auto predicating branches)1Anton Ertl
21 Apr 25 ii     `* Re: auto predicating branches39MitchAlsup1
22 Apr 25 ii      `* Re: auto predicating branches38Anton Ertl
22 Apr 25 ii       +- Re: auto predicating branches1MitchAlsup1
22 Apr 25 ii       `* Re: auto predicating branches36Anton Ertl
22 Apr 25 ii        `* Re: auto predicating branches35MitchAlsup1
23 Apr 25 ii         +* Re: auto predicating branches3Stefan Monnier
23 Apr 25 ii         i`* Re: auto predicating branches2Anton Ertl
25 Apr 25 ii         i `- Re: auto predicating branches1MitchAlsup1
23 Apr 25 ii         `* Re: auto predicating branches31Anton Ertl
23 Apr 25 ii          `* Re: auto predicating branches30MitchAlsup1
24 Apr 25 ii           `* Re: asynch register rename29Robert Finch
27 Apr 25 ii            `* Re: fractional PCs28Robert Finch
27 Apr 25 ii             `* Re: fractional PCs27MitchAlsup1
28 Apr 25 ii              `* Re: fractional PCs26Robert Finch
28 Apr 25 ii               +* Re: fractional PCs15MitchAlsup1
29 Apr 25 ii               i`* Re: fractional PCs14Robert Finch
5 May 25 ii               i `* Re: control co-processor13Robert Finch
5 May 25 ii               i  `* Re: control co-processor12Al Kossow
5 May 25 ii               i   `* Re: control co-processor11Stefan Monnier
6 May 25 ii               i    +* Re: control co-processor3MitchAlsup1
7 May 25 ii               i    i+- Re: control co-processor1MitchAlsup1
15 Jul 25 ii               i    i`- Re: control co-processor1MitchAlsup1
7 May 25 ii               i    `* Scan chains (was: control co-processor)7Stefan Monnier
7 May 25 ii               i     +* Re: Scan chains (was: control co-processor)2Al Kossow
7 May 25 ii               i     i`- Re: Scan chains1Stefan Monnier
7 May 25 ii               i     +* Re: Scan chains3MitchAlsup1
7 May 25 ii               i     i`* Re: Scan chains2Stefan Monnier
8 May 25 ii               i     i `- Re: Scan chains1MitchAlsup1
15 Jul 25 ii               i     `- Re: Scan chains1MitchAlsup1
29 Apr 25 ii               `* Re: fractional PCs10Robert Finch
29 Apr 25 ii                `* Re: fractional PCs9MitchAlsup1
30 Apr 25 ii                 `* Re: fractional PCs8Robert Finch
30 Apr 25 ii                  +* Re: fractional PCs6Thomas Koenig
1 May 25 ii                  i+- Re: fractional PCs1Robert Finch
2 May 25 ii                  i`* Re: fractional PCs4moi
2 May 25 ii                  i +* Re: millicode, extracode, fractional PCs2John Levine
2 May 25 ii                  i i`- Re: millicode, extracode, fractional PCs1moi
2 May 25 ii                  i `- Re: fractional PCs1moi
30 Apr 25 ii                  `- Re: fractional PCs1MitchAlsup1
15 Jul 25 i`* Re: register sets5John Savard
15 Jul 25 i `* Re: register sets4MitchAlsup1
19 Jul 25 i  `* Re: register sets3Robert Finch
19 Jul 25 i   `* Re: register sets2Anton Ertl
19 Jul 25 i    `- Re: register sets1MitchAlsup1
15 Jul 25 `* Re: register sets2John Savard
15 Jul 25  `- Re: register sets1MitchAlsup1

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal