Re: Decrement And Branch

Liste des Groupes 
Sujet : Re: Decrement And Branch
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.arch
Date : 15. Aug 2024, 11:39:28
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Aug15.123928@mips.complang.tuwien.ac.at>
References : 1 2 3 4
User-Agent : xrn 10.11
mitchalsup@aol.com (MitchAlsup1) writes:
On Wed, 14 Aug 2024 9:10:01 +0000, Anton Ertl wrote:
>
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
Like I said, I wondered why this sort of thing wasn't more common ...
>
For the early RISCs, the pipeline was designed for early branch
execution.  Performing an ALU op before the branch did not fit that
kind of pipeline.
>
MIPS would disagree.

In nearly all of the MIPS history, there is no ALU op before the
branch, only a comparison of two registers for equality.  They revised
the branches significantly in 2014, but that's not early MIPS, and by
that time branch predictors were so good that resolving the branch one
cycle later was not a big issue.

MIPS pipeline performed Branch Target Calculation by pasting bits
from the instruction onto bits vacated from IP.

Conditional branches in MIPS are relative.  Only J and JAL have this
misfeature.

For over a decade, Intel decoders have decoded many sequences of ALU
and branch instructions into one uop, so they can do at a
microarchitectural level what you are asking about at the architecture
level.  Other microarchitectures have followed this pattern, and
RISC-V seems to make a philosophy out of this.
>
On the Intel side they mostly depend on prediction.

Every high-performance CPU depends on prediction.  Your point is what?

On the RISC-V side they mostly depend on fusion. As far as I understand,
They only fuse pairs not ADD-CMP-BCs.

RISC-V has compare-and-branch instructions; I don't know if any
implementations fuse that with a preceding addition/subtraction, but
if so, it's a fusion of a pair of instructions.

As for only fusing pairs, one of the patterns, in a section called
"Fusion Pair Candidates" Celio et al.
<https://arxiv.org/pdf/1607.02318> give the sequence

slli rd, rs1, {1,2,3}
add rd, rd, rs2
ld rd, 0(rd)

However, as they point out, this may be the result of first pairing
the first two instructions and then pairing the result with the third
instruction.

The paper does not describe any implementation that actually performs
such instruction fusions, so any real implementation may perform the
fusions shown there, or more or fewer fusion patterns.

ARM A64 OTOH seems to put everything into an instruction that fits in
32 bits, and while they have instructions (TBNZ and TBZ) that tests a
specific bit in a register and branch if the bit is set or clear, they
have not added a subtract-and-branch or branch-and-subtract
instruction.  Apparently the uses for such an instruction are not that
frequent.
>
My 66000 finds use cases all the time, and I also have Branch on bit
instructions and have my CMP instructions build bit-vectors of outcomes.

If an architecture has the 88000-style treatment of comparison results
(fill a GPR with conditions, one bit per condition), instructions like
TBNZ and TBZ certainly are useful, but ARM A64 uses a condition code
register with NZCV flags for dealing with conditions, so what is TBNZ
and TBZ used for on this architecture?  Looking at a binary I have at
hand, I see a lot of checking bit #63 and some checking of #31, #15,
#7, i.e., checking for whether a 64-bit, ... 8-bit number is negative.
There are also a number of uses coming from libgcc, e.g.,

   6f0a8:       37e001c3        tbnz    w3, #28, 6f0e0 <__aarch64_sync_cache_range+0x50>
   6f0e8:       37e801e2        tbnz    w2, #29, 6f124 <__aarch64_sync_cache_range+0x94>
   6f6dc:       b7980b84        tbnz    x4, #51, 6f84c <__addtf3+0x71c>
   6fb28:       b79000a3        tbnz    x3, #50, 6fb3c <__addtf3+0xa0c>
   6fc30:       b79000a3        tbnz    x3, #50, 6fc44 <__addtf3+0xb14>
   70248:       b7980d02        tbnz    x2, #51, 703e8 <__multf3+0x728>
   7036c:       b79809a2        tbnz    x2, #51, 704a0 <__multf3+0x7e0>
   70430:       b77801a2        tbnz    x2, #47, 70464 <__multf3+0x7a4>
   7048c:       b79ffae2        tbnz    x2, #51, 703e8 <__multf3+0x728>
   70498:       b79ffa82        tbnz    x2, #51, 703e8 <__multf3+0x728>

The tf3 stuff probably is the implementation of long doubles.  In any
case, in this binary with 26473 instructions, there are 30 occurences
of tbnz and 41 of tbz, for a total of 71 (0.3% of static instruction
count).

Apparently the usefulness of decrement-and-branch is even lower.

Certainly in my code most loops count upwards.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
  Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Date Sujet#  Auteur
13 Aug 24 * Decrement And Branch24Lawrence D'Oliveiro
13 Aug 24 +* Re: Decrement And Branch2Anton Ertl
13 Aug 24 i`- Re: Decrement And Branch1MitchAlsup1
13 Aug 24 +- Re: Decrement And Branch1MitchAlsup1
14 Aug 24 +* Re: Decrement And Branch18Lawrence D'Oliveiro
14 Aug 24 i+* Re: Decrement And Branch3MitchAlsup1
14 Aug 24 ii`* Re: Decrement And Branch2Lawrence D'Oliveiro
15 Aug 24 ii `- Re: Decrement And Branch1MitchAlsup1
14 Aug 24 i`* Re: Decrement And Branch14Anton Ertl
15 Aug 24 i +* Re: Decrement And Branch4Lawrence D'Oliveiro
15 Aug 24 i i+- Re: Decrement And Branch1MitchAlsup1
15 Aug 24 i i`* Re: Decrement And Branch2Anton Ertl
16 Aug 24 i i `- Re: Decrement And Branch1Lawrence D'Oliveiro
15 Aug 24 i `* Re: Decrement And Branch9MitchAlsup1
15 Aug 24 i  `* Re: Decrement And Branch8Anton Ertl
15 Aug 24 i   +* Re: Decrement And Branch5MitchAlsup1
16 Aug 24 i   i`* Instruction counts (was: Decrement And Branch)4Anton Ertl
16 Aug 24 i   i +* Re: Instruction counts (was: Decrement And Branch)2Lawrence D'Oliveiro
16 Aug 24 i   i i`- Re: Instruction counts (was: Decrement And Branch)1Anton Ertl
16 Aug 24 i   i `- Re: Instruction counts1MitchAlsup1
15 Aug 24 i   +- Re: Decrement And Branch1MitchAlsup1
9 Sep 24 i   `- Re: Decrement And Branch1Kent Dickey
16 Aug 24 `* Re: Decrement And Branch2quadibloc
16 Aug 24  `- Re: Decrement And Branch1MitchAlsup1

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal