Re: Decrement And Branch

Liste des GroupesRevenir à c arch 
Sujet : Re: Decrement And Branch
De : kegs (at) *nospam* provalid.com (Kent Dickey)
Groupes : comp.arch
Date : 09. Sep 2024, 04:31:00
Autres entêtes
Organisation : provalid.com
Message-ID : <vblq5k$2991r$1@dont-email.me>
References : 1 2 3 4
User-Agent : trn 4.0-test76 (Apr 2, 2001)
In article <2024Aug15.123928@mips.complang.tuwien.ac.at>,
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
mitchalsup@aol.com (MitchAlsup1) writes:
On Wed, 14 Aug 2024 9:10:01 +0000, Anton Ertl wrote:
>
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
Like I said, I wondered why this sort of thing wasn't more common ...
[snip]
My 66000 finds use cases all the time, and I also have Branch on bit
instructions and have my CMP instructions build bit-vectors of outcomes.
>
If an architecture has the 88000-style treatment of comparison results
(fill a GPR with conditions, one bit per condition), instructions like
TBNZ and TBZ certainly are useful, but ARM A64 uses a condition code
register with NZCV flags for dealing with conditions, so what is TBNZ
and TBZ used for on this architecture?  Looking at a binary I have at
hand, I see a lot of checking bit #63 and some checking of #31, #15,
#7, i.e., checking for whether a 64-bit, ... 8-bit number is negative.
There are also a number of uses coming from libgcc, e.g.,
>
  6f0a8:       37e001c3        tbnz    w3, #28, 6f0e0
<__aarch64_sync_cache_range+0x50>
  6f0e8:       37e801e2        tbnz    w2, #29, 6f124
<__aarch64_sync_cache_range+0x94>
  6f6dc:       b7980b84        tbnz    x4, #51, 6f84c <__addtf3+0x71c>
  6fb28:       b79000a3        tbnz    x3, #50, 6fb3c <__addtf3+0xa0c>
  6fc30:       b79000a3        tbnz    x3, #50, 6fc44 <__addtf3+0xb14>
  70248:       b7980d02        tbnz    x2, #51, 703e8 <__multf3+0x728>
  7036c:       b79809a2        tbnz    x2, #51, 704a0 <__multf3+0x7e0>
  70430:       b77801a2        tbnz    x2, #47, 70464 <__multf3+0x7a4>
  7048c:       b79ffae2        tbnz    x2, #51, 703e8 <__multf3+0x728>
  70498:       b79ffa82        tbnz    x2, #51, 703e8 <__multf3+0x728>
>
The tf3 stuff probably is the implementation of long doubles.  In any
case, in this binary with 26473 instructions, there are 30 occurences
of tbnz and 41 of tbz, for a total of 71 (0.3% of static instruction
count).
>
Apparently the usefulness of decrement-and-branch is even lower.
>
Certainly in my code most loops count upwards.
>
- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
 Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

PA-RISC had "ADDIB,cond,n  imm,reg,target".  Add a 5-bit signed
immediate to reg, and then branch on comparing the result to 0
(effectively), allowing branching on <, <=, =, >, >=, overflow, carry,
etc.  And a non-immediate version ADDB.  The target was +/-8KB.

Really simple loops could be done with the loop operation in the delay
slot of ADDIB.

The HP C/C++ Compiler pretty much converted all for() loops to count down
to 0, when it wasn't too awkward.  So:

for(i = 0; i < 100; i++) {
array[i] = 0;
}

would be effectively transformed to:

ptr = &array[0];
for(i = 99, i >= 0; i--) {
*ptr++ = 0;
}

Which becomes (PA-RISC has target register listed last, and delay slots,
and nullification where on branches it nullifies next instruction if it
is not taken):

MOV array,r8
LDI 99,r9
LOOP: ADDIB,>=,n -1,r9,LOOP ; r9=r9-1.  If r9 >= 0, jump to LOOP
STD,ma r0,8(r8) ; (r8)=r0; r8=r8+8

So it could use ADDIB for many "for" loops.  The way nullification works,
it works properly even if the loop should never execute.  If r9 starts
at 0, no STD will be done.  There was no reason to change the source
code, the compiler would do the transform for you.  PA-RISC also had
CMPIB which just does the compare and branch.  ADDIB is a very simple
instruction which costs very little to add, and saves 2 instructions for
many loops (ADDI,CMP_0,Bcc -> ADDIB).  I think it is a mistake for ARM to
not have it.  I see a lot of "ADD, CMP, Bcc" in ARM assembly code.
To avoid inverting the counter, "ADD1CMPBcc" would ADD 1 to a counter,
compare the counter to another register, and branch on condition.

As for ARM TBNZ and TBZ, I see it used all the time in my code where I
often use single bit flags in control variables:

if(flags & FLAG_SPECIAL1) { // FLAG_SPECIAL1 = 0x40
// Do "SPECIAL1" stuff
}

In one program I've written on ARM, 2.3% of all instructions are TBZ or
TBNZ.

Kent

Date Sujet#  Auteur
13 Aug 24 * Decrement And Branch24Lawrence D'Oliveiro
13 Aug 24 +* Re: Decrement And Branch2Anton Ertl
13 Aug 24 i`- Re: Decrement And Branch1MitchAlsup1
13 Aug 24 +- Re: Decrement And Branch1MitchAlsup1
14 Aug 24 +* Re: Decrement And Branch18Lawrence D'Oliveiro
14 Aug 24 i+* Re: Decrement And Branch3MitchAlsup1
14 Aug 24 ii`* Re: Decrement And Branch2Lawrence D'Oliveiro
15 Aug 24 ii `- Re: Decrement And Branch1MitchAlsup1
14 Aug 24 i`* Re: Decrement And Branch14Anton Ertl
15 Aug 24 i +* Re: Decrement And Branch4Lawrence D'Oliveiro
15 Aug 24 i i+- Re: Decrement And Branch1MitchAlsup1
15 Aug 24 i i`* Re: Decrement And Branch2Anton Ertl
16 Aug 24 i i `- Re: Decrement And Branch1Lawrence D'Oliveiro
15 Aug 24 i `* Re: Decrement And Branch9MitchAlsup1
15 Aug 24 i  `* Re: Decrement And Branch8Anton Ertl
15 Aug 24 i   +* Re: Decrement And Branch5MitchAlsup1
16 Aug 24 i   i`* Instruction counts (was: Decrement And Branch)4Anton Ertl
16 Aug 24 i   i +* Re: Instruction counts (was: Decrement And Branch)2Lawrence D'Oliveiro
16 Aug 24 i   i i`- Re: Instruction counts (was: Decrement And Branch)1Anton Ertl
16 Aug 24 i   i `- Re: Instruction counts1MitchAlsup1
15 Aug 24 i   +- Re: Decrement And Branch1MitchAlsup1
9 Sep 24 i   `- Re: Decrement And Branch1Kent Dickey
16 Aug 24 `* Re: Decrement And Branch2quadibloc
16 Aug 24  `- Re: Decrement And Branch1MitchAlsup1

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal