On 2025-03-28, Chris M. Thomasson <
chris.m.thomasson.1@gmail.com> wrote:
I also remember reading in some SPARC manual about using NOP's in branch
delay slots. Programming the SPARC was really "fun" when it was in RMO
mode... :^)
I fixed a bug in GCC almost 20 years ago. It was incorrectly filling
a branch delay slot on MIPS, due to a mistake in the callee-saved
register determination:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34456What ended up happening was that we had shared library functions being
called using the wrong global-offset-table pointer (that of the
caller rather than their correct one) resulting in a function call
going into an entirely wrong function due to register $28 (GOT)
being wrongly set.
When you put an instruction in the branch delay slot of a conditional
branch, you have to be sure that it's okay to execute that instruction
in both the taken and not-taken path.
If the instruction is taken from the earlier stream, this is
always OK. I.e. if we change this:
insn0
bz elsewhere
nop
insn1
to this:
bz elsewhere
insn0
insn1
where we of course have ot be sure that we can reorder insn0 past the
bz instruction.
Usually the branch delayed slot is filled from the target of the branch:
insn0
bz elsewhere
insnX ;; first instruction of elsewhere block moved here
insn1
If this is a regular branch delay slot, insnX will be executed in
the not-taken path.
We need careful analysis to determine that executing the instruction is
okay in the not-taken path, even though it borrowed from a completely
unrelated elsewhere block.
It is okay, for instance, if it clobbers a register that is dead
in the not-taken path.
insn0
bz elsewhere
move R3, R5 ; OK to move this here: R3 dead
move R9, (R13 + 8) ; R3 dead here
move R3, R7 ; R3 dead because of this.
The move R3, R5 instruction doesn't matter because R3 is dead in the
not-taken path. It is dead because the next time it is referenced, it is
the target of a store; its value is never used.
What was happening here is that an instruction to set up the global
offset pointer, register R28 (or $28) was being moved into the
branch delay slot.
bz elsewhere ;; elsewhere makes a call to a shared library
move R28, whaetver ;; wrongly moved: set up GP for that library
call something ;; R28 wrongly believed dead
GCC though it was okay to put the R28 instruction into the delay
slot because R28 is a register that is clobbered by the function
call in the not-taken path.
Registers that are not callee-saved are considered dead before
a function call; it is assumed the callee clobbers them and so
their current values are not required.
GCC mistakenly considered register $28, the global pointer,
to be callee-clobbered. But in fact it is not, and the
function invoked relied on it having the prior value.
(There is a feature called annulled delay slots. That's why the bug
report refers to non-annulled. An annulled delay slot would not have
this problem. An anulled delay slot is flagged such that the
branch-not-taken straight path will skip the instruction.
You can blindly move an instruction from the branch target into an
annulled delay slot, without worrying whether it clobbers a live
register.)
It was a tough debug. Analyzing the crash and determining it was from a
bad $28 pointer, noticing that it's set in a branch delay slot
as the easy part. After that, I had to ramp up on the
spaghetti-like, labyrinthine gcc internals (full of ugly macros not
resolvable in gdb) far enough that I was able to actually single step
through a basic block optimization in GDB, somehow printing out the
INSNS as they are being visited, and thereby I saw where it was
getting to the branch at the end of the basic block and incorrectly
handling the branch delay slot.
I'm not sure I'd have the energy to dive into someone's open source
ball of mud that deeply today, just for the sake of a "drive by"
bugfix.
I was driven because I was in a startup surrounded by people who
believed I could do anything, and I probably subconsciously wanted to
live up that.
-- TXR Programming Language: http://nongnu.org/txrCygnal: Cygwin Native Application Library: http://kylheku.com/cygnalMastodon: @Kazinator@mstdn.ca