Sujet : Re: Misc: Ongoing status...
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.archDate : 02. Feb 2025, 02:22:47
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <97ead3323da5a89d174910edebcbb815@www.novabbs.org>
References : 1 2 3 4 5 6 7
User-Agent : Rocksolid Light
On Sat, 1 Feb 2025 22:42:39 +0000, BGB wrote:
On 1/31/2025 10:05 PM, MitchAlsup1 wrote:
--------------------------------
Whereas, if performance is dominated by a piece of code that looks like,
say:
v0=dytf_int2fixnum(123);
v1=dytf_int2fixnum(456);
v2=dytf_mul(v0, v1);
v3=dytf_int2fixnum(789);
v4=dytf_add(v2, v3);
v5=dytf_wrapsymbol("x");
dytf_storeindex(obj, v5, v4);
...
With, say, N levels of call-graph in each called function, but with this
sort of code still managing to dominate the total CPU ("Self%" time).
>
This seems to be a situation where callee-save registers are a big win
for performance IME.
With callee save registers, the prologue and epilogue of subroutines
sees all the save/restore memory traffic; sometimes saving a register
that is not "in use" and restoring it later.
With caller save registers, the caller saves exactly the registers
it needs preserved, while the callee saves/restores none. Moreover
it only saves registers currently "in use" and may defer restoring
since it does not need that value in that register for a while.
So, the instruction path length has a better story in caller saves
than callee saves. Nothing that was "Not live" is ever saved or
restored.
The arguments for callee save have to do with I cache footprint.