Sujet : Re: "Mini" tags to reduce the number of op codes
De : terje.mathisen (at) *nospam* tmsw.no (Terje Mathisen)
Groupes : comp.archDate : 11. Apr 2024, 11:22:47
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <uv8dlo$1krvp$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13
User-Agent : Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.18.2
Scott Lurndal wrote:
mitchalsup@aol.com (MitchAlsup1) writes:
BGB wrote:
>
On 4/9/2024 7:28 PM, MitchAlsup1 wrote:
BGB-Alt wrote:
>
>
Also the blob of constants needed to be within 512 bytes of the load
instruction, which was also kind of an evil mess for branch handling
(and extra bad if one needed to spill the constants in the middle of a
basic block and then branch over it).
>
In My 66000 case, the constant is the word following the instruction.
Easy to find, easy to access, no register pollution, no DCache pollution.
It does occupy some icache space, however; have you boosted the icache
size to compensate?
Except it pretty rarely do so (increase icache pressure):
mov temp_reg, offset const_table
mov reg,qword ptr [temp_reg+const_offset]
looks to me like at least 5 bytes for the first instruction and probably 6 for the second, for a total of 11 (could be as low as 8 for a very small offset), all on top of the 8 bytes of dcache needed to hold the 64-bit value loaded.
In My 66000 this should be a single 32-bit instruction followed by the 8-byte const, so 12 bytes total and no lookaside dcache inference.
It is only when you do a lot of 64-bit data loads, all gathered in a single 256-byte buffer holding up to 32 such values, and you can afford to allocate a fixed register pointing to the middle of that range, that you actually gain some total space: Each load can now just do a
mov reg,qword ptr [fixed_base_reg+byte_offset]
which, due to the need for a 64-bit prefix, will probably need 4 instruction bytes on top of the 8 bytes from dcache. At this point we are touching exactly the same number of bytes (12) as My 66000, but from two different caches, so much more likley to suffer dcache misses.
Terje
-- - <Terje.Mathisen at tmsw.no>"almost all programming can be viewed as an exercise in caching"