Sujet : Re: Constant Stack Canaries
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.archDate : 04. Apr 2025, 22:07:09
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <6a02e29617176252be4814869b64eeba@www.novabbs.org>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13
User-Agent : Rocksolid Light
On Wed, 2 Apr 2025 0:07:41 +0000, Robert Finch wrote:
On 2025-04-01 7:24 p.m., MitchAlsup1 wrote:
On Tue, 1 Apr 2025 22:06:10 +0000, Robert Finch wrote:
-------------------------
Why is it not 13 cycles to get started and then each register is 1 one
cycle.
>
The CPU does not do pipe-lined burst loads. To load the cache line it is
two independent loads. 256-bits at a time. Stores post to the bus, but
I seem to remember having to space out the stores so the queue in the
memory controller did not overflow. Needs more work.
>
Stores should be faster, I think they are single cycle. But loads may be
quite slow if things are not in the cache. I should really measure it.
It may not be as bad I think. It is still 300 LOC, about 100 loads and
stores each way. Lots of move instructions for regs that cannot be
directly loaded or stored. And with CRs serializing the processor. But
the processor should eat up all the moves fairly quickly.
By placing all the CRs together, and treating thread-state as a write-
back cache, all the storing and loading happens without any
serialization,
in cache line quanta, where the LD can begin before the STs
begin--giving
the overlap that reduces the cycle count.
For example, once a core has decided to run "this-thread" all it has to
do is to execute a single HR instruction which writes a pointer to
thread-
state. Then upon SVR, that thread begins running. Between HE and SVR, HW
can preload the inbound data, and push out the outbound data after the
inbound data has arrived.
But, also note: Due to the way CR's are mapped into MMI/O memory, one
core can write that same HR available CR on another core and cause a
remote context switch of that another core.
The main use is more likely to be remote diagnostics of a core that
has quit responding to the system (crashed hard) so its CRs can be
read out and examined to see why it quit responding.