Sujet : Re: Interrupts in OoO
De : anton (at) *nospam* mips.complang.tuwien.ac.at (Anton Ertl)
Groupes : comp.archDate : 13. Oct 2024, 16:20:37
Autres entêtes
Organisation : Institut fuer Computersprachen, Technische Universitaet Wien
Message-ID : <2024Oct13.172037@mips.complang.tuwien.ac.at>
References : 1 2 3 4 5 6 7 8
User-Agent : xrn 10.11
EricP <
ThatWouldBeTelling@thevillage.com> writes:
Anton Ertl wrote:
EricP <ThatWouldBeTelling@thevillage.com> writes:
That's difficult with a circular buffer for the instruction queue/rob
as you can't edit the order.
What's wrong with performing an asynchronous interrupt at the ROB
level rather than inserting it at the decoder? Just stop commiting at
some point, record this at the interrupt return address and start
decoding the interrupt code.
>
That's worse than a pipeline drain because you toss things you already
invested in, by fetch, decode, rename, schedule, and possibly execute.
The question is what you want to optimize.
Design simplicity? I think my approach wins here, too.
Interrupt response latency? Use what I propose.
Maximum throughput? Then follow your approach.
The throughput issue is only relevant if you have lots of interrupts.
The way I saw it, the core continues to execute its current stream while
it prefetches the handler prologue into I$L1, then loads its fetch buffer.
At that point fetch injects a special INT_START uOp into the instruction
stream and switches to the handler. The INT_START uOp travels down the
pipeline following right behind the tail of the original stream.
If none of the flow disrupting events occur to the original stream then
the handler just tucks in behind it. When INT_START hits retire then core
send the commit signal to the interrupt controller to confirm the hand-off.
>
The interrupt handler should start executing at the same time as it would
otherwise.
Architecturally, an instruction is only executed when it
commits/retires. Only then do I/O devices or other CPUs see any
stores or I/O operations performed in the interrupt handler. With
your approach, if there are long-latency instructions in the pipeline
(say, dependence chains containing multiple cache misses) when the
interrupt strikes, the instructions in your interrupt handler will
have to wait until the preceding instructions retire, which can take
thousands of cycles in the worst case.
By contrast, if you treat an interrupt like a branch misprediction and
cancel all the speculative work, the instructions of the interrupt
handler go through the engine as fast as possible, and you get the
minimum response latency possible in the engine.
- anton
-- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>