Sujet : Re: fractional PCs
De : robfi680 (at) *nospam* gmail.com (Robert Finch)
Groupes : comp.archDate : 29. Apr 2025, 03:00:17
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vupbrk$jpqb$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
User-Agent : Mozilla Thunderbird
On 2025-04-28 6:02 p.m., MitchAlsup1 wrote:
On Mon, 28 Apr 2025 2:32:52 +0000, Robert Finch wrote:
On 2025-04-27 4:53 p.m., MitchAlsup1 wrote:
On Sun, 27 Apr 2025 11:36:05 +0000, Robert Finch wrote:
>
Representing the PC as a fixed-point number because it records which
micro-op of the micro-op stream for an instruction got interrupted. It
was easier to restart the micro-op stream than to defer interrupts to
the next instruction.
>
Why not just backup to the instruction boundary ??
>
I think I was worried about an instruction disabling interrupts or
causing an exception that should be processed before the interrupt
occurred. (keeping the interrupt precise). I did not want to need to
consider other exceptions that might have occurred before the interrupt.
>
Searching for an instruction boundary in either direction is I think
more logic than just recording the micro-op number.
You say your interrupt-PC is fixed point so it can point at the
micro-Op that raised the exception (or was interrupted). It
seems to me that simply wiping the fractional bits from the PC
should put you at the instruction boundary. That is:: Round Down.
If the PC is rounded down, then interrupts may not be in precise order with respect to exceptions in the instructions. But I guess interrupts are asynchronous anyway, so it should not make much difference. Rounding down is appealing to me as the ROB entry can be calculated based on the current micro-op number. It cannot be as easily calculated if rounding up.
I had this code which can now be simplified.
pgh (pronounced pug) is the micro-op (pipeline) group header containing information in common to all micro-ops in the group.
task tDeferToNextInstruction;
input rob_ndx_t ndx;
integer kk;
rob_ndx_t m1;
rob_ndx_t m2;
rob_ndx_t m3;
rob_ndx_t m4;
rob_ndx_t m5;
rob_ndx_t m6;
rob_ndx_t m7;
rob_ndx_t ih;
begin
m1 = (ndx + Stark_pkg::ROB_ENTRIES + 1) % Stark_pkg::ROB_ENTRIES;
m2 = (ndx + Stark_pkg::ROB_ENTRIES + 2) % Stark_pkg::ROB_ENTRIES;
m3 = (ndx + Stark_pkg::ROB_ENTRIES + 3) % Stark_pkg::ROB_ENTRIES;
m4 = (ndx + Stark_pkg::ROB_ENTRIES + 4) % Stark_pkg::ROB_ENTRIES;
m5 = (ndx + Stark_pkg::ROB_ENTRIES + 5) % Stark_pkg::ROB_ENTRIES;
m6 = (ndx + Stark_pkg::ROB_ENTRIES + 6) % Stark_pkg::ROB_ENTRIES;
m7 = (ndx + Stark_pkg::ROB_ENTRIES + 7) % Stark_pkg::ROB_ENTRIES;
if (rob[m1].op.uop.count!=3'd0 && rob[m1].sn > rob[ndx].sn)
ih = m1;
else if (rob[m2].op.uop.count!=3'd0 && rob[m2].sn > rob[ndx].sn)
ih = m2;
else if (rob[m3].op.uop.count!=3'd0 && rob[m3].sn > rob[ndx].sn)
ih = m3;
else if (rob[m4].op.uop.count!=3'd0 && rob[m4].sn > rob[ndx].sn)
ih = m4;
else if (rob[m5].op.uop.count!=3'd0 && rob[m5].sn > rob[ndx].sn)
ih = m5;
else if (rob[m6].op.uop.count!=3'd0 && rob[m6].sn > rob[ndx].sn)
ih = m6;
else if (rob[m7].op.uop.count!=3'd0 && rob[m7].sn > rob[ndx].sn)
ih = m7;
// Cannot find lead micro-op, must not be queued yet. Select tail position as
// place for interrupt. It may be moved again later.
else
ih = (tail0 + ROB_ENTRIES - 1) % ROB_ENTRIES;
if (ih != ndx) begin
rob[ih].op.hwi <= TRUE;
rob[ndx].op.hwi <= FALSE;
pgh[ih>>2].hwi <= TRUE;
pgh[ih>>2].irq <= pgh[ndx>>2].irq;
pgh[ndx>>2].hwi <= FALSE;
pgh[ndx>>2].irq.level <= 6'd0;
end
end
endtask
It is more FFs to
record the number, but fewer LUTs. There is like 8 x10 bit comparators
plus muxes on the re-order buffer to backup to the instruction boundary
and mark an interrupts. Just recording the micro-op number is just
stuffing 3 bits into the PC, plus three bits propagated down the
pipeline (FFs). The PC has two zero bits available already.