Sujet : Re: MSI interrupts
De : robfi680 (at) *nospam* gmail.com (Robert Finch)
Groupes : comp.archDate : 17. Mar 2025, 16:23:24
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vr9epd$cqbc$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14
User-Agent : Mozilla Thunderbird
On 2025-03-17 10:11 a.m., Michael S wrote:
On Mon, 17 Mar 2025 13:38:12 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:
Robert Finch <robfi680@gmail.com> writes:
<please trim posts>
>
>
Consider that you have a pool of 4 cores setup to receive
interrupts. That those 4 cores are running at differing priorities
and the interrupt is at a still different priority. You want the
core operating at the lowest
priority (with the right software stack) to accept the interrupt
!!
>
Okay, I wrote a naive hardware filter for this which should work
okay for small numbers of CPUs but does not scale up very well.
What happens if there is no core ready? Place the IRQ back into the
queue (regenerate the IRQ as an IRQ miss IRQ)? Or just wait for an
available core. I do not like the idea of waiting as it could stall
the system.
>
Use a fifo to hold up to N pending IRQ. Define a signal that asserts
when the fifo is non-empty. The CPU can mask the signal to prevent
interruption, when the signal is unmasked the CPU pops the
first IRQ from the FIFO. Or use a bitmap in flops or SRAM
(prioritization happensin the logic that asserts the fact
hat an interrupt is pending to the CPU).
>
Choose whether you want a FIFO/bitmap per target CPU, or global
pending data with the logic to select highest priority pending
IRQ when the CPU signals that interrups are not masked.
The problem Robert is talking about arises when there are many
interrupt source and many target CPUs.
The required routing/prioritization/acknowledgment logic (at least a
naive logic I am having in mind) would be either non-scalable or
relatively complicated. Process of selection for the second case will
take multiple cycles (I am thinking about ring).
Yeah, that was an issue.
My naive filter occupied only about 50 LUTs for eight CPU cores which is probably okay timing wise. But it jumped up to about 1500 LUTs for 63 cores. I think it is O^2 logic.
My thought was to try using something like the bit arrays for an age matrix.
Race logic is to be avoided. It cannot be processed in an FPGA.