PCIe MSI-X interrupts
Sujet : PCIe MSI-X interrupts
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.archDate : 21. Jun 2024, 21:35:32
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <bb16865f7675526d4e2b87283e28c2c5@www.novabbs.org>
User-Agent : Rocksolid Light
PCIe has an MSI-X interrupt 'capabillity' which consists of
a number (n) interrupt desctiptors and an associated Pending
Bit Array where each bit in PBA has a corresponding 128-bit
desctiptor. A descriptor contains a 64-bit address, a 32-bit
message, and a 32-bit vector control word. There are 2-levels of enablement, one at the MSI-X configura-
tion control register and one in each interrupt descriptor at
vector control bit[31].
As the device raises an interrupt, it sets a bit in PBA.
When MSI-X is enabled and a bit in PBA is set (1) and the
vector control bit[31] is enabled, the device sends a
write of the message to the address in the descriptor,
and clears the bit in PBA.
I am assuming that the MSI-X enable bit is used to throttle
a device so that it sends bursts of interrupts to optimize
the caching behavior of the cores handling the interrupts.
run applications->handle k interrupts->run applications.
A home machine would not use this featrue as the interrupt
load is small, but a GB server might more control over when.
But does anybody know ??
a) device dommand to interrupt descriptor mapping {
Thre is no mention of the mapping of commands to the device
and to these interrupt descriptors. Can anyone supply input
or pointers to this mapping. A single device (such as a SATA drive) might have a queue of
outstanding commands that it services in whatever order it
thinks best. Many of these commands want to inform some core
when the command is complete (or cannot be completed). To do
this, device sends a stored interrupt messages to the stored service port.
}
I don't really NEED to know this mapping, but knowing would
significantly enhance my understanding of what is supposed to be going on, and thus avoid making crippling errors.
b) address space of interrupt service port {
The address in the interrupt descriptor points at a service port (APIC). Since a service port is "not like memory"*, I
want to mandate this aqddress be in MMI/O space, and since My 66000 has a full 64-bit address space for MMI/O there is no burden on the size of MMI/O space--it is already as big
as possible on a 64-bit machine. Plus, MMI/O space has the property of being sequentially consistent whereas DRAM is
only cache consistent.
Most current architectures just partition a hunk of the physical address space as MMI/O address space.
(*) memory has the property that a read will return the last
bit pattern written, a service port does not.
I assume that service port addresses map to different cores (or local APICs of a core). I want to directly support the
notion of a virtual core so while a 'chip' might have a large
number of physical cores, one would want a pool of thousands+ of virtual cores. I want said service ports to support raising interrupt directly to a physical or virtual core.
}
Apparently, the message part of the MSI-X interrupt can be interpreted any way that both SW and HW agree. This works
for already defined architectures, and doing it like one
or more others, makes an OS port significantly easier.
However what these messages contain is difficult to find
via Google. So, it seems to me, that the combination of the 64-bit address
and the 32-bit message must provide::
a) which level of the system to interrupt
{Secure Monitor, HyperVisor, SuperVisor, Application}
b) which core should handle the interrupt
{physical[0..k], virtual[l..m]}
c) what priority level is the interrupt.
{There are 64 unique priority levels}
d) something about why the interrupt was raised
{what remains of the meassage}
I suspect that (a) and (b) are parts of the address while (c)
and (d) are part of the message. Although nothing prevents
(c) from being part of the address.
Once MSI-X is sorted out MSI becomes a subset.
HostBridge has a service port that provides INT[A,B,C,D] to
MSI-X translation, so only MSI-X message are used system-
wide.
------------------------------------------------------------
It seems to me that the interrupt address needs translation
via I/O MMU, but which of the 4 levels provides the trans-
lation Root pointers ??
Am I allowed to use bits in Vector Control to provide this ??
But if I put it there then there is cross privilege leakage !
c) interupt latency {
When "what is running on a core" is timesliced by a HyperVisor,
a core that launched a command to a device may not be running
at the instant the interrupt arrives back.
It seems to me, that the HyperVisor would want to perform ISR
processing of the interrupt (low latency) and then schedule
the softIRQs to the <sleeping> core so when it regains control
the pending I/O stack of "stuff" is proprly cleaned up.
So, shold all initerrupt simple go to HyperVisor and let HV
sort it all out? Or can the <sleeping> virtual core just deal
with it when it is given a next time slice ??
Now, if there were a way to cascade interrupts such that if
an interrupt was routed to a <sleeping virtual core> that
some kind of "poke in the side" of a HyperVisor would cause
HV to find a next times slice for the <sleeping> core ex post
haste, and just let the core deal with the interrupt !!
Presto, any privilege level can handle its own interrupts.
}
Comments ??
Haut de la page
Les messages affichés proviennent d'usenet.
NewsPortal