Sujet : Re: DMA is obsolete
De : mitchalsup (at) *nospam* aol.com (MitchAlsup1)
Groupes : comp.archDate : 01. May 2025, 23:03:08
Autres entêtes
Organisation : Rocksolid Light
Message-ID : <5a77c46910dd2100886ce6fc44c4c460@www.novabbs.org>
References : 1 2 3 4
User-Agent : Rocksolid Light
On Thu, 1 May 2025 13:07:07 +0000, Dan Cross wrote:
In article <da5b3dea460370fc1fe8ad2323da9bc4@www.novabbs.org>,
MitchAlsup1 <mitchalsup@aol.com> wrote:
On Sat, 26 Apr 2025 17:29:06 +0000, Scott Lurndal wrote:
[snip]
Reminds me of trying to sell a micro x86-64 to AMD as a project.
The µ86 is a small x86-64 core made available as IP in Verilog
where it has/runs the same ISA as main GBOoO x86, but is placed
"out in the PCIe" interconnect--performing I/O services topo-
logically adjacent to the device itself. This allows 1ns access
latencies to DCRs and performing OS queueing of DPCs,... without
bothering the GBOoO cores.
>
AMD didn't buy the arguments.
>
I can see it either way; I suppose the argument as to whether I
buy it or not comes down to, "in depends". How much control do
I, as the OS implementer, have over this core?
Other than it being placed "away" from the centralized cores,
it runs the same ISA as the main cores has longer latency to
coherent memory and shorter latency to device control registers
--which is why it is placed close to the device itself:: latency.
The big fast centralized core is going to get microsecond latency
from MMI/O device whereas ASIC version will have handful of nano-
second latencies. So the 5 GHZ core sees ~1 microsecond while the
little ASIC sees 10 nanoseconds. ...
If it is yet another hidden core embedded somewhere deep in the
SoC complex and I can't easily interact with it from the OS,
then no thanks: we've got enough of those between MP0, MP1, MP5,
etc, etc.
>
On the other hand, if it's got a "normal" APIC ID, the OS has
control over it like any other LP, and its coherent with the big
cores, then yeah, sign me up: I've been wanting something like
that for a long time now.
It is just a core that is cheap enough to put in ASICs, that
can offload some I/O burden without you having to do anything
other than setting some bits in some CRs so interrupts are
routed to this core rather than some more centralized core.
Consider a virtualization application. A problem with, say,
SR-IOV is that very often the hypervisor wants to interpose some
sort of administrative policy between the virtual function and
whatever it actually corresponds to, but get out of the fast
path for most IO. This implies a kind of offload architecture
where there's some (presumably software) agent dedicated to
handling IO that can be parameterized with such a policy. A
Interesting:: Could you cite any literature, here !?!
core very close to the device could handle that swimmingly,
though I'm not sure it would be enough to do it at (say) line
rate for a 400Gbps NIC or Gen5 NVMe device.
I suspect the 400 GHz NIC needs a rather BIG core to handle the
traffic loads.
....but why x86_64? It strikes me that as long as the _data_
formats vis the software-visible ABI are the same, it doesn't
need to use the same ISA. In fact, I can see advantages to not
doing so.
Having the remote core run the same OS code as every other core
means the OS developers have fewer hoops to jump through. Bug-for
bug compatibility means that clearing of those CRs just leaves
the core out in the periphery idling and bothering no one.
On the other hand, you buy a motherboard with said ASIC core,
and you can boot the MB without putting a big chip in the
socket--but you may have to deal with scant DRAM since the
big centralized chip contains teh memory controller.
- Dan C.