Liste des Groupes | Revenir à c arch |
For something like a bytecode interpreter, the prediction accuracy>>Hmm... but that would require fetching that info from memory.Hmm... but in order not to have bubbles, your prediction structure stillYes, but you use the predicted index number to find the predicted
needs to give you a predicted target address (rather than a predicted
index number), right?
target IP.
Can you do that without introducing bubbles?
In many/most (dynamic) cases, they have already been fetched and all
that is needed is muxing the indexed field out of Instruction Buffer.
I guess for small jump table that would work well, indeed, but for
something like a bytecode interpreter, even if you can compact it to
have only 16bit per entry, that still spans 512B. Is your IB large
enough for that?
a) line size is 512-bits or 64-bytes.>If you're lucky it's in the L1 Icache, but that still takes a coupleMy 1-wide machine fetches 4-words per cycle.
cycles to get, doesn't it?
My 6-wide machine fetches 3 ½-cache-lines per cycle.
Even with a 256B cache line width, it would take 2 cycles to get a 512B
jump table into your IB, after which you still have to select (and
compute, if the table is compacted) the corresponding target address,
and only after that can you start fetching (which itself will suffer
the L1 latency), so we're up to a 5-6 cycle bubble, no?
>
>
Stefan
Les messages affichés proviennent d'usenet.