Liste des Groupes | Revenir à c arch |
MitchAlsup1 wrote:EricP wrote:
>It is ok to *try* decoding a length from a token that might be an
instruction as long as you toss it away when you later find that it wasn't.You use the tail of the first instruction to select the start of the second.
You use the tail of the first pair to select the start of the second pair.
You use the tail of the first quad to select the start of the second quad.For example, if instructions can be 1..4 tokens long
then the next instruction comes from one of 4 following tokens,
the next instruction pair comes from one of 7 following instruction pairs,
the next instruction quad comes from one of 13 following instruction quads.Decode0 Decode1 Decode2 Decode3 Decode4 Decode5...
| | | | | |
v v v v v v
Length0->[--------4:1 Select Mux----------][----------...
| | | | | |
v v | | | |
Inst0 Inst1 v v v v
Length1->[----------7:1 Select Mux---------------------]
| | | |
v v v v
Inst2 Inst3 [----------13:1 Select Mux-----------]
| | | |
v v v v
Inst4 Inst5 Inst6 Inst7<---first pair---><--second pair--><--third pair---><---fourth pair--->Treeifying::
<-----------first quad------------><--------second quad--------------->
Decode0 Decode1 Decode2 Decode3 Decode4 Decode5...
| | | | | |
| | | Pinst3->[--------4:1 Select Mux-
| | | | | |
| | Pinst2->[--------4:1 Select Mux----------]
| | | | | |
| Pinst1->[--------4:1 Select Mux----------]
| Length1 | | | |
v v v v v v
Length0->[--------4:1 Select Mux----------]
| | | | | |
v v | | | |
Inst0 Inst1 v v v v
Length1->[----------2:1×4 Select Mux----------------]
| | | |
v v v v
Inst2 Inst3 [----------2:1×4 Select Mux-----------]
| | | |
v v v v
Inst4 Inst5 Inst6 Inst7
<---first pair---><--second pair--><--third pair---><---fourth pair--->
<-----------first quad------------><--------second quad--------------->
Where Pinsti is a purported instruction decode which may or may not
be selected as an instruction starting point. This gets rid of the
wide multiplexers at the cost of additional 4:1 multiplexers.
And thanks for taking the time to ASCII-art the figure.
I should have mentioned those muxes are replicated horizontally across
the input token buffer for each offset a pair or quad could start at.
In the above case, the input buffer has space for 8 instruction * 4 tokens,
The first token is offset 0, the first possible pair starts at offset 1,
the last possible pair starts at offset 28, so thats 28 sets of 4:1 muxes
* 4 tokens per instruction * bits-per-token (plus sundry housekeeping bits).
Also I used one-hot select muxes,To a logic designer, the difference between a 1-hot mux and a binary
that is the 4:1 mux has a 4-bitA 4:1 mux is 1 gate of delay (and one logic inversion)
one-hot select control and the 7:1 mux has a 7-bit select control,
as it is easier to shift a one-hot enable out to the next position,99% of selection logic anywhere in a pipeline is 1-hot.
and it eliminates the mux binary decoder and length adders forExactly.
figuring out where the next pair or quad starts from.
So those wide muxes are really just a layer of AND gates enabled byBasically, you let each word determine its output and you decode the LOBs of IP to get your starting point.
one of the select control bits, and a 4 or 7 or 13 input OR.
There are no length adders inside the selection routing tree,
just at the end to sum up the total length of valid instruction bytes
so we know what to increment the fetch RIP by.
Les messages affichés proviennent d'usenet.