Misc: Ongoing status...

Liste des GroupesRevenir à c arch 
Sujet : Misc: Ongoing status...
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 30. Jan 2025, 21:00:22
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vnglop$33lk0$1@dont-email.me>
User-Agent : Mozilla Thunderbird
So, recent features added to my core ISA: None.
Reason: Not a whole lot that brings much benefit.
Have ended up recently more working on the RISC-V side of things, because there are still gains to be made there (stuff is still more buggy, less complete, and slower than XG2).
On the RISC-V side, did experiment with Branch-compare-Immediate instructions, but unclear if I will carry them over:
   Adds a non-zero cost to the decoder;
     Cost primarily associated with dealing with a second immed.
   Effect on performance is very small (< 1%).
In my case, I added them as jumbo-prefixed forms, so:
   BEQI Imm17s, Rs, Disp12s
Also added Store-with-Immediate, with a similar mechanism:
   MOV.L  Imm17s, (Rm, Disp12s*1)
As, it basically dropped out for free.
Also unclear if it will be carried over. Also gains little, as in most of the store-with-immediate scenarios, the immediate is 0.
Instructions with a less than 1% gain and no compelling edge case, are essentially clutter.
I can note that some of the niche ops I did add, like special-case RGB555 to Index8 or RGBI, were because at least they had a significant effect in one use-case (such as, speeding up how quickly the GUI can do redraw operations).
My usual preference in these cases is to assign 64-bit encodings, as the instructions might only be used in a few edge cases, so it becomes a waste to assign them spots in the more valuable 32-bit encoding space.
The more popular option was seemingly another person's option, to define them as 32-bit encodings.
   Their proposal was effectively:
     Bcc Imm5, Rs1', Disp12
       (IOW: a 3-bit register field, in a 32-bit instruction)
   I don't like this, this is very off-balance.
     Better IMO: Bcc Imm6s, Rs1, Disp9s (+/- 512B)
The 3-bit register field also makes it nearly useless with my compiler, as my compiler (in its RV mode) primarily uses X18..X27 for variables (IOW: the callee save registers). But, maybe moot, as either way it would still save less than 1%.
Also, as for any ops with 3-bit registers:
   Would make superscalar harder and more expensive;
   Would add ugly edge cases and cost to the instruction decoder;
   ...
I would prefer it if people not went that route (and tried to keep things at least mostly consistent, trying to avoid making a dog chewed mess of the 32-bit ISA).
If you really feel the need for 3-bit register fields... Maybe, go to a larger encoding?...
When I defined my own version of BccI (with a 64-bit encoding), how many new instructions did I need to define in the 32-bit base ISA: Zero.
And, I can also have things like:
   SEQI  Rs, Imm17s, Rd  // Rd = (Rs==Imm17s)
   Etc.
Also without adding anything new to the 32-bit encoding space.
I am personally feeling kinda useless right now, as nothing new or compelling in this, mostly trying to hunt down bugs in my Verilog code.
The RISC-V mode is still not entirely stable if jumbo prefixes are used, but I can't seem to find the issue.
Comparably, XG3 is even less stable, but I have put it at a lower priority than stabilizing RV+Jx (as XG3 depends on RV+Jx being stable).
Seemingly, people are not very accepting of either jumbo prefixes or register-indexed load/store, but in my testing, these are the two biggest performance improvements.
Did recently change the encoding space for RV jumbo prefixes from 0000401B to 0000403F, namely, putting them in the 64-bit opcode space, because this was (more or less) what they do (this takes up 1/8 of the 64-bit encoding space).
Did recently experiment with a compacted 48-bit encoding (Jumbo-Mini-48, or JM48). It more effectively hacks apart the 32-bit instruction and some bits from the prefix, and crams them into a 48 bit encoding (using 1/4 of the 48-bit encoding space).
These basically give Imm22/Disp22 encodings for the existing Imm12 encodings. There was also a considered JAL and AUIPC Disp30 case, but will need to special-case these (these would have a range of +/- 1GB).
They essentially mirror the existing 32-bit encoding space.
   Just with some special-case decoding tweaks depending on the block.
Mechanism was basically that, along-side the XG3 decoder, had added some logic to essentially dynamically repack 48-bit encodings into the 64-bit jumbo-prefixed forms (which are then what are seen by the decoder proper).
Where, the existing/known 48-bit ops had taken the form (partly taking a guess, existing table was not super clear):
   iii* - yyy0-nnnnn-0011111
     000:   L.LI.48 Imm32, Rn
     001: ? L.ADD.48 Imm32, Rn
     010: ? L.JAL.48 Disp32, X0
     011: ? L.JAL.48 Disp32, X1
     100:   L.SHORI.48 Imm32, Rn
     101: ? L.AND.48 Imm32, Rn
     110: ? L.OR.48 Imm32, Rn
     111: ? L.XOR.48 Imm32, Rn
I had then made a claim of:
   zzz* - yyy1-nnnnn-0011111
For the JM48 encodings.
Seemingly, they had used 1011111 mostly for Disp26 branch-ops.
   IMHO, kind of a waste...
My JM48 encodings, while not giving as big of an immediate, do at least also give things like Load/Store instructions. Which, you can't add, if you have already burned the entire 48-bit encoding space on 2RI ops and Bcc variants...
Had they used Disp24, then one could have both Bcc and basic Load/Store ops. Though, realistically, one usually only needs to Bcc within a single function, so even 24 bits is overkill (I would be more inclined towards, say, 16 or 17 bits).
Or, in my JM48 scheme, for the cost of (only) 22 bit displacements, one gets: All of the existing Disp12 ops... (Just with 10 bits glued on).
And, more so, in its present form, in under 200 lines of Verilog...
Currently, JM48 encodings can't encode references to R32..R63 (XGPR's), but this may not be a huge loss:
I currently intend it for probable use with RVC;
Code in RV64GC mode or similar, will be very unlikely to also be using XGPR's;
The relative cost increase from a 48 to 64-bit encoding to use XGPR's will probably be smaller.
I may consider allowing JM48 with 3R ops to potentially access XGPRs, say (with a JV bit):
   JV=0:
     Gives XGPRs and some extended opcode (need ~ 3b + 4b here);
     Still leaves 2 bits remaining, probably MBZ for now.
   JV=1:
     Gives a 14-bit synthesized immediate (with no XGPRs).
       Or, maybe 12-bit (with XGPRs).
       Still, TBD.
It is possible, I could have used a combined JV/JT bit for the Imm12 ops, allowing a choice between Disp21 or Disp17 with XGPRs, but decided to just go with the simpler option of Disp22 with no option of XGPRs (if they need XGPRs, next option up being a 64-bit encoding, where it is a choice between Disp33 or Disp17 with XGPRs).
Or, if the 3R blocks are given XGPRs with synthesized immediate values, could use synthesized 12-bit immediate encodings for cases where XGPR is needed. Need to decide between these options.
For maximizing performance, I would just assume sticking mostly to 32 and 64 bit encodings. But, many people are judging the "goodness" more in terms of minimizing the number of bytes in ".text" rather than the number of logical instructions in the execution path (seemingly using ".text" size as their sole metric, rather than as a guide).
Where, 48-bit will through off the 32-bit alignment, which in my core's design will hinder the use of superscalar (it is superscalar only for 32-bit aligned 32-bit ops).
But, my overall goal still being:
   Try to make it not suck.
But, it still kinda sucks.
   And, people don't want to admit that it kinda sucks;
   Or, that going some directions will make things worse.
Seems like a mostly pointless uphill battle trying to convince anyone of things that (at least to me) seem kinda obvious.
...

Date Sujet#  Auteur
30 Jan 25 * Misc: Ongoing status...25BGB
31 Jan 25 +* Re: Misc: Ongoing status...19MitchAlsup1
31 Jan 25 i`* Re: Misc: Ongoing status...18BGB
31 Jan 25 i `* Re: Misc: Ongoing status...17MitchAlsup1
1 Feb 25 i  `* Re: Misc: Ongoing status...16BGB
1 Feb 25 i   `* Re: Misc: Ongoing status...15MitchAlsup1
1 Feb 25 i    `* Re: Misc: Ongoing status...14BGB
2 Feb 25 i     `* Re: Misc: Ongoing status...13MitchAlsup1
2 Feb 25 i      +- Re: Misc: Ongoing status...1BGB
2 Feb 25 i      `* Caller-saved vs. callee-saved registers (was: Misc: Ongoing status...)11Anton Ertl
2 Feb 25 i       `* Re: Caller-saved vs. callee-saved registers10BGB
2 Feb 25 i        `* Re: Caller-saved vs. callee-saved registers9BGB
3 Feb 25 i         `* Re: Caller-saved vs. callee-saved registers8MitchAlsup1
3 Feb 25 i          `* Re: Caller-saved vs. callee-saved registers7BGB
3 Feb 25 i           `* Re: Caller-saved vs. callee-saved registers6MitchAlsup1
3 Feb 25 i            `* Re: Caller-saved vs. callee-saved registers5BGB
4 Feb 25 i             `* Re: Caller-saved vs. callee-saved registers4MitchAlsup1
4 Feb 25 i              `* Re: Caller-saved vs. callee-saved registers3BGB
4 Feb 25 i               `* Re: Caller-saved vs. callee-saved registers2MitchAlsup1
5 Feb 25 i                `- Re: Caller-saved vs. callee-saved registers1BGB
9 Mar 25 `* Instruction Parcel Size5Robert Finch
9 Mar 25  `* Re: Instruction Parcel Size4MitchAlsup1
9 Mar 25   +- Re: Instruction Parcel Size1Robert Finch
9 Mar 25   `* Re: Instruction Parcel Size2Robert Finch
9 Mar 25    `- Re: Instruction Parcel Size1MitchAlsup1

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal