On 4/27/2024 8:45 PM, MitchAlsup1 wrote:
BGB wrote:
On 4/27/2024 3:37 PM, MitchAlsup1 wrote:
BGB wrote:
>
On 4/26/2024 1:59 PM, EricP wrote:
MitchAlsup1 wrote:
BGB wrote:
>
If one had 16-bit displacements, then unscaled displacements would make sense; otherwise scaled displacements seem like a win (misaligned displacements being much less common than aligned displacements).
>
What we need is ~16-bit displacements where 82½%-91¼% are positive.
>
How does one use a frame pointer without negative displacements ??
>
[FP+disp] accesses callee save registers
[FP-disp] accesses local stack variables and descriptors
>
[SP+disp] accesses argument and result values
>
A sign extended 16-bit offsets would cover almost all such access needs
so I really don't see the need for funny business.
>
But if you really want a skewed range offset it could use something like
excess-256 encoding which zero extends the immediate then subtract 256
(or whatever) from it, to give offsets in the range -256..+65535-256.
So an immediate value of 0 equals an offset of -256.
>
>
Yeah, my thinking was that by the time one has 16 bits for Load/Store displacements, they could almost just go +/- 32K and call it done.
>
But, much smaller than this, there is an advantage to scaling the displacements.
>
>
>
>
In other news, got around to getting the RISC-V code to build in PIE mode for Doom (by using "riscv64-unknown-linux-gnu-*").
>
Can note that RV64 code density takes a hit in this case:
RV64: 299K (.text)
XG2 : 284K (.text)
>
Is this indicative that your ISA and RISC-V are within spitting distance of each other in terms of the number of instructions in .text ?? or not ??
>
It would appear that, with my current compiler output, both BJX2-XG2 and RISC-V RV64G are within a few percent of each other...
If adjusting for Jumbo prefixes (with the version that omits GBR reloads):
XG2: 270K (-10K of Jumbo Prefixes)
Implying RISC-V now has around 11% more instructions in this scenario.
Based on Brian's LLVM compiler; RISC-V has about 40% more instructions
than My 66000, or My 66000 has 70% the number of instructions that RISC-V has (same compilation flags, same source code).
I have made some progress here recently, but it is still a case of (in my case):
Stronger ISA, but with a compiler with a weak optimizer;
Vs:
Weaker ISA, but vs a compiler with a stronger optimizer.
GCC is very clever at figuring out what to optimize...
Meanwhile, BGBCC may fail to optimize away constant sub-expressions if operator precedence doesn't fall in a preferable direction.
Say:
y=x*3*4;
Doing two multiply instructions in a row, because:
y=x*12;
Didn't happen to map to the AST as it was written (because parsing was left-associative in this case).
Yeah, actually ran into this recently, only solution at present is to put parenthesis around the constant parts.
But, yeah, seemingly GCC isn't fooled by things like precedence order.
Seemingly, it may even chase constants across basic blocks or across memory loads and stores, causing chunks of code to disappear, etc...
But, still not enough to make up for RV64G's weaknesses it seems.
Well, and Doom isn't full of a lot of cases for it to leverage its seeming aggressive constant-folding might...
It also has an additional 20K of ".rodata" that is likely constants, which likely overlap significantly with the jumbo prefixes.
My 66000 has vastly smaller .rodata because constants are part of .text
Similar, though in my case they exist as Jumbo prefixes.
Except well, if values are declared as "const double x0=...;", where BGBCC ends up treating it like a normal variable that does not allow assignment (so will generate different code than had one used #define or similar).
Also noted cases of this recently when diffing through my compiler output.
Does seem to be context-dependent to some extent though...
So, apparently using this version of GCC and using "-fPIE" works in my favor regarding code density...
>
>
I guess a question is what FDPIC would do if GCC supported it, since this would be the closest direct analog to my own ABI.
>
What is FDPIC ?? Federal Deposit Processor Insurance Corporation ??
Final Dopey Position Independent Code ??
>
Required a little digging: "Function Descriptor Position Independent Code".
But, I think the main difference is that, normal PIC does calls like like:
LD Rt, [GOT+Disp]
BSR Rt
CALX [IP,,#GOT+#disp-.]
It is unlikely that %GOT can be represented with 16-bit offset from IP
so the 32-bit displacement form (,,) is used.
Wheres, FDPIC was typically more like (pseudo ASM):
MOV SavedGOT, GOT
LEA Rt, [GOT+Disp]
MOV GOT, [Rt+8]
MOV Rt, [Rt+0]
BSR Rt
MOV GOT, SavedGOT
Since GOT is not in a register but is an address constant this is also::
CALX [IP,,#GOT+#disp-.]
So... Would this also cause GOT to point to a new address on the callee side (that is dependent on the GOT on the caller side, and *not* on the PC address at the destination) ?...
In effect, the context dependent GOT daisy-chaining is a fundamental aspect of FDPIC that is different from conventional PIC.
But, in my case, noting that function calls tend to be more common than the functions themselves, and functions will know whether or not they need to access global variables or call other functions, ... it made more sense to move this logic into the callee.
No official RISC-V FDPIC ABI that I am aware of, though some proposals did seem vaguely similar in some areas to what I was doing with PBO.
Where, they were accessing globals like:
LUI Xt, DispHi
ADD Xt, Xt, DispLo
ADD Xt, Xt, GP
LD Xd, Xt, 0
Granted, this is less efficient than, say:
MOV.Q (GBR, Disp33s), Rd
LDD Rd,[IP,,#GOT+#disp-.]
As noted, BJX2 can handle this in a single 64-bit instruction, vs 4 instructions.
Though, people didn't really detail the call sequence or prolog/epilog sequences, so less sure how this would work.
Likely guess, something like:
MV Xs, GP
LUI Xt, DispHi
ADD Xt, Xt, DispLo
ADD Xt, Xt, GP
LD GP, Xt, 8
LD Xt, Xt, 0
JALR LR, Xt, 0
MV GP, Xs
Well, unless they have a better way to pull this off...
CALX [IP,,#GOT+#disp-.]
Well, can you explain the semantics of this one...
But, yeah, as far as I saw it, my "better solution" was to put this part into the callee.
Main tradeoff with my design is:
From any GBR, one needs to be able to get to every other GBR;
We need to have a way to know which table entry to reload (not statically known at compile time).
Resolved by linker or accessed through GOT in mine. Each dynamic
module gets its own GOT.
The important thing is not associating a GOT with an ELF module, but with an instance of said module.
So, say, one copy of an ELF image, can have N separate GOTs and data sections (each associated with a program instance).
In my PBO ABI, this was accomplished by using base relocs (but, this is N/A for ELF, where PE/COFF style base relocs are not a thing).
One other option might be to use a PC-relative load to load the index.
Say:
AUIPC Xs, DispHi //"__global_pbo_offset$" ?
LD Xs, DispLo
LD Xt, GP, 0 //get table of offsets
ADD Xt, Xt, Xs
LD GP, Xt, 0
In this case, "__global_pbo_offset$" would be a magic constant variable that gets fixed up by the ELF loader.
LDD Rd,[IP,,#GOT+#disp-.]
Still going to need to explain the semantics here...
Based on previous examples, the above would presumably be a normal variable load.
This was not the purpose of the "__global_pbo_offset$" trick, but more how to perform the GP reload in RV64 in a way that does not require base relocs (and was compatible with the ELF way of doing things).
I guess some people are dragging their feet on FDPIC, as there is some debate as to whether or not NOMMU makes sense for RISC-V, along with its associated performance impact if used.
>
In my case, if I wanted to go over to simple base-relocatable images, this would technically eliminate the need for GBR reloading.
>
Checks:
Simple base-relocatable case actually currently generates bigger binaries, I suspect because in this case it is less space-efficient to use PC-rel vs GBR-rel.
>
Went and added a "pbostatic" option, which sidesteps saving and restoring GBR (making the simplifying assumption that functions will never be called from outside the current binary).
>
This saves roughly 4K (Doom's ".text" shrinks to 280K).
>
Would you be willing to compile DOOM with Brian's LLVM compiler and
show the results ??
>
Will need to download and build this compiler...
Might need to look into this.
Please do.
Extracting the ZIP file and "git clone llvm-project" etc, have thus far taken hours...
Well, and then the commands to CMake were not working, tried invoking cmake more minimally, and it gives a message complaining about the version being too old, ...
Seems I have to build it with a different / newer WSL instance (well, I guess it was either this or try to rebuild CMake from source).
Checks, download for compiler (+ git cloned LLVM) is a little over 6GB.
Well, OK, now LLVM is building... I guess, will see if it compiles and doesn't explode in the process. Probably going to be a while it seems.
But, yeah, current standing for this is:
XG2 : 280K (static linked, Modified PDPCLIB + TestKern)
RV64G : 299K (static linked, Modified PDPCLIB + TestKern)
X86-64: 288K ("gcc -O3", dynamically linked GLIBC)
X64 : 1083K (VS2022, static linked MSVCRT)
But, MSVC is an outlier here for just how bad it is on this front.
To get more reference points, would need to install more compilers.
Could have provided an ARM reference point, except that the compiler isn't compiling stuff at the moment (would need to beat on stuff a bit more to try to get it to build; appears to be trying to build with static-linked Newlib but is missing symbols, ...).
But, yeah, for good comparison, one needs to have everything build with the same C library, etc.
I am thinking it may be possible to save a little more space by folding some of the stuff for "va_start()" into an ASM blob (currently, a lot of stuff is folded off into the function prolog, but probably doesn't need to be done inline for every varargs function).
Mostly this would be the logic for spilling all of the argument registers to a location on the stack and similar.
Part of ENTER already does this: A typical subroutine will use::
ENTER R27,R0,#local_stack_size
Where the varargs subroutine will use::
ENTER R27,R8,#local_stack_size
ADD Rva_ptr,SP,#local_stack_size+64
notice all we had to do was to specify 8 more registers to be stored;
and exit with::
EXIT R27,R0,#local_stack_size+64
Here we skip over the 8 register variable arguments without reloading
them.
It is mostly a chunk of code for storing the argument registers to memory, either 8 or 16 depending on the ABI variant. Need to save them off to memory mostly so "va_arg()" can see them.
Previously, this part has been done inline, but is a fairly repetitive code sequence...
Though, folding it off doesn't really seem to have saved all that much...
....
Date | Sujet | # | | Auteur |
17 Apr 24 | Stealing a Great Idea from the 6600 | 128 | | John Savard |
18 Apr 24 | Re: Stealing a Great Idea from the 6600 | 125 | | MitchAlsup1 |
18 Apr 24 | Re: Stealing a Great Idea from the 6600 | 124 | | John Savard |
18 Apr 24 | Re: Stealing a Great Idea from the 6600 | 123 | | MitchAlsup1 |
19 Apr 24 | Re: Stealing a Great Idea from the 6600 | 122 | | John Savard |
19 Apr 24 | Re: Stealing a Great Idea from the 6600 | 121 | | John Savard |
19 Apr 24 | Re: Stealing a Great Idea from the 6600 | 120 | | MitchAlsup1 |
20 Apr 24 | Re: Stealing a Great Idea from the 6600 | 2 | | John Savard |
21 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | John Savard |
20 Apr 24 | Re: Stealing a Great Idea from the 6600 | 117 | | John Savard |
20 Apr 24 | Re: Stealing a Great Idea from the 6600 | 116 | | John Savard |
20 Apr 24 | Re: Stealing a Great Idea from the 6600 | 115 | | MitchAlsup1 |
20 Apr 24 | Re: Stealing a Great Idea from the 6600 | 105 | | BGB |
21 Apr 24 | Re: Stealing a Great Idea from the 6600 | 104 | | MitchAlsup1 |
21 Apr 24 | Re: Stealing a Great Idea from the 6600 | 63 | | John Savard |
21 Apr 24 | Re: Stealing a Great Idea from the 6600 | 15 | | John Savard |
25 Apr 24 | Re: Stealing a Great Idea from the 6600 | 14 | | Lawrence D'Oliveiro |
25 Apr 24 | Re: Stealing a Great Idea from the 6600 | 12 | | MitchAlsup1 |
25 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | Lawrence D'Oliveiro |
30 Apr 24 | Re: a bit of history, Stealing a Great Idea from the 6600 | 10 | | John Levine |
3 May 24 | Re: a bit of history, Stealing a Great Idea from the 6600 | 9 | | Anton Ertl |
3 May 24 | Re: a bit of history, Stealing a Great Idea from the 6600 | 7 | | John Levine |
4 May 24 | Re: a bit of history, Stealing a Great Idea from the 6600 | 6 | | Thomas Koenig |
4 May 24 | Re: a bit of history, Stealing a Great Idea from the 6600 | 4 | | John Levine |
4 May 24 | Re: a bit of history, Stealing a Great Idea from the 6600 | 3 | | MitchAlsup1 |
5 May 24 | Re: a bit of history, Stealing a Great Idea from the 6600 | 2 | | Thomas Koenig |
5 May 24 | Re: a bit of history, Stealing a Great Idea from the 6600 | 1 | | MitchAlsup1 |
28 Jul 24 | Re: a bit of history, Stealing a Great Idea from the 6600 | 1 | | Lawrence D'Oliveiro |
3 May 24 | Re: a bit of history, Stealing a Great Idea from the 6600 | 1 | | MitchAlsup1 |
25 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | John Savard |
21 Apr 24 | Re: Stealing a Great Idea from the 6600 | 47 | | MitchAlsup1 |
23 Apr 24 | Re: Stealing a Great Idea from the 6600 | 45 | | George Neuner |
23 Apr 24 | Re: Stealing a Great Idea from the 6600 | 44 | | MitchAlsup1 |
25 Apr 24 | Re: Stealing a Great Idea from the 6600 | 43 | | George Neuner |
26 Apr 24 | Re: Stealing a Great Idea from the 6600 | 42 | | BGB |
26 Apr 24 | Re: Stealing a Great Idea from the 6600 | 41 | | MitchAlsup1 |
26 Apr 24 | Re: Stealing a Great Idea from the 6600 | 2 | | Anton Ertl |
26 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | MitchAlsup1 |
26 Apr 24 | Re: Stealing a Great Idea from the 6600 | 4 | | BGB |
26 Apr 24 | Re: Stealing a Great Idea from the 6600 | 2 | | MitchAlsup1 |
27 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | BGB |
26 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | MitchAlsup1 |
27 Apr 24 | Re: Stealing a Great Idea from the 6600 | 34 | | BGB |
27 Apr 24 | Re: Stealing a Great Idea from the 6600 | 33 | | MitchAlsup1 |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 32 | | BGB |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 31 | | MitchAlsup1 |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 30 | | BGB |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 24 | | BGB |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 23 | | BGB |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 22 | | Thomas Koenig |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 21 | | BGB |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 20 | | BGB |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 2 | | Thomas Koenig |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | BGB |
29 Jul 24 | Re: Stealing a Great Idea from the 6600 | 16 | | Lawrence D'Oliveiro |
29 Jul 24 | Re: Stealing a Great Idea from the 6600 | 6 | | BGB |
30 Jul 24 | Re: Stealing a Great Idea from the 6600 | 5 | | Lawrence D'Oliveiro |
30 Jul 24 | Re: Stealing a Great Idea from the 6600 | 4 | | BGB |
31 Jul 24 | Re: Stealing a Great Idea from the 6600 | 3 | | Lawrence D'Oliveiro |
31 Jul 24 | Re: Stealing a Great Idea from the 6600 | 2 | | BGB |
1 Aug 24 | Re: Stealing a Great Idea from the 6600 | 1 | | Lawrence D'Oliveiro |
29 Jul 24 | Re: Stealing a Great Idea from the 6600 | 9 | | Terje Mathisen |
29 Jul 24 | Re: Stealing a Great Idea from the 6600 | 8 | | MitchAlsup1 |
30 Jul 24 | Re: Stealing a Great Idea from the 6600 | 1 | | Lawrence D'Oliveiro |
30 Jul 24 | Re: Stealing a Great Idea from the 6600 | 4 | | Michael S |
30 Jul 24 | Re: Stealing a Great Idea from the 6600 | 3 | | MitchAlsup1 |
31 Jul 24 | Re: Stealing a Great Idea from the 6600 | 2 | | BGB |
1 Aug 24 | Re: Stealing a Great Idea from the 6600 | 1 | | Lawrence D'Oliveiro |
1 Aug 24 | Re: Stealing a Great Idea from the 6600 | 2 | | Thomas Koenig |
1 Aug 24 | Re: Stealing a Great Idea from the 6600 | 1 | | MitchAlsup1 |
29 Jul 24 | Re: Stealing a Great Idea from the 6600 | 1 | | George Neuner |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 5 | | MitchAlsup1 |
28 Apr 24 | Re: Stealing a Great Idea from the 6600 | 4 | | BGB |
29 Apr 24 | Re: Stealing a Great Idea from the 6600 | 3 | | MitchAlsup1 |
29 Apr 24 | Re: Stealing a Great Idea from the 6600 | 2 | | BGB |
29 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | Thomas Koenig |
29 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | Tim Rentsch |
21 Apr 24 | Re: Stealing a Great Idea from the 6600 | 40 | | BGB |
21 Apr 24 | Re: Stealing a Great Idea from the 6600 | 39 | | MitchAlsup1 |
22 Apr 24 | Re: Stealing a Great Idea from the 6600 | 3 | | BGB |
22 Apr 24 | Re: Stealing a Great Idea from the 6600 | 2 | | MitchAlsup1 |
22 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | BGB |
22 Apr 24 | Re: Stealing a Great Idea from the 6600 | 2 | | John Savard |
22 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | BGB |
22 Apr 24 | Re: Stealing a Great Idea from the 6600 | 33 | | Terje Mathisen |
22 Apr 24 | Re: Stealing a Great Idea from the 6600 | 1 | | BGB |
13 Jun 24 | Re: Stealing a Great Idea from the 6600 | 31 | | Kent Dickey |
13 Jun 24 | Re: Stealing a Great Idea from the 6600 | 16 | | Stefan Monnier |
13 Jun 24 | Re: Stealing a Great Idea from the 6600 | 15 | | BGB |
13 Jun 24 | Re: Stealing a Great Idea from the 6600 | 14 | | MitchAlsup1 |
14 Jun 24 | Re: Stealing a Great Idea from the 6600 | 13 | | BGB |
18 Jun 24 | Re: Stealing a Great Idea from the 6600 | 12 | | MitchAlsup1 |
19 Jun 24 | Re: Stealing a Great Idea from the 6600 | 8 | | BGB |
19 Jun 24 | Re: Stealing a Great Idea from the 6600 | 7 | | MitchAlsup1 |
19 Jun 24 | Re: Stealing a Great Idea from the 6600 | 5 | | BGB |
19 Jun 24 | Re: Stealing a Great Idea from the 6600 | 4 | | MitchAlsup1 |
20 Jun 24 | Re: Stealing a Great Idea from the 6600 | 3 | | Thomas Koenig |
20 Jun 24 | Re: Stealing a Great Idea from the 6600 | 2 | | MitchAlsup1 |
21 Jun 24 | Re: Stealing a Great Idea from the 6600 | 1 | | Thomas Koenig |
20 Jun 24 | Re: Stealing a Great Idea from the 6600 | 1 | | John Savard |
19 Jun 24 | Re: Stealing a Great Idea from the 6600 | 1 | | Thomas Koenig |
20 Jun 24 | Re: Stealing a Great Idea from the 6600 | 1 | | MitchAlsup1 |
31 Jul 24 | Re: Stealing a Great Idea from the 6600 | 1 | | Lawrence D'Oliveiro |
13 Jun 24 | Re: Stealing a Great Idea from the 6600 | 13 | | MitchAlsup1 |
14 Jun 24 | Re: Stealing a Great Idea from the 6600 | 1 | | Terje Mathisen |
22 Apr 24 | Re: Stealing a Great Idea from the 6600 | 9 | | John Savard |
18 Apr 24 | Re: Stealing a Great Idea from the 6600 | 2 | | Lawrence D'Oliveiro |