Re: Why I've Dropped In

Liste des GroupesRevenir à c arch 
Sujet : Re: Why I've Dropped In
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 15. Jun 2025, 19:21:48
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <102n2vt$12enl$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11
User-Agent : Mozilla Thunderbird
On 6/12/2025 7:00 PM, MitchAlsup1 wrote:
On Thu, 12 Jun 2025 21:30:39 +0000, BGB wrote:
 
On 6/12/2025 2:13 PM, MitchAlsup1 wrote:
------------------------------
>
GP: Similar situation to LR, as it mostly looks like a GPR.
In my CPU core and JX2VM, the high bits of GP were aliased to FPSR, so
saving/restoring GP will also implicitly save/restore the dynamic
rounding mode and similar (as opposed to proper RISC-V which has this
stuff in a CSR).
>
With universal constants, you get this register back.
>
>
Well, if using an ABI that either allows absolute addressing or PC-rel
access to globals.
 It is ISA that directly supports access to Globals.
 
>
The ABI designs I am using in BGBCC and TestKern use a global pointer
for accessing globals, and allocate the storage for ".data"/".bss"
separately from ".text". In this ABI design, the pointer is unavoidable.
 I want a system where .data and .bss can be > 1TB away from each other.
So, that .data grows when ld.so loads another dynamic library, and .bss
grows for the same reasons.
 
At least with PE / PEL4, each EXE or DLL gets its own index/value for the global pointer, then there is a mechanism for reloading it as needed (usually needed for any DLL exports or non-leaf function pointers).
Relocating each image in a GP-relative manner was considered at one point, but didn't go that way.
I guess, could be done for ELF, though with my current ISA designs, doing it this way would have a limit of ~ 16 or 32 GB.
For PIE binaries, usual option is to load an all-new copy of each image, but this is not ideal.
The more traditional solution for this sort of thing in ELF land is FDPIC, though RV64+FDPIC is apparently not a thing (seemingly FDPIC only being supported for 32 bit targets).
Also the way FDPIC was usually done carries a fairly high performance overhead (callee side GP reloads for every function call, rather than merely across DLL/SO boundaries).
An intermediate option would be to use local calls for local functions, and then go through a thunk (which reloads GP) for non-local calls, though ELF often does not distinguish (and it would also leave the responsibility mostly to "ld.so" to deal with it).
More annoyingly, in Unix land, there is generally no distinction between local functions and exports (and it isn't really possible to distinguish these cases within the scope of a single translation unit; which is a problem for traditional separate compilation).
One option could be to create separate internal/external entry points for each function (the external version effectively having the reload stuff). Then "ld.so" could fill the GOT's with each type of function.
Still would have the overhead of using GOT calls though (and additional overhead whenever using an external call, though for some cases it may make sense to do external-only and always perform the reload, but then one is back to the compiler needing to know the difference, ...).

Does allow multiple process instances in a single address space with
non-duplicated ".text" though (and is more friendly towards NOMMU
operation).
>
>
>
>
Though, this isn't practically too much different from using the HOB's
of captured LR values to hold the CPU ISA mode and similar (which my
newer X3VM retains, though I am still on the fence about the "put FPSR
bits into HOBs of GP" thing).
>
Does mean that either dynamic rounding mode is lost every time a GP
reload is done (though, only for the callee), or that setting the
rounding mode also needs to update the corresponding PBO GP pointer
(which would effectively make it semi-global but tied to each PE image).
>
The traditional assumption though was that dynamic rounding mode is
fully global, and I had been trying to make it dynamically scoped.
>
The modern interpretation is that the dynamic rounding mode can be set
prior to any FP instruction. So, you better be able to set it rapidly
and without pipeline drain, and you need to mark the downstream FP
instructions as dependent on this.
>
Errm, there is likely to be a delay here, otherwise one will get a stale
rounding mode.
 RM is "just 3-bits" that get read from control register and piped
through instruction queue to function unit. Think of the problem
one would have if a hyperthreaded core had to stutter step through
changing RM ...
 
To do it more quickly, one would likely need special logic in the pipeline for getting the updated RM to the FPU in a more timely manner.
If done (as-is) in a lax way: Held in the HOBs of GP/GBR or similar, which is handled as an SPR that gets broadcast out of the regfile.
Then one has the latency issue:
The new value needs to reach the regfile (WB stage);
The value then needs to make its way to the relevant ID2/RF stage (next cycle after WB).
A lazy option would be to add an interlock so that any dynamic rounding mode instruction would generate pipeline stalls for any in-flight modifications to GBR (as opposed to using a branch or a series of NOPs). This was not done in my existing implementation.
But, IME, the "fenv.h" stuff, and FENV_ACCESS, is rarely used.
So, making "fesetround()" or similar faster doesn't seem like a high priority.
If having "fsetround()" as a function call, can also ensure the needed delay as-is by using a non-default register during the return (mostly to hinder the branch predictor).

>
So, setting the rounding mode might be something like:
   MOV .L0, R14
   MOVTT GP, 0x8001, GP  //Set to rounding mode 1, clear flag bits
   JMP R14         //use branch to flush pipeline
   .L0:            //updated FPSR now ready
   FADDG R11, R12, R10  //FADD, dynamic mode
 Setting RM to a constant (known) value::
      HRW  rd,RM,#imm3    // rd gets old value
 
It is possible,
Could almost alias the bits to part of SR, where SR does generally have a more timely update process (could reduce latency to 2 cycles).
At present, the RM field is held in GBR(51:48), with fast update options either being a MOVTT (can replace the high 16 bits, *1) or BITMOV,
*1: There is a MOVTT Imm5/Imm6 variant, currently can only modify (63:60) though.
Though, this strategy is only directly usable in XG3 (where GBR is mapped to R3/X3), N/A in XG1 or XG2, where GBR is in CR space and so would require 3 instructions.
Implicitly, the fragment assumed XG3, but then this leaves open the issue of whether to use my former ASM syntax or RISC-V style ASM syntax (BGBCC can sorta accept either, with my newer X3VM experiment defaulting to RISC-V syntax).
Can note that the RISC-V F/D instructions define a fixed rounding modes in the instruction, with rounding modes for a dynamic rounding mode (though, IIRC, no way to update the dynamic RM within the scope of the base ISA; so one needs Zicsr and similar to pull it off).
Where, say, the relevant bits in FPSCR would be aliased with the bits in GBR, which is potentially crap, but, ... Yeah. Though, it could be possible to make the bit-aliasing behavior depend on the ISA (say, if CPU is set to RV64GC mode or similar, it becomes CSR only; and the HOB parts of GP are ignored).
Would need to evaluate whether FPSCR updates immediately followed by dynamic-rounding instructions are actually a thing (vs, say, fsetround and similar always existing as function calls).

Or, use an encoding with an explicit (static) rounding mode:
   FADD R11, R12, 1, R10
>

Date Sujet#  Auteur
19 May 25 * Why I've Dropped In417quadibloc
19 May 25 +* Re: Why I've Dropped In349quadibloc
21 May 25 i`* Re: Why I've Dropped In348quadibloc
22 May 25 i +* Re: Why I've Dropped In11David Chmelik
22 May 25 i i+* Re: Why I've Dropped In2MitchAlsup1
23 May 25 i ii`- Re: Why I've Dropped In1MitchAlsup1
10 Jun 25 i i`* Re: Why I've Dropped In8quadibloc
11 Jun 25 i i +- Re: Why I've Dropped In1BGB
11 Jun 25 i i `* Re: Why I've Dropped In6quadibloc
11 Jun 25 i i  +* Re: Why I've Dropped In4Chris M. Thomasson
12 Jun 25 i i  i`* Re: Why I've Dropped In3quadibloc
12 Jun 25 i i  i +- Re: Why I've Dropped In1Chris M. Thomasson
16 Jun 25 i i  i `- Re: Why I've Dropped In1Chris M. Thomasson
12 Jun 25 i i  `- Re: Why I've Dropped In1quadibloc
10 Jun 25 i +* Re: Why I've Dropped In335quadibloc
11 Jun 25 i i+* Re: Why I've Dropped In322Thomas Koenig
11 Jun 25 i ii+* Re: Why I've Dropped In23BGB
11 Jun 25 i iii+* Re: Why I've Dropped In8MitchAlsup1
11 Jun 25 i iiii`* Re: Why I've Dropped In7BGB
12 Jun 25 i iiii `* Re: Why I've Dropped In6MitchAlsup1
12 Jun 25 i iiii  `* Re: Why I've Dropped In5BGB
13 Jun 25 i iiii   `* Re: Why I've Dropped In4MitchAlsup1
15 Jun 25 i iiii    `* Re: Why I've Dropped In3BGB
15 Jun 25 i iiii     `* Re: Why I've Dropped In2MitchAlsup1
15 Jun 25 i iiii      `- Re: Why I've Dropped In1BGB
11 Jun 25 i iii+* Re: Why I've Dropped In10Anton Ertl
11 Jun 25 i iiii+* Re: Why I've Dropped In6MitchAlsup1
12 Jun 25 i iiiii`* Re: Why I've Dropped In5MitchAlsup1
12 Jun 25 i iiiii `* Re: Why I've Dropped In4Anton Ertl
12 Jun 25 i iiiii  +* Re: Why I've Dropped In2MitchAlsup1
20 Jun 25 i iiiii  i`- Re: Why I've Dropped In1Anton Ertl
12 Jun 25 i iiiii  `- Re: Why I've Dropped In1Thomas Koenig
11 Jun 25 i iiii`* Re: Why I've Dropped In3BGB
12 Jun 25 i iiii `* Re: Why I've Dropped In2Anton Ertl
12 Jun 25 i iiii  `- Re: Why I've Dropped In1BGB
20 Jun 25 i iii`* Re: Why I've Dropped In4quadibloc
20 Jun 25 i iii `* Re: Why I've Dropped In3MitchAlsup1
20 Jun 25 i iii  `* Re: Why I've Dropped In2moi
20 Jun 25 i iii   `- Re: Why I've Dropped In1quadibloc
11 Jun 25 i ii`* Re: Why I've Dropped In298quadibloc
11 Jun 25 i ii +* Re: Why I've Dropped In19MitchAlsup1
11 Jun 25 i ii i+* Re: Why I've Dropped In3quadibloc
11 Jun 25 i ii ii`* Re: Why I've Dropped In2MitchAlsup1
14 Jun 25 i ii ii `- Re: Why I've Dropped In1BGB
16 Jun 25 i ii i`* Re: Why I've Dropped In15Stefan Monnier
17 Jun 25 i ii i +- Re: Why I've Dropped In1quadibloc
17 Jun 25 i ii i `* Re: Why I've Dropped In13Stephen Fuld
17 Jun 25 i ii i  `* Re: Why I've Dropped In12MitchAlsup1
17 Jun 25 i ii i   +- Re: Why I've Dropped In1Stephen Fuld
17 Jun 25 i ii i   `* Re: Why I've Dropped In10Stefan Monnier
17 Jun 25 i ii i    +* Re: Why I've Dropped In6MitchAlsup1
17 Jun 25 i ii i    i`* Re: Why I've Dropped In5Stefan Monnier
18 Jun 25 i ii i    i `* Re: Why I've Dropped In4Anton Ertl
18 Jun 25 i ii i    i  +* Re: Why I've Dropped In2Stefan Monnier
19 Jun 25 i ii i    i  i`- Re: Why I've Dropped In1Anton Ertl
18 Jun 25 i ii i    i  `- Re: Why I've Dropped In1BGB
18 Jun 25 i ii i    `* Re: Why I've Dropped In3Chris M. Thomasson
18 Jun 25 i ii i     `* Re: Why I've Dropped In2Stefan Monnier
20 Jun 25 i ii i      `- Re: Why I've Dropped In1Chris M. Thomasson
11 Jun 25 i ii +* Re: Why I've Dropped In198Thomas Koenig
12 Jun 25 i ii i`* Re: Why I've Dropped In197quadibloc
12 Jun 25 i ii i +* Re: Why I've Dropped In193Stephen Fuld
13 Jun 25 i ii i i+* Re: Why I've Dropped In54quadibloc
13 Jun 25 i ii i ii`* Re: Why I've Dropped In53Stephen Fuld
13 Jun 25 i ii i ii `* Re: Why I've Dropped In52Thomas Koenig
13 Jun 25 i ii i ii  +- Re: Why I've Dropped In1quadibloc
13 Jun 25 i ii i ii  `* Re: Why I've Dropped In50Stephen Fuld
13 Jun 25 i ii i ii   `* Re: Why I've Dropped In49Thomas Koenig
13 Jun 25 i ii i ii    +* Re: Why I've Dropped In21Stephen Fuld
13 Jun 25 i ii i ii    i+* Re: Why I've Dropped In19Thomas Koenig
13 Jun 25 i ii i ii    ii+* Re: Why I've Dropped In2MitchAlsup1
15 Jun 25 i ii i ii    iii`- Re: Why I've Dropped In1Stephen Fuld
13 Jun 25 i ii i ii    ii+- Re: Why I've Dropped In1Stephen Fuld
15 Jun 25 i ii i ii    ii`* Re: base and bounds, Why I've Dropped In15John Levine
15 Jun 25 i ii i ii    ii +* Re: base and bounds, Why I've Dropped In13Stephen Fuld
15 Jun 25 i ii i ii    ii i`* Re: base and bounds, Why I've Dropped In12John Levine
15 Jun 25 i ii i ii    ii i +* Re: base and bounds, Why I've Dropped In9MitchAlsup1
16 Jun 25 i ii i ii    ii i i+* Re: base and bounds, Why I've Dropped In7Stephen Fuld
16 Jun 25 i ii i ii    ii i ii+* Re: base and bounds, Why I've Dropped In2quadibloc
16 Jun 25 i ii i ii    ii i iii`- Re: base and bounds, Why I've Dropped In1Stephen Fuld
16 Jun 25 i ii i ii    ii i ii`* Re: base and bounds, Why I've Dropped In4MitchAlsup1
16 Jun 25 i ii i ii    ii i ii `* Re: base and bounds, Why I've Dropped In3Stephen Fuld
16 Jun 25 i ii i ii    ii i ii  `* Re: base and bounds, Why I've Dropped In2quadibloc
16 Jun 25 i ii i ii    ii i ii   `- Re: base and bounds, Why I've Dropped In1Stephen Fuld
16 Jun 25 i ii i ii    ii i i`- Re: base and bounds, Why I've Dropped In1quadibloc
15 Jun 25 i ii i ii    ii i `* Re: base and bounds, Why I've Dropped In2Stephen Fuld
16 Jun 25 i ii i ii    ii i  `- Re: base and bounds, Why I've Dropped In1John Levine
16 Jun 25 i ii i ii    ii `- Re: big pages, base and bounds, Why I've Dropped In1John Levine
13 Jun 25 i ii i ii    i`- Re: Why I've Dropped In1Lars Poulsen
13 Jun 25 i ii i ii    +- Re: Why I've Dropped In1MitchAlsup1
13 Jun 25 i ii i ii    `* Re: Why I've Dropped In26quadibloc
14 Jun 25 i ii i ii     `* Re: Why I've Dropped In25Thomas Koenig
14 Jun 25 i ii i ii      `* Re: Why I've Dropped In24Stephen Fuld
14 Jun 25 i ii i ii       +* Re: Why I've Dropped In3Thomas Koenig
14 Jun 25 i ii i ii       i`* Re: Why I've Dropped In2Stephen Fuld
14 Jun 25 i ii i ii       i `- Re: Why I've Dropped In1Thomas Koenig
14 Jun 25 i ii i ii       +* Re: Why I've Dropped In14Stephen Fuld
14 Jun 25 i ii i ii       i`* Re: Why I've Dropped In13quadibloc
14 Jun 25 i ii i ii       i +- Re: Why I've Dropped In1Stephen Fuld
14 Jun 25 i ii i ii       i `* Re: Why I've Dropped In11quadibloc
15 Jun 25 i ii i ii       i  `* Re: Why I've Dropped In10Stephen Fuld
15 Jun 25 i ii i ii       `* Re: Why I've Dropped In6quadibloc
13 Jun 25 i ii i i+* Re: Why I've Dropped In134quadibloc
14 Jun 25 i ii i i+* Re: base registers and addres size, Why I've Dropped In3John Levine
18 Jun 25 i ii i i`- Re: Why I've Dropped In1Lynn Wheeler
13 Jun 25 i ii i `* Re: Why I've Dropped In3BGB
11 Jun 25 i ii +* Re: Why I've Dropped In55Anton Ertl
11 Jun 25 i ii +* Re: Why I've Dropped In4quadibloc
11 Jun 25 i ii `* Re: Why I've Dropped In21MitchAlsup1
11 Jun 25 i i+* Re: Why I've Dropped In11quadibloc
13 Jun 25 i i`- Re: Why I've Dropped In1quadibloc
16 Jun 25 i `- Re: Why I've Dropped In1quadibloc
12 Jun 25 +* Re: Why I've Dropped In58quadibloc
27 Jun 25 `* Re: errno, Code density9John Levine

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal