On 8/29/2024 11:23 AM, MitchAlsup1 wrote:
On Thu, 29 Aug 2024 3:36:44 +0000, BGB wrote:
On 8/28/2024 11:40 AM, MitchAlsup1 wrote:
On Wed, 28 Aug 2024 3:33:40 +0000, BGB wrote:
>
And what kind of code compatibility would you have between different
designs...
>
>
If people can agree as to the encodings, then implementations are more
free to pick which extensions they want or don't want.
>
If the encodings conflict with each other, no such free choice is
possible.
With differing instructions, how does a software vendor write software
such that it can run near optimally on any implementation ??
They presumably target whatever is common, or the least common denominator (such as RV64G or RV64GC), and settle with "probably good enough"...
But, probably not too much different from other ISAs, just with a lot more parties involved.
The alternative is that one expects that all the software be rebuilt for the specific configuration being used, or recompiled from source or some other distribution format on the local machine which it is to be run (with binaries distributed as some form of "portable IR").
Though, the latter is hindered to some extent by lack of a "good" portable IR:
1, Non proprietary;
2, Sensible design;
3, Works acceptable for various languages;
4, Works well for C and C++;
...
But, sadly, there is no really "good" option.
CIL / MSIL (.NET): Fails on 1 and 4;
WASM: Fails on 2;
...
While C can be compiled to .NET, it fails to pass a reasonable definition of "good" (and the generated binaries tend to fail on non-MS implementations, such as Mono).
Elsewhere, the "accepted" solution is that people release the source, and a "./configure" script or similar will figure it out.
Though, sadly, the situation with getting non-trivial programs built for my stuff is a bit more involved than passing a "--target" option to configure.
But, this is part of why I am still messing with RISC-V Mode:
Theoretically, if I can get the RISC-V stuff working better, and various common Linux libraries ported, it should get easier.
Though, while one can get it to target "riscv64-unknown-linux-gnu", not really an obvious way to convince "configure" that it needs to use "-fPIC" and "-fPIE" for *everything* that gets built (rather than just for shared objects).
Also sorta annoying that RV64 binaries need to reload a full copy of the ELF for every program instance.
If not for these issues (and the 'C' extension), could almost try to rip userland binaries from the RISC-V Ubuntu build.
Or, maybe start trying to working towards a direction where I "can" just start ripping binaries off of Ubuntu or similar (and running them on top of TestKern, sorta like WSL).
Major steps needed for this:
Implement support for Linux syscalls;
Would allow binaries to use GLIBC;
Implement support for separate per-process address spaces;
Would eliminate the need for mandatory "-fPIE";
Implement 'C' extension;
Would allow directly ripping existing userland binaries.
...
Though, does raise the question of the point of having a custom ISA if it is only being used by the kernel and maybe a few programs. But, potentially, still more useful than not having any real userland.
Prolog/Epilog happens once per function, and often may be skipped for
small leaf functions, so seems like a lower priority. More so, if one
lacks a good way to optimize it much beyond the sequence of load/store
ops which is would be replacing (and maybe not a way to do it much
faster than however can be moved in a single clock cycle with the
available register ports).
>
My 1-wide machines does ENTER and EXIT at 4 registers per cycle.
Try doing 4 LDs or 4 STs per cycle on a 1-wide machine.
>
>
It likely isn't going to happen because a 1-wide machine isn't going to
have the needed register ports.
3R1W most of the time converts to 4R or 4W for the *logues.
Having a register port "change direction" seems like an issue.
Only real way I can imagine this happening is if one has "inout" ports or similar, but, ... the tools don't like this.
For things like bidirectional pins, I usually needed to split them internally into:
pins_i, pins_o, pins_d
Or, in/out/direction.
With logic in the toplevel like:
inout[15:0] pins;
assign pins[0] = pins_d[0] ? pins_o[0] : 1'bZ;
assign pins[1] = ...
...
assign pins_i[0] = pins_d[0] ? 1'b1 : pins[0];
...
Trying to do this in internal modules generally resulting in synthesis warnings, or Verilator rejecting it entirely.
As far as logic goes, such a register file may as well be 4R1W.
For 2 wide profiles, I had used a 4R2W design.
This mostly works, but disallows a few semi-common cases, and (sadly) with WEX there isn't a way to make code that runs on both at the same time and gives best-case performance. It is now almost tempting to give in and go over to superscalar (well, since I already did it and it seems to work OK for RV mode).
But, if one doesn't have the register ports, there is likely no viable
way to move 4 registers/cycle to/from memory (and it wouldn't make sense
for the register file to have a path to memory that is wider than what
the pipeline has).
---------------
This is likely the fate of nearly every hobby class ISA.
>
Time to up your game to an industrial quality ISA.
>
Open question of what an "industrial quality" ISA has that BJX2 lacks...
Limiting the scope to things that RISC-V and ARM have.
Proper handling of exceptions (ignoring them is not proper)
If you mean FPU exceptions, maybe.
As far as general interrupt handling, mechanism isn't too far off from what SH-4 had used, and apparently also RISC-V's CLINT and MIPS work in a similar way.
Though, with differences as to how they divide up exceptions.
In my case:
Reset;
General Fault;
External Interrupt;
TLB/MMU;
Syscall.
RV CLINT apparently uses either a single entry point for everything, or a table of more specialized interrupt entry points. All 3 ISA's apparently agree on the idea of interrupt entry points being an array of branch instructions to the respective handlers.
I had thought RV was using a more complex mechanism, but when I looked back into it, it was different.
Proper IEEE 754-2018 handling of FMAC (compute all the bits)
Possibly true.
My FPU can more-or-less pass the 1985 spec, but not the 2018 spec.
Floating Point Transcendentals
Not present in many/most ISA's I have looked at.
HyperVisors/Secure Monitors
Possible. I had considered doing it essentially with emulators, but granted, this is not quite the same thing.
Seems many of the extant RV implementations don't have this either.
Write Interrupt service routines entirely in HLL
If you mean C... I do have this.
#ifdef TK_REGSAVE_TBR
__interrupt_tbrsave void __isr_syscall(void)
#else
__interrupt void __isr_syscall(void)
#endif
{
...
}
AKA: What exactly is the '__interrupt' for?...
However, the ISR's can't access virtual memory apart from manually translating the pointers.
The various architectural CR's can be accessed from C as well, such as "__arch_tbr" to access TBR, etc.
proper Privileges and Priorities
?...
If you mean a User/Supervisor mode split in the ISA, this does exist.
Not yet properly working in TestKern, but this is more a software thing in theory.
Proper rollout of usermode was delayed partly by needing to get virtual memory working more reliably (seems with the VM + RV issue, there is still something amiss here). And, also, eliminating any raw hardware access from the ported software (should be mostly done; programs have been moved to 'TKGDI' which basically wraps graphics/audio/MIDI stuff).
Multi-location ATOMIC events
Possibly true.
Maybe the "volatile" mechanism is weak.
Did recently end up improving things, so now (in theory) there is a Volatile XCHG.VL instruction that can be used to implement spinlocks.
Though, proper mutex locking will still require cache flushes to avoid programs seeing stale data. May need to work on this.
..
>