On 8/20/2024 8:40 AM, Stefan Monnier wrote:
I can understand the reluctance to go to 6 bit register specifiers, it
burns up your opcode space and makes encoding everything more difficult.
But today that is an unserviced market which will get customers to give you
a look. Put out some vapor ware and see what customers say.
If the issue is only the encoding, then presumably, Mitch could go the
route of a prefix instruction (like his PRED instruction or the
instruction he uses to do wide shifts/adds/...).
Ironically, in my case, this was one of the original uses of the FF prefix word.
FFwZ-Zvii
w: {WW, WN, WM, WO}/{WW, Wnmo} (Register Extensions, Bit 5)
WW: Typically Opcode
Z: Opcode Extension
v: {VV, Sc/I, Sc/I, I} (V Bit, Scale|Immed, Immed)
ii: Immed
Except for 4R ops:
FFwZ-Zvpp
v: V, Sc/Z, Sc/Z, Z
pp: ZZ, pppppp (Rp)
Interpretation partly depends on the op that follows.
Typical Encoding for base instruction:
PYnm-ZeoZ
P: 111P in Baseline, ~P=Predicated, {~Wnmo, p} in XG2
Y: {Z, Wx, Z, Z}
Wx: WEX Bit if not Predicated, False Bit Predicated
Z: Opcode
e: {EQ, EN, EM, EO} (Register Extensions, Bit 4)
EQ: Usually Opcode
n: nnnn (low 4 bits of Rn/Rd)
m: mmmm (low 4 bits of Rm/Rs)
o: oooo (low 4 bits of Ro/Rt)
And, for Immed Form instructions:
PYnm-Zeii (3RI, Imm9 / Imm10)
PYnZ-Zeii (2RI, Imm10 / Imm12)
PYZn-iiii (2RI, Imm16)
Breaks conventions with rest of encoding scheme.
Within the notation:
0..9, A..F: Literal Hexadecimal Values.
For reasons, the instruction stream was partly defined in terms of 16-bit words. There are 16-bit ops in Baseline, but are N/E in XG2.
Typically:
ZZnm //2R
ZZni //2RI (Imm4)
ZZnZ //1R
Znii //2RI (Imm8)
ZZii //1I (Imm8)
(Only 2R ops, usually limited to R0..R15)
Some (logically) 3R ops exist, with Ro/Rt hard wired to R0.
Where R0 was a designated "stomp" register.
In my case, I usually identify instruction blocks as FZ (F0, F1, F2, ...), which basically gives the P and Y fields as they would appear in a Base instruction.
In cases where there is no register fields, typically the corresponding E and W bits are assigned either to the Immediate Field (if present) or Opcode.
If an FF prefix is used, in an Immediate-form instruction, it will glue its immediate bits onto the base immediate form:
Imm5: 5+11+1 = 17
Can also turn some 3R ops into 3RI Imm17 forms.
Can note development path was like:
Initial (direct predecessor) ISA: 16-bit Ops, R0..R15
This was a 32-bit "BSR1 ISA"
Was sort of a redesigned SH based design,
trying to be made more competitive with MSP430 in code density.
Original BJX2 ISA: Only R0..R31
Where, BJX2 was initially a 64-bit BSR1 with 32-bit ops.
Also had 48-bit ops initially, but these went away.
Also borrowed a lot from BJX1:
Which was my highly mutilated version of SH-4
Where, SH-4 was the ISA used in the Sega Dreamcast, etc.
Later, dropped 48 bit ops and added FE and FF Jumbo Prefixes.
This change ate the encoding space the 48-bit ops had used.
The FE prefix allowed bigger immediate fields.
With FF, this allowed extending GPRs to 64.
But, with only a smaller extension to Immed fields.
Later, added the 32-bit XGPR encodings:
7wnm-ZeoZ //F0 (w={Wx, Wnmo})
9wnm-Zeii //F1 and F2 (w={Wx, Wnm, Z})
These allowed R0..R63 for F0, F1, and F2 blocks;
These reused a few "holes" in the 16-bit map.
Initially, had "won" over what would later become XG2:
Mostly in that this approach was backwards compatible.
Any other ops using R32..R63 needing to fall back to 64-bit ops.
Then added "RISC-V Mode", or allowing the CPU to run RISC-V code.
But, didn't get it semi-usable until much later.
Then added XG2, which used the same mechanism as RV Mode.
Revived the original idea for how to add R32..R63;
But, makes 16-bit ops N/E in this mode.
There isn't yet a clear winner of the Baseline+XGPR or XG2 fight:
Baseline+XGPR:
Better average code density (mostly due to 16-bit ops);
More complicated to decode;
Disp9 / Imm9 typically positive-only;
Direct Branch +/- 1MB;
Largest 3RI encoding: Imm57s
XG2:
Worse average code density;
Slightly better average performance;
Simpler and more consistent encoding scheme;
Slightly larger immediate values;
Signed Load/Store displacements;
Direct Branch +/- 8MB;
Largest 3RI encoding: Imm64
In the original form of the ISA, branches always had the LSB clear.
Now, LSB clear indicates a branch within the same ISA Mode.
Branching to an address with the LSB set encodes a branch potentially into a different ISA (the high 16 bits of the address describing the target mode and some captured status flags).
At present, the Link Register always has the LSB set (as do function pointers with "Function Pointer Tagging" enabled in the compiler).
This doesn't play cleanly with the use of TagRefs, but mixing function pointers with TagRefs isn't really a thing (except for lambdas or similar, but generally possible to tell them apart because a tagged function-pointer will have the LSB set and high 8 bits clear, whereas for a normal TagRef Object Pointer will have the LSB clear). Doesn't matter for C, would mostly matter for JS-like languages.
...
But, yeah, otherwise, had recently been distracted mostly by working on a new font system for TestKern.
In effect, it is similar to TrueType fonts, but:
Custom format as the TTF format seemed needlessly complicated;
Mostly expresses glyphs as points encoded as 32-bit words;
Cases where font is "TTF only" are non-free, so doesn't matter.
Generally generating fonts from the UFO / GLIF system (XML based)
Around a 30x size reduction vs the ASCII XML format;
Much less memory requirements vs parsing XML into DOM trees;
File size is comparable with the TTF format.
Generates better looking glyphs when scaled up vs SDF fonts.
Also uses less memory than SDF fonts
SDF's stored as 8bpp or 16bpp BMP images at 16x16 pixels/glyph.
Size reduction:
Not so much because of "amazing compression" (by storing each vertex as 32 bits), but more because storing each vertex as textual XML is bulky (and parsing XML eats memory). Was able to reuse an existing XML parser of mine for the converter tool, so it wasn't too much of an issue.
Basically, glyphs in this format are stored as tagged vertices, with a "NULL Command" marker signaling the end of each glyph. A 2-level page-table like structure is used to give the starting offset of each glyph (keyed using the UTF16 codepoint). Whole file is basically a blob of 32-bit words, as this seemed like the "bare minimum" way to approach the format (the tagged vertex format also encodes bounding-boxes and a few other things).
Generally, vertex format:
(31:28): Major Tag (0=Command, 1=Vertex, 2=Start Vert, 3=End Vert)
(27:24): Tag Specific Bits (Eg: Line/Curve/Offcurve/...)
(23:12): Y (signed 12 bits)
(11: 0): X (signed 12 bits)
If the word is 0x00000000, this is the end of the glyph.
The other commands thus far are mostly just used to give the bounding-box and similar (same general format as the vertex).
Temporarily considered using something like RIFF, but would have added a lot of complexity, so didn't bother. Used a 2-level lookup table, as 1-level would have burned 256K of almost entirely zeroes, and most other options would have been more effort. Current table is ~ 24K of mostly zeroes (for ~ 613 glyphs).
Possible space saving could be to compact each table, say:
(31:24): Current Table Index
(23: 0): Offset
With 0x0000000 signaling the end of each table (table lookup via linear search; TBD if "worth it"; Lookup table being ~ 14% of the total file size with the font used).
Still unresolved issues:
Not yet gotten B-Splines working correctly, so font looks off;
At present, the options are either "angular" or "potatoes".
Drawing glyphs is at present computationally expensive.
Current glyph drawing algorithm:
For each pixel in the glyph being drawn:
Trace a line in one of 4 directions;
Check line against contour edges,
for intersecting edges,
counting which side the starting point is on.
If positive>=negative, outside
Else, inside.
Use the majority conclusion for pixel color.
Possible faster versions:
Build a BSP and try to make point lookups faster;
Draw outline edges in a bitmap,
then determine points and flood-fill for the rest.
Could significantly reduce the number of point-checks needed.
Or, flag "inside" edges and potentially rely entirely on flood-fill.
I guess, potentially, something like Bressenham's followed by flood-fill would make more sense for "80s and 90s tech" versus using a whole lot of geometric line-line checks for each pixel to draw each glyph (which is painfully slow).
Note that the glyph outlines tend to be non-convex and often self-intersecting, which does complicate matters (can't be handled like with normal polygon rasterization).
For now, having been testing with the "Cantarell" font. Would have considered "Noto Sans" except that it is using a different format (an ASCII ".glyphs" format). Both have the same (semi annoying) license terms. For the latter, I would have needed to figure out the format and write a custom parser (format vaguely resembles JSON, but looks more annoying; and by this point, almost better off writing something to decode TTF files and using this).
Not yet found a clearly better option, wanting:
Covers at least CP-1252 range;
Very permissive terms;
Generic and readable, preferably mostly sans-serif;
Except in cases where it leads to visual ambiguity.
Does not have any 1Il| ambiguity;
Preferably in UFO / GLIF format.
Though, possible could be to use a "marching squares" type approach on my existing bitmap font (as an alternative to the use of SDF). Where, a higher-resolution version already needed to be made for SDF generation, and could potentially be auto-traced to generate a geometric version.
Note that most likely, the 6x8 and 8x8 text rendering will continue using a bitmap font:
6x7, 6x8, 6x9, 7x7, 7x8, 7x9: Can pad a 5x6 pixel cel;
8x8, 8x9, 9x10, ...: Can pad an 8x8 pixel cel;
Bigger, can use SDF or geometric fonts;
Mostly relevant to possible larger text.
Geometric fonts matter if one wants to turn it into 2.5D or 3D, but 3D is likely to need CSG and similar to turn it into a usable form (triangles). Didn't want to bother with CSG and triangles for plain text drawing though.
Say, if one wants to do the whole "spinning 3D text logo" thing, ideally one needs a geometric font.
...
Stefan