On 3/7/2025 5:40 PM, Robert Finch wrote:
But, in Verilog, it is not uncommon to go well outside of the usual 64- bit limit that C typically imposes
Yes, all kinds of hardware structures use bit vectors >64. They also tend to have a reasonable upper limit as muxing from a bit vector can be expensive.
In my own Verilog code, I rarely go over 128 bit, but a few 512 bit vectors are used.
This leaves 512 or 1024 as the lower-end of a vector-size limit.
Something like 32kb is well over this.
The way types are encoded has an implicit limit of 4kb before it needs to go to an overflow type.
Say:
(11: 0): Base Type
000..0FF: Primitive Types
100..FFF: Complex/Structure Types
(15:12): Pointer Indirection Level or Array/Ref Type.
(27:16): Array Size (also used for _BitInt size)
(31:28): Type-Type Tag
There is a type that has larger arrays (up to 1M elements), but only supports a subset of the primitive types.
Some primitive types may appear multiple types, differing in terms of things like alignment or volatile status.
Another variant has a larger type fields, but only supports small arrays (this case mostly comes up once the compiler gets beyond 3840 struct/union/etc types; can actually happen in practice, though one big offender here is "GL/gl.h" defining a crapton of function-pointer types...).
In other cases, the type is expressed as an index into an array of types whose type could not be described in one of the 32 bit formats.
>
But, "32kb" in a single bit-vector should probably be enough for anyone...
>
I would be tempted to allow an 'int' (size_t?) to be used for the size.
I had a 64-bit value field intended to hold an immediate value for a 3AC operator, so I divided it into 3 parts for this operator.
3x 16b made sense as it was easy to unpack into 3 signed values (for "technical reasons" also needs to be able to express negative values).
If needed, could switch to 1M as a limit (64/3 is 21 bits).
Can note that BGBCC can express larger literals (such as 128-bit integers) but generally does so by encoding them as lookup table indices (broken up into 64-bit chunks).
To go much bigger, would likely need to use multi-level indirection (or switch to a different strategy for expressing large constants).
The variable fields can also express values:
temp/argument/local/global:
24 bit ID
12b index + 12b version got locals
24-bit global-ID for globals
32 bit type
8 bit tag
string literals use a similar format
Just encoding an index into a string table.
integer literals:
32-bit payload, 16-bit type (12b base, 4b level)
long/double literal:
56 bit value
(values that don't fit here are encoded as lookup table indices)
...
But, as such, memory use is still an issue.
Say, don't want to use lookup tables when it can be avoided, as this requires memory.
But, can't really go much bigger than 32-bits for types, or 64-bits for variable IDs, as this would eat a significant amount of memory in a part of the compilers that already eats lots of RAM.
Similarly, don't want to increase the size of the 3AC operation structure, ...
For software bit vectors could be larger than for hardware. For example, I used a bit-pair vector to hold a ternary value representing memory pages for something called a page-allocation-map (PAM) which tracks which pages are allocated or free, and the end of an allocation. There could be a lot of pages in a large memory system. Scanning the bit-pair vector could be done mostly a word-at-a-time. Same thing could be done for disk pages.
Possibly, but not necessarily imagining that Verilog or similar would make sense as a generic software programming language.
And, the bitfield stuff as-is wouldn't make sense for large bit arrays (which as-is, only allows for constant bit offsets; most bitmap use-cases tending to use a variable bit index).
I guess, while a person could do something like (in C):
_BitInt(1048576) bmp;
_Boolean b;
...
b=(bmp>>i)&1; //*blarg* (shift here would be absurdly expensive)
This is liklely to be rare vs more traditional strategies, say:
uint64_t *bmp;
int b, i;
...
b=(bmp[i>>6]>>(i&63))&1;
As well as the traditional strategy being a whole lot more efficient in this case...
I guess the case could be made for a generic dense bit array.
Though, an open question is how one would define it in a way that is consistent with existing semantics rules.
Note that a traditional array in Verilog is more like:
reg bitArray[1023:0];
But, this represents something conceptually different from a large bit vector:
reg[1023:0] bitVec;
Even if it still allows:
b=bitVec[idx];
Hmm...