Newsportal USENET - Re: "The provenance memory model for C", by Jens Gustedt

On 7/12/2025 7:30 PM, Chris M. Thomasson wrote:

On 7/11/2025 1:48 AM, David Brown wrote:
On 11/07/2025 04:09, BGB wrote:
On 7/10/2025 4:34 AM, David Brown wrote:
On 10/07/2025 04:28, BGB wrote:
On 7/9/2025 4:41 AM, David Brown wrote:
On 09/07/2025 04:39, BGB wrote:
On 7/2/2025 8:10 AM, Kaz Kylheku wrote:
On 2025-07-02, Alexis <flexibeast@gmail.com> wrote:
>
...
>
>
[...]
It is worth remembering that game code (especially commercial game code) is seldom written with a view to portability, standard correctness, or future maintainability. It is written to be as fast as possible using the compiler chosen at the time, to be build and released as a binary in the shortest possible time-to-market.
Love this one:
https://forums.parallax.com/discussion/147522/dog-leg-hypotenuse- approximation
[...]

I looked at the post and initially didn't recognize it as I didn't use the same term (I usually called it "distance approximation").
But, yeah, variations of the first algorithm were used frequently in a lot of this code. For example, it was the main way of calculating distance in Wolf3D and ROTT.
Not sure about Doom, would need to find the relevant code, but wouldn't be surprised.
In terms of coding style, ROTT had considerable "hair".
In my initial porting effort, spent quite a long time trying to deal with all of the places where memory access was going out of bounds (though, in many cases, just ended up masking the access by the size of the array or similar; or in some cases rewriting the logic to not depend on particular out-of-bounds access behaviors).
Also had to deal with the mess that a lot of the drawing code depended on the specific way they were using the VGA hardware:
320x200 on screen, but 384x200 in memory, but divided into 4 planes of 96x200 (and doing tricks with the plane mask register).
Kinda needed to write wrappers to fake a lot of this crap.
But, it seemed like Apogee / 3D Realms liked this mode, vs id mostly staying with the standard 320x200 256-color VGA mode (which can be mapped directly to a 64K array).
FWIW, my most recent 3D engine also used this strategy for distance math in many places.
Can note that in this 3D engine, world coordinates are primarily expressed in a packed fix point form:
   (23: 0): X coord, 16.8
   (47:24): Y coord, 16.8
   (63:48): Z coord, 8.8
With the base unit roughly 1 meter, albeit informally subdivided into 32 "inches", with the exact handling of the 25% error on the size of the inch relative to the "standard inch" being subject to interpretation.
Though, for sake of hand waving, in 3D engines one can also say that meters and yards are "basically the same thing" as well (if one redefines the base unit as "yard", the inch discrepancy drops to 12.5%...).
Though, in my case, I just sorta instead used "decimal inches", where there is only around a 4% discrepancy between 1/32 of a meter, and the decimal inch.
Another factor is the size of the player:
   In Minecraft, IIRC "Steve" is 1.85m tall, this is atypical (~ 6'2").
   In my 3D engine, decided to adjust is so the player 1.60m tall (5'4").
This small difference has a notable effect on how big the blocks look.
Note that in this engine (like my previous 3D engine) has a world that is modulo in X and Y.
The previous engine had used 20.12 fixed point coords for X/Y/Z, with a wrap around in all 3 axes; though the terrain generation was planar (the sky limit was more part of the terrain generator, only generating a world 1 region tall). Actually, outdoor spaces were partly "faked" as the 3D engine actually treated the world like an infinite cave, just the terrain generation put a "sky" block at Z=255, which created an overworld-like situation. Initially, the sky handling was similar to the Quake engines (the skybox was drawn onto geometry), but then switched to a "generic" skybox with the "sky at Z=255" key (and there was support for multiple skies, possibly to have an effect similar to the "dimensions" in Minecraft).
Actually, this isn't too hard from how world parameters were defined in the ROTT engine, where the blocks near X=0,Y=0 were used to define the parameters for the rest of the map (if nothing was there, it giving a world similar to that in Wolfenstein 3D).
In my own Minecraft playing experience, I can say a 64km world limit is still plenty large for practical gameplay. Nevermind if the current engine has a world height limit of 128, which is a little less than Minecraft.
Granted, 64km is still a lot less than ~ 1048km.
Partly the reason for the design choices was to try to make the engine need less RAM, so that it was more practical to fit in the RAM limits of the FPGA boards I had (eg, a 128MB RAM chip).
To fit a Minecraft like experience in under, say, 80MB of RAM, requires compromises.
One was region size, where reducing the region size from a 256m cube to a 128m cube was an 8x reduction of the storage size of each region.
And generally for shorter draw distances (~ 64 .. 96), roughly 9 regions will be loaded at a time.
At ~ 128m to 192m, the number of loaded regions increases to 29, and at 256m, may increase to around 51 regions.
I ended up staying with the same chunk size (16x16x16).
In both engines, blocks exist as a table of unique blocks with a 4 or 8 bit index table (2K or 4K for each loaded chunk).
Currently each chunk stores its own table of blocks (32 bits in the newer engine), but in theory could merge the blocks per region (with a 16-bit number), with an assumption of fewer than fewer than 64K unique blocks per region (would mostly hold).
In both engines, each chunk is LZ compressed:
   Currently engine uses RP2, but this is in a similar category to LZ4.
   Prior engine had used an LZ+AdRice entropy coder.
The main reason for the change is that AdRice adds a lot of additional CPU performance cost on a 50MHz CPU, vs a more modest effect on compression.
Can note that most of the "variation" between blocks is typically the same few block types but usually at different lighting levels. In the absence of light sources, many chunks fit into a limit of 16 or fewer unique blocks.
Where, block lighting:
   4 bits: Sky light intensity;
   At 15, travels downward at full strength;
   This is understood as direct view of the sky.
   Drops off by 1 for each meter.
   At 14 or less, it drops off by 1 in all directions.
   Sunlight is always white in open air.
   4 bits: Block light intensity;
   Say, 15 at a light source, dropping off by 1 each meter.
   Block lighting is the max of the contribution from adjacent blocks
   4 bits: Block light color
   Color applied to the block lighting.
   Derived from a 16 color palette.
This is a similar scheme to that in Minecraft, just with the addition of an explicit block light color. Note that these fields only apply to transparent blocks (but may instead be metadata for opaque blocks).
Contrast:
Minecraft had used Deflate, but for chunk compression, Deflate has a high overhead. Minecraft also stores full block data for every block, rather than using a "block palette".
My first 3D engine had used the same RLE compression scheme as Wolf3D and ROTT, but this had left something to be desired.
Where, say, Wolf3D and ROTT both used flat words containing multiple byte planes (IIRC):
   Plane 0: Block type at each location.
   0: Empty Space
   1-239: Various types of wall blocks.
   Typically, index number mapped 1:1 to wall texture.
   Rule breaks in ROTT for multi-state or animated walls.
   Plane 1: Entity Type
   1-239: Various types of entity (such as enemy or item).
   Plane 2: Entity Attribute
   Mostly encoded facing direction and spawn difficulty and similar.
Wolf3D world size was 64x64, whereas ROTT was 128x128.
   In both cases, block size was ~ 2 meters.
The worlds were stored in an RLE compressed form, where:
   A magic tag byte was defined externally
   (IIRC, usually 0xFF or similar for Wolf3D).
   In ROTT, it was defined per map plane IIRC (vs hard coded).
   Most bytes were passed through as-is;
   If the tag byte is seen:
   Tag Length Value
   An RLE run of 1 byte could be used to escape-code the tag byte.
Can note that an LZ compression scheme does notably better here vs a simple RLE scheme.
Can note also:
Wolf3D and ROTT had used ray-casting to determine visibility, where they would raycast along the map until they hit something (with a raycast for each horizontal screen pixel).
My newer 3D engine had used raycasts for visibility determination.
   Whenever a raycast hit a block, it could take note of what block was hit and where. So, it would build a visible shell for the currently visible parts of the world. However, couldn't do this per-pixel in full 3D, so it mostly raycasts in a spherical region and jittering the ray vectors to try to improve coverage (something drops off the list if it wasn't hit recently enough).
This doesn't scale well with draw distance though, for use on PCs (with a larger draw distance), I have ended up with a modified approach:
Per block ray-casting is only used for shorter distances (64 meters);
For more distant stuff (medium distance), it is per-chunk and using cached chunk vertex arrays (loading the chunks). This is more like the approach Minecraft uses, just in my case, used for things further than 64 meters.
For higher distances (say, past 192 meters), it builds an "outer shell" representation of the region. Here, the region is represented as 6x 128x128 faces, each encoding both the block type at the location, and the distance of the block from the edge of the region (16-bit block type index, 8 bit distance). The limitations of this representation are less obvious from a distance, but it does save on RAM vs actually loading the chunks (it can only represent things that are life-of-site along an axis from the edge of the region).
One possibility (for greater performance) could be if this were turned into textures with rendering done via a parallax shader. Currently this engine doesn't use shaders though (it is operating roughly within the limits of the OpenGL 1.3 feature-set; so no shaders or similar).
I can push it up to around 384 meters before performance goes to crap. But, this is acceptable to me, as Minecraft with a 24 chunk draw distance also performs like crap...
Have noted that my 3D engine does still seem to use less RAM relative to draw distance if compared with Minecraft.
Most of the RAM use goes into the vertex arrays though.
   Sadly, the steep cost of vertex arrays is unavoidable here.
Well, unless each face were drawn one-at-a-time using immediate mode (glBegin/glEnd for each polygon), but this would be unacceptably slow (as it scales very poorly).
Note, not using VBOs partly as these are also outside the feature range of OpenGL 1.3, ...
Can note that it seems like OpenGL somewhat prefers if vertex array contents don't change. If one has large vertex arrays with rapidly changing contents, then the OpenGL backend seems to allocate absurd amounts of RAM. Works better if one has a larger number of more moderate size vertex arrays with static contents.
Seemingly OpenGL likely caches stuff, and checks whether or not a repeat draw is the same vertex array with the same contents.
...

Date	Sujet	#	Auteur
2 Jul 25	"The provenance memory model for C", by Jens Gustedt	14	Alexis
2 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	13	Kaz Kylheku
9 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	12	BGB
9 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	11	David Brown
10 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	9	BGB
10 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	8	David Brown
11 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	7	BGB
11 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	6	David Brown
11 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	1	BGB
13 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	4	Chris M. Thomasson
13 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	3	BGB
13 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	2	Chris M. Thomasson
14 Jul 25	Re: "The provenance memory model for C", by Jens Gustedt	1	BGB
20 Jul01:21	Re: "The provenance memory model for C", by Jens Gustedt	1	Waldek Hebisch