Re: Constant Stack Canaries

Liste des GroupesRevenir à c arch 
Sujet : Re: Constant Stack Canaries
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.arch
Date : 31. Mar 2025, 07:34:14
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vsdd20$3jc0i$1@dont-email.me>
References : 1 2 3
User-Agent : Mozilla Thunderbird
On 3/30/2025 3:14 PM, MitchAlsup1 wrote:
On Sun, 30 Mar 2025 17:47:59 +0000, BGB wrote:
 
On 3/30/2025 7:16 AM, Robert Finch wrote:
Just got to thinking about stack canaries. I was going to have a special
purpose register holding the canary value for testing while the program
was running. But I just realized today that it may not be needed. Canary
values could be handled by the program loader as constants, eliminating
the need for a register. Since the value is not changing while the
program is running, it could easily be a constant. This may require a
fixup record handled by the assembler / linker to indicate to the loader
to place a canary value.
>
Prolog code would just store an immediate to the stack. On return a TRAP
instruction could check for the immediate value and trap if not present.
But the process seems to require assembler / linker support.
>
>
They are mostly just a normal compiler feature IME:
   Prolog stores the value;
   Epilog loads it and verifies that the value is intact.
 Agreed.
 
Using a magic number
 Remove excess words.
 
It is possible that the magic number could have been generated by the CPU itself, or specified on the command-line by the user, or, ...
Rather than, say, the compiler coming up with a magic number for each function (say, based on a hash function or "rand()" or something).

Nothing fancy needed in the assemble or link stages.
 They remain blissfully ignorant--at most they generate the magic
number, possibly at random, possibly per link-module.
 
Yes.

In my case, canary behavior is one of:
   Use them in functions with arrays or similar (default);
   Use them everywhere (optional);
   Disable them entirely (also optional).
>
In my case, it is only checking 16-bit magic numbers, but mostly because
a 16-bit constant is cheaper to load into a register in this case
(single 32-bit instruction, vs a larger encoding needed for larger
values).
>
....
( Well, anyways, going off on a tangent here... )
Meanwhile, in my own goings on... It took way to much effort to figure out the specific quirks in that RIFF/WAVE headers to get Audacity to accept IMA-ADPCM output from BGBCC's resource converter.
It was like:
   Media Player Classic: Yeah, fine.
   VLC Media Player: Yeah, fine.
   Audacity: "I have no idea what this is...".
Turns out Audacity is not happy unless:
   The size of the 'fmt ' is 20 bytes, cbSize is 2,
   with an additional 16 bit member specifying the samples per block.
With a 'fact' chunk, specifying the overall length of the WAV in samples.
Pretty much everything else accepted the 16-byte PCMWAVEFORMAT with no 'fact' chunk (and calculating the samples per block based on nBlockAlign).
...
Though, in this case, I am mostly poking at stuff for "Resource WADs", typically images/etc that are intended to be hidden inside EXE or DLL files (where size matters more than quality, and any sound effects are likely to be limited to under 1 second).
Say, one has a sound effect that is, say:
   0.5 seconds;
   8kHz
   2 bits/sample
This is roughly 1kB of audio data.
I also defined a 2-bit ADPCM variant (ADLQ), and ended up using a customized simplified header for it (using a similar structure to the BMP format; where the full RIFF format adds unnecessary overhead; though the savings here are debatable).
Say:
   Full RIFF in this case:
     60 bytes of header.
   Simplified format:
     32 bytes of header.
   So, saving roughly 28 bytes of overhead vs RIFF/WAVE.
     Though, drops to 12 bytes in the absence of 'fact',
       and using the 16-byte PCMWAVEFORMAT structure vs WAVEFORMATEX.
While theoretically 2-bit IMA ADPCM already exists for WAV, seemingly not much supports it. I also implemented support for this, as it does at least "exist in the wild".
As for the 2-bit version of IMA ADPCM:
   Media Player Classic: Opens it and shows correct length,
     but sounds broken.
     Sounds like it is trying to play it with the 4 bit decoder.
   VLC Media Player:
     Basically works, though progress bar and time display is wonky.
     Does figure out mostly the correct length at least.
   Audacity: Claims to not understand it.
I had discovered the "adpcm-xq" library, and looked at this as a reference for the 2-bit IMA format. Since VLC plays it, I will assume my code is probably generating "mostly correct" output (at least WRT the 2b ADPCM part; possible wonk may remain in the WAVEFORMATEX header, and/or VLC is just a little buggy here).
So, thus far:
   ADLQ:
     Slightly higher quality;
     Needs a slightly more complicated encoder for good results;
       Decoder needs to ensure values don't go out of range.
     Software support: Basically non existent.
     Could in theory allow a cheap-ish hardware decoder.
   2-bit IMA ADPCM:
     Slightly simpler encoder;
     More is needed on the decoder side;
       Requires using multiply and range clamping.
     Slightly worse audio quality ATM.
       Around 0.8% bigger for mono due to header differences.
Block Headers:
   ADLQ:
     ( 7: 0): Initial Sample, A-Law
     (11: 8): Initial Step Index
     (   12): Interpolation Hint
     (15:13): Block Size (Log2)
   IMA, 2b:
     (15: 0): Initial Sample, PCM16
     (23:16): Step Index
     (31:24): Zero
   ADLQ is 1016 samples in 256 bytes, IMA is 1008.
Sample Format is common:
   00: Small Positive
   01: Large Positive
   10: Small Negative
   11: Large Negative
Both have a scale-ratio of 1 or 3 (if normalized).
   ADLQ has a narrower range of steps, with stepping of -1/+1.
     Each step in ADLQ is 1/2 bit, so each 2 steps is a power of 2.
     So, curve of around 1.414214**n
   IMA has more steps, with a per-sample step of -1/+2.
     Doesn't map cleanly to power of 2,
       but around 8 steps per power of 2.
       Seems to be build around a curve of 1.1**n.
But, more aggressive stepping makes sense with 2-bit samples IMO...
I went with not doing any range clamping in the decoder, so the encoder would be responsible that values don't go out of range. This does increase encoder complexity some (it needs to evaluate possible paths multiple samples in advance to make sure the path doesn't go out of range).
Potentially, 1/4-bit step with -1/+2 could have made sense. Would need a 5-bit index though to have enough dynamic range.
Both use a different strategy for stereo:
   ADLQ:
     Splits center and side, encoding side at 1/4 sample rate;
     So, stereo increases bitrate by 25%.
   2b IMA:
     Encodes both the left and right channel independently.
       So, stereo doubles the bitrate.
As for why 2b:
   Where one cares more about size than audio quality...
     8kHz : 16 kbps
     11kHz: 24 kbps
     16kHz: 32 kbps
   Also IMHO, 16kHz at 2b/s sounds better than 8kHz at 4b/s.
     At least speech is mostly still intelligible at 16 kHz.
     Basic sound effects still mostly work at 8kHz though.
       Like, if one needs a ding or chime or similar.
Not really any good/obvious way here to reach or go below 1 bit/sample while still preserving passable quality (2 bit/sample is the lower limit for ADPCM, only real way to go lower would be to match blocks of 4 or 8 samples to a pattern table).
Had previously been making some use of A-Law, but as can be noted, A-Law requires 8 bits per sample.
Though, ending up back at poking around with ADPCM is similar territory to my projects from a decade ago...
But, OTOH: APDCM is/was an effective format for sound effects; even if not given much credit (and seemingly most people see it as obsolescent).
As for image formats, I have a few options for low bpp, while also being cheap to decode:
   BMP+CRAM: 4x4x1, limited variant of CRAM ("MS Video 1")
     Roughly 2 bpp (and repurposed as a graphics format...).
   BMP+CQ: 8x8x1, similar design to CRAM.
     Roughly 1.25 bpp
Where, these can work well for images with no more than 2 colors per 4x4 or 8x8 pixel block (otherwise, YMMV). As it so happens, lots of UI graphics fit this pattern, and/or are essentially monochrome. CQ can deal well with monochrome or almost-monochrome graphics without too much space overhead.
Though, in some other cases, monochrome or 4-color images could be a better fit. These default to black/white or black/white/cyan/magenta, but don't necessarily need to be limited to this (but, may need to add options in BGBCC for 2/4/16 color dynamic-palette).
Say, for example, if an image is only black/white/red/blue or similar, 4-color could make sense (vs using CRAM or CQ and picking from the 256 color palette; but not being able to have different sets of colors in close proximity). Often, 16-color works, but 16-color is rather bulky if compared with CRAM or CQ.
For the CRAM and CQ formats, I ended up adding an option by which the color palette can be skipped (it is replaced by a palette hash value; OS can use the color palette associated with the corresponding hash number).
Mostly this was because, say, for 32x32 or 64x64 CRAM images, the 256 color palette was bigger than the image itself.
Note that much below 32x32, it is more compact to use hi-color BMP images than 256-color due to the color palette issue (making the optional omission for small image formats desirable).
Though, generally, these are generated with BGBCC, which can include the palette in the generated resource WAD, though TBD the best format. For the kernel, it is stored as a 256x256 indexed color bitmap (which also encodes a set of dither-aware RGB555 lookup tables).
For normal EXE/DLL files, could either store a dummy 16x16 256-color image, or more compactly, as a 16x16 hi-color image (with no dither table). Since, it is possible that it could make sense that EXEs/DLLs use a different default color palette from the OS kernel.
Note that neither PNG, JPEG, nor even QOI, are a good fit for these use cases. Wonky BMP variants are a better fit.
For SDF font images, had also used BMP, say a 256x256 8bpp image covering CP-1252, with a specialized color palette (X/Y distances are encoded in the in the pixels). Needed a full 8bpp here as CRAM doesn't work for this.
PNG compresses them, but overhead is too high; and QOI is not so effective for this scenario. Though, as 8bpp images, they do LZ compress pretty OK.
But, would not be reasonable to specially address every scenario.
...

Date Sujet#  Auteur
30 Mar 25 * Constant Stack Canaries50Robert Finch
30 Mar 25 `* Re: Constant Stack Canaries49BGB
30 Mar 25  `* Re: Constant Stack Canaries48MitchAlsup1
31 Mar 25   +- Re: Constant Stack Canaries1Robert Finch
31 Mar 25   +- Re: Constant Stack Canaries1BGB
31 Mar 25   `* Re: Constant Stack Canaries45Stephen Fuld
31 Mar 25    `* Re: Constant Stack Canaries44BGB
31 Mar 25     +- Re: Constant Stack Canaries1Stephen Fuld
31 Mar 25     `* Re: Constant Stack Canaries42MitchAlsup1
31 Mar 25      `* Re: Constant Stack Canaries41BGB
31 Mar 25       `* Re: Constant Stack Canaries40MitchAlsup1
1 Apr 25        +* Re: Constant Stack Canaries10Robert Finch
1 Apr 25        i+* Re: Constant Stack Canaries6MitchAlsup1
1 Apr 25        ii`* Re: Constant Stack Canaries5Robert Finch
2 Apr 25        ii `* Re: Constant Stack Canaries4MitchAlsup1
2 Apr 25        ii  `* Re: Constant Stack Canaries3Robert Finch
2 Apr 25        ii   +- Re: Constant Stack Canaries1MitchAlsup1
4 Apr 25        ii   `- Re: Constant Stack Canaries1MitchAlsup1
1 Apr 25        i`* Re: Constant Stack Canaries3BGB
1 Apr 25        i `* Re: Constant Stack Canaries2Robert Finch
2 Apr 25        i  `- Re: Constant Stack Canaries1BGB
1 Apr 25        `* Re: Constant Stack Canaries29BGB
2 Apr 25         `* Re: Constant Stack Canaries28MitchAlsup1
2 Apr 25          +* Re: Constant Stack Canaries26Stefan Monnier
2 Apr 25          i`* Re: Constant Stack Canaries25BGB
3 Apr 25          i `* Re: Constant Stack Canaries24Stefan Monnier
3 Apr 25          i  `* Re: Constant Stack Canaries23BGB
4 Apr 25          i   `* Re: Constant Stack Canaries22Robert Finch
4 Apr 25          i    +- Re: Constant Stack Canaries1BGB
4 Apr 25          i    `* Re: Constant Stack Canaries20MitchAlsup1
5 Apr 25          i     `* Re: Constant Stack Canaries19Robert Finch
5 Apr 25          i      `* Re: Constant Stack Canaries18MitchAlsup1
5 Apr 25          i       +* Re: Constant Stack Canaries3Robert Finch
6 Apr 25          i       i+- Re: Constant Stack Canaries1MitchAlsup1
6 Apr 25          i       i`- Re: Constant Stack Canaries1Robert Finch
6 Apr 25          i       `* Re: Constant Stack Canaries14MitchAlsup1
7 Apr 25          i        `* Re: Constant Stack Canaries13MitchAlsup1
9 Apr 25          i         +- Re: Constant Stack Canaries1MitchAlsup1
15 Apr 25          i         `* Re: Constant Stack Canaries11MitchAlsup1
15 Apr 25          i          `* Re: Constant Stack Canaries10MitchAlsup1
16 Apr 25          i           `* Re: Constant Stack Canaries9MitchAlsup1
16 Apr 25          i            +* Virtualization layers (was: Constant Stack Canaries)2Stefan Monnier
16 Apr 25          i            i`- Re: Virtualization layers1MitchAlsup1
16 Apr 25          i            `* Re: Constant Stack Canaries6Stephen Fuld
17 Apr 25          i             `* Re: virtualization, Constant Stack Canaries5John Levine
17 Apr 25          i              +- Re: virtualization, Constant Stack Canaries1Stefan Monnier
17 Apr 25          i              +- Re: virtualization, Constant Stack Canaries1Stephen Fuld
17 Apr 25          i              `* Re: virtualization, Constant Stack Canaries2MitchAlsup1
17 Apr 25          i               `- Re: virtualization, Constant Stack Canaries1MitchAlsup1
2 Apr 25          `- Re: Constant Stack Canaries1BGB

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal