On 2025-06-12, Mateusz Viste <
mateusz@x.invalid> wrote:
Thank you all for your thoughtful responses. You rightly identified
that the problem is essentially an out-of-bounds access - a symptom of
deeper code quality issues. The bug in question managed to pass unit
tests, peer review, functional tests, and it didn’t trigger any
warnings from GCC or clang, even with the strict -Weverything flag I
enforce across my teams. This underscores a fundamental truth: every
software has bugs, and some, like this one, are notoriously difficult
to locate. The bug caused a segfault about once every 10 days,
manifesting in an unrelated part of the code and sometimes days after
the out-of-bounds write occurred.
>
This led me to wonder how I could accelerate such crashes to simplify
debugging.
Below is a proof-of-concept program that works in GNU/Linux. For
rapidity of prototyping, I have assumed a page size of 4096; this is not
right for all systems.
The my_array[] array is declared between two page-sized and page-aligned
guard arrays, guard_0 and guard_1.
The program write-protects the two arrays with mprotect.
The output demonstrates that the egregious overrun of my_array[],
namely a write to my_array[5000] triggers a segfault:
$ ./prog
Address of guard_0: 0x4c4000
Address of my_array: 0x4c5000
Address of guard_1: 0x4c6000
guard_1 is now write-protected (read-only).
writing my_array[0] succeeded
Segmentation fault (core dumped)
With a little additional effort, we can manipulate the declarations
such that the high element of my_array[] will be placed just before
the guard_1 page. Then we will have byte-accurate overrun detection,
at the loss of accurate underrun detection.
A bunch of decades ago, hacker Bruce Perens developed a malloc
debugging library called Electric Fence which implemented exactly
this technique, but for malloced objects. We can think of this
as "Electric Fence, but for static".
Note that all static arrays have initializers. This is so that they
are part of the same category of non-zero-initialized data.
I suspect that this will work fine if all three arrays are
zero-initialized or all three are non-zero-initialized, but not
for mixtures. The reason is that zero-initialized and
non-zero-initialized statics are separated and put into different
sections.
Try it, verify that my_array[-1] = 0 segfaults, showing that
there is accurate underrun protection. Try manipulating the
declarations to get my_array to butt up against guard_1.
Code follows ...
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#define PAGE_SIZE 4096
static char __attribute__((aligned(PAGE_SIZE))) guard_0[PAGE_SIZE] = { 1 };
static char my_array[42] = { 1 };
static char __attribute__((aligned(PAGE_SIZE))) guard_1[PAGE_SIZE] = { 1 };
int main() {
printf("Address of guard_0: %p\n", (void*)guard_0);
printf("Address of my_array: %p\n", (void*)my_array);
printf("Address of guard_1: %p\n", (void*)guard_1);
if (mprotect(guard_0, PAGE_SIZE, PROT_READ) == -1) {
perror("mprotect guard_0 failed");
return EXIT_FAILURE;
}
if (mprotect(guard_1, PAGE_SIZE, PROT_READ) == -1) {
perror("mprotect guard_1 failed");
return EXIT_FAILURE;
}
printf("guard_1 is now write-protected (read-only).\n");
my_array[0] = 2;
printf("writing my_array[0] succeeded\n");
my_array[5000] = 2;
printf("writing my_array[5000] should not have succeeded\n");
return EXIT_SUCCESS;
}