On Fri, May 2, 2025 at 2:22 PM Peter Geoghegan <p...@bowt.ie> wrote: > A slight variant of my fuzzing Python script did in fact go on to > detect a couple of bugs. > > I'm attaching a compressed SQL file with repros for 2 different bugs. > The first bug was independently detected by some kind of fuzzing > performed by Mark Dilger, reported elsewhere [1].
Picking up from the email with the big attachment... Both bugs are from commit 8a510275, "Further optimize nbtree search scan key comparisons" (not the main skip scan commit). I actually wrote code very much like the code from these patches that appeared in certain versions of the skip scan patch series -- it was originally supposed to be defensive hardening. This so-called hardening wasn't kept in the final committed version because I incorrectly believed that it wasn't necessary. I would like to commit the first patch later today, ahead of shipping beta1. But the second patch won't make it into beta1. In practice the bug fixed by the first patch is more likely to affect users, and (unlike the bug fixed by the second patch), it involves a hard crash. The first patch prevents us from dereferencing a NULL pointer (pstate) within _bt_advance_array_keys (unless on an assert-enabled build, where we get an assertion failure first). It would also be possible to fix the issue by testing if pstate itself is not a NULL pointer in the usual defensive style, but I find the approach taken in the first patch slightly more natural. The second patch is more complicated, and seems like something that I'll need to spend more time thinking about before proceeding with commit. It has subtle behavioral implications, in that it causes the pstate.forcenonrequired mechanism to influence when and how _bt_advance_array_keys schedules primitive index scans in a tiny minority of forward scan cases. I know of only 3 queries where this happens, 2 of which are from my repro -- it's actually really hard to find an example of this, even if you go out of your way to. Allowing pstate.forcenonrequired to affect primscan scheduling like this is something that I have been avoiding up until now, since that makes things cleaner -- but I'm starting to think that that goal isn't important enough to force the second patch to be significantly more complicated than what I came up with here. It's not like the behavioral differences represent a clear regression; they're just slightly different to what we see in cases where pstate.forcenonrequired/pstate.ikey is forcibly not applied (e.g., by commenting-out the calls to _bt_set_startikey made by _bt_readpage). My approach in the second patch is to simply call _bt_start_array_keys ahead of the finaltup call to _bt_checkkeys when pstate.forcenonrequired, which has the merit of being relatively simple (it's likely the simplest possible approach). I'm unwilling to pay too much of a cost in implementation complexity just to avoid side-effects in _bt_advance_array_keys/primscan scheduling, but maybe I'll find that the cost isn't too high. -- Peter Geoghegan
v1-0001-Avoid-treating-nonrequired-nbtree-keys-as-require.patch
Description: Binary data
v1-0002-Prevent-prematurely-nbtree-array-advancement.patch
Description: Binary data