Re: Andres Freund > How confident are we that this isn't actually because we passed a bogus > address to the kernel or such? With this patch, are *any* pages recognized as > valid on the machines that triggered the error?
See upthread - the first 35 pages were ok, then a lot of -14. > I wonder if we ought to report the failures as a separate "numa node" > (e.g. NULL as node id) instead ... Did that now, using N+1 (== 1 here) for errors in this Debian i386 environment (chroot on an amd64 host): select * from pg_shmem_allocations_numa \crosstabview name │ 0 │ 1 ────────────────────────────────────────────────┼──────────┼────────── multixact_offset │ 69632 │ 65536 subtransaction │ 139264 │ 131072 notify │ 139264 │ 0 Shared Memory Stats │ 188416 │ 131072 serializable │ 188416 │ 86016 PROCLOCK hash │ 4096 │ 0 FinishedSerializableTransactions │ 4096 │ 0 XLOG Ctl │ 2117632 │ 2097152 Shared MultiXact State │ 4096 │ 0 Proc Header │ 4096 │ 0 Archiver Data │ 4096 │ 0 .... more 0s in the last column ... AioHandleData │ 1429504 │ 0 Buffer Blocks │ 67117056 │ 67108864 Buffer IO Condition Variables │ 266240 │ 0 Proc Array │ 4096 │ 0 .... more 0s (73 rows) There is something fishy with pg_buffercache. If I restart PG, I'm getting "Bad address" (errno 14), this time as return value of move_pages(). postgres =# select * from pg_buffercache_numa; DEBUG: 00000: NUMA: NBuffers=16384 os_page_count=32768 os_page_size=4096 LOCATION: pg_buffercache_numa_pages, pg_buffercache_pages.c:383 2025-06-23 19:41:41.315 UTC [1331894] ERROR: failed NUMA pages inquiry: Bad address 2025-06-23 19:41:41.315 UTC [1331894] STATEMENT: select * from pg_buffercache_numa; ERROR: XX000: failed NUMA pages inquiry: Bad address LOCATION: pg_buffercache_numa_pages, pg_buffercache_pages.c:394 Repeated calls are fine. Maybe NUMA is just not supported on 32-bit archs, but I'd rather be sure about that before play that card. Christoph