On Thu, 8 Sept 2022 at 01:22, David Rowley <dgrowle...@gmail.com> wrote: > > On Thu, 8 Sept 2022 at 01:05, Julien Rouhaud <rjuju...@gmail.com> wrote: > > FYI lapwing isn't happy with this patch: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing&dt=2022-09-07%2012%3A40%3A16. > > I'll look into it further.
Looks like my analysis wasn't that good in nodeWindowAgg.c. The reason it's crashing is due to int2int4_sum() returning Int64GetDatumFast(transdata->sum). For 64-bit machines, Int64GetDatumFast() translates to Int64GetDatum() and and that's byval, so the MemoryContextContains() call is not triggered, but on 32-bit machines that's PointerGetDatum() and a byref type, and we're returning a pointer to transdata->sum, which is part way into an allocation. Funnily, the struct looks like: typedef struct Int8TransTypeData { int64 count; int64 sum; } Int8TransTypeData; so the previous version of MemoryContextContains() would have subtracted sizeof(void *) from &transdata->sum which, on this 32-bit machine would have pointed halfway up the "count" field. That count field seems like it would be a good candidate for the "false positive" that the previous comment in MemoryContextContains mentioned about. So it looks like it had about a 1 in 2^32 odds of doing the wrong thing before. Had the fields in that struct happened to be in the opposite order, then I don't think it would have crashed, but that's certainly no fix. I'll need to think about how best to fix this. In the meantime, I think the other 32-bit animals are probably not going to like this either :-( David