I wrote: > I still haven't figured out why this wasn't a problem in the old > version of functions.c.
Oh, got it: I was misremembering the API contract for tuplestore_gettupleslot. There are two possible ways for it to load the slot with a tuple: * tuple points into the tuplestore's memory, and the slot's should_free flag is false. In this case, tuplestore_clear leaves the slot pointing to garbage, but it doesn't matter because we won't try to access nor free the garbage. (This is the case that comment was talking about.) * tuple points to a tuple palloc'd in *the caller's memory context* --- not the tuplestore's memory as I was thinking. In this case the slot's should_free flag is true. Thus the bug arises because the caller's memory context is now the relatively short-lived per-tuple context that fmgr_sql is called in, where in prior versions it was the SQL function's long-lived fcontext. The slot we're putting it in is also long-lived. So what happens is that on the next fmgr_sql call, after the per-tuple context has been reset and thus discarded the palloc'd tuple, the slot still remembers that it's supposed to free the tuple and does so. Kaboom. But in released branches, the slot and the palloc'd tuple are in the same context and there's no such hazard. The quick-hack patch I posted upthread fixes this by ensuring we clear the slot (and thus let it pfree the tuple) before context reset of the per-tuple context wipes the tuple from underneath it. I'm still inclined to fix this by using the patch from [1] though. Aside from the argument that it'd be better if v18 were like later branches here, I now realize that there's at least one false statement in my argument at [1]: >> Given the lack of field complaints over the many years it's been >> like this, there doesn't seem to be a live problem. I think the >> explanation for that is >> (1) those mechanisms are never used to call set-returning functions, >> (2) therefore, the tuplestore will not be called on to hold more >> than one result row, >> (3) therefore, it won't get large enough that it wants to spill >> to disk, >> (4) therefore, its potentially dangling resowner pointer is never >> used. >> However, this is an uncomfortably shaky chain of assumptions. This example has shown that tuplestore.c is capable of spilling to disk even if it is storing only a single tuple, which means (3) is wrong, which means the tuplestore *can* touch its resowner pointer if the function result is wide enough. I wonder if this means we have a reachable bug in released branches. regards, tom lane [1] https://www.postgresql.org/message-id/2443532.1744919968%40sss.pgh.pa.us