On Fri, Feb 8, 2019 at 8:00 AM Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > Sometimes FreeManagerPutInternal() returns a > number-of-contiguous-pages-created-by-this-insertion that is too large > by one. If this happens to be a new max-number-of-contiguous-pages, > it causes trouble some arbitrary time later because the max is wrong > and this FPM cannot satisfy a request that large, and it may not be > recomputed for some time because the incorrect value prevents > recomputation. Not sure yet if this is due to the lazy computation > logic or a plain old fence-post error in the btree consolidation code > or something else.
I spent a long time thinking about this and starting at code this afternoon, but I didn't really come up with much of anything useful. It seems like a strange failure mode, because FreePageManagerPutInternal() normally just returns its third argument unmodified. The only cases where anything else happens are the ones where we're able to consolidate the returned span with a preceding or following span, and I'm scratching my head as to how that logic could be wrong, especially since it also has some Assert() statements that seem like they would detect the kinds of inconsistencies that would lead to trouble. For example, if we somehow ended up with two spans that (improperly) overlapped, we'd trip an Assert(). And if that didn't happen -- because we're not in an Assert-enabled build -- the code is written so that it only relies on the npages value of the last of the consolidated scans, so an error in the npages value of one of the earlier spans would just get fixed up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company