Antonin Houska <a...@cybertec.at> writes: > When doing my experiments with bucket split ([1]), I noticed a comment that > _hash_getnewbuf should not be called concurrently. However, there's no > synchronization of calls from _hash_splitbucket in HEAD. I could reproduce > such concurrent calls using gdb (multiple bucket splits in progress at a > time).
Ugh. > [ locking the metapage around the _hash_getnewbuf call ] > I think it'd also be the easiest fix for _hash_splitbucket. Or should a > separate ("regular") lock be introduced and used and used in both cases? That doesn't really fix the problem: once we release lock on the metapage, if a new split starts then it would try to create a bucket at the next higher page number, so that the _hash_getnewbuf() calls might occur "out of order", leading to failures of the crosschecks therein. I think the only simple fix here is to rearrange _hash_splitbucket's API so that the _hash_getnewbuf() call is done before we drop the metapage lock at all. Not sure offhand if the best way is to move that call into _hash_expandtable(), or to change the API definition so that _hash_splitbucket is responsible for dropping the metapage lock at the right time. Obviously, this whole business of needing the metapage lock to add pages is nasty. I wonder if we could use the relation extension lock instead to serialize those operations, rather than requiring a metapage buffer lock. But that would be a much more invasive thing. In any case, the current state of affairs is that the relation physical length is tightly tied to the bucket mapping defined by the metapage contents, and we have to change both of those together. Actually ... one of the other things that's not being accounted for here is the possibility of ENOSPC while adding the new page. Really, we want to attempt the extension *before* we modify the bucket mapping. So that seems to argue for moving the _hash_getnewbuf call into _hash_expandtable, where it could be done just before we scribble on the metapage. OTOH, we're not any less screwed if we get ENOSPC while trying to add an overflow page later on :-(, since future readers would assume the bucket split's been performed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers