On Fri, Sep 21, 2018 at 3:21 PM Andres Freund <and...@anarazel.de> wrote: > I'm quite suspicious of the logic around: > > /* > * If we received a query cancel or termination signal, we > will have > * EINTR set here. If the caller said that errors are OK > here, check > * for interrupts immediately. > */ > if (errno == EINTR && elevel >= ERROR) > CHECK_FOR_INTERRUPTS(); > > because it seems far from guaranteed to do anything meaningful as I > don't see a guarantee that interrupts are active at that point (e.g. it > seems quite reasonable to hold an lwlock while resizing).
Right, in that case CFI does nothing and you get the following ereport() instead, so the user sees an unsightly "Interrupt system call" message (or however your strerror() spells it). That would actually happen in the dsa.c case for example (something that should be improved especially now that dsm_create() is so slow on Linux, independently of all this). That's probably not a great user experience, but I'm not sure if the fact that it "works around" the suppression of interrupts while LWLocks are held by converting them into errors is a more serious problem or not. The caller has to be ready for errors to be raised here in any case. > Afaict that might cause problems at a later stage, because at that point > we've not adjusted the actual mapping, but *have* ftruncate()ed it. If > there's actual data in the mapping, that certainly could cause trouble. > > In fact, while this commit has expanded the size of the problem, I fail > to see how the error handling for resizing is correct. It's fine to fail > in the ftruncate() itself - at that point no changes have been made -, > but I don't think it's currently ok for posix_fallocate() to ever error > out. Right, independently of this commit, dsm_resize() might have done the actual truncation when the error is reported. That's not good if the caller plans to do anything other than abandon ship. None of this applies to dsm_create() though, which uses the same code path but knows how to clean up. There may be ways to fix the dsm_resize() path based on the observation that you don't need to fallocate() if you made the mapping smaller, and if you made it bigger then you could always undo that on error (or not) and you haven't thrown away any data. Hmm... I note that there are actually no callers of dsm_resize(), and it's not implemented on Windows or SystemV. -- Thomas Munro http://www.enterprisedb.com