On Sat, Dec 2, 2017 at 4:46 PM, Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > On Sat, Dec 2, 2017 at 3:54 PM, Thomas Munro > <thomas.mu...@enterprisedb.com> wrote: >> On Sat, Dec 2, 2017 at 1:55 PM, Andres Freund <and...@anarazel.de> wrote: >>> - As we don't clean temp files after crash-restarts it isn't impossible >>> to have a bunch of crash-restarts and end up with pids *and* per-pid >>> shared file set counters reused. Which'd lead to conflicts. Do we care? >> >> We don't care. PathNameCreateTemporaryDir() on a path that already >> exists will return silently. PathNameCreateTemporaryFile() on a path >> that already exists, as mentioned in a comment and following an >> existing convention, will open and truncate it. So either it was >> really an orphan and that is a continuation of our traditional >> recycling behaviour, or the calling code has a bug and used the same >> path twice and it's not going to end well. > > On further reflection, there are problems with that higher up. (1) > Even though PathNameCreateTemporaryFile() will happily truncate and > reuse an orphaned file when BufFileCreateShared() calls it, > BufFileOpenShared() could get confused by the orphaned files. It > could believe that XXX.1 is a continuation of XXX.0, when in fact it > is junk left over from a crash restart. Perhaps BufFileCreateShared() > needs to delete XXX.{N+1} if it exists, whenever it creates XXX.{N}. > That will create a gap in the series of existing files that will cause > BufFileOpenShared()'s search to terminate. (2) BufFileOpenShared() > thinks that the absence of an XXX.0 file means there is no BufFile by > this name, when it could mistakenly open pre-crash junk due to a > colliding name. I use that condition as information, but I think I > can fix that easily by using SharedTuplestoreParticipant::npage == 0 > to detect that there should be no file, rather than trying to open it, > and then I can define this problem away by declaring that > BufFileOpenShared() on a name that you didn't call > BufFileCreateShared() on invokes undefined behaviour. I will make it > so.
Here is a patch to deal with that problem. Thoughts? I suppose if we wanted to make this type of problem go away, but still keep files for forensic purposes, we could introduce a "restart number", and stick it into the top level temporary directory's name. That way you'd never be able to collide with files created before a crash-restart, and we could add O_EXCL so it'd become an error to try to create the same filename again. I'll post a new SharedTuplestore and Parallel Hash patch set shortly. -- Thomas Munro http://www.enterprisedb.com
0001-Add-defenses-against-pre-crash-files-to-BufFileOpenS.patch
Description: Binary data