On 2014-08-20 18:58:05 -0400, Bruce Momjian wrote: > On Wed, Aug 20, 2014 at 10:36:40AM -0400, Tom Lane wrote: > > Andres Freund <and...@2ndquadrant.com> writes: > > > On 2014-08-20 10:19:33 -0400, Tom Lane wrote: > > >> Alternatively, you could use the process PID as part of the temp file > > >> name; which is probably a good idea anyway. > > > > > I think that's actually worse, because nothing will clean up those > > > unless you explicitly scan for all <whatever>.$pid files, and somehow > > > kill them. > > > > True. As long as the copy command is prepared to get rid of a > > pre-existing target file, using a fixed .tmp extension should be fine. > > Well, then we are back to this comment by MauMau:
> > With that said, copying to a temporary file like <dest>.tmp and > > renaming it to <dest> sounds worthwhile even as a basic copy utility. > > I want to avoid copying to a temporary file with a fixed name like > > _copy.tmp, because some advanced utility may want to run multiple > > instances of pg_copy to copy several files into the same directory > > simultaneously. However, I'm afraid multiple <dest>.tmp files might > > continue to occupy disk space after canceling copy or power failure in > > some use cases, where the copy of the same file won't be retried. > > That's also the reason why I chose to not use a temporary file like > > cp/copy. > > Do we want cases where the same directory is used multiple pg_copy > processes? I can't imagine how that setup would make sense. I don't think anybody is proposing the _copy.tmp proposal. We've just argued about the risk of <dest>.tmp. And I argued - and others seem to agree - the space usage problem isn't really relevant because archive commands and such are rerun after failure and can then clean up the temp file again. > I am thinking pg_copy should emit a warning message when it removes an > old temp file. This might alert people that something odd is happening > if they see the message often. Don't really see a point in this. If the archive command or such failed, that will already have been logged. I'd expect this to be implemented by passing O_CREAT | O_TRUNC to open(), nothing else. > The pid-extension idea would work as pg_copy can test all pid extension > files to see if the pid is still active. However, that assumes that the > pid is running on the local machine and not on another machines that has > NFS-mounted this directory, so maybe this is a bad idea, but again, we > are back to the idea that only one process should be writing into this > directory. I don't actually think we should assume that. There very well could be one process running an archive command, using differently prefixed file names or such. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers