On 2017-05-01 08:41, Cameron Simpson wrote: > On 30Apr2017 06:52, Tim Chase <python.l...@tim.thechases.com> wrote: > >> > - use a GUID-named temp-file instead for less chance of > >> > collision? > > You could, but mktemp is supposed to robustly perform that task, > versus "very very probably".
Though with the potential of its race-condition, mktemp() isn't a much stronger guarantee. A GUID seems like the best route. > >> > - I happen to already have a hash of the file contents, so use > >> > the .hexdigest() string as the temp-file name? > > Hashes collide. (Yes, I know that for your purposes we consider > that they don't; I have a very similar situation of my own). And > what if your process is running twice, or leaves around a previous > temp file by accident (or interruption) _or_ the file tree contains > filenames named after the hash of their content (not actually > unheard of)? In both case #1 (a *file* happens to have the name of the SHA256 hash, but has different file contents) and case #2 (another process running generates a *link* with the SHA256 of the matching content), the os.link() should fail with the EEXISTS which I'm okay with. Likewise, if there's an interruption, I'd rather have the stray SHA-named link floating around than lose an existing file-name. > What about some variation on: > > from tempfile import NamedTemporaryFile > ... > with NamedTemporaryFile(dir=your_target_directory) as T: > use T.name, and do your rename/unlink in here As mentioned in my follow-up (which strangely your reply came in with a References header referencing), the NamedTemporaryFile creates the file on-disk, which means os.link(source, T.name) fails with the EEXISTS. -tkc -- https://mail.python.org/mailman/listinfo/python-list