Working on some deduplication code, I want do my my best at performing an atomic re-hard-linking atop an existing file, akin to "ln -f source.txt dest.txt"
However, when I issue os.link("source.txt", "dest.txt") it fails with an OSError (EEXISTS). This isn't surprising as it's documented. Unfortunately, os.link doesn't support something like os.link("source.txt", "dest.txt", force=True) However, I don't want to os.unlink("dest.txt") os.link("source.txt", "dest.txt") in the event the power goes out between the unlink() and the link(), leaving me in a state where dest.txt is deleted but the link hasn't yet happened. So my plan was to do something like temp_name = tempfile.mktemp(dir=DIRECTORY_CONTAINING_SOURCE_TXT) os.link("source.txt", temp_name) try: os.rename(temp_name, "dest.txt") # docs guarantee this is atomic except OSError: os.unlink(temp_name) There's still the potential leakage if a crash occurs, but I'd rather have an extra hard-link floating around than lose an original file-name. Unfortunately, tempfile.mktemp() is described as deprecated since 2.3 (though appears to still exist in the 3.4.2 that is the default Py3 on Debian Stable). While the deprecation notice says "In version 2.3 of Python, this module was overhauled for enhanced security. It now provides three new functions, NamedTemporaryFile(), mkstemp(), and mkdtemp(), which should eliminate all remaining need to use the insecure mktemp() function", as best I can tell, all of the other functions/objects in the tempfile module return a file object, not a string suitable for passing to link(). So which route should I pursue? - go ahead and use tempfile.mktemp() ignoring the deprecation? - use a GUID-named temp-file instead for less chance of collision? - I happen to already have a hash of the file contents, so use the .hexdigest() string as the temp-file name? - some other solution I've missed? Thanks, -tkc -- https://mail.python.org/mailman/listinfo/python-list