On Thursday, December 1, 2016 at 7:26:18 PM UTC-5, DFS wrote: > On 12/01/2016 06:48 PM, Ned Batchelder wrote: > > On Thursday, December 1, 2016 at 2:31:11 PM UTC-5, DFS wrote: > >> After a simple test below, I submit that the above scenario would never > >> occur. Ever. The time gap between checking for the file's existence > >> and then trying to open it is far too short for another process to sneak > >> in and delete the file. > > > > It doesn't matter how quickly the first operation is (usually) followed > > by the second. Your process could be swapped out between the two > > operations. On a heavily loaded machine, there could be a very long > > time between them > > > How is it possible that the 'if' portion runs, then 44/100,000ths of a > second later my process yields to another process which deletes the > file, then my process continues.
A modern computer is running dozens or hundreds (or thousands!) of processes "all at once". How they are actually interleaved on the small number of actual processors is completely unpredictable. There can be an arbitrary amount of time passing between any two processor instructions. I'm assuming you've measured this program on your own computer, which was relatively idle at the moment. This is hardly a good stress test of how the program might execute under more burdened conditions. > > Is that governed by the dreaded GIL? > > "The mechanism used by the CPython interpreter to assure that only one > thread executes Python bytecode at a time." > > But I see you posted a stack-overflow answer: > > "In the case of CPython's GIL, the granularity is a bytecode > instruction, so execution can switch between threads at any bytecode." > > Does that mean "chars=f.read().lower()" could get interrupted between > the read() and the lower()? Yes. But even more importantly, the Python interpreter is itself a C program, and it can be interrupted between any two instructions, and another program on the computer could run instead. That other program can fiddle with files on the disk. > > I read something interesting last night: > https://www.jeffknupp.com/blog/2012/03/31/pythons-hardest-problem/ > > "In the new GIL, a hard timeout is used to instruct the current thread > to give up the lock. When a second thread requests the lock, the thread > currently holding it is compelled to release it after 5ms (that is, it > checks if it needs to release it every 5ms)." > > With a 5ms window, it seems the following code would always protect the > file from being deleted between lines 4 and 5. > > -------------------------------- > 1 import os,threading > 2 f_lock=threading.Lock() > 3 with f_lock: > 4 if os.path.isfile(filename): > 5 with open(filename,'w') as f: > 6 process(f) > -------------------------------- > You seem to be assuming that the program that might delete the file is the same program trying to read the file. I'm not assuming that. My Python program might be trying to read the file at the same time that a cron job is running a shell script that is trying to delete the file. > Also, this is just theoretical (I hope). It would be terrible system > design if all those dozens of processes were reading and writing and > deleting the same file. If you can design your system so that you know for sure no one else is interested in fiddling with your file, then you have an easier problem. So far, that has not been shown to be the case. I'm talking more generally about a program that can't assume those constraints. --Ned. -- https://mail.python.org/mailman/listinfo/python-list