[issue10948] Trouble with dir_util created dir cache
New submission from Diego Queiroz : There is a problem with dir_util cache (defined by "_path_created" global variable). It appears to be useful but it isn't, just repeat these steps to understand the problem I'm facing: 1) Use mkpath to create any path (eg. /home/user/a/b/c) 2) Open the terminal and manually delete the directory "/home/user/a" and its contents 3) Try to create "/home/user/a/b/c" again using mkpath Expected behavior: mkpath should create the folder tree again. What happens: Nothing, mkpath "thinks" the folder already exists because its creation was cached. Moreover, if you try to create one more folder level (eg. /home/user/a/b/c/d) it raises an exception because it thinks that part of the tree was already created and fails to create the last folder. I'm working with parallel applications that deal with files asynchronously, this problem gave me a headache. Anyway, the solution is easy: remove the cache. -- assignee: tarek components: Distutils messages: 126540 nosy: diegoqueiroz, eric.araujo, tarek priority: normal severity: normal status: open title: Trouble with dir_util created dir cache type: behavior versions: Python 2.5, Python 2.6, Python 2.7, Python 3.1 ___ Python tracker <http://bugs.python.org/issue10948> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10948] Trouble with dir_util created dir cache
Diego Queiroz added the comment: Well. My application does not actually randomly remove the folders, it just can't guarantee for a given process how the folder it created will be deleted. I have many tasks running on a cluster using the same disk. Some tasks creates the folders/files and some of them remove them after processing. What each task will do depends of the availability of computational resources. The application is also aware of possible user interaction, that is, I need to be able to manipulate folders manually (adding or removing) without crashing the application or corrupting data. -- ___ Python tracker <http://bugs.python.org/issue10948> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10948] Trouble with dir_util created dir cache
Diego Queiroz added the comment: Suppose the application creates one folder and add some data to it: - /scratch/a/b/c While the application is still running (it is not using the folder anymore), you see the data, copy it to somewhere and delete everything manually using the terminal. After some time, (maybe a week or a month later, it doesn't really matter) the application wants to write again on that folder, but ops, the folder was removed. As application is very well coded :-), it checks for that folder and note that it doesn't exist anymore and needs to be recreated. But, when the application try to do so, nothing happens, because the cache is not updated. ;/ Maybe distutils package was not designed for the purpose I am using it (I am not using it to install python modules or anything), but this behavior is not well documented anyway. If you really think the cache is important, two things need to be done: 1) Implement a way to update/clear the cache 2) Include details about the cache and its implications on distutils documentation -- ___ Python tracker <http://bugs.python.org/issue10948> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10948] Trouble with dir_util created dir cache
Diego Queiroz added the comment: You were right, "os.makedirs" fits my needs. :-) Anyway, I still think the change in the documentation is needed. This is not an implementation detail, it is part of the way the function works. The user should be aware of the behavior when he call this function twice. In my opinion, the documentation should be clear about everything. We could call this an implementation detail iff it does not affect anything externally, but this is not the case (it affects subsequent calls). This function does exactly the same of "os.makedirs" but the why is discribed only in a comment inside the code. We know this is a poor programming style. This information need to be available in the documentation too. -- ___ Python tracker <http://bugs.python.org/issue10948> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10948] Trouble with dir_util created dir cache
Diego Queiroz added the comment: "I would agree if mkpath were a public function." So It is better to define what a "public function" is. Any function in any module of any project, if it is indented to be used by other modules, it is public by definition. If new people get involved in distutils development they will need to read all the code, line by line and every comment, because the old developers decided not to document the inner workings of its functions. "Considering that dir_util is gone in distutils2, I see no benefit in editing the doc." Well, I know nothing about this. However, if you tell me that distutils2 will replace distutils, I may agree with you and distutils just needs to be deprecated. Otherwise, I keep my opinion. -- ___ Python tracker <http://bugs.python.org/issue10948> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com