Ben Hoyt added the comment:

To Victor and Serhiy:

1) Serhiy's point about not needing to build the symlinks set when followlinks 
is True is a good one, because it'll never get used. Just a "if not 
followlinks: ..." around that try/except would do it. Though this is a small 
optimization, as I expect 95% of people use os.walk() with followlinks=False, 
which is the default.

2) Much as I don't want to admit it :-), I think Serhiy has a point regarding 
the change in behaviour. Can we accept this tiny change? This happens because 
the previous implementation of os.walk() calls path.islink(), which does a real 
syscall, after the yield returns.

So if the caller modifies "dirnames" and adds a symlink to a directory it won't 
be in the symlinks set. Or if (as Serhiy's example shows) they change a 
symlink-to-a-directory to a directory the new implementation doesn't do another 
syscall to check, so differs from the old one -- in Serhiy's example, the old 
os.walk() will recurse into the changed 
symlink-to-a-directory-that's-now-a-directory, whereas the new os.walk() won't 
recurse.

Arguably the new behaviour is just as good here, but it is different. And 
Serhiy's function unsymlink() is a reasonable example of how this might play 
out.

The os.walk() docs explicitly allow modifying "dirnames" -- not just removing 
and reordering, but also adding entries, which could surface the difference in 
behaviour: "the caller can modify the dirnames list in-place ... even to inform 
walk() about directories the caller creates or renames before it resumes walk() 
again". See here:

https://docs.python.org/3.4/library/os.html#os.walk

So I do think it's a real issue. Will this really be an issue in practice? I 
don't know; I'm tempted to think not.

Obviously if we have to call stat/islink again, it defeats half of the purpose 
of scandir. But I don't see a way around it if we want to keep the exact old 
os.walk() behaviour. We could fix the behaviour if a directory/symlink was 
added to "dirnames" fairly easily by testing whether it changed, but I don't 
see a way to fix the unsymlinks() example without a syscall.

Thoughts?

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23605>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to