Gregory P. Smith <g...@krypto.org> added the comment:

For some context here, the pathlib scandir code was written by Serhiy in 
https://bugs.python.org/issue26032 and related commit 
https://github.com/python/cpython/commit/680cb152c5d220a74321fa905d4fc91bdec40fbb.

> Any optimization can be accepted only when we have any prove that the change 
> actually has measurable effect, and that it is large enough.

While often good advice, there is no such blanket policy.  Saying that comes 
across as dismissive.  A better way to phrase such a statement is as a question:

In what practical scenarios does this change provide a demonstrable benefit?

The improvement in this case could be avoiding an explosion of memory use in 
some circumstances.  Depending on the particular filesystem.  (the original 
reason based on real world production experience that I pushed to have our 
os.scandir API created in the first place)

> Avoiding calling list() on the output of scandir() may be not harmless.

Keeping it may not be harmless either. :)

One key point of os.scandir() is to be iterated over instead of turned into a 
list to avoid using too much memory.  Code that requires a list could've just 
called listdir() (as the previous code did in both of these places) - if that 
is intentional, we should include an explicit comment stating why a preloaded 
list was intentional.

Either these list calls go away as is natural with scandir, or comments should 
be added explaining why they are important (related: unittests exercising the 
situations they are important for would be ideal)

As you're the original author of the pathlib scandir code, do you remember what 
the intent of these list(scandir) calls was?

One potential difference without them: If someone is selecting via pathlib and 
modifying directory file contents while iterating over the results, the lists 
may be important.  (what happens in this case presumably depends on the 
underlying os fs implementation)

It sounds like we don't have test cases covering such scenarios.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37935>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to