New submission from Elijah Rippeth <elijah.ripp...@gmail.com>:

I have a directory with hundreds of thousands of text files. I wanted to 
explore one file, so I wrote the following code expecting it to happen 
basically instantaneously because of how generators work:

```python
from pathlib import Path

base_dir = Path("/path/to/lotta/files/")
files = base_dir.glob("*.txt")            # return immediately
first_file = next(files)                  # doesn't return immediately
```

to my surprise, this took a long time to finish since `next` on a generator 
should be O(1).

A colleague pointed me to the following code: 
https://github.com/python/cpython/blob/adcd2205565f91c6719f4141ab4e1da6d7086126/Lib/pathlib.py#L431

I assume calling this list is to "freeze" a potentially changing directory 
since `scandir` relies on `os.stat`, but this causes a huge penalty and makes 
the generator return-type a bit disingenuous. In any case, I think this is bug 
worthy in someo sense.

----------
components: IO
messages: 393190
nosy: Elijah Rippeth
priority: normal
severity: normal
status: open
title: pathlib.Path.glob's generator is not a real generator
type: performance
versions: Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue44069>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to