Chris Angelico <ros...@gmail.com> writes: > On Tue, Dec 18, 2012 at 2:28 AM, Gilles Lenfant > <gilles.lenf...@gmail.com> wrote: >> Hi, >> >> I have googled but did not find an efficient solution to my >> problem. My customer provides a directory with a huuuuge list of >> files (flat, potentially 100000+) and I cannot reasonably use >> os.listdir(this_path) unless creating a big memory footprint. >> >> So I'm looking for an iterator that yields the file names of a >> directory and does not make a giant list of what's in. > > Sounds like you want os.walk.
But doesn't os.walk call listdir() and that creates a list of the contents of a directory, which is exactly the initial problem? > But... a hundred thousand files? I know the Zen of Python says that > flat is better than nested, but surely there's some kind of directory > structure that would make this marginally manageable? > Sometimes you have to deal with things other people have designed, so the directory structure is not something you can control. I've run up against exactly the same problem and made something in C that implemented an iterator. It would probably be better if listdir() made an iterator rather than a list. -- http://mail.python.org/mailman/listinfo/python-list