On 17/12/2012 15:41, Chris Angelico wrote: > On Tue, Dec 18, 2012 at 2:28 AM, Gilles Lenfant > <gilles.lenf...@gmail.com> wrote: >> Hi, >> >> I have googled but did not find an efficient solution to my >> problem. My customer provides a directory with a huuuuge list of >> files (flat, potentially 100000+) and I cannot reasonably use >> os.listdir(this_path) unless creating a big memory footprint. >> >> So I'm looking for an iterator that yields the file names of a >> directory and does not make a giant list of what's in. > > Sounds like you want os.walk. But... a hundred thousand files? I > know the Zen of Python says that flat is better than nested, but > surely there's some kind of directory structure that would make this > marginally manageable? > > http://docs.python.org/3.3/library/os.html#os.walk
Unfortunately all of the built-in functions (os.walk, glob.glob, os.listdir) rely on the os.listdir functionality which produces a list first even if (as in glob.iglob) it later iterates over it. There are external functions to iterate over large directories in both Windows & Linux. I *think* the OP is on *nix from his previous posts, in which case someone else will have to produce the Linux-speak for this. If it's Windows, you can use the FindFilesIterator in the pywin32 package. TJG -- http://mail.python.org/mailman/listinfo/python-list