waylan wrote: > Bruce wrote: > > Hi all, > > I have a question about traversing file systems, and could use some > > help. Because of directories with many files in them, os.walk appears > > to be rather slow. I`m thinking there is a potential for speed-up since > > I don`t need os.walk to report filenames of all the files in every > > directory it visits. Is there some clever way to use os.walk or another > > tool that would provide functionality like os.walk except for the > > listing of the filenames? > > You might want to check out the path module [1] (not os.path). The > following is from the docs: > > > The method path.walk() returns an iterator which steps recursively > > through a whole directory tree. path.walkdirs() and path.walkfiles() > > are the same, but they yield only the directories and only the files, > > respectively. > > Oh, and you can thank Paul Bissex for pointing me to path [2]. >
> [1]: http://www.jorendorff.com/articles/python/path/ > [2]: http://e-scribe.com/news/289 A little late but.. thanks for the replies, was very useful. Here`s what I do in this case: def search(a_dir): valid_dirs = [] walker = os.walk(a_dir) while 1: try: dirpath, dirnames, filenames = walker.next() except StopIteration: break if dirtest(dirpath,filenames): valid_dirs.append(dirpath) return valid_dirs def dirtest(a_dir): testfiles = ['a','b','c'] for f in testfiles: if not os.path.exists(os.path.join(a_dir,f)): return 0 return 1 I think you`re right - it`s not os.walk that makes this slow, it`s the dirtest method that takes so much more time when there are many files in a directory. Also, thanks for pointing me to the path module, was interesting. -- http://mail.python.org/mailman/listinfo/python-list