du is faster than my code that does the same thing in python, it is highly optomized at the os level.
that said, I profiled spawning an external process to call du and over the large number of times I need to do this it is actually slower to execute du externally than my os.walk() implementation. du does not return the value I need anyway, I need files only not raw blocks consumed which is what du returns. also I need to filter out some files and dirs. after extensive profiling I found out that the way that os.walk() is implemented it calls os.stat() on the dirs and files multiple times and that is where all the time is going. I guess I need something like os.statcache() but that is deprecated, and probably wouldn't fix my problem. I only walk the dir once and then cache all bytes, it is the multiple calls to os.stat() that is kicked off by the os.walk() command internally on all the isdir() and getsize() and what not. just wanted to check and see if anyone had already solved this problem. -- http://mail.python.org/mailman/listinfo/python-list