Ben Hoyt added the comment: > I find iterdir_stat() ugly :-) I like the scandir name, which has some > precedent with POSIX.
Fair enough. I'm cool with scandir(). > scandir() cannot return (name, stat), because on POSIX, readdir() only > returns d_name and d_type (the type of the entry): to return a stat, we would > have to call stat() on each entry, which would defeat the performance gain. Yes, you're right. I "solved" this in BetterWalk with the solution you propose of returning a stat_result object with the fields it could get "for free" set, and the others set to None. So on Linux, you'd get a stat_result with only st_mode set (or None for DT_UNKNOWN), and all the other fields None. However -- st_mode is the one you're most likely to use, usually looking just for whether it's a file or directory. So calling code would look something like this: files = [] dirs = [] for name, st in scandir(path): if st.st_mode is None: st = os.stat(os.path.join(path, name)) if stat.S_ISDIR(st.st_mode): dirs.append(name) else: files.append(name) Meaning you'd get the speed improvements 99% of the time (when st_mode) was set, but if st_mode is None, you can call stat and handle errors and whatnot yourself. > That's why scandir would be a rather low-level call, whose main user would be > walkdir, which only needs to know the entry time and not the whole stat > result. Agreed. This is in the OS module after all, and there's tons of stuff that's OS-dependent in there. However, I think that doing something like the above, we can make it usable and performant on both Linux and Windows for use cases like walking directory trees. > Also, I don't know which information is returned by the readdir equivalent on > Windows, but if we want a consistent API, we have to somehow map d_type and > Windows's returned type to a common type, like DT_FILE, DT_DIRECTORY, etc > (which could be an enum). The Windows scan directory functions (FindFirstFile/FindNextFile) return a *full* stat (or at least, as much info as you get from a stat in Windows). We *could* map them to a common type -- but I'm suggesting that common type might as well be "stat_result with None meaning not present". That way users don't have to learn a completely new type. > The other approach would be to return a dummy stat object with only st_mode > set, but that would be kind of a hack to return a dummy stat result with only > part of the attributes set (some people will get bitten by this). We could document any platform-specific stuff, and places you'd users could get bitten. But can you give me an example of where the stat_result-with-st_mode-or-None approach falls over completely? ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11406> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com