> BTW, I really wonder how much pain could be avoided if updatedb recorded > mtime of directories and checked it. I.e. instead of just doing blind > find(1), walk the stored directory tree comparing timestamps with those > in filesystem. If directory mtime has not changed, don't bother rereading > it and just go for (stored) subdirectories. If it has changed - reread the > sucker. If we have a match for stored subdirectory of changed directory, > check inumber; if it doesn't match, consider the entire subtree as new > one. AFAICS, that could eliminate quite a bit of IO...
That would just save reading the directories. Not sure it helps that much. Much better would be actually if it didn't stat the individual files (and force their dentries/inodes in). I bet it does that to find out if they are directories or not. But in a modern system it could just check the type in the dirent on file systems that support that and not do a stat. Then you would get much less dentries/inodes. Also I expect in general the new slub dcache freeing that is pending will improve things a lot. But even if updatedb was fixed to be more efficient we probably still need a general solution for other tree walking programs that cannot be optimized this way. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/