On Jul 9, 7:08 pm, Terry Reedy <[EMAIL PROTECTED]> wrote: > Dan Stromberg wrote: > > On Tue, 08 Jul 2008 15:18:23 -0700, [EMAIL PROTECTED] wrote: > > >> I need to mantain a filesystem where I'll keep only the most recently > >> used (MRU) files; least recently used ones (LRU) have to be removed to > >> leave space for newer ones. The filesystem in question is a clustered fs > >> (glusterfs) which is very slow on "find" operations. To add complexity > >> there are more than 10^6 files in 2 levels: 16³ dirs with equally > >> distributed number of files inside. > > >> Any suggestions of how to do it effectively? > > > os.walk once. > > > Build a list of all files in memory. > > > Sort them by whatever time you prefer - you can get times from os.stat. > > Since you do not need all 10**6 files sorted, you might also try the > heapq module. The entries into the heap would be (time, fileid)
I'll look into it: probably sorting dirs by atime and adding the files inside to the heapq until I can remove enough of them would work very efficiently. Thanks Pau -- http://mail.python.org/mailman/listinfo/python-list