On Jan 5, 2010, at 2:34 AM, Mikko Lammi wrote:

Hello,

As a result of one badly designed application running loose for some time,
we now seem to have over 60 million files in one directory. Good thing
about ZFS is that it allows it without any issues. Unfortunatelly now that we need to get rid of them (because they eat 80% of disk space) it seems
to be quite challenging.

Traditional approaches like "find ./ -exec rm {} \;" seem to take forever - after running several days, the directory size still says the same. The
only way how I've been able to remove something has been by giving "rm
-rf" to problematic directory from parent level. Running this command
shows directory size decreasing by 10,000 files/hour, but this would still
mean close to ten months (over 250 days) to delete everything!

This is, in part, due to stat() slowness. Fixed in later OpenSolaris builds.
I have no idea if or when the fix will be backported to Solaris 10.

I also tried to use "unlink" command to directory as a root, as a user who created the directory, by changing directory's owner to root and so forth,
but all attempts gave "Not owner" error.

Any commands like "ls -f" or "find" will run for hours (or days) without actually listing anything from the directory, so I'm beginning to suspect that maybe the directory's data structure is somewhat damaged. Is there
some diagnostics that I can run with e.g "zdb" to investigate and
hopefully fix for a single directory within zfs dataset?

To make things even more difficult, this directory is located in rootfs,
so dropping the zfs filesystem would basically mean reinstalling the
entire system, which is something that we really wouldn't wish to go.

How are the files named?  If you know something about the filename
pattern, then you could create subdirs and mv large numbers of files
to reduce the overall size of a single directory.  Something like:

        mkdir .A
        mv A* .A
        mkdir .B
        mv B* .B
        ...

Also, as previously noted, atime=off.

If you can handle a reboot, you can bump the size of the DNLC, which
might help also.  OTOH, if you can reboot you can also run the latest
b130 livecd which has faster stat().
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to