Note that: 6501037 want user/group quotas on ZFS
Is already committed to be fixed in build 113 (i.e. in the next month). - Eric On Thu, Mar 12, 2009 at 12:04:04PM +0900, Jorgen Lundman wrote: > > In the style of a discussion over a beverage, and talking about > user-quotas on ZFS, I recently pondered a design for implementing user > quotas on ZFS after having far too little sleep. > > It is probably nothing new, but I would be curious what you experts > think of the feasibility of implementing such a system and/or whether or > not it would even realistically work. > > I'm not suggesting that someone should do the work, or even that I will, > but rather in the interest of chatting about it. > > Feel free to ridicule me as required! :) > > Thoughts: > > Here at work we would like to have user quotas based on uid (and > presumably gid) to be able to fully replace the NetApps we run. Current > ZFS are not good enough for our situation. We simply can not mount > 500,000 file-systems on all the NFS clients. Nor do all servers we run > support mirror-mounts. Nor do auto-mount see newly created directories > without a full remount. > > Current UFS-style-user-quotas are very exact. To the byte even. We do > not need this precision. If a user has 50MB of quota, and they are able > to reach 51MB usage, then that is acceptable to us. Especially since > they have to go under 50MB to be able to write new data, anyway. > > Instead of having complicated code in the kernel layer, slowing down the > file-system with locking and semaphores (and perhaps avoiding learning > indepth ZFS code?), I was wondering if a more simplistic setup could be > designed, that would still be acceptable. I will use the word > 'acceptable' a lot. Sorry. > > My thoughts are that the ZFS file-system will simply write a > 'transaction log' on a pipe. By transaction log I mean uid, gid and > 'byte count changed'. And by pipe I don't necessarily mean pipe(2), but > it could be a fifo, pipe or socket. But currently I'm thinking > '/dev/quota' style. > > User-land will then have a daemon, whether or not it is one daemon per > file-system or really just one daemon does not matter. This process will > open '/dev/quota' and empty the transaction log entries constantly. Take > the uid,gid entries and update the byte-count in its database. How we > store this database is up to us, but since it is in user-land it should > have more flexibility, and is not as critical to be fast as it would > have to be in kernel. > > The daemon process can also grow in number of threads as demand increases. > > Once a user's quota reaches the limit (note here that /the/ call to > write() that goes over the limit will succeed, and probably a couple > more after. This is acceptable) the process will "blacklist" the uid in > kernel. Future calls to creat/open(CREAT)/write/(insert list of calls) > will be denied. Naturally calls to unlink/read etc should still succeed. > If the uid goes under the limit, the uid black-listing will be removed. > > If the user-land process crashes or dies, for whatever reason, the > buffer of the pipe will grow in the kernel. If the daemon is restarted > sufficiently quickly, all is well, it merely needs to catch up. If the > pipe does ever get full and items have to be discarded, a full-scan will > be required of the file-system. Since even with UFS quotas we need to > occasionally run 'quotacheck', it would seem this too, is acceptable (if > undesirable). > > If you have no daemon process running at all, you have no quotas at all. > But the same can be said about quite a few daemons. The administrators > need to adjust their usage. > > I can see a complication with doing a rescan. How could this be done > efficiently? I don't know if there is a neat way to make this happen > internally to ZFS, but from a user-land only point of view, perhaps a > snapshot could be created (synchronised with the /dev/quota pipe > reading?) and start a scan on the snapshot, while still processing > kernel log. Once the scan is complete, merge the two sets. > > Advantages are that only small hooks are required in ZFS. The byte > updates, and the blacklist with checks for being blacklisted. > > Disadvantages are that it is loss of precision, and possibly slower > rescans? Sanity? > > But I do not really know the internals of ZFS, so I might be completely > wrong, and everyone is laughing already. > > Discuss? > > Lund > > -- > Jorgen Lundman | <lund...@lundman.net> > Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) > Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) > Japan | +81 (0)3 -3375-1767 (home) > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Eric Schrock, Fishworks http://blogs.sun.com/eschrock _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss