Re: [zfs-discuss] ZFS file disk usage

Richard Elling Sun, 20 Sep 2009 17:33:46 -0700

If you are just building a cache, why not just make a file system and
put a reservation on it? Turn off auto snapshots and set other features

as per best practices for your workload? In other words, treat it likewe

treat dump space.


I think that we are getting caught up in trying to answer the question
you ask rather than solving the problem you have... perhaps because
we don't understand the problem.
 -- richard

On Sep 20, 2009, at 2:17 PM, Andrew Deason wrote:

On Fri, 18 Sep 2009 17:54:41 -0400
Robert Milkowski <mi...@task.gda.pl> wrote:

There will be a delay of up-to 30s currently.

But how much data do you expect to be pushed within 30s?
Lets say it would be even 10g to lots of small file and you would
calculate the total size by only summing up a logical size of data.
Would you really expect that an error would be greater than 5% which
would be 500mb. Does it matter in practice?

Well, that wasn't the problem I was thinking of. I meant, if we haveto

wait 30 seconds after the write to measure the disk usage... what do I
do, just sleep 30s after the write before polling for disk usage?

We could just ask for disk usage when we write, knowing that itdoesn'ttake into account the write we are performing... but we're changingwhatwe're measuring, then. If we are removing things from the cache inorder

to free up space, how do we know when to stop?

To illustrate: normally when the cache is 98% full, we remove items
until we are 95% full before we allow a write to happen again. If we
relied on statvfs information for our disk usage information, we would

start removing items at 98%, and have no idea when we hit 95% unlesswe

wait 30 seconds.

If you are simply saying that the difference in logical size and used

disk blocks on ZFS are similar enough not to make a difference...well,

that's what I've been asking. I have asked what the maximum difference

is between "logical size rounded up to recordsize" and "size takenup on

disk", and haven't received an answer yet. If the answer is "small
enough that you don't care", then fantastic.

what is user enables compression like lzjb or even gzip?
How would you like to take it into account before doing writes?

What if user creates a snapshot? How would you take it into account?


Then it will be wrong; we do not take them into account. I do not care
about those cases. It is already impossible to enforce that the cache
tracking data is 100% correct all of the time.

Imagine we somehow had a way to account for all of those cases you

listed, and would make me happy. Say the directory the user uses forthecache data is /usr/vice/cache (one standard path to put it). TheOpenAFSclient will put cache data in e.g. /usr/vice/cache/D0/V1 and a bunchof

other files.  If the user puts their own file in
/usr/vice/cache/reallybigfile, our cache tracking information will

always be off, in all current implementations. We have no controlover

it, and we do not try to solve that problem.

I am treating the cases of "what if the user creates a snapshot" andthelike as a similar situation. If someone does that and runs out ofspace,

it is pretty easy to troubleshoot their system and say "you have a
snapshot of the cache dataset; do not do that". Right now, if someone
runs an OpenAFS client cache on zfs and runs out of space, the only
thing I can tell them is "don't use zfs", which I don't want to do.

If it works for _a_ configuration -- the default one -- that is allI am

asking for.

I'm under suspicion that you are looking too closely  for no real
benefit. Especially if you don't want to dedicate a dataset to cache
you would expect other  applications in a system  to write to the
same file system but different locations which you have no control or
ability to predict how much data will be written at all. Be it Linux,
Solaris, BSD, ... the issue will be there.

It is certainly possible for other applications to fill up the disk.We

just need to ensure that we don't fill up the disk to block other
applications. You may think this is fruitless, and just from that
description alone, it may be. But you must understand that without an

accurate bound on the cache, well... we can eat up the disk a lotfaster

than other applications without the user realizing it.

--
Andrew Deason
adea...@sinenomine.net
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS file disk usage

Reply via email to