Erik Trimble wrote:
Jose Luis Barquín Guerola wrote:
Hello.
I have a question about how ZFS works with "Dinamic Stripe".
Well, start with the next situation:
- 4 Disk of 100MB in stripe format under ZFS.
- We use the stripe in a 75%, so we have free 100MB. (easy)
Well, we add a new disk of 100MB in the pool. So we have 200MB free
but only 100MB will have the speed of 4 disk and, the rest 100MB will
have the speed of 1 disk.
The questions are:
- Have ZFS any kind of reorganization of the data in the stripe
that change this situation and become in 200MB free with the speed of
5 disks?
- If the answer is yes, how is it does? in the background?
Yes, new writes are biased towards the more-empty vdev.
Thanks for your time (and sorry for my english).
JLBG
When you add more vdevs to the zpool, NEW data is written to the new
stripe width. That is, when data was written to the original pool,
it was written across 4 drives. It now will be written across 5
drives. Existing data WILL NOT be changed.
So, for a zpool 75% full, you will NOT get to immediately use the
first 75% of the new vdevs added.
Thus, in your case, you started with a 400MB zpool (with 300MB of
data). You added another 100MB vdev, resulting in a 500MB zpool.
300MB is written across 4 drives, and will have the appropriate
speed. 75% of the new vdev isn't immediately usable (as it
corresponds to the 75% in-use on the other 4 vdevs), so you
effectively only have added 25MB of immediately usable space. Thus,
you have:
300MB across 4 vdevs
125MB across 5 vdevs
75MB "wasted" space on 1 vdev
To correct this - that is, to recover the 75MB of "wasted" space and
to move the 300MB from spanning 4 vdevs to spanning 5 vdevs - you
need to re-write the entire existing data space. Right now, there is
no background or other automatic method to do this. 'cp -rp' or
'rsync' is a good idea.
We really should have something like 'zpool scrub' do this automatically.
No. Dynamic striping is not RAID-0, which is what you are describing.
In a dynamic stripe, the data written is not divided up amongst the current
devices in the stripe. Rather, data is chunked and written to the vdevs.
When about 500 kBytes has been written to a vdev, the next chunk is
written to another vdev. The choice of which vdev to go to next is based,
in part, on the amount of free space available on the vdev. So you get
your cake (stochastic spreading of data across vdevs) and you get to
eat it (use all available space), too.
-- richard
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss