Erik Trimble wrote:
Jose Luis Barquín Guerola wrote:
Hello.
I have a question about how ZFS works with "Dinamic Stripe".

Well, start with the next situation:
  - 4 Disk of 100MB in stripe format under ZFS.
  - We use the stripe in a 75%, so we have free 100MB. (easy)

Well, we add a new disk of 100MB in the pool. So we have 200MB free but only 100MB will have the speed of 4 disk and, the rest 100MB will have the speed of 1 disk.

The questions are:
- Have ZFS any kind of reorganization of the data in the stripe that change this situation and become in 200MB free with the speed of 5 disks?
   - If the answer is yes, how is it does? in the background?

Yes, new writes are biased towards the more-empty vdev.


Thanks for your time (and sorry for my english).

JLBG

When you add more vdevs to the zpool, NEW data is written to the new stripe width. That is, when data was written to the original pool, it was written across 4 drives. It now will be written across 5 drives. Existing data WILL NOT be changed.

So, for a zpool 75% full, you will NOT get to immediately use the first 75% of the new vdevs added.

Thus, in your case, you started with a 400MB zpool (with 300MB of data). You added another 100MB vdev, resulting in a 500MB zpool. 300MB is written across 4 drives, and will have the appropriate speed. 75% of the new vdev isn't immediately usable (as it corresponds to the 75% in-use on the other 4 vdevs), so you effectively only have added 25MB of immediately usable space. Thus, you have:

300MB across 4 vdevs
125MB across 5 vdevs
75MB "wasted" space on 1 vdev

To correct this - that is, to recover the 75MB of "wasted" space and to move the 300MB from spanning 4 vdevs to spanning 5 vdevs - you need to re-write the entire existing data space. Right now, there is no background or other automatic method to do this. 'cp -rp' or 'rsync' is a good idea. We really should have something like 'zpool scrub' do this automatically.


No.  Dynamic striping is not RAID-0, which is what you are describing.
In a dynamic stripe, the data written is not divided up amongst the current
devices in the stripe.  Rather, data is chunked and written to the vdevs.
When about 500 kBytes has been written to a vdev, the next chunk is
written to another vdev.  The choice of which vdev to go to next is based,
in part, on the amount of free space available on the vdev.  So you get
your cake (stochastic spreading of data across vdevs) and you get to
eat it (use all available space), too.
-- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to