Thanks for the review and suggestions! I'll send a v2 later, two replies inline.

On 11/5/19 10:14 AM, Aaron Lauterer wrote:
Nicely written.

I have some suggestions inline:
* splitting long sentences
* adding more info as to what is valid for the size in special_small_blocks (taken from the zfs man page)
* rewrote the last paragraph a bit

On 10/22/19 12:33 PM, Fabian Ebner wrote:
 > Signed-off-by: Fabian Ebner <f.eb...@proxmox.com>
 > ---
 >   local-zfs.adoc | 44 ++++++++++++++++++++++++++++++++++++++++++++
 >   1 file changed, 44 insertions(+)
 >
 > diff --git a/local-zfs.adoc b/local-zfs.adoc
 > index b4fb7db..378cbee 100644
 > --- a/local-zfs.adoc
 > +++ b/local-zfs.adoc
> @@ -431,3 +431,47 @@ See the `encryptionroot`, `encryption`, `keylocation`, `keyformat` and
 >   `keystatus` properties, the `zfs load-key`, `zfs unload-key` and `zfs
>   change-key` commands and the `Encryption` section from `man zfs` for more
 >   details and advanced usage.
 > +
 > +
 > +ZFS Special Device
 > +~~~~~~~~~~~~~~~~~~
 > +
> +Since version 0.8.0 ZFS allows adding a `special` device to a pool, which is > +then used to store metadata, deduplication tables and optionally small file
 > +blocks.

Since version 0.8. ZFS supports `special` devices. A `special` device in a pool is used to store metadata, deduplication tables, and optionally small file blocks.

 > +
> +IMPORTANT: The redundancy of the `special` device should match the one of the > +pool, since the `special` device is a point of failure for the whole pool.
 > +
 > +WARNING: Adding a `special` device to a pool cannot be undone!
 > +
 > +.Create a pool with `special` device and RAID-1:
 > +
> + zpool create -f -o ashift=12 <pool> mirror <device1> <device2> special mirror <device3> <device4>
 > +
 > +.Add a `special` device to an existing pool with RAID-1:
 > +
 > + zpool add <pool> special mirror <device1> <device2>
 > +
> +For ZFS datasets where the `special_small_blocks` property is set to a non-zero > +value, the `special` device is used to store small file blocks up to that size. > +Setting the `special_small_blocks` property on the pool will change the default > +value of that property for all child ZFS datasets (for example all containers
 > +in the pool will opt in for small file blocks).
 > +
 > +.Opt in for small file blocks pool-wide:
 > +
 > + zfs set special_small_blocks=<size> <pool>
 > +
 > +.Opt in for small file blocks for a single dataset:
 > +
 > + zfs set special_small_blocks=<size> <pool>/<filesystem>
 > +
 > +.Opt out from small file blocks for a single dataset:
 > +
 > + zfs set special_small_blocks=0 <pool>/<filesystem>

INFO: The value for <size> can be `0` to disable storing small file blocks on the special device or a power of two in the range between 512B to 128K.


Another thing I'll add here is about the (non-intuitive) relation with the recordsize. Setting small_file_blocks higher or equal than the recordsize of the ZFS file system will cause *all* data to be written to the special device [0].

 > +
> +Using a `special` device makes sense for pools with lots and lots of changing > +metadata respectively small files. If you also have other, larger I/O on the > +same pool then the benefit from using a `special` device might be even more > +noticeable. It is recommended to use SSDs or NVMes for the `special` device.
 >

A `special` device can improve the speed of small I/O operations if the pool consists of slow spinning hard disks. Enabling `special_small_blocks` can further increase the performance if a lot of small files are used. Use fast (NVME) SSDs  for the `special` device.


It's really about metadata and not small I/O operations in general. For example having I/O operations with block-size 4K, but on large files will not benefit from a special device (even with small_file_blocks enabled). And I think that the benefit does not depend so much on the speed of the SSD. It should come from the fact that the I/O on the HDDs doesn't get disturbed as much by the metadata/small file operations.

What about the following?

A `special` device can improve the speed of a pool consisting of slow spinning hard disks with a lot of changing metadata. For example if the pool has many short-lived files. Enabling `special_small_blocks` can further increase the performance when those files are small. Use SSDs for the `special` device.

_______________________________________________
pve-devel mailing list
pve-devel@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


[0]: https://github.com/zfsonlinux/zfs/issues/9131#issuecomment-523680936

_______________________________________________
pve-devel mailing list
pve-devel@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to