Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

David Christensen Fri, 11 Nov 2022 21:55:33 -0800

On 11/11/22 00:43, hw wrote:

On Thu, 2022-11-10 at 21:14 -0800, David Christensen wrote:

On 11/10/22 07:44, hw wrote:

On Wed, 2022-11-09 at 21:36 -0800, David Christensen wrote:

On 11/9/22 00:24, hw wrote:
   > On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:

Taking snapshots is fast and easy.  The challenge is deciding when to
destroy them.


That seems like an easy decision, just keep as many as you can and destroy the
ones you can't keep.

As with most filesystems, performance of ZFS drops dramatically as youapproach 100% usage. So, you need a data destruction policy that keepsstorage usage and performance at acceptable levels.

Lots of snapshots slows down commands that involve snapshots (e.g. 'zfslist -r -t snapshot ...'). This means sysadmin tasks take longer whenthe pool has more snapshots.

I have considered switching to one Intel Optane Memory
Series and a PCIe 4x adapter card in each server [for a ZFS cache].

Isn't that very expensinve and wears out just as well?

The Intel Optane Memory Series products are designed to be cache devices-- when using compatible hardware, Windows, and Intel software. Myhardware should be compatible (Dell PowerEdge T30), but I am unsure ifFreeBSD 12.3-R will see the motherboard NVMe slot or an installed OptaneMemory Series product.



Intel Optane Memory M10 16 GB PCIe M.2 80mm are US $18.25 on Amazon.


Intel Optane Memory M.2 2280 32GB PCIe NVMe 3.0 x2 are US $69.95 on Amazon.

Wouldn't it be better to have the cache in RAM?

Adding memory should help in more ways than one. Doing so might reduceZFS cache device usage, but I am not certain. But, more RAM will notaddress the excessive wear problems when using a desktop SSD as a ZFScache device.

8 GB ECC memory modules to match the existing modules in my SOHO serverare $24.95 each on eBay. I have two free memory slots.

Please run and post the relevant command for LVM, btrfs, whatever.


Well, what would that tell you?

That would provide accurate information about the storage configurationof your backup server.

Here is the pool in my backup server. mirror-0 and mirror-1 each usetwo Seagate 3 TB HDD's. dedup and cache each use partitions on twoIntel SSD 520 Series 180 GB SSD's:


2022-11-11 20:41:09 toor@f1 ~
# zpool status p1
  pool: p1
 state: ONLINE

scan: scrub repaired 0 in 7 days 22:18:11 with 0 errors on Sun Sep 414:18:21 2022

config:

        NAME                              STATE     READ WRITE CKSUM
        p1                                ONLINE       0     0     0
          mirror-0                        ONLINE       0     0     0
            gpt/p1a.eli                   ONLINE       0     0     0
            gpt/p1b.eli                   ONLINE       0     0     0
          mirror-1                        ONLINE       0     0     0
            gpt/p1c.eli                   ONLINE       0     0     0
            gpt/p1d.eli                   ONLINE       0     0     0
        dedup   
          mirror-2                        ONLINE       0     0     0
            gpt/CVCV******D0180EGN-2.eli  ONLINE       0     0     0
            gpt/CVCV******7K180EGN-2.eli  ONLINE       0     0     0
        cache
          gpt/CVCV******D0180EGN-1.eli    ONLINE       0     0     0
          gpt/CVCV******7K180EGN-1.eli    ONLINE       0     0     0

errors: No known data errors

I suggest creating a ZFS pool with a mirror vdev of two HDD's.
   If you
can get past your dislike of SSD's,
  add a mirror of two SSD's as a
dedicated dedup vdev.  (These will not see the hard usage that cache
devices get.)
   Create a filesystem 'backup'.  Create child filesystems,
one for each host.  Create grandchild filesystems, one for the root
filesystem on each host.


Huh?  What's with these relationships?

ZFS datasets can be organized into hierarchies. Child datasetproperties can be inherited from the parent dataset. Commands can beapplied to an entire hierarchy by specifying the top dataset and using a"recursive" option. Etc..

When a host is decommissioned and you no longer need the backups, youcan destroy the backups for just that host. When you add a new host,you can create filesystems for just that host. You can use differentbackup procedures for different hosts. Etc..

   Set up daily rsync backups of the root
filesystems on the various hosts to the ZFS grandchild filesystems.  Set
up zfs-auto-snapshot to take daily snapshots of everything, and retain
10 snapshots.  Then watch what happens.

What do you expect to happen?

I expect the first full backup and snapshot will use an amount ofstorage that is something less than the sum of the sizes of the sourcefilesystems (due to compression). The second through tenth backups andsnapshots will each increase the storage usage by something less thanthe sum of the daily churn of the source filesystems. On day 11, andevery day thereafter, the oldest snapshot will be destroyed, daily churnwill be added, and usage will stabilize. Any source system upgrades andsoftware installs will cause an immediate backup storage usage increase.Any source system cleanings and software removals will cause a backupstorage usage decrease after 10 days.

I'm thinking about changing my backup sever ...
In any case, I need to do more homework first.

Keep your existing backup server and procedures operational. If you donot have offline copies of your backups (e.g. drives in removable racks,external drives), implement that now.

Then work on ZFS. ZFS looks simple enough going in, but you soonrealize that ZFS has a large feature set, new concepts, and anon-trivial learning curve. Incantations get long and repetitive; youwill want to script common tasks. Expect to make mistakes. It would bewise to do your ZFS evaluation in a VM. Using a VM would also allow youto use any OS supported by the hypervisor (which may work-around theproblem of FreeBSD not having drivers for the HP smart array P410).



David

Re: ZFS performance (was: Re: deduplicating file systems: VDO with Debian?)

Reply via email to