Steve <steve.jack...@norman.com> writes:

> I would like to ask a question regarding ZFS performance overhead when
> having hundreds of millions of files
>
> We have a storage solution, where one of the datasets has a folder
> containing about 400 million files and folders (very small 1K files)
>
> What kind of overhead do we get from this kind of thing?

at least 50%.  I don't think this is obvious, so I'll state it: RAID-Z
will not gain you any additional capacity over mirroring in this
scenario.

remember each individual file gets its own stripe.  if the file is 512
bytes or less, you'll need another 512 byte block for the parity
(actually as a special case, it's not parity, but a copy.  parity would
just be an inversion of all bits, so it's not useful to spend time doing
it.)  what's more, even if the file is 1024 bytes or less, ZFS will
allocate an additional padding block to reduce the chance of unusable
single disk blocks.  a 1536 byte file will also consume 2048 bytes of
physical disk, however.  the reasoning for RAID-Z2 is similar, except it
will add a padding block even for the 1536 byte file.  to summarise:

    net   raid-z1   raidz-2
  --------------------------
    512   1024 2x   1536 3x
   1024   2048 2x   3072 3x
   1536   2048 1½x  3072 2x
   2048   3072 1½x  3072 1½x
   2560   3072 1⅕x  3584 1⅖x

the above assumes at least 8 (9) disks in the vdev, otherwise you'll get
a little more overhead for the "larger" filesizes.

> Our storage performance has degraded over time, and we have been
> looking in different places for cause of problems, but now I am
> wondering if its simply a file pointer issue?

adding new files will fragment directories, that might cause performance
degradation depending on access patterns.

I don't think many files in itself will cause problems, but since you
get a lot more ZFS records in your dataset (128x!), more of the disk
space is "wasted" on block pointers, and you may get more block pointer
writes since more levels are needed.

-- 
Kjetil T. Homme
Redpill Linpro AS - Changing the game

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to