Brandorr wrote:
  > Is ZFS efficient at handling huge populations of tiny-to-small files -
  > for example, 20 million TIFF images in a collection, each between 5
  > and 500k in size?
  >
  > I am asking because I could have sworn that I read somewhere that it
  > isn't, but I can't find the reference.
  >   
  If you're worried about the I/O throughput, you should avoid RAIDZ1/2 
  configurations. random read performance will be desastrous if you do; 

A raid-z group  can do one random  read per I/O latency.  So
for 8 disks (each capable of 200 IOPS) in a zpool split into
2  raid-z groups  should  be able  to  server 400  files per
second. If you need to serve more  files, then you need more
disks or  need to use  mirroring. With mirroring, I'd expect
to   serve 1600 files (8*200).  This  model  only applies to
random reading, not sequential access,  not to any types  of
write loads.

For  small file creation ZFS can   be extremely efficient in
that it can create more than 1  file per I/O. It should also
approach disk streaming performance for write loads.

  I've seen random reads ratios with less than 1 MB/s on a X4500 with 40 
  dedicated disks for data storage. 

It would  be nice to  see  if the  above model matches  your
data. So if you have  all 40 disks  in a single raid-z group
(an anti  best  practice) I'd  expect <200  files served per
second and if the files were of 5K avg  size then I'd expect
that 1MB/sec.

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide,

  If you don't have to worry about disk 
  space, use mirrors;  

right on !

  I got my best results during my extensive X4500 
  benchmarking sessions, when I mirrored single slices instead of complete 
  disks (resulting in 40 2-way-mirrors on 40 physical discs, mirroring 
  c0t0d0s0->c0t1d0s1 and c0t1d0s0->c0t0d0s1, and so on). If you're worried 
  about disk space,  you should consider striping several instances of 
  RAIDZ1 arrays, each one consisting of three discs or slices. sequential 
  access will  go down the cliff,  but random reads will be boosted.

Writes should be good if not great, no matter what the
workload is. I'm interested in data that shows otherwise.

  You should also adjust the recordsize. 

For small files I certainly would not. 
Small files are stored as single record when they are
smaller than the recordsize. Single record is good in my
book. Not sure when one would want otherwise for small files.


  Try to measure the average I/O 
  transaction size. There's a good chance that your I/O performance will 
  be best if you set your recordsize to a smaller value. For instance, if 
  your average file size is 12 KB, try using 8K or even 4K recordsize, 
  stay away from 16K or higher.

Tuning the record size is currently only recommended for
databases (large file) with fixed record access. Again it's
interesting input if tuning the recordsize helped another
type of workload.

-r

  -- 

  Ralf Ramge
  Senior Solaris Administrator, SCNA, SCSA

  Tel. +49-721-91374-3963 
  [EMAIL PROTECTED] - http://web.de/

  1&1 Internet AG
  Brauerstraße 48
  76135 Karlsruhe

  Amtsgericht Montabaur HRB 6484

  Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, 
Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss 
  Aufsichtsratsvorsitzender: Michael Scheeren

  _______________________________________________
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to