On 09/11/16 14:17, Ralf Brinkmann wrote:
> I just checked the use of the multicore compression program "gzip" on
> one file-daemon side. It did use 32 cores out of 48 possible.
>
> I think even for LTO tape drives the use of a multicore compression tool
> could be a win on time and storage space.
Be sure that that your "48 cores" really are physical cores - 
hyperthreading is of no use when running gzip processes and using more 
pigz threads than there are physical CPUs will result in performance loss.

Pigz -9 will normally get you 5% extra over the onboard hardware 
compression on a LTO drive, with a massive computational cost - and you 
have the extra cost of getting the data in from a PCIe bus, processed 
through the CPU(s), and out through another PCIe bus, with the overall 
penalties meaning it's simply not worth doing it.

Normally Gzip has a maximum output speed on most x86 cpus of about 
30-35MB/s and what I've seen when compresing database dumps (which is an 
almost-ideal compression case, gettng 20-30:1 compression is that the 
best write speed (using pigs 16 cores pigz to another (SSD) array is 
about 50-60Mb/s - far lower than the 140-180Mb needed to keep a LTO6 or 
7 drive happy with uncompressible data. Even less compressible (text) 
datasets only ever end up outputting at significantly under 100MB/s.


The bigger problem is that backing up any array of mechanical drives is 
not going to keep up with the speed of a LTO drive unless the array is 
utterly quiescent AND the data is sequentially laid down on the drive. 
As soon as the heads start seeking, throughput will fall away so badly 
that the tape will have to slow to lower speeds or start shoeshining.

Running the input through pigz is going to make things substantially 
worse and the effect can easily be that your backup time is 
substantially extended, whilst the amount of physical tape used is only 
slightly reduced.

This effect is significantly worse when you're making differential or 
incremental backups. It takes about 3 seconds for the tape to stop or 
start, so you really want to throw as much data at it as it can handle 
and only stop when you run out of data,

It's bad enough with LTOs when you have a single tape drive in play that 
a SSD spool drive is advisable.

When you start backing up multiple simultaneous filesets and/or have 
multiple tape drives in operation, a decent SSD(PCIe NVME) or ramdrive 
spool area is essential.

Pigz might be useful when your save area is a HDD (but a compressing 
filesystem like ZFS is probably better) but NOT when feeding tape.



>
> Output of top while pigz was running:
>>    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
>> 35961 root      20   0 2399932  25088    664 S  2767  0.0   4:53.11 pigz -9 
>> -p 32





------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to