On 09/11/16 14:17, Ralf Brinkmann wrote: > I just checked the use of the multicore compression program "gzip" on > one file-daemon side. It did use 32 cores out of 48 possible. > > I think even for LTO tape drives the use of a multicore compression tool > could be a win on time and storage space. Be sure that that your "48 cores" really are physical cores - hyperthreading is of no use when running gzip processes and using more pigz threads than there are physical CPUs will result in performance loss.
Pigz -9 will normally get you 5% extra over the onboard hardware compression on a LTO drive, with a massive computational cost - and you have the extra cost of getting the data in from a PCIe bus, processed through the CPU(s), and out through another PCIe bus, with the overall penalties meaning it's simply not worth doing it. Normally Gzip has a maximum output speed on most x86 cpus of about 30-35MB/s and what I've seen when compresing database dumps (which is an almost-ideal compression case, gettng 20-30:1 compression is that the best write speed (using pigs 16 cores pigz to another (SSD) array is about 50-60Mb/s - far lower than the 140-180Mb needed to keep a LTO6 or 7 drive happy with uncompressible data. Even less compressible (text) datasets only ever end up outputting at significantly under 100MB/s. The bigger problem is that backing up any array of mechanical drives is not going to keep up with the speed of a LTO drive unless the array is utterly quiescent AND the data is sequentially laid down on the drive. As soon as the heads start seeking, throughput will fall away so badly that the tape will have to slow to lower speeds or start shoeshining. Running the input through pigz is going to make things substantially worse and the effect can easily be that your backup time is substantially extended, whilst the amount of physical tape used is only slightly reduced. This effect is significantly worse when you're making differential or incremental backups. It takes about 3 seconds for the tape to stop or start, so you really want to throw as much data at it as it can handle and only stop when you run out of data, It's bad enough with LTOs when you have a single tape drive in play that a SSD spool drive is advisable. When you start backing up multiple simultaneous filesets and/or have multiple tape drives in operation, a decent SSD(PCIe NVME) or ramdrive spool area is essential. Pigz might be useful when your save area is a HDD (but a compressing filesystem like ZFS is probably better) but NOT when feeding tape. > > Output of top while pigz was running: >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 35961 root 20 0 2399932 25088 664 S 2767 0.0 4:53.11 pigz -9 >> -p 32 ------------------------------------------------------------------------------ Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users