While I'm not able to contribute patches, I'd like to voice my support for the concept of having multiple spool files to enable concurrent spooling & de-spooling that Ralph Gross brought up in 2007[1] and which Jesper Krogh submitted as a feature request in 2009[2].
Problem synopsis: Data spooling is a necessity to get good throughput to modern, high-speed tape drives, but bacula pauses the spooling as soon as it starts writing to tape, which drastically reduces throughput. Here are some numbers from our environment: [A] Throughput w/o spooling: ~22MB/s this represents the aggregate of the speed to read data from disk and write to tape, with shoe-shining, network congestion, disk contention, etc. [B] Throughput to spool file: ~55MB/s this represents the aggregate of the speed to read data from disk (a 9TB logical volume made up from multiple RAID5 and RAID6 LUNs) and write to the RAID-10 spool partition. This includes any network congestion, disk contention, etc. [C] Throuput from disk spool file to LTO-4 tape: ~108MB/s This is the raw despooling-speed. [D] End-to-end throughput with spooling: ~27MB/s This is very disappointing...this is the overall throughput of [B] + [C] above. While eliminating shoe-shining is much better for the tape media and tape drive, the overall performance is almost identical to [A], while it should be close to [B]. The reason for the decrease in performance is that bacula stops all spooling as soon as it starts de-spooling. In an ideal configuration, there could be multiple spool directories defined, and bacula would open a new spool file in the next directory as soon as it begins despooling. An example bacula-sd configuration might contain: Spool Directory=/raid0-A/spool Spool Directory=/raid0-B/spool Spool Directory=/raid0-C/spool Concurrent Spool=yes where each "/raid0-*" mount point is a separate RAID-0 array, so as to minimize contention. The "Concurrent Spool" option would determine whether spooling follows the existing behavior, or if multiple spool files (possibly in different directories) are used concurrently for spooling and despooling. If the user defines a single spool directory (as in the current configuration), and does not defined "Concurrent Spool = yes", the existing behavior would occur. [1] http://copilotco.com/mail-archives/bacula-devel.2007/msg02642.html [2] http://www.bacula.org/git/cgit.cgi/bacula/plain/bacula/projects?h=Branch-5.1 Thanks, Mark ------------------------------------------------------------------------------ uberSVN's rich system and user administration capabilities and model configuration take the hassle out of deploying and managing Subversion and the tools developers use with it. Learn more about uberSVN and get a free download at: http://p.sf.net/sfu/wandisco-dev2dev _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users