On 7/26/2011 5:04 AM, Konstantin Khomoutov wrote: > On Tue, 26 Jul 2011 00:18:05 -0700 > Steve Ellis<el...@brouhaha.com> wrote: > > [...] >> Another point, even with your current config, if you >> aren't doing data spooling you are probably slowing things down >> further, as well as wearing out both the tapes and heads on the drive >> with lots of shoeshining. > (I'm asking as a person having almost zero prior experience with tape > drives for backup purposes.) > > Among other things, I'm doing full backups of a set of machines to a > single tape--yes, full backup each time, no incremental/differential > which means I supposedly have just straightforward data flows from > FDs to the SD. At present time I have max concurrent jobs set to 1 > on my tape drive resource and no data spooling turned on. > Would I benefit from enabling data spooling in this scenario? > > To present some numbers, each machine's data is about 50-80G and I can > use about 200G for the spool directory which means I could do spooling > for 3-4 jobs in parallel (as described in [1]). > Would that improve tape usage pattern? > > 1. http://www.bacula.org/en/dev-manual/main/main/Data_Spooling.html > > OK, perhaps I'm not the best person to ask, but here's what I do know:
Even with only 1 job at a time, if you aren't able to deliver data to the drive at its minimum streaming data rate (for LTO4, probably at least 40MB/sec--possibly varies by manufacturer), then the tape mechanism will have to stop, go back a bit, wait for more data, then start up again--all of this takes time, and increases wear on the tapes and drive heads. If you enable data spooling when you can't keep up with the drive anyway, even with a fairly modest spool size of 10-20G per job, I believe you will find that your backups will at least not be slower, and may well proceed faster, even with the overhead of spooling (assuming that your spool disk(s) are able to send data to the drive fast enough to hit near the maximum rate the drive can accept). If you are using concurrent jobs, there is a further benefit: the data for all jobs won't be completely shuffled on the tape. If I recall, data spooling in bacula implicitly turns on attribute spooling, which can also help, I believe, if there are lots of small files in your backup. You don't have to spool an entire job in order to take advantage of spooling--and with multiple concurrent jobs, while one is despooling others can be spooling (have to watch out for whether your spool area can keep up with all the writes and reads, though). I'm still on LTO3, but I believe that some people advocate RAID0 for spool disks for LTO4. I'm using an otherwise completely idle single drive for spooling 3 concurrent jobs and as far as I've noticed, I'm able to stream data to the drive at a rate it is happy with (again to LTO3). I hope this helps, -se ------------------------------------------------------------------------------ Magic Quadrant for Content-Aware Data Loss Prevention Research study explores the data loss prevention market. Includes in-depth analysis on the changes within the DLP market, and the criteria used to evaluate the strengths and weaknesses of these DLP solutions. http://www.accelacomm.com/jaw/sfnl/114/51385063/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users