On 4/28/2020 5:05 AM, Mark Dixon wrote:
Hi David,

Running two jobs on a client takes the FD's CPU utilisation from 100% to 200%, so it does look multi-threaded.


Yes. I believe the client is multi-threaded in that multiple commands can be issued and they will each be handled in a separately spawned thread. However, each thread will itself be sequential, so a single thread will not work on multiple files at the same time. If you run two backup jobs in parallel, then bacula-fd may work on two files at the same time, depending on CPU core availability, etc.

Whether or not this helps depends on whether or not single-core CPU performance is indeed the bottleneck. Is the single job approach CPU bound due to compression? Or is it i/o bound anyway? For example, if on a 1G network, then it can transfer at most 125 MB/s to the SD. If many files on the same disk are being worked on, then it can slow down average disk access times and perhaps the disk subsystem on the client will be the bottleneck.

A good test would be to run the single job with compression disabled. If the throughput is much greater without compression, then perhaps splitting into multiple jobs will help by utilizing more cores for the compression. If the throughput isn't much different, then splitting into multiple jobs likely won't help. Likewise for encryption.



Virtual full backups sounds like a useful alternative - thanks for that. But I am a little nervous of it effectively meaning "incremental forever", as far as the client is concerned.

On a side note, my configuration is in theory pretty verbose: I find myself writing programs to in-line into my configuration files with various @|"" directives to simplify it, or abstract out passwords so that they don't end up in my version control system. Is this unusual?

Cheers,

Mark

On Mon, 27 Apr 2020, David Brodbeck wrote:

I'm not sure two jobs concurrently will work -- I think the FD is
still single-threaded, although someone can correct me if I'm wrong.

My solution was to go to virtual full backups, so that full backups on the
client became a rare event. The heavy job then becomes the virtual full
consolidation, which is strictly a SD and director issue. My chokepoint for
consolidation jobs is currently attribute despooling, which thrashes the
database pretty hard, but it's still a lot faster than a full backup from
the client.

On Mon, Apr 27, 2020 at 9:34 AM Mark Dixon <mark.c.di...@durham.ac.uk>
wrote:

Hi all,

Am I right in thinking that a single bacula job can only back up each file in its fileset sequentially - there's no multithreading available to back
up multiple files at the same time in order to leverage the client CPU?

I'm a relatively long-term user of bacula (thanks!) who has been happy
backing up relatively small data volumes to disk, but am now faced with a fairly large directory. "Large" is defined as "takes too long to do a full
dump" and the limiting factor at the moment might be down to software
compression on the client's CPU.

Playing with the compression settings is the obvious approach, but I was wondering about other options - particularly as I may have a use case for
client-side encryption as well.

If the job stubbornly remains too long to backup, I suspect I'm looking at
splitting the directory across multiple jobs and running them
concurrently.

Is that right?

Thanks,

Mark


_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



--
David Brodbeck
System Administrator, Department of Mathematics
University of California, Santa Barbara



_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to