On 4/28/2020 5:05 AM, Mark Dixon wrote:
Hi David,
Running two jobs on a client takes the FD's CPU utilisation from 100%
to 200%, so it does look multi-threaded.
Yes. I believe the client is multi-threaded in that multiple commands
can be issued and they will each be handled in a separately spawned
thread. However, each thread will itself be sequential, so a single
thread will not work on multiple files at the same time. If you run two
backup jobs in parallel, then bacula-fd may work on two files at the
same time, depending on CPU core availability, etc.
Whether or not this helps depends on whether or not single-core CPU
performance is indeed the bottleneck. Is the single job approach CPU
bound due to compression? Or is it i/o bound anyway? For example, if on
a 1G network, then it can transfer at most 125 MB/s to the SD. If many
files on the same disk are being worked on, then it can slow down
average disk access times and perhaps the disk subsystem on the client
will be the bottleneck.
A good test would be to run the single job with compression disabled. If
the throughput is much greater without compression, then perhaps
splitting into multiple jobs will help by utilizing more cores for the
compression. If the throughput isn't much different, then splitting into
multiple jobs likely won't help. Likewise for encryption.
Virtual full backups sounds like a useful alternative - thanks for
that. But I am a little nervous of it effectively meaning "incremental
forever", as far as the client is concerned.
On a side note, my configuration is in theory pretty verbose: I find
myself writing programs to in-line into my configuration files with
various @|"" directives to simplify it, or abstract out passwords so
that they don't end up in my version control system. Is this unusual?
Cheers,
Mark
On Mon, 27 Apr 2020, David Brodbeck wrote:
I'm not sure two jobs concurrently will work -- I think the FD is
still single-threaded, although someone can correct me if I'm wrong.
My solution was to go to virtual full backups, so that full backups
on the
client became a rare event. The heavy job then becomes the virtual full
consolidation, which is strictly a SD and director issue. My
chokepoint for
consolidation jobs is currently attribute despooling, which thrashes the
database pretty hard, but it's still a lot faster than a full backup
from
the client.
On Mon, Apr 27, 2020 at 9:34 AM Mark Dixon <mark.c.di...@durham.ac.uk>
wrote:
Hi all,
Am I right in thinking that a single bacula job can only back up
each file
in its fileset sequentially - there's no multithreading available to
back
up multiple files at the same time in order to leverage the client CPU?
I'm a relatively long-term user of bacula (thanks!) who has been happy
backing up relatively small data volumes to disk, but am now faced
with a
fairly large directory. "Large" is defined as "takes too long to do
a full
dump" and the limiting factor at the moment might be down to software
compression on the client's CPU.
Playing with the compression settings is the obvious approach, but I
was
wondering about other options - particularly as I may have a use
case for
client-side encryption as well.
If the job stubbornly remains too long to backup, I suspect I'm
looking at
splitting the directory across multiple jobs and running them
concurrently.
Is that right?
Thanks,
Mark
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users
--
David Brodbeck
System Administrator, Department of Mathematics
University of California, Santa Barbara
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users