Hi Stefan,

> When the backup run the bareos-fd client has a cpu utilisation of around
15%, iowait and also the load is around 0-1(%).

How many cpu cores/threads does this server have ?
Theese numbers you have posted , are they a global load (sum of all cores)
or one distinct core utilized by bareos ?

I suspect you are running into simillar problem as me. My bareos-fd (18.2)
is able of utilizing only one core (nmon is great for spotting that) so my
backups are slow:
https://groups.google.com/forum/#!topic/bareos-users/GqYR8kKWD1Y

> try strace -c <PID>

I was stracing bareos-fd some time ago (19.01.2019 bareos 18.2 unstable/rc):

[root@lucyfer ~]# strace -c -f -p 3698
strace: Process 3698 attached with 4 threads
strace: [ Process PID=24636 runs in x32 mode. ]
strace: [ Process PID=24636 runs in 64 bit mode. ]
^Cstrace: Process 3698 detached
strace: Process 3704 detached
strace: Process 24636 detached
strace: Process 24637 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 82,55    2,893997      723499         4           poll
  7,35    0,257837           6     37905           write
  3,69    0,129244          17      7580           read
  2,83    0,099115       99115         1           restart_syscall
  2,19    0,076782           2     37904           getpid
  1,34    0,046803           2     23367           fcntl
  0,02    0,000815          26        31           lstat
  0,01    0,000335          10        33        33 getxattr
  0,01    0,000286           9        31           openat
  0,00    0,000139           4        33           stat
  0,00    0,000138          34         4           getdents64
  0,00    0,000114           3        31           close
  0,00    0,000010           5         2           fstat
------ ----------- ----------- --------- --------- ----------------
100.00    3,505615                106926        33 total

There is more details on some of pool() calls:

[root@lucyfer ~]# grep poll strace.out
[pid 24637] restart_syscall(<... resuming interrupted poll ...> <unfinished ...>
[pid 24637] poll([{fd=7,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished
...>
[pid 24637] <... poll resumed> )        = 0 (Timeout)
[pid 24637] poll([{fd=7,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished
...>
[pid 24637] <... poll resumed> )        = 0 (Timeout)
[pid 24637] poll([{fd=7,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished
...>
[pid 24637] <... poll resumed> )        = 0 (Timeout)
[pid 24637] poll([{fd=7,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished
...>
[pid 24637] <... poll resumed> )        = 0 (Timeout)
[pid 24637] poll([{fd=7,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished
...>
[pid 24637] <... poll resumed> )        = 0 (Timeout)
[pid 24637] poll([{fd=7,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished
...>


Looking at /proc/PID/fd:

dr-x------. 2 root root    0 01-01 22:13 .
dr-xr-xr-x. 9 root bareos  0 01-01 22:13 ..
lr-x------. 1 root root   64 01-01 22:13 0 -> /dev/null
lr-x------. 1 root root   64 01-01 22:13 1 -> /dev/null
...
lr-x------. 1 root root   64 01-01 22:13 13 -> /home/0001_FLAC_klasyka
lr-x------. 1 root root   64 01-01 22:13 14 ->
'/home/0001_FLAC_klasyka/Tchaikovsky 24 Bit Vinyl Pack'
lr-x------. 1 root root   64 01-01 22:13 15 ->
'/home/0001_FLAC_klasyka/Tchaikovsky 24 Bit Vinyl Pack/1959 Swan Lake
Ballet Suite - Sinfonia of London'
lr-x------. 1 root root   64 01-01 22:13 16 ->
'/home/0001_FLAC_klasyka/Tchaikovsky 24 Bit Vinyl Pack/1959 Swan Lake
Ballet Suite - Sinfonia of London/A6 - Act II - Dance of the
Cygnets.flac'
lr-x------. 1 root root   64 01-01 22:13 2 -> /dev/null
lrwx------. 1 root root   64 01-01 22:13 3 -> 'socket:[3762303]'
lrwx------. 1 root root   64 01-01 22:13 4 -> 'socket:[3792059]'
lrwx------. 1 root root   64 01-01 22:13 5 -> 'socket:[3791320]'
lr-x------. 1 root root   64 01-01 22:13 6 -> /
lrwx------. 1 root root   64 01-01 22:13 7 -> 'socket:[3791320]'
lrwx------. 1 root root   64 01-01 22:13 8 -> 'socket:[3792059]'
lr-x------. 1 root root   64 01-01 22:13 9 -> /home


I have looked to find if fd7 socket corresponds to any active network
connection but i haven't found it. For me it looks that this socket is used
for bareos interprocess communication (fd was always launching a couple of
threads running on the same core). My feeling is that one thread is waiting
for data but there are no data available. Maybe it is a net sender thread
waiting for a data from reader thread (perhaps compression/encription is
implemented there)? Hope someone more skilled can follow this path and
validate this findings.

Ps. Did you inspected individual disk performance during the backup
(iostat, nmon, etc ...) It is quite unlikely on SSD but maybe you are
hitting iops/bandwidth limit somewhere ?

Regards,
Bart

pt., 22 mar 2019 o 16:31 Udo Lembke <[email protected]> napisał(a):

> Hi Stefan,
> if you don't see an postgresql or mysql process with an longer high
> cpu-usages (and perhaps iowait), it's not the database.
>
> Udo
>
> Am 22.03.19 um 10:37 schrieb 'Stefan Krüger' via bareos-users:
> > @ulembke: I don't think so, but how can I check this? load and iowait
> are also on the server side quite low
> >
> > @Douglas Rand: I'm running zfs on linux, so I dont have drace, I can try
> strace -c <PID>
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "bareos-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to