Hi Stefan, > When the backup run the bareos-fd client has a cpu utilisation of around 15%, iowait and also the load is around 0-1(%).
How many cpu cores/threads does this server have ? Theese numbers you have posted , are they a global load (sum of all cores) or one distinct core utilized by bareos ? I suspect you are running into simillar problem as me. My bareos-fd (18.2) is able of utilizing only one core (nmon is great for spotting that) so my backups are slow: https://groups.google.com/forum/#!topic/bareos-users/GqYR8kKWD1Y > try strace -c <PID> I was stracing bareos-fd some time ago (19.01.2019 bareos 18.2 unstable/rc): [root@lucyfer ~]# strace -c -f -p 3698 strace: Process 3698 attached with 4 threads strace: [ Process PID=24636 runs in x32 mode. ] strace: [ Process PID=24636 runs in 64 bit mode. ] ^Cstrace: Process 3698 detached strace: Process 3704 detached strace: Process 24636 detached strace: Process 24637 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 82,55 2,893997 723499 4 poll 7,35 0,257837 6 37905 write 3,69 0,129244 17 7580 read 2,83 0,099115 99115 1 restart_syscall 2,19 0,076782 2 37904 getpid 1,34 0,046803 2 23367 fcntl 0,02 0,000815 26 31 lstat 0,01 0,000335 10 33 33 getxattr 0,01 0,000286 9 31 openat 0,00 0,000139 4 33 stat 0,00 0,000138 34 4 getdents64 0,00 0,000114 3 31 close 0,00 0,000010 5 2 fstat ------ ----------- ----------- --------- --------- ---------------- 100.00 3,505615 106926 33 total There is more details on some of pool() calls: [root@lucyfer ~]# grep poll strace.out [pid 24637] restart_syscall(<... resuming interrupted poll ...> <unfinished ...> [pid 24637] poll([{fd=7, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished ...> [pid 24637] <... poll resumed> ) = 0 (Timeout) [pid 24637] poll([{fd=7, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished ...> [pid 24637] <... poll resumed> ) = 0 (Timeout) [pid 24637] poll([{fd=7, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished ...> [pid 24637] <... poll resumed> ) = 0 (Timeout) [pid 24637] poll([{fd=7, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished ...> [pid 24637] <... poll resumed> ) = 0 (Timeout) [pid 24637] poll([{fd=7, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished ...> [pid 24637] <... poll resumed> ) = 0 (Timeout) [pid 24637] poll([{fd=7, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 5000 <unfinished ...> Looking at /proc/PID/fd: dr-x------. 2 root root 0 01-01 22:13 . dr-xr-xr-x. 9 root bareos 0 01-01 22:13 .. lr-x------. 1 root root 64 01-01 22:13 0 -> /dev/null lr-x------. 1 root root 64 01-01 22:13 1 -> /dev/null ... lr-x------. 1 root root 64 01-01 22:13 13 -> /home/0001_FLAC_klasyka lr-x------. 1 root root 64 01-01 22:13 14 -> '/home/0001_FLAC_klasyka/Tchaikovsky 24 Bit Vinyl Pack' lr-x------. 1 root root 64 01-01 22:13 15 -> '/home/0001_FLAC_klasyka/Tchaikovsky 24 Bit Vinyl Pack/1959 Swan Lake Ballet Suite - Sinfonia of London' lr-x------. 1 root root 64 01-01 22:13 16 -> '/home/0001_FLAC_klasyka/Tchaikovsky 24 Bit Vinyl Pack/1959 Swan Lake Ballet Suite - Sinfonia of London/A6 - Act II - Dance of the Cygnets.flac' lr-x------. 1 root root 64 01-01 22:13 2 -> /dev/null lrwx------. 1 root root 64 01-01 22:13 3 -> 'socket:[3762303]' lrwx------. 1 root root 64 01-01 22:13 4 -> 'socket:[3792059]' lrwx------. 1 root root 64 01-01 22:13 5 -> 'socket:[3791320]' lr-x------. 1 root root 64 01-01 22:13 6 -> / lrwx------. 1 root root 64 01-01 22:13 7 -> 'socket:[3791320]' lrwx------. 1 root root 64 01-01 22:13 8 -> 'socket:[3792059]' lr-x------. 1 root root 64 01-01 22:13 9 -> /home I have looked to find if fd7 socket corresponds to any active network connection but i haven't found it. For me it looks that this socket is used for bareos interprocess communication (fd was always launching a couple of threads running on the same core). My feeling is that one thread is waiting for data but there are no data available. Maybe it is a net sender thread waiting for a data from reader thread (perhaps compression/encription is implemented there)? Hope someone more skilled can follow this path and validate this findings. Ps. Did you inspected individual disk performance during the backup (iostat, nmon, etc ...) It is quite unlikely on SSD but maybe you are hitting iops/bandwidth limit somewhere ? Regards, Bart pt., 22 mar 2019 o 16:31 Udo Lembke <[email protected]> napisał(a): > Hi Stefan, > if you don't see an postgresql or mysql process with an longer high > cpu-usages (and perhaps iowait), it's not the database. > > Udo > > Am 22.03.19 um 10:37 schrieb 'Stefan Krüger' via bareos-users: > > @ulembke: I don't think so, but how can I check this? load and iowait > are also on the server side quite low > > > > @Douglas Rand: I'm running zfs on linux, so I dont have drace, I can try > strace -c <PID> > > > > -- > You received this message because you are subscribed to the Google Groups > "bareos-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "bareos-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. For more options, visit https://groups.google.com/d/optout.
