On Tue, Mar 4, 2025, at 11:03 PM, Rob Gerber wrote: > I don't think that the problem is in bacula, for sure. I suspect other > traffic over the link might be similarly impacted. My searching indicated > that 0a000119 is a generic openssl error. Could be many things. I might be > suspicious of the openssl version or implementation installed on your new > router / firewall. The router or firewall may have flawed firmware.
Is that theory contingent upon some kind of hardware acceleration on said firewall? If so, I should be able to verify that that is occurring and perhaps disable that acceleration so it's all done in software, removing the firmware from the equation. > > Consider running wireshark to analyze failed ssl transactions. > > I think I maybe got lucky when I searched this in duckduckgo. The top result > contained something sort of relevant, with further breadcrumbs to chase. The > next million results didn't even contain the 0a000119 keyword and look > unrelated. > Check out this, and follow the links therein, and the links inside those > links. I have about 10 tabs open now and I see some interesting stuff. Some > people turned off segmentation offloading on their nic, others made new > certs, others got rid of their netgear router. 0a000119 is a vague error. > https://forum.proxmox.com/threads/decryption-failed-or-bad-record-on-remote-sync.145131/ Thank you for that research. It is appreciated. > Have you verified that data can be sent over the network link? I assume yes, > so what about data larger than a single packet size (ie, if a packet is > fragmented, then what happens?)? The network link of the firewall? Yes. I think that is fine and working as expected. To test, I ran "wget https://download.freebsd.org/releases/ISO-IMAGES/14.2/FreeBSD-14.2-RELEASE-amd64-memstick.img". It completed in about 42s without errors. I verified the checksum is correct. Does that do the test you wanted? This test does not involve the VPN. However, your suggestion made me try another test: [8:42 pro02 dan ~/tmp] % time scp -r foo.example:~bar/backups/Bacula . That grabs all the .bsr files I've backed up to that how. The copy involves about 2.6M and 221 files. Let's try that same backup over the VPN: [8:43 pro02 dan ~/tmp] % time scp -r foo.vpn.example.org:~rsyncer/backups/Bacula Bacula-vpn .. about five files are copied 0% 0 0.0KB/s --:-- ETAssh_dispatch_run_fatal: Connection to 10.14.0.217 port 22: message authentication code incorrect scp: Connection closed scp -r foo.vpn.example.org:~rsyncer/backups/Bacula Bacula-vpn 0.21s user 0.02s system 25% cpu 0.938 total To me, that says something is very wonky with the VPN. Which also means, this is not a Bacula issue but a transport issue - solve that first, and the Bacula issue should resolve. Does that make sense? Thank you > > > Regards, > Robert Gerber > 402-237-8692 > r...@craeon.net > > > On Sun, Mar 2, 2025 at 2:17 PM Dan Langille <d...@langille.org> wrote: >> Hello, >> >> I have several clients which have recently start failing with: >> >> SD says - Error: openssl.c:108 TLS read/write failure.: >> ERR=error:0A000119:SSL routines::decryption failed or bad record mac >> FD says - Error: bsock.c:397 Wrote 43011 bytes to Storage >> daemon:bacula-sd-04.int.unixathome.org:9103, but only 0 accepted. >> SD says - Fatal error: append.c:327 Network error reading from FD. >> ERR=Unknown error: 9919 >> >> My search for these errors isn't finding anything to try. >> >> Full job output appears below. >> >> Relevant background: >> >> * Bacula 15.0.2 installed on clients and servers >> * FreeBSD 14.x >> * the failing clients have been around for years, with successful backups >> * the gateway in my basement was recently replaced, new firewall rules and >> OpenVPN configuration >> * the OpenVPN topology went from net40 to subnet >> * this affects all the VPN hosts; local hosts are unaffected >> >> Given the Bacula configuration hasn't changed and these jobs had been >> running successfully for years, the problem must be with OpenVPN and/or the >> firewall rules. However, I cannot find the cause. >> >> Here is a failed job: >> >> 02-Mar 19:06 bacula-dir JobId 373736: Start Backup JobId 373736, >> Job=r720-02_basic.2025-03-02_19.06.22_22 >> 02-Mar 19:06 bacula-dir JobId 373736: Connected to Storage >> "bacula-sd-04-FullFile" at bacula-sd-04.int.unixathome.org:9103 with TLS >> 02-Mar 19:06 bacula-dir JobId 373736: There are no more Jobs associated with >> Volume "FullAuto-04-15375". Marking it purged. >> 02-Mar 19:06 bacula-dir JobId 373736: All records pruned from Volume >> "FullAuto-04-15375"; marking it "Purged" >> 02-Mar 19:06 bacula-dir JobId 373736: Recycled volume "FullAuto-04-15375" >> 02-Mar 19:06 bacula-dir JobId 373736: Using Device "vDrive-FullFile-4" to >> write. >> 02-Mar 19:06 bacula-dir JobId 373736: Connected to Client "r720-02-fd" at >> r720-02.vpn.unixathome.org:9102 with TLS >> 02-Mar 19:06 r720-02-fd JobId 373736: Connected to Storage at >> bacula-sd-04.int.unixathome.org:9103 with TLS >> 02-Mar 19:06 bacula-sd-04 JobId 373736: Recycled volume "FullAuto-04-15375" >> on File device "vDrive-FullFile-4" (/usr/local/bacula/volumes/FullFile), all >> previous data lost. >> 02-Mar 19:06 bacula-dir JobId 373736: Max Volume jobs=1 exceeded. Marking >> Volume "FullAuto-04-15375" as Used. >> 02-Mar 19:06 bacula-sd-04 JobId 373736: Error: openssl.c:108 TLS read/write >> failure.: ERR=error:0A000119:SSL routines::decryption failed or bad record >> mac >> 02-Mar 19:06 r720-02-fd JobId 373736: Error: bsock.c:397 Wrote 43011 bytes >> to Storage daemon:bacula-sd-04.int.unixathome.org:9103, but only 0 accepted. >> 02-Mar 19:06 bacula-sd-04 JobId 373736: Fatal error: append.c:327 Network >> error reading from FD. ERR=Unknown error: 9919 >> 02-Mar 19:06 r720-02-fd JobId 373736: Fatal error: backup.c:1057 Network >> send error to SD. ERR=Input/output error >> 02-Mar 19:06 bacula-sd-04 JobId 373736: Elapsed time=00:00:01, Transfer >> rate=16.71 M Bytes/second >> 02-Mar 19:06 r720-02-fd JobId 373736: Error: bsock.c:276 Socket has errors=1 >> on call to Storage daemon:bacula-sd-04.int.unixathome.org:9103 >> 02-Mar 19:06 bacula-dir JobId 373736: Error: Director's connection to SD for >> this Job was lost. >> 02-Mar 19:06 bacula-dir JobId 373736: Error: Bacula bacula-dir 15.0.2 >> (21Mar24): >> Build OS: amd64-portbld-freebsd14.1 freebsd 14.1-RELEASE >> JobId: 373736 >> Job: r720-02_basic.2025-03-02_19.06.22_22 >> Backup Level: Full >> Client: "r720-02-fd" 15.0.2 (21Mar24) >> amd64-portbld-freebsd14.1,freebsd,14.1-RELEASE >> FileSet: "basic backup" 2019-10-28 03:05:00 >> Pool: "FullFile-04" (From Job FullPool override) >> Catalog: "MyCatalog" (From Client resource) >> Storage: "bacula-sd-04-FullFile" (From Pool resource) >> Scheduled time: 02-Mar-2025 19:06:19 >> Start time: 02-Mar-2025 19:06:25 >> End time: 02-Mar-2025 19:06:26 >> Elapsed time: 1 sec >> Priority: 10 >> FD Files Written: 91 >> SD Files Written: 0 >> FD Bytes Written: 17,352,576 (17.35 MB) >> SD Bytes Written: 0 (0 B) >> Rate: 17352.6 KB/s >> Software Compression: 100.0% 1.0:1 >> Comm Line Compression: 45.9% 1.8:1 >> Snapshot/VSS: no >> Encryption: no >> Accurate: no >> Volume name(s): FullAuto-04-15375 >> Volume Session Id: 56 >> Volume Session Time: 1740841409 >> Last Volume Bytes: 16,733,601 (16.73 MB) >> Non-fatal FD errors: 3 >> SD Errors: 0 >> FD termination status: Error >> SD termination status: Error >> Termination: *** Backup Error *** >> >> >> -- >> Dan Langille >> d...@langille.org >> >> >> _______________________________________________ >> Bacula-users mailing list >> Bacula-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/bacula-users -- Dan Langille d...@langille.org
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users