Here is a little update. I have created more devices in the sd for parallel job execution. But all of them with "Maximum Concurrent Jobs = 1". Previously we had a single device with 20 concurrent jobs. That solved the problem for us. At least for new backups.
Still, something seems to be wrong with copy jobs and the crashing fd. Let me know if I can provide any more information to get this sorted out. I will be on vacation until end of next week. Best wishes, Andreas Andreas R schrieb am Dienstag, 3. Juni 2025 um 14:53:05 UTC+2: > Hi Sebastian, > > the bscan output with the modified bsr was uploaded to the shared folder. > > I did some more debugging. > > First I created a new storage and a new disk pool. > Then I copied the initial full job to the new disk pool. (disk > disk) > Selection Pattern = "SELECT 212964 AS jobid;" > > The restore from that pool also failed. So it seems the problem is not > related to tape. > > With the debug traces I was able to identify affected files. There is some > kind of pattern: > host1: > - /var/adm/backup/rpmdb/Packages-20250517.gz > - /var/adm/backup/rpmdb/Packages-20250520.gz > - /var/lib/ca-certificates/openssl/OISTE_WISeKey_Global_Root_GC_CA.pem > host2: > - /var/adm/backup/rpmdb/Packages-20250517.gz > - /etc/vmware-tools/vgauth/schemas/XMLSchema.xsd > host3: > - /etc/vmware-tools/vgauth/schemas/XMLSchema.xsd > host4: > - /var/lib/ca-certificates/openssl/DIGITALSIGN_GLOBAL_ROOT_ECDSA_CA.pem > - /var/lib/sss/mc/initgroups > etc. > All these jobs run simultaneously to a single pool. > > Have a nice vacation, > Andreas > > Sebastian Sura schrieb am Dienstag, 3. Juni 2025 um 09:45:07 UTC+2: > >> Hi Andreas, >> >> thanks for the help! Diffing those files yielded: >> >> -bscan: stored/bscan.cc:496-0 Record: ... Stream=20 len=262144 >> +bscan: stored/bscan.cc:496-0 Record: ... Stream=20 len=209312 >> This is very weird. It looks like some of the data was not copied >> correctly. I will come back to this after my vacation. It definitely >> looks weird. >> Could you modify the copy.bsr by deleting the >> VolSessionId=,VolSessionTime=,FileIndex=,Count= lines and running bscan >> again like before? >> I am wondering if some other job somehow cut off that part. >> >> Kind Regards >> Sebastian Sura >> >> Am 02.06.25 um 07:48 schrieb Sebastian Sura: >> >> Hi Andreas, >> >> i want to check why the copy is not restorable. Could you do the >> following for me ? >> 1) Grab the bsr of the (working) full and the (not working) copy. You >> can do this via >> >> * restore jobid=<full/copy id> bsr=/path/to/the/file.bsr all done >> >> bareos then writes the bsr in the given file. Lets say the bsrs are now >> in /tmp/full.bsr an /tmp/copy.bsr. >> >> 2) We now want to use bscan to see what data is getting sent to the fd: >> >> $ bscan -b /path/to/the/file.bsr --list-records -c path/to/config ... >> <your device> >> >> This should output a list like the following: >> >> bscan: stored/butil.cc:327-0 Using device: "FileStorage2" for reading. >> 02-Jun 07:37 bscan JobId 0: Ready to read from volume "Copy-0002" on >> device "FileStorage2" (storage). >> 02-Jun 07:37 bscan JobId 0: Forward spacing Volume "Copy-0002" to >> file:block 0:216. >> bscan: stored/bscan.cc:501-0 Record: SessId=1 SessTim=1748841876 >> FileIndex=-4 Stream=5 len=164 >> bscan: stored/bscan.cc:501-0 Record: SessId=1 SessTim=1748841876 >> FileIndex=1 Stream=1 len=184 >> bscan: stored/bscan.cc:501-0 Record: SessId=1 SessTim=1748841876 >> FileIndex=1 Stream=22 len=640 >> bscan: stored/bscan.cc:501-0 Record: SessId=1 SessTim=1748841876 >> FileIndex=1 Stream=20 len=8624 >> bscan: stored/bscan.cc:501-0 Record: SessId=1 SessTim=1748841876 >> FileIndex=1 Stream=20 len=16 >> bscan: stored/bscan.cc:501-0 Record: SessId=1 SessTim=1748841876 >> FileIndex=1 Stream=1998 len=81 >> bscan: stored/bscan.cc:501-0 Record: SessId=1 SessTim=1748841876 >> FileIndex=1 Stream=19 len=322 >> bscan: stored/bscan.cc:501-0 Record: SessId=1 SessTim=1748841876 >> FileIndex=1 Stream=40 len=16 >> bscan: stored/bscan.cc:501-0 Record: SessId=1 SessTim=1748841876 >> FileIndex=2 Stream=1 len=185 >> ... >> >> Could you send the two bsrs and the two lists to me ? >> >> Kind Regards >> Sebastian Sura >> Am 30.05.25 um 13:08 schrieb 'Andreas R' via bareos-users: >> >> I have sent you the debug trace. Let me know if I can provide further >> information. >> Kind Regards >> Andreas >> Sebastian Sura schrieb am Mittwoch, 28. Mai 2025 um 09:45:20 UTC+2: >> >>> Thanks for that traceback. Something really weird is happening. It >>> looks like the fd tries to decrypt your encrypted backup, and it thinks it >>> succeeds, but it actually went wrong. >>> >>> Could you redo the restore, but with debug tracing enabled ? I.e. do >>> >>> setdebug client=<clientname> level=500 trace=1 >>> before the restore. >>> This command should print a filename where the debug messages will be >>> stored. It would be great if you could send this file to me (after the >>> filedaemon crashed). >>> >>> I created an internal issue to track this as there is clearly something >>> going wrong here. >>> >>> Kind Regards >>> Sebastian Sura >>> >>> Am 27.05.25 um 13:23 schrieb 'Andreas R' via bareos-users: >>> >>> Thank you for looking into this matter. >>> Here is the debug report. >>> >>> Best Regards, >>> Andreas >>> >>> Sebastian Sura schrieb am Dienstag, 27. Mai 2025 um 10:07:07 UTC+2: >>> >>>> Thanks for the crash report. This looks very weird. I have not seen >>>> this kind of crash before. >>>> Would it be possible for you to install the debug packages and recreate >>>> the crash ? >>>> >>>> See here on how to install the debug symbol packages: >>>> https://docs.bareos.org/Appendix/Debugging.html#installing-debug-symbols-packages >>>> >>>> Kind Regards >>>> Sebastian Sura >>>> Am 26.05.25 um 16:42 schrieb 'Andreas R' via bareos-users: >>>> >>>> Hi Sebastian, >>>> >>>> thank you for your reply. >>>> I have attached both files. >>>> >>>> Kind Regards, >>>> Andreas >>>> >>>> Sebastian Sura schrieb am Montag, 26. Mai 2025 um 14:38:07 UTC+2: >>>> >>>>> Hi Andreas, >>>>> >>>>> you attached the `.bactrace` file that the fd created. It would be >>>>> very helpful if you could also send us the `.traceback` file that was >>>>> created during the crash, as that file contains the stacktrace. >>>>> Without it we would have to guess were the problem occured. >>>>> >>>>> As this problem occured on a restore, could you >>>>> >>>>> 1) check if this is reproducable, and if so, >>>>> 2) send us the bootstrap record file of that restore job ? >>>>> >>>>> If you give the restore command the option `bootstrap=<path>`, then >>>>> bareos will write the bsr file to that path and will not delete it. >>>>> >>>>> Kind Regards >>>>> Sebastian Sura >>>>> Am 26.05.25 um 12:23 schrieb 'Andreas R' via bareos-users: >>>>> >>>>> Hi, >>>>> >>>>> I have trouble restoring from tape. Jobs start as expected, but at >>>>> some point during the restore, the filedaemon is killed with signal 11. >>>>> >>>>> *restore jobid=213438 client=prestore01-fd all done yes >>>>> >>>>> May 23 05:16:57 prestore01 bareos-fd[30717]: bareos-fd, prestore01-fd >>>>> got signal 11 - Segmentation violation. Attempting traceback. >>>>> May 23 05:16:57 prestore01 bareos-fd[30717]: exepath=/usr/sbin/ >>>>> May 23 05:16:57 prestore01 bareos-fd[30717]: BAREOS interrupted by >>>>> signal 11: Segmentation violation >>>>> May 23 05:16:57 prestore01 bareos-fd[30917]: Calling: >>>>> /usr/sbin/btraceback /usr/sbin/bareos-fd 30717 /var/lib/bareos >>>>> May 23 05:16:57 prestore01 bareos-fd[30924]: bsmtp: >>>>> tools/bsmtp.cc:455-0 Failed to connect to mailhost localhost >>>>> May 23 05:16:57 prestore01 bareos-fd[30717]: The btraceback call >>>>> returned 1 >>>>> May 23 05:16:57 prestore01 bareos-fd[30717]: Dumping: >>>>> /var/lib/bareos/prestore01-fd.30717.bactrace >>>>> >>>>> cat /var/lib/bareos/prestore01-fd.30717.bactrace >>>>> Attempt to dump current JCRs. njcrs=1 >>>>> threadid=0x00007f399fdfe6c0 JobId=213439 JobStatus=R >>>>> jcr=0x7f3998047ec0 name=RestoreFiles.2025-05-23_10.16.37_28 >>>>> threadid=0x00007f399fdfe6c0 killable=1 JobId=213439 JobStatus=R >>>>> jcr=0x7f3998047ec0 name=RestoreFiles.2025-05-23_10.16.37_28 >>>>> UseCount=1 >>>>> JobType=R JobLevel= >>>>> sched_time=23-May-2025 05:16 start_time=23-May-2025 05:16 >>>>> end_time=31-Dec-1969 18:00 wait_time=31-Dec-1969 18:00 >>>>> db=(nil) db_batch=(nil) batch_started=0 >>>>> >>>>> Steps to reproduce: >>>>> 1. Full backup to disk >>>>> 2. Copy to tape via next pool >>>>> 3. Restore from disk is ok >>>>> 4. Restore from tape is not ok >>>>> >>>>> What I tried without success so far: >>>>> - Deleted the jobs from tape and copied them again >>>>> The error occourred after the same amount of restored files >>>>> - Tried a different Tape >>>>> - Tried other fd versions. 22(debian), 23(suse) and 24(suse) >>>>> - Changed the blocksize to 512 in the sd >>>>> - Disabled compression and rerun everything >>>>> >>>>> Client { >>>>> Name = prestore01-fd >>>>> #Maximum Concurrent Jobs = 20 >>>>> FDport = 9102 >>>>> PKI Signatures = Yes >>>>> PKI Encryption = Yes >>>>> PKI Keypair = "/etc/bareos/master.pem" >>>>> PKI Master Key = "/etc/bareos/prestore01.cert" >>>>> PkiCipher = AES256 >>>>> } >>>>> >>>>> Pool { >>>>> Name = Full >>>>> Pool Type = Backup >>>>> Recycle = Yes >>>>> Volume Retention = 12 months >>>>> Maximum Volumes = 125 >>>>> Maximum Volume Bytes = 125G >>>>> Next Pool = "TapeFull" >>>>> Label Format = "Full-" >>>>> Storage = LocalStorage >>>>> } >>>>> >>>>> Pool { >>>>> Name = TapeFull >>>>> Pool Type = Backup >>>>> Recycle = Yes >>>>> Volume Retention = 13 month >>>>> Storage = TL1000 >>>>> Cleaning Prefix = CLN >>>>> } >>>>> >>>>> Job { >>>>> Name = CopyFull2Tape >>>>> JobDefs = "CycleJob" >>>>> Type = Copy >>>>> Selection Type = PoolUncopiedJobs >>>>> Level = Full >>>>> Pool = Full >>>>> Messages = Standard >>>>> Client = pbackup01-fd >>>>> FileSet = "SuseBase" >>>>> Storage = "LocalStorage" >>>>> Schedule = "CopyFull2Tape" >>>>> } >>>>> >>>>> System Info: >>>>> Bareos: 24.0.4~pre0.1014be830-74 >>>>> OS: openSUSE Leap 15.6 >>>>> Catalog: Postgresql >>>>> Tape: LTO8 >>>>> >>>>> Thanks in advance -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "bareos-users" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion visit >>>>> https://groups.google.com/d/msgid/bareos-users/08776ca6-2a98-4901-a228-524922713a9en%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/bareos-users/08776ca6-2a98-4901-a228-524922713a9en%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> -- >>>>> Sebastian Sura [email protected] >>>>> Bareos GmbH & Co. KG Phone: +49 221 630693-0 >>>>> https://www.bareos.com >>>>> Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646 >>>>> Komplementär: Bareos Verwaltungs-GmbH >>>>> Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz >>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "bareos-users" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> >>>> To view this discussion visit >>>> https://groups.google.com/d/msgid/bareos-users/93ba060d-c6bf-46e4-8679-874fbc7e6754n%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/bareos-users/93ba060d-c6bf-46e4-8679-874fbc7e6754n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> -- >>>> Sebastian Sura [email protected] >>>> Bareos GmbH & Co. KG Phone: +49 221 630693-0 >>>> https://www.bareos.com >>>> Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646 >>>> Komplementär: Bareos Verwaltungs-GmbH >>>> Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "bareos-users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >>> To view this discussion visit >>> https://groups.google.com/d/msgid/bareos-users/f0156028-ac9f-4930-91a6-b5b59c45b59bn%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/bareos-users/f0156028-ac9f-4930-91a6-b5b59c45b59bn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> -- >>> Sebastian Sura [email protected] >>> Bareos GmbH & Co. KG Phone: +49 221 630693-0 >>> https://www.bareos.com >>> Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646 >>> Komplementär: Bareos Verwaltungs-GmbH >>> Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "bareos-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion visit >> https://groups.google.com/d/msgid/bareos-users/c7d2965e-c03f-497c-8c64-d7e4997ec8fan%40googlegroups.com >> >> <https://groups.google.com/d/msgid/bareos-users/c7d2965e-c03f-497c-8c64-d7e4997ec8fan%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> -- >> Sebastian Sura [email protected] >> Bareos GmbH & Co. KG Phone: +49 221 630693-0 >> https://www.bareos.com >> Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646 >> Komplementär: Bareos Verwaltungs-GmbH >> Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz >> >> -- >> You received this message because you are subscribed to the Google Groups >> "bareos-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> >> To view this discussion visit >> https://groups.google.com/d/msgid/bareos-users/18bebbc3-3218-41c3-9cf2-a67fac50dad3%40bareos.com >> >> <https://groups.google.com/d/msgid/bareos-users/18bebbc3-3218-41c3-9cf2-a67fac50dad3%40bareos.com?utm_medium=email&utm_source=footer> >> . >> >> -- >> Sebastian Sura [email protected] >> Bareos GmbH & Co. KG Phone: +49 221 630693-0 >> https://www.bareos.com >> Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646 >> Komplementär: Bareos Verwaltungs-GmbH >> Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz >> >> -- You received this message because you are subscribed to the Google Groups "bareos-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/bareos-users/8b8c0fcc-586d-4f59-a365-47cc78d4202dn%40googlegroups.com.
