Hi Samuel,

i think i managed to reproduce your Problem.  I created a fix here if you want to check it out: https://github.com/bareos/bareos/pull/1745

Kind Regards

Sebastian Sura

Am 04.03.24 um 11:43 schrieb 'Samuel' via bareos-users:
I finally had the chance to restart the storage daemon this morning in order to disable autoxflate. Today's copy jobs that ran afterwards finished without any warning, so I think it's probably an issue with autoxflate (on replication?). I can also successfully restore those new copies.

Best regards,
Samuel
On Monday, February 26, 2024 at 12:19:50 PM UTC+1 Samuel wrote:

    Yes, that's right. In this case though it's a consolidated
    incremental which was copied. However, the warnings also occur
    during migration jobs of consolidated fulls. The warning has never
    occurred while copying/migrating a non-consolidated incremental.

    This is all working just fine except for a couple of jobs which
    all have the following in common: filedaemon is on another host
    than director, they use LZ4 compression and autoxflate inflates on
    replication (though it's also enabled for all the other jobs but
    doesn't have any effect since they don't use compression).

    I use the SQL Query selection type:
    Selection Type = SQL Query
    Selection Pattern = "
      SELECT DISTINCT Job.JobId, Job.StartTime
      FROM Job, Pool
      WHERE Pool.Name IN ('AI-Consolidated', 'AI-Incremental') AND
            Pool.PoolId = Job.PoolId AND
            Job.Type = 'B' AND
            Job.JobStatus IN ('T', 'W') AND
            Job.Level IN ('I') AND
            Job.JobBytes > 0 AND
            Job.JobId NOT IN (
                SELECT PriorJobId
                FROM Job AS copy
                WHERE copy.Type IN ('B', 'C', 'A') AND
                      copy.JobStatus IN ('T', 'W') AND
                      copy.PriorJobId != 0
            )
      ORDER by Job.StartTime;"

    No, there's no concurrent writing going on currently. Before
    updating to Bareos 23 I did have devices with concurrent writing
    enabled, but that was before these warnings happened.

    Best regards,
    Samuel

    On Monday, February 26, 2024 at 11:51:59 AM UTC+1 Sebastian Sura
    wrote:

        Hi Samuel

        i just wanted to make sure i completely understand the
        situation.  You have an always incremental job with compression
        turned on (which algorithm do you use ?) and then you have a
        copy job that copies the consolidated full job (on the same sd)
        to tape with a device that auto inflates on write.

        Do i have that right ?

        Does the device where the job gets consolidated onto allow
        concurrent writes from multiple jobs ?

        What selection type has your copy job ?

        Kind Regards
        Sebastian Sura

        Am 26.02.24 um 11:32 schrieb 'Samuel' via bareos-users:
        Hi again.

        So, as expected, the warning occurred again when job 94511
        was copied (but also two new warnings for this specific job):
        Warning: dird/catreq.cc:608 MD5 digest not same File=3 as
        attributes=2
        Warning: dird/catreq.cc:608 MD5 digest not same File=18 as
        attributes=17
        Warning: dird/catreq.cc:608 MD5 digest not same File=23 as
        attributes=22

        The restore of the copy also errored in the same manner as
        before:
         2024-02-26 10:43:50 bareos-fd JobId 95784: Error:
        findlib/attribs.cc:381 File size of restored file
        
/tmp/bareos-restores/var/backup/old/2024-02-07_23:30/backup_www_1100CC.tar.gz
        not correct. Original 738212473, restored 442105856.
         2024-02-26 10:44:10 bareos-fd JobId 95784: Error:
        findlib/attribs.cc:381 File size of restored file
        
/tmp/bareos-restores/var/backup/old/2024-02-12_23:30/backup_www_1100CC.tar.gz
        not correct. Original 856739265, restored 783548416.
         2024-02-26 10:44:17 bareos-fd JobId 95784: Error:
        findlib/attribs.cc:381 File size of restored file
        
/tmp/bareos-restores/var/backup/old/2024-02-14_23:30/backup_www_1100CC.tar.gz
        not correct. Original 856739984, restored 31064064.

        For now I'll try disabling autoxflate and see if that solves
        the issue.

        Best regards,
        Samuel


        On Friday, February 23, 2024 at 2:02:28 PM UTC+1 Samuel wrote:

            Thanks for looking into it!

            Hm, there seems to be no file with fileindex=3 in the
            database for the copy:
            Enter SQL query: select * from file where jobid=94146 and
            fileindex=3;
            No results to list.

            All files that are being backed up are just regular
            files, nothing special.

            The copy runs locally on the director's host from its SD
            to the same SD.

            Here's file 2 and 3 of today's consolidated incremental
            backup which will be copied tomorrow and most likely lead
            to the same warning again as it has for the last few days
            (for this specific fileset it always warns about `File=3
            attributes=2`):

            Enter SQL query: select *, decode_lstat(lstat) from file
            where jobid=94511 and fileindex=2;
            
+-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+
            | fileid      | fileindex | jobid | pathid  | deltaseq |
            markid | fhinfo | fhnode | lstat                    | md5
             | name                     | decode_lstat              
                                |
            
+-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+
            | 484,575,603 |         2 | 94511 | 910,901 |        0 |
                 0 | 0      | 0      | P0D CgA4 IGk B A i A sADp5 BAA
            FgAo BlxAt1 BlxASF BlxVYH A A d | D/VouduY5TF4KFacjRE7Hw
            | backup_www_1100CC.tar.gz |
            
(64771,655416,33188,1,0,34,0,738212473,4096,1441832,1707346805,1707345029,1707431431,0,0,29)
            |
            
+-------------+-----------+-------+---------+----------+--------+--------+--------+----------------------------------------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------+

            Enter SQL query: select *, decode_lstat(lstat) from file
            where jobid=94511 and fileindex=3;
            
+-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+
            | fileid      | fileindex | jobid | pathid  | deltaseq |
            markid | fhinfo | fhnode | lstat                   | md5
                               | name                        |
            decode_lstat                                |
            
+-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+
            | 484,575,604 |         3 | 94511 | 910,901 |        0 |
                 0 | 0      | 0      | P0D CgA5 IGk B A i A BA9gA BAA
            IHw BlxAt8 BlxASH BlxVYH A A d | J9h+OdR/6XCqhqpJdJvYtw |
            backup_databases_1100CC.tar |
            
(64771,655417,33188,1,0,34,0,17029120,4096,33264,1707346812,1707345031,1707431431,0,0,29)
            |
            
+-------------+-----------+-------+---------+----------+--------+--------+--------+---------------------------------------------------------------+------------------------+-----------------------------+-------------------------------------------------------------------------------------------+


            Restore of this job (jobid=94511) works just fine:

            23-Feb 13:44 bareos-sd JobId 94559: Releasing device
            "FileDevice-ReadOnly-0005" (/backup_1/bareos).
            23-Feb 13:44 bareos-dir JobId 94559: Max configured use
            duration=82,800 sec. exceeded. Marking Volume
            "AI-Consolidated-1771" as Used.
            23-Feb 13:44 bareos-dir JobId 94559: Bareos bareos-dir
            23.0.2~pre32.0a0e55739 (31Jan24):
              Build OS:               Ubuntu 20.04.5 LTS
              JobId:                  94559
              Job:  Restore-Files.2024-02-23_13.43.47_38
              Restore Client:         "bareos-fd"
            23.0.2~pre32.0a0e55739 (31Jan24) Ubuntu 20.04.5 LTS,ubuntu
              Start time:             23-Feb-2024 13:43:49
              End time:               23-Feb-2024 13:44:20
              Elapsed time:           31 secs
              Files Expected:         33
              Files Restored:         33
              Bytes Restored:         9,143,514,060
              Rate:                   294952.1 KB/s
              FD Errors:              0
              FD termination status:  OK
              SD termination status:  OK
              Bareos binary info:     Bareos community build
            (UNSUPPORTED): Get professional support from
            https://www.bareos.com
              Job triggered by:       User
              Termination:            Restore OK

            I'll also try restoring its copy from tape once it has
            been copied.

            Best regards,
            Samuel

            On Friday, February 23, 2024 at 1:16:32 PM UTC+1
            Sebastian Sura wrote:

                I meant to say File 3 has no stream 1, sorry for the
                confusion!

                Am 23.02.24 um 13:15 schrieb Sebastian Sura:

                Hi Samuel

                thanks for the gathering this info. Ill look into
                the bscan issue as well though i think this might be
                a known issue.
                Regardless, regarding your actual issue: The bscan
                output shows that File 3 is missing its attributes
                (it has no stream 3).
                This confuses the director as it never got told that
                file 3 now started getting backed up and this is why
                you get the warning
                message from the director.

                The same is happening during the restore: the
                filedaemon never got told that now a new file was
                started (because it never got the attribute stream)
                so it basically merged both File 2 and File 3 into
                one file.  Even with this in mind, i think the
                restore should have caught that and issues a warning.
                Ill look into why this did not happen.

                If you add up all the data records (stream=2) with
                fileid=2 and fileid=3, you will get 738212473 bytes,
                which is exactly what the filedaemon reported
                as size for file 2.  Since the log contains the size
                it expected i imagine that you could manually
                restore file 3 by splitting file 2 into two.

                Can you check which file fileid 3 corresponds to ?
                Is it a special kind of file or just a normal one ?

                Ill try to see if i can reproduce your issue in the
                copy system test.  Do you do a local copy (so copy
                to the same sd) or a remote one?

                Kind Regards

                Sebastian Sura

                Am 23.02.24 um 12:56 schrieb 'Samuel' via bareos-users:
                Hi Sebastian,

                After looking at the logs some more I noticed that
                all jobs with this warning have in common that
                they're using compression (LZ4) and autoxflate on
                replication.
                Perhaps this combination is still not entirely
                fixed yet. I'll try disabling autoxflate in the
                coming days.

                The entire result of bscan is in the attachment.
                However here's the last part of it where bscan
                seems to abort (?):
                $ sudo -u bareos bscan -b
                md5_digest_error_copy_only.bsr --list-records
                TapeDevice2 2>&1 | tee records.txt
                ...
                bscan: stored/bscan.cc:494-0 Record: SessId=3480
                SessTim=1707222936 FileIndex=23 Stream=2 len=65536
                bscan: stored/bscan.cc:494-0 Record: SessId=3480
                SessTim=1707222936 FileIndex=23 Stream=2 len=65536
                bscan: stored/bscan.cc:494-0 Record: SessId=3480
                SessTim=1707222936 FileIndex=23 Stream=2 len=65536
                bscan: stored/bscan.cc:494-0 Record: SessId=3480
                SessTim=1707222936 FileIndex=23 Stream=2 len=22528
                bscan: stored/bscan.cc:494-0 Record: SessId=3480
                SessTim=1707222936 FileIndex=23 Stream=3 len=16
                bscan: stored/bscan.cc:494-0 Record: SessId=3480
                SessTim=1707222936 FileIndex=24 Stream=1 len=100
                bscan: stored/bscan.cc:494-0 Record: SessId=3480
                SessTim=1707222936 FileIndex=25 Stream=1 len=83
                bscan: stored/bscan.cc:494-0 Record: SessId=3480
                SessTim=1707222936 FileIndex=-5 Stream=94145 len=193
                bscan: stored/bscan.cc:681-0 Could not find
                SessId=3480 SessTime=1707222936 for EOS record.
                Records would have been added or updated in the
                catalog:
                      0 Media
                      1 Pool
                      0 Job
                      0 File
                      0 RestoreObject
                23-Feb 12:05 bscan JobId 0: Releasing device
                "TapeDevice2"
                (/dev/tape/by-id/scsi-35000e111c71ac0bf-nst).

                I also tried restoring the copy which ends in error.

                Most files are restored successfully except for one
                file which happens to have FileIndex=2
                (corresponding to the `attributes=2` in the warning?):
                Enter SQL query: select path,name from file,path
                where jobid=94146 and fileindex=2 and
                file.pathid=path.pathid;
                +-----------------------------------+--------------------------+
                | path                              | name        
                            |
                +-----------------------------------+--------------------------+
                | /var/backup/old/2024-02-07_23:30/ |
                backup_www_1100CC.tar.gz |
                +-----------------------------------+--------------------------+

                A full restore of the original non-copy backup jobs
                works fine, as do restores of other copy jobs on
                tape of filesets that don't use compression.
                The original job that was copied (93682) doesn't
                exist anymore; I think it was an always-incremental
                consolidated incremental backup which got
                consolidated again the following day.

                Best regards,
                Samuel
-- You received this message because you are
                subscribed to the Google Groups "bareos-users" group.
                To unsubscribe from this group and stop receiving
                emails from it, send an email to
                bareos-users...@googlegroups.com.
                To view this discussion on the web visit
                
https://groups.google.com/d/msgid/bareos-users/e766e8c6-5b58-424f-816f-f7c5d7252180n%40googlegroups.com
                
<https://groups.google.com/d/msgid/bareos-users/e766e8c6-5b58-424f-816f-f7c5d7252180n%40googlegroups.com?utm_medium=email&utm_source=footer>.
-- Sebastian surasebasti...@bareos.com
                  Bareos GmbH & Co. KG            Phone: +49 221 630693-0
                  https://www.bareos.com
                  Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
                  Komplementär: Bareos Verwaltungs-GmbH
                  Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz
-- You received this message because you are subscribed
                to the Google Groups "bareos-users" group.
                To unsubscribe from this group and stop receiving
                emails from it, send an email to
                bareos-users...@googlegroups.com.
                To view this discussion on the web visit
                
https://groups.google.com/d/msgid/bareos-users/111cecfe-d3e1-4eb9-8a0e-f053a2777d0f%40bareos.com
                
<https://groups.google.com/d/msgid/bareos-users/111cecfe-d3e1-4eb9-8a0e-f053a2777d0f%40bareos.com?utm_medium=email&utm_source=footer>.

-- Sebastian surasebasti...@bareos.com
                  Bareos GmbH & Co. KG            Phone: +49 221 630693-0
                  https://www.bareos.com
                  Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
                  Komplementär: Bareos Verwaltungs-GmbH
                  Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz

-- You received this message because you are subscribed to the
        Google Groups "bareos-users" group.
        To unsubscribe from this group and stop receiving emails from
        it, send an email to bareos-users...@googlegroups.com.
        To view this discussion on the web visit
        
https://groups.google.com/d/msgid/bareos-users/3823a22d-b49e-4892-8b45-6c932ce7e699n%40googlegroups.com
        
<https://groups.google.com/d/msgid/bareos-users/3823a22d-b49e-4892-8b45-6c932ce7e699n%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- Sebastian surasebasti...@bareos.com
          Bareos GmbH & Co. KG            Phone: +49 221 630693-0
          https://www.bareos.com
          Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
          Komplementär: Bareos Verwaltungs-GmbH
          Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz

--
You received this message because you are subscribed to the Google Groups "bareos-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to bareos-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/bareos-users/67455f46-6182-4fb6-9be1-e8969ae0fb3dn%40googlegroups.com <https://groups.google.com/d/msgid/bareos-users/67455f46-6182-4fb6-9be1-e8969ae0fb3dn%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
 Sebastian surasebastian.s...@bareos.com
 Bareos GmbH & Co. KG            Phone: +49 221 630693-0
 https://www.bareos.com
 Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
 Komplementär: Bareos Verwaltungs-GmbH
 Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz

--
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/bareos-users/63072587-75b3-40bf-aafa-aa95a60b7c9c%40bareos.com.

Reply via email to