Hi Martin, Kern

Confirmed that at least recompiling on the target machine, if not the upgrade 
to 5.2.13 at the same time fixed this, and I was able to restore 10TB over the 
weekend from backups previously indicated as faulty. It looks like the root 
cause was binary-incompatibility in the system libraries bacula was linking to 
compared to what it was built against that manifested only as a read error 
during restores.

Thanks again for your help, very much appreciated!

Ben Roberts

IT Infrastructure

GSA Capital Partners LLP

Stratton House
5 Stratton Street

London W1J 8LA

D +44 (0)20 7959 7661

T +44 (0)20 7959 8800


www.gsacapital.com<http://www.gsacapital.com/>



From: Kern Sibbald [mailto:k...@sibbald.com]
Sent: 17 January 2014 19:35
To: Roberts, Ben; bacula-users@lists.sourceforge.net
Subject: Re: [Bacula-users] Errors restoring from disk backup: Volume data 
error Wanted ID: "BB02", got ""

Hello Ben,

Great!  Thanks for the feedback.

Good luck,

Kern

On 01/17/2014 06:34 PM, Roberts, Ben wrote:
Hi Kern,

I verified that the failures were happening on a 5.0.x FD as well as a 5.2.x 
FD. At the time, I hadn't realised this was unsupported or that it was even 
happening. Following Martin's observation earlier that the corruption was 
happening conveniently closely to the 2^32 overflow boundary, I've re-compiled 
the director/sd (and took the opportunity to upgrade to 5.2.13) and am just 
trying a full restore of one of the failing backups now - so far 460GB restored 
which is a record for this server. It looks like the problem was entirely my 
fault - using a copy of the DIR/SD compiled for one OS on a newer version of OS 
and that the corruption was happening on reading the data stream back in rather 
than while it was being written to the backup volumes.

I'll do a few more TB of restores to confirm the upgraded director is doing the 
correct thing and that this doesn't need any further investigation from the 
Bacula side.

Noted Re the same version of DIR/SD. I have not and will not be attempting to 
cross versions here.

Regards,

Ben Roberts

IT Infrastructure

GSA Capital Partners LLP

Stratton House
5 Stratton Street

London W1J 8LA

D +44 (0)20 7959 7661

T +44 (0)20 7959 8800


www.gsacapital.com<http://www.gsacapital.com/>



From: Kern Sibbald [mailto:k...@sibbald.com]
Sent: 17 January 2014 17:23
To: Roberts, Ben; 
bacula-users@lists.sourceforge.net<mailto:bacula-users@lists.sourceforge.net>
Subject: Re: [Bacula-users] Errors restoring from disk backup: Volume data 
error Wanted ID: "BB02", got ""

Hello,

Every case of this particular error message that I have seen has been
due to data corruption outside of Bacula.  Typically this happens when
a disk drive is bad, but since you are running ZFS and its checksums
are good, I can see only several other possibilities:

1. The ZFS code is messed up.  Running a current distribution with the
ZFS kernel module should not have this problem, but if you are running
something a bit older or using a user file system rather than the kernel
module you could have problems.

2. You have bad cables or a bad disk controller.

3. You seem to be using 5.2.x FDs with 5.0.x Director/SD,
which is not supported.  Your FDs should never be a higher
version that the DIR/SD, but may be lower.  In addition your
DIR and SD must always be the same version.

Oops, I just re-read your email and probably point 1 does not apply
since you seem to be running ZFS on Solaris so there is little or no
possibility that the code is bad.

Best regards,
Kern

On 01/14/2014 06:04 PM, Roberts, Ben wrote:
Hi all,

I've recently setup a new Bacula director/storage daemon in preparation to move 
our existing backups to newer hardware. During testing, I've run into problems 
doing restores of backups taken to disk, failing with the messages:



Error: block.c:275 Volume data error at 24:4294944994! Wanted ID: "BB02", got 
"". Buffer discarded.

Fatal error: fd_cmds.c:169 Command error with FD, hanging up.

Similar errors are reported for both file-level backups, and block-level 
backups made using bpipe. I've seen the instructions in 
http://www.bacula.org/en/dev-manual/main/main/Restore_Command.html#SECTION0021100000000000000000,
 but these only seem to apply to tape backups rather than disk ones. 
Regardless, I've tried striping the positional information from the bootstrap 
file with no effect.

Some relevant notes from my testing:

-          The issue does not affect every backup made, but does affect a 
significant proportion tested.

-          A single job can be affected at multiple locations, i.e. skipping 
one affected file might see the job fail again at a subsequent file.

-          Attempting to restore the same job multiple times elicits failures 
at the same block each time. Re-running the job may produce a restorable 
backup, otherwise a job that will fail at a different location again. Other 
jobs fail at different locations.

-          All data is stored on ZFS, which reports completely clean of any 
checksum errors at the filesystem level

-          The server is not reporting any hardware issues, e.g. corrected or 
uncorrectable memory reads, disk accesses etc.

-          The backup jobs are multiple TB in size, and restores frequently 
fail within the first couple hundred GB.

-          The storage daemon is configured with a disk-changer backed 
autochanger, writing to 100GB volumes, all residing within the same ZFS 
filesystem (sitting atop a large RAID-Z2 disk array).

The director is running "Version: 5.0.2 (28 April 2010) i386-pc-solaris2.10 
solaris 5.10" (compiled on solaris 5.10, running on 5.11). Storage daemon runs 
on the same machine as the director.  (I'm loosely tied to this version so the 
director can interact with a storage daemon on another machine connected to a 
tape changer).
A sample client is running "Version: 5.2.13 (19 February 2013)  
i386-pc-solaris2.11 solaris 5.11".

>From my understanding of how the Bacula components fit together, I suspect the 
>corruption must be happening in the Storage daemon (since this is the only 
>component that would be interested in the BB02 block header?) before the data 
>is written to disk (otherwise ZFS would be reporting read/write errors).

Is this an issue that's been seen before on other disk backups? Can anyone 
provide any assistance in locating and fixing the cause of the corruption? Any 
help would be greatly appreciated.

Regards,

Ben Roberts

IT Infrastructure


--- Relevant config excerpts:

Autochanger {
  Name = backup3-autochanger
  Device = drive-restore-backup3, drive-1-backup3
  Device = drive-2-backup3, drive-3-backup3
  Device = drive-4-backup3, drive-5-backup3
  Changer Device = /data2/bacula/storage/backup3-autochanger.conf
  Changer Command = "/opt/bacula/etc/disk-changer %c %o %S %a %d"
}

Device {
  Name = drive-1-backup3
  Archive Device = /data2/bacula/storage/backup3-autochanger/drive1
  Device Type = File
  Media Type = File-backup3
  AutoChanger = yes
  Removable media = no
  Random access = yes
  Requires Mount = no
  Always Open = no
  Label Media = yes
  Maximum Changer Wait = 180
  Drive Index = 1
  Maximum Spool Size = 100G
}
...

Storage {
  Name = backup3-sd
  Address = backup3.local
  Device = backup3-autochanger
  Media Type = File-backup3
  Autochanger = yes
}

Pool {
    Name = Disk-45Day-backup3
    Pool Type = Backup
    Recycle = yes
    AutoPrune = yes
    Job Retention = 45 days
    Volume Retention = 45 days
    Label Format = Disk-45Day-backup3-
    Storage = backup3-sd
    Maximum Volume Bytes = 100G
}

________________________________
This email and any files transmitted with it contain confidential and 
proprietary information and is solely for the use of the intended recipient. If 
you are not the intended recipient please return the email to the sender and 
delete it from your computer and you must not use, disclose, distribute, copy, 
print or rely on this email or its contents. This communication is for 
informational purposes only. It is not intended as an offer or solicitation for 
the purchase or sale of any financial instrument or as an official confirmation 
of any transaction. Any comments or statements made herein do not necessarily 
reflect those of GSA Capital. GSA Capital Partners LLP is authorised and 
regulated by the Financial Conduct Authority and is registered in England and 
Wales at Stratton House, 5 Stratton Street, London W1J 8LA, number OC309261. 
GSA Capital Services Limited is registered in England and Wales at the same 
address, number 5320529.






------------------------------------------------------------------------------

CenturyLink Cloud: The Leader in Enterprise Cloud Services.

Learn Why More Businesses Are Choosing CenturyLink Cloud For

Critical Workloads, Development Environments & Everything In Between.

Get a Quote or Start a Free Trial Today.

http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk





_______________________________________________

Bacula-users mailing list

Bacula-users@lists.sourceforge.net<mailto:Bacula-users@lists.sourceforge.net>

https://lists.sourceforge.net/lists/listinfo/bacula-users


________________________________
This email and any files transmitted with it contain confidential and 
proprietary information and is solely for the use of the intended recipient. If 
you are not the intended recipient please return the email to the sender and 
delete it from your computer and you must not use, disclose, distribute, copy, 
print or rely on this email or its contents. This communication is for 
informational purposes only. It is not intended as an offer or solicitation for 
the purchase or sale of any financial instrument or as an official confirmation 
of any transaction. Any comments or statements made herein do not necessarily 
reflect those of GSA Capital. GSA Capital Partners LLP is authorised and 
regulated by the Financial Conduct Authority and is registered in England and 
Wales at Stratton House, 5 Stratton Street, London W1J 8LA, number OC309261. 
GSA Capital Services Limited is registered in England and Wales at the same 
address, number 5320529.





------------------------------------------------------------------------------

CenturyLink Cloud: The Leader in Enterprise Cloud Services.

Learn Why More Businesses Are Choosing CenturyLink Cloud For

Critical Workloads, Development Environments & Everything In Between.

Get a Quote or Start a Free Trial Today.

http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk




_______________________________________________

Bacula-users mailing list

Bacula-users@lists.sourceforge.net<mailto:Bacula-users@lists.sourceforge.net>

https://lists.sourceforge.net/lists/listinfo/bacula-users


________________________________
This email and any files transmitted with it contain confidential and 
proprietary information and is solely for the use of the intended recipient. If 
you are not the intended recipient please return the email to the sender and 
delete it from your computer and you must not use, disclose, distribute, copy, 
print or rely on this email or its contents. This communication is for 
informational purposes only. It is not intended as an offer or solicitation for 
the purchase or sale of any financial instrument or as an official confirmation 
of any transaction. Any comments or statements made herein do not necessarily 
reflect those of GSA Capital. GSA Capital Partners LLP is authorised and 
regulated by the Financial Conduct Authority and is registered in England and 
Wales at Stratton House, 5 Stratton Street, London W1J 8LA, number OC309261. 
GSA Capital Services Limited is registered in England and Wales at the same 
address, number 5320529.

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to