Hello Ben,

Great!  Thanks for the feedback.

Good luck,

Kern

On 01/17/2014 06:34 PM, Roberts, Ben wrote:
>
> Hi Kern,
>
>  
>
> I verified that the failures were happening on a 5.0.x FD as well as a
> 5.2.x FD. At the time, I hadn't realised this was unsupported or that
> it was even happening. Following Martin's observation earlier that the
> corruption was happening conveniently closely to the 2^32 overflow
> boundary, I've re-compiled the director/sd (and took the opportunity
> to upgrade to 5.2.13) and am just trying a full restore of one of the
> failing backups now -- so far 460GB restored which is a record for
> this server. It looks like the problem was entirely my fault -- using
> a copy of the DIR/SD compiled for one OS on a newer version of OS and
> that the corruption was happening on reading the data stream back in
> rather than while it was being written to the backup volumes.
>
>  
>
> I'll do a few more TB of restores to confirm the upgraded director is
> doing the correct thing and that this doesn't need any further
> investigation from the Bacula side.
>
>  
>
> Noted Re the same version of DIR/SD. I have not and will not be
> attempting to cross versions here.
>
>  
>
> Regards,
>
>  
>
> *Ben Roberts***
>
> IT Infrastructure
>
> *GSA Capital Partners LLP***
>
> Stratton House
>
> 5 Stratton Street
>
> London W1J 8LA
>
> *D*+44 (0)20 7959 7661
>
> *T*+44 (0)20 7959 8800
>
>
> www.gsacapital.com <http://www.gsacapital.com/>
>
>  
>
>  
>
> *From:*Kern Sibbald [mailto:k...@sibbald.com]
> *Sent:* 17 January 2014 17:23
> *To:* Roberts, Ben; bacula-users@lists.sourceforge.net
> *Subject:* Re: [Bacula-users] Errors restoring from disk backup:
> Volume data error Wanted ID: "BB02", got ""
>
>  
>
> Hello,
>
> Every case of this particular error message that I have seen has been
> due to data corruption outside of Bacula.  Typically this happens when
> a disk drive is bad, but since you are running ZFS and its checksums
> are good, I can see only several other possibilities:
>
> 1. The ZFS code is messed up.  Running a current distribution with the
> ZFS kernel module should not have this problem, but if you are running
> something a bit older or using a user file system rather than the kernel
> module you could have problems.
>
> 2. You have bad cables or a bad disk controller. 
>
> 3. You seem to be using 5.2.x FDs with 5.0.x Director/SD,
> which is not supported.  Your FDs should never be a higher
> version that the DIR/SD, but may be lower.  In addition your
> DIR and SD must always be the same version.
>
> Oops, I just re-read your email and probably point 1 does not apply
> since you seem to be running ZFS on Solaris so there is little or no
> possibility that the code is bad.
>
> Best regards,
> Kern
>
> On 01/14/2014 06:04 PM, Roberts, Ben wrote:
>
>     Hi all,
>
>      
>
>     I've recently setup a new Bacula director/storage daemon in
>     preparation to move our existing backups to newer hardware. During
>     testing, I've run into problems doing restores of backups taken to
>     disk, failing with the messages:
>
>      
>
>     Error: block.c:275 Volume data error at 24:4294944994! Wanted ID:
>     "BB02", got "". Buffer discarded.
>
>     Fatal error: fd_cmds.c:169 Command error with FD, hanging up.
>
>      
>
>     Similar errors are reported for both file-level backups, and
>     block-level backups made using bpipe. I've seen the instructions
>     in
>     
> http://www.bacula.org/en/dev-manual/main/main/Restore_Command.html#SECTION0021100000000000000000,
>     but these only seem to apply to tape backups rather than disk
>     ones. Regardless, I've tried striping the positional information
>     from the bootstrap file with no effect.
>
>      
>
>     Some relevant notes from my testing:
>
>     -          The issue does not affect every backup made, but does
>     affect a significant proportion tested.
>
>     -          A single job can be affected at multiple locations,
>     i.e. skipping one affected file might see the job fail again at a
>     subsequent file.
>
>     -          Attempting to restore the same job multiple times
>     elicits failures at the same block each time. Re-running the job
>     may produce a restorable backup, otherwise a job that will fail at
>     a different location again. Other jobs fail at different locations.
>
>     -          All data is stored on ZFS, which reports completely
>     clean of any checksum errors at the filesystem level
>
>     -          The server is not reporting any hardware issues, e.g.
>     corrected or uncorrectable memory reads, disk accesses etc.
>
>     -          The backup jobs are multiple TB in size, and restores
>     frequently fail within the first couple hundred GB.
>
>     -          The storage daemon is configured with a disk-changer
>     backed autochanger, writing to 100GB volumes, all residing within
>     the same ZFS filesystem (sitting atop a large RAID-Z2 disk array).
>
>      
>
>     The director is running "Version: 5.0.2 (28 April 2010)
>     i386-pc-solaris2.10 solaris 5.10" (compiled on solaris 5.10,
>     running on 5.11). Storage daemon runs on the same machine as the
>     director.  (I'm loosely tied to this version so the director can
>     interact with a storage daemon on another machine connected to a
>     tape changer).
>
>     A sample client is running "Version: 5.2.13 (19 February 2013) 
>     i386-pc-solaris2.11 solaris 5.11".
>
>      
>
>     From my understanding of how the Bacula components fit together, I
>     suspect the corruption must be happening in the Storage daemon
>     (since this is the only component that would be interested in the
>     BB02 block header?) before the data is written to disk (otherwise
>     ZFS would be reporting read/write errors).
>
>      
>
>     Is this an issue that's been seen before on other disk backups?
>     Can anyone provide any assistance in locating and fixing the cause
>     of the corruption? Any help would be greatly appreciated.
>
>      
>
>     Regards,
>
>      
>
>     *Ben Roberts*
>
>     IT Infrastructure
>
>      
>
>     --- Relevant config excerpts:
>
>      
>
>     Autochanger {
>
>       Name = backup3-autochanger
>
>       Device = drive-restore-backup3, drive-1-backup3
>
>       Device = drive-2-backup3, drive-3-backup3
>
>       Device = drive-4-backup3, drive-5-backup3
>
>       Changer Device = /data2/bacula/storage/backup3-autochanger.conf
>
>       Changer Command = "/opt/bacula/etc/disk-changer %c %o %S %a %d"
>
>     }
>
>      
>
>     Device {
>
>       Name = drive-1-backup3
>
>       Archive Device = /data2/bacula/storage/backup3-autochanger/drive1
>
>       Device Type = File
>
>       Media Type = File-backup3
>
>       AutoChanger = yes
>
>       Removable media = no
>
>       Random access = yes
>
>       Requires Mount = no
>
>       Always Open = no
>
>       Label Media = yes
>
>       Maximum Changer Wait = 180
>
>       Drive Index = 1
>
>       Maximum Spool Size = 100G
>
>     }
>
>     ...
>
>      
>
>     Storage {
>
>       Name = backup3-sd
>
>       Address = backup3.local
>
>       Device = backup3-autochanger
>
>       Media Type = File-backup3
>
>       Autochanger = yes
>
>     }
>
>      
>
>     Pool {
>
>         Name = Disk-45Day-backup3
>
>         Pool Type = Backup
>
>         Recycle = yes
>
>         AutoPrune = yes
>
>         Job Retention = 45 days
>
>         Volume Retention = 45 days
>
>         Label Format = Disk-45Day-backup3-
>
>         Storage = backup3-sd
>
>         Maximum Volume Bytes = 100G
>
>     }
>
>      
>
>     ------------------------------------------------------------------------
>
>     This email and any files transmitted with it contain confidential
>     and proprietary information and is solely for the use of the
>     intended recipient. If you are not the intended recipient please
>     return the email to the sender and delete it from your computer
>     and you must not use, disclose, distribute, copy, print or rely on
>     this email or its contents. This communication is for
>     informational purposes only. It is not intended as an offer or
>     solicitation for the purchase or sale of any financial instrument
>     or as an official confirmation of any transaction. Any comments or
>     statements made herein do not necessarily reflect those of GSA
>     Capital. GSA Capital Partners LLP is authorised and regulated by
>     the Financial Conduct Authority and is registered in England and
>     Wales at Stratton House, 5 Stratton Street, London W1J 8LA, number
>     OC309261. GSA Capital Services Limited is registered in England
>     and Wales at the same address, number 5320529.
>
>
>
>
>     
> ------------------------------------------------------------------------------
>
>     CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>
>     Learn Why More Businesses Are Choosing CenturyLink Cloud For
>
>     Critical Workloads, Development Environments & Everything In Between.
>
>     Get a Quote or Start a Free Trial Today. 
>
>     
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>
>
>
>
>     _______________________________________________
>
>     Bacula-users mailing list
>
>     Bacula-users@lists.sourceforge.net 
> <mailto:Bacula-users@lists.sourceforge.net>
>
>     https://lists.sourceforge.net/lists/listinfo/bacula-users
>
>  
>
>
> ------------------------------------------------------------------------
> This email and any files transmitted with it contain confidential and
> proprietary information and is solely for the use of the intended
> recipient. If you are not the intended recipient please return the
> email to the sender and delete it from your computer and you must not
> use, disclose, distribute, copy, print or rely on this email or its
> contents. This communication is for informational purposes only. It is
> not intended as an offer or solicitation for the purchase or sale of
> any financial instrument or as an official confirmation of any
> transaction. Any comments or statements made herein do not necessarily
> reflect those of GSA Capital. GSA Capital Partners LLP is authorised
> and regulated by the Financial Conduct Authority and is registered in
> England and Wales at Stratton House, 5 Stratton Street, London W1J
> 8LA, number OC309261. GSA Capital Services Limited is registered in
> England and Wales at the same address, number 5320529.
>
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today. 
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>
>
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to