On Friday 15 September 2006 21:36, William Baker wrote:
>
> I found the documentation on the heartbeat, configured it for the FD and
> SD for 5 sec, restarted the deamons, and ran the test again. On the
> primary test machine, the backup is still dying in the same place. I
> did notice (a little late) that I was probably focusing on the wrong
> message.
>
> The clients and server are seperated by a couple of switches, but they
> are on the same subnets, so routers should not be an issue. Most
> devices are gigabit on managed switches. Some devices are 100MB. In
> particular, the server is gigabit and the primary test client is 100MB.
> I plan to trace the route and check the errors on the ports -- starting
> with the server.
>
> For my primary test machine, the point of failure is consistantly around
> 5 mins into the backup with 2.460 to 2.464 G transferred.
If it happens that quickly and at 2.xx G, then it is most likely a Windows
problem (see the Win32 chapter of the manual for a weird case), or a bad
ethernet card (probably bad firmware).
>
> bbaker
>
> >On Friday 15 September 2006 18:07, William Baker wrote:
> >
> >
> >>(Thanks for kindly pointing me in the right direction, Kern.)
> >>
> >>I have a little bit more info to add to the mix -- and a little more
> >>confusion. Unix clients are behaving the same way. So, the only thing
> >>all these items appear to have in common is the server -- though it
> >>would seem strange to me to have such a problem in a production server
> >>that has been in use in other places for months.
> >>
> >>So, I upgraded the server to the latest beta. Surprise: same thing
> >>still happened -- "packet size too big". Well. The server is fedora
> >>core 4 with up-to-date patches. gcc version 4.0.2. I also failed to
> >>mention the server is build-from-source due to a strict mysql version
> >>4.1.10 requirement. The clients are RPM's and EXE's.
> >>
> >>I guess now is the time to dig into the code. At least I have a few
> >>verbose error messages to point the way.
> >>
> >>
> >
> >The problem you are having doesn't appear to be packet size too big because
> >that was not the first error message, and is likely spurious due to the
> >disconnection.
> >
> >I suspect that you are seeing network problems -- either a bad switch, a
bad
> >ethernet card, or simply Windows software that doesn't follow Internet
rules
> >and times out the line during large transfers. The manual discusses
several
> >reasons for this, including in some cases a Bacula workaround called
> >Heartbeat Interval.
> >
> >
> >
> >>bbaker
> >>
> >>
> >>
> >>>You will probably have better luck getting your question answered on the
> >>>bacula-users list, which I have copied for you.
> >>>
> >>>On Friday 15 September 2006 15:36, William Baker wrote:
> >>>
> >>>
> >>>
> >>>
> >>>>I know "packet too long" is in the FAQ. I think this is a new but
> >>>>related issue. The error is consistant and repeatable.
> >>>>
> >>>>The server is a production version bacula 1.38.11 running on Linux with
> >>>>MySQL database. Two versions of the Windows client have been tested:
> >>>>1.38.10 and 1.39.22. Several configurations of the client have been
> >>>>tested, but with and without VSS enabled. I have a TODO list that
> >>>>includes backing up other (non-windows) clients, but those tests haven't
> >>>>been done yet. The traces included below are for 1.39.22.
> >>>>
> >>>>The client data to backup is approximately 21 GB. For v1.38.10, only
> >>>>about 2GB where actually backed up. For 1.39.22 about 20GB were
> >>>>retrieved from the client before, the following message appears:
> >>>>
> >>>>15-Sep 07:36 scott2-sd: mcleod-job.2006-09-15_07.17.39 Fatal error:
> >>>>append.c:144 Error reading data header from FD. ERR=No data available
> >>>>15-Sep 07:36 scott2-sd: mcleod-job.2006-09-15_07.17.39 Fatal error:
> >>>>bnet.c:228 Packet size too big from "client:192.168.4.20:36643.
> >>>>Terminating connection.
> >>>>15-Sep 07:36 mcleod-fd: mcleod-job.2006-09-15_07.17.39 Fatal error:
> >>>>../../filed/backup.c:787 Network send error to SD. ERR=Input/output
error
> >>>>15-Sep 07:36 mcleod-fd: mcleod-job.2006-09-15_07.17.39 Error:
> >>>>../../lib/bnet.c:393 Write error sending len to Storage
> >>>>daemon:proe.priefert.com:9103: ERR=Input/output error
> >>>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "System Writer",
> >>>>State: 0x1 (VSS_WS_STABLE)
> >>>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "MSDEWriter",
> >>>>State: 0x1 (VSS_WS_STABLE)
> >>>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "IIS Metabase
> >>>>Writer", State: 0x1 (VSS_WS_STABLE)
> >>>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "Removable Storage
> >>>>Manager", State: 0x1 (VSS_WS_STABLE)
> >>>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "WMI Writer",
> >>>>State: 0x1 (VSS_WS_STABLE)
> >>>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "Event Log Writer",
> >>>>State: 0x1 (VSS_WS_STABLE)
> >>>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "Registry Writer",
> >>>>State: 0x1 (VSS_WS_STABLE)
> >>>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "COM+ REGDB
> >>>>Writer", State: 0x1 (VSS_WS_STABLE)
> >>>>15-Sep 07:38 scott2-dir: mcleod-job.2006-09-15_07.17.39 Error: Bacula
> >>>>1.38.11 (28Jun06): 15-Sep-2006 07:38:08
> >>>>
> >>>>On the client, the last few lines of the bacula.trace file tell a
> >>>>similar story:
> >>>>
> >>>>mcleod-fd: ../compat/compat.cpp:150 Leave cvt_u_to_win32_path
> >>>>path=\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy10\Program Files\ALK
> >>>>Technologies\PMW190\Connect\PCMSRV.HLP
> >>>>mcleod-fd: ../compat/compat.cpp:90 Enter convert_unix_to_win32_path
> >>>>mcleod-fd: ../compat/compat.cpp:141 path=D:\Program Files\ALK
> >>>>Technologies\PMW190\Connect\PCMSRV.HLP
> >>>>mcleod-fd: ../compat/compat.cpp:150 Leave cvt_u_to_win32_path
> >>>>path=\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy10\Program Files\ALK
> >>>>Technologies\PMW190\Connect\PCMSRV.HLP
> >>>>mcleod-fd: ../compat/compat.cpp:1107 readdir_r(b64960, {
> >>>>d_name="pcmsrv.pdf", d_reclen=10, d_off=66
> >>>>mcleod-fd: ../compat/compat.cpp:177 Enter wchar_win32_path
> >>>>mcleod-fd: ../compat/compat.cpp:351 Leave wchar_win32_path=\
> >>>>mcleod-fd: ../compat/compat.cpp:90 Enter convert_unix_to_win32_path
> >>>>mcleod-fd: ../compat/compat.cpp:141 path=D:\Program Files\ALK
> >>>>Technologies\PMW190\Connect\pcmsrv.pdf
> >>>>mcleod-fd: ../compat/compat.cpp:150 Leave cvt_u_to_win32_path
> >>>>path=\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy10\Program Files\ALK
> >>>>Technologies\PMW190\Connect\pcmsrv.pdf
> >>>>mcleod-fd: ../compat/compat.cpp:90 Enter convert_unix_to_win32_path
> >>>>mcleod-fd: ../compat/compat.cpp:141 path=D:\Program Files\ALK
> >>>>Technologies\PMW190\Connect\pcmsrv.pdf
> >>>>mcleod-fd: ../compat/compat.cpp:150 Leave cvt_u_to_win32_path
> >>>>path=\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy10\Program Files\ALK
> >>>>Technologies\PMW190\Connect\pcmsrv.pdf
> >>>>mcleod-fd: ../../filed/heartbeat.c:77 Got BNET_SIG 0 from SD
> >>>>mcleod-fd: ../../filed/heartbeat.c:82 wait_intr=1 stop=1
> >>>>mcleod-fd: ../../filed/backup.c:184 end blast_data ok=0
> >>>>mcleod-fd: ../../filed/job.c:221 Quit command loop. Canceled=1
> >>>>mcleod-fd: ../../filed/job.c:303 Calling term_find_files
> >>>>mcleod-fd: ../../filed/job.c:306 Done with term_find_files
> >>>>mcleod-fd: ../../filed/job.c:308 Done with free_jcr
> >>>>
> >>>>Actually, on the Windows box, I'm trying to back up most of C: and D:.
> >>>>Here is what cygwin df says about the data:
> >>>>
> >>>>C:\WINDOWS\system32>df
> >>>>Filesystem 1K-blocks Used Available Use% Mounted on
> >>>>C:\cygwin\bin 20482843 9327624 11155219 46% /usr/bin
> >>>>C:\cygwin\lib 20482843 9327624 11155219 46% /usr/lib
> >>>>C:\cygwin 20482843 9327624 11155219 46% /
> >>>>c: 20482843 9327624 11155219 46% /cygdrive/c
> >>>>d: 123170320 12428172 110742148 11% /cygdrive/d
> >>>>
> >>>>While the backup statistics give the following:
> >>>>
> >>>> JobId: 8
> >>>> Job: mcleod-job.2006-09-15_07.17.39
> >>>> Backup Level: Full
> >>>> Client: "mcleod-fd" Linux,Cross-compile,Win32
> >>>> FileSet: "BasicWindowsFileSet" 2006-09-14 21:26:55
> >>>> Pool: "Default"
> >>>> Storage: "LTO2"
> >>>> Scheduled time: 15-Sep-2006 07:17:36
> >>>> Start time: 15-Sep-2006 07:17:44
> >>>> End time: 15-Sep-2006 07:38:08
> >>>> Elapsed time: 20 mins 24 secs
> >>>> Priority: 1
> >>>> FD Files Written: 73,064
> >>>> SD Files Written: 72,744
> >>>> FD Bytes Written: 20,028,562,878 (20.02 GB)
> >>>> SD Bytes Written: 20,037,565,140 (20.03 GB)
> >>>> Rate: 16363.2 KB/s
> >>>> Software Compression: None
> >>>> Volume name(s): bacula-1
> >>>> Volume Session Id: 7
> >>>>
> >>>>So, the immense majority of the data was sent. I don't yet know enought
> >>>>about bacula to know if the difference between the FD Files Written and
> >>>>SD Files Written is any kind of clue.
> >>>>
> >>>>By the way, the beta on Windows looks very promising. I liked what I
> >>>>saw. I worked a little with BartPE to build a bootable recovery CD. I
> >>>>know the issues associated with that. Can you post off-topic in your
> >>>>own post?
> >>>>
> >>>>bbaker
> >>>>
> >>>>
>
>>>>-------------------------------------------------------------------------
> >>>>Using Tomcat but need to do more? Need to support web services,
security?
> >>>>Get stuff done quickly with pre-integrated technology to make your job
> >>>>
> >>>>
> >>>>
> >>>>
> >>>easier
> >>>
> >>>
> >>>
> >>>
> >>>>Download IBM WebSphere Application Server v.1.0.1 based on Apache
Geronimo
> >>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> >>>>_______________________________________________
> >>>>Bacula-devel mailing list
> >>>>[EMAIL PROTECTED]
> >>>>https://lists.sourceforge.net/lists/listinfo/bacula-devel
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>-------------------------------------------------------------------------
> >>Using Tomcat but need to do more? Need to support web services, security?
> >>Get stuff done quickly with pre-integrated technology to make your job
> >>
> >>
> >easier
> >
> >
> >>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> >>_______________________________________________
> >>Bacula-users mailing list
> >>[email protected]
> >>https://lists.sourceforge.net/lists/listinfo/bacula-users
> >>
> >>
> >>
>
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-users