I tend to be answering my own questions quite a bit. Sorry for the undigested flow of info. I'll try to keep the verbosity of my debugging a little lower.
With each version change of the client or server, the problem moves to a different spot. With the latest beta on the server, the Windows machine completes the entire backup. It appears that some of the other Linux machines will also complete the backup, but not all of them. I have at least one fedore-core-4 machine that is failing. The nice thing about the failures is that they have been consistant and repeatable. With data changing on the disks continually, my luck may not hold out forever. I suspose I have all I need to debug the issue completely: source code, a compiler, and ethereal. I'm certainly open to other suggestions. bbaker >(Thanks for kindly pointing me in the right direction, Kern.) > >I have a little bit more info to add to the mix -- and a little more >confusion. Unix clients are behaving the same way. So, the only thing >all these items appear to have in common is the server -- though it >would seem strange to me to have such a problem in a production server >that has been in use in other places for months. > >So, I upgraded the server to the latest beta. Surprise: same thing >still happened -- "packet size too big". Well. The server is fedora >core 4 with up-to-date patches. gcc version 4.0.2. I also failed to >mention the server is build-from-source due to a strict mysql version >4.1.10 requirement. The clients are RPM's and EXE's. > >I guess now is the time to dig into the code. At least I have a few >verbose error messages to point the way. > >bbaker > > > >>You will probably have better luck getting your question answered on the >>bacula-users list, which I have copied for you. >> >>On Friday 15 September 2006 15:36, William Baker wrote: >> >> >> >> >>>I know "packet too long" is in the FAQ. I think this is a new but >>>related issue. The error is consistant and repeatable. >>> >>>The server is a production version bacula 1.38.11 running on Linux with >>>MySQL database. Two versions of the Windows client have been tested: >>>1.38.10 and 1.39.22. Several configurations of the client have been >>>tested, but with and without VSS enabled. I have a TODO list that >>>includes backing up other (non-windows) clients, but those tests haven't >>>been done yet. The traces included below are for 1.39.22. >>> >>>The client data to backup is approximately 21 GB. For v1.38.10, only >>>about 2GB where actually backed up. For 1.39.22 about 20GB were >>>retrieved from the client before, the following message appears: >>> >>>15-Sep 07:36 scott2-sd: mcleod-job.2006-09-15_07.17.39 Fatal error: >>>append.c:144 Error reading data header from FD. ERR=No data available >>>15-Sep 07:36 scott2-sd: mcleod-job.2006-09-15_07.17.39 Fatal error: >>>bnet.c:228 Packet size too big from "client:192.168.4.20:36643. >>>Terminating connection. >>>15-Sep 07:36 mcleod-fd: mcleod-job.2006-09-15_07.17.39 Fatal error: >>>../../filed/backup.c:787 Network send error to SD. ERR=Input/output error >>>15-Sep 07:36 mcleod-fd: mcleod-job.2006-09-15_07.17.39 Error: >>>../../lib/bnet.c:393 Write error sending len to Storage >>>daemon:proe.priefert.com:9103: ERR=Input/output error >>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "System Writer", >>>State: 0x1 (VSS_WS_STABLE) >>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "MSDEWriter", >>>State: 0x1 (VSS_WS_STABLE) >>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "IIS Metabase >>>Writer", State: 0x1 (VSS_WS_STABLE) >>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "Removable Storage >>>Manager", State: 0x1 (VSS_WS_STABLE) >>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "WMI Writer", >>>State: 0x1 (VSS_WS_STABLE) >>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "Event Log Writer", >>>State: 0x1 (VSS_WS_STABLE) >>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "Registry Writer", >>>State: 0x1 (VSS_WS_STABLE) >>>15-Sep 07:38 mcleod-fd: VSS Writer (BackupComplete): "COM+ REGDB >>>Writer", State: 0x1 (VSS_WS_STABLE) >>>15-Sep 07:38 scott2-dir: mcleod-job.2006-09-15_07.17.39 Error: Bacula >>>1.38.11 (28Jun06): 15-Sep-2006 07:38:08 >>> >>>On the client, the last few lines of the bacula.trace file tell a >>>similar story: >>> >>>mcleod-fd: ../compat/compat.cpp:150 Leave cvt_u_to_win32_path >>>path=\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy10\Program Files\ALK >>>Technologies\PMW190\Connect\PCMSRV.HLP >>>mcleod-fd: ../compat/compat.cpp:90 Enter convert_unix_to_win32_path >>>mcleod-fd: ../compat/compat.cpp:141 path=D:\Program Files\ALK >>>Technologies\PMW190\Connect\PCMSRV.HLP >>>mcleod-fd: ../compat/compat.cpp:150 Leave cvt_u_to_win32_path >>>path=\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy10\Program Files\ALK >>>Technologies\PMW190\Connect\PCMSRV.HLP >>>mcleod-fd: ../compat/compat.cpp:1107 readdir_r(b64960, { >>>d_name="pcmsrv.pdf", d_reclen=10, d_off=66 >>>mcleod-fd: ../compat/compat.cpp:177 Enter wchar_win32_path >>>mcleod-fd: ../compat/compat.cpp:351 Leave wchar_win32_path=\ >>>mcleod-fd: ../compat/compat.cpp:90 Enter convert_unix_to_win32_path >>>mcleod-fd: ../compat/compat.cpp:141 path=D:\Program Files\ALK >>>Technologies\PMW190\Connect\pcmsrv.pdf >>>mcleod-fd: ../compat/compat.cpp:150 Leave cvt_u_to_win32_path >>>path=\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy10\Program Files\ALK >>>Technologies\PMW190\Connect\pcmsrv.pdf >>>mcleod-fd: ../compat/compat.cpp:90 Enter convert_unix_to_win32_path >>>mcleod-fd: ../compat/compat.cpp:141 path=D:\Program Files\ALK >>>Technologies\PMW190\Connect\pcmsrv.pdf >>>mcleod-fd: ../compat/compat.cpp:150 Leave cvt_u_to_win32_path >>>path=\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy10\Program Files\ALK >>>Technologies\PMW190\Connect\pcmsrv.pdf >>>mcleod-fd: ../../filed/heartbeat.c:77 Got BNET_SIG 0 from SD >>>mcleod-fd: ../../filed/heartbeat.c:82 wait_intr=1 stop=1 >>>mcleod-fd: ../../filed/backup.c:184 end blast_data ok=0 >>>mcleod-fd: ../../filed/job.c:221 Quit command loop. Canceled=1 >>>mcleod-fd: ../../filed/job.c:303 Calling term_find_files >>>mcleod-fd: ../../filed/job.c:306 Done with term_find_files >>>mcleod-fd: ../../filed/job.c:308 Done with free_jcr >>> >>>Actually, on the Windows box, I'm trying to back up most of C: and D:. >>>Here is what cygwin df says about the data: >>> >>>C:\WINDOWS\system32>df >>>Filesystem 1K-blocks Used Available Use% Mounted on >>>C:\cygwin\bin 20482843 9327624 11155219 46% /usr/bin >>>C:\cygwin\lib 20482843 9327624 11155219 46% /usr/lib >>>C:\cygwin 20482843 9327624 11155219 46% / >>>c: 20482843 9327624 11155219 46% /cygdrive/c >>>d: 123170320 12428172 110742148 11% /cygdrive/d >>> >>>While the backup statistics give the following: >>> >>> JobId: 8 >>> Job: mcleod-job.2006-09-15_07.17.39 >>> Backup Level: Full >>> Client: "mcleod-fd" Linux,Cross-compile,Win32 >>> FileSet: "BasicWindowsFileSet" 2006-09-14 21:26:55 >>> Pool: "Default" >>> Storage: "LTO2" >>> Scheduled time: 15-Sep-2006 07:17:36 >>> Start time: 15-Sep-2006 07:17:44 >>> End time: 15-Sep-2006 07:38:08 >>> Elapsed time: 20 mins 24 secs >>> Priority: 1 >>> FD Files Written: 73,064 >>> SD Files Written: 72,744 >>> FD Bytes Written: 20,028,562,878 (20.02 GB) >>> SD Bytes Written: 20,037,565,140 (20.03 GB) >>> Rate: 16363.2 KB/s >>> Software Compression: None >>> Volume name(s): bacula-1 >>> Volume Session Id: 7 >>> >>>So, the immense majority of the data was sent. I don't yet know enought >>>about bacula to know if the difference between the FD Files Written and >>>SD Files Written is any kind of clue. >>> >>>By the way, the beta on Windows looks very promising. I liked what I >>>saw. I worked a little with BartPE to build a bootable recovery CD. I >>>know the issues associated with that. Can you post off-topic in your >>>own post? >>> >>>bbaker >>> >>> >>>------------------------------------------------------------------------- >>>Using Tomcat but need to do more? Need to support web services, security? >>>Get stuff done quickly with pre-integrated technology to make your job >>> >>> >>> >>> >>easier >> >> >> >> >>>Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >>>_______________________________________________ >>>Bacula-devel mailing list >>>[EMAIL PROTECTED] >>>https://lists.sourceforge.net/lists/listinfo/bacula-devel >>> >>> >>> >>> >>> > >------------------------------------------------------------------------- >Using Tomcat but need to do more? Need to support web services, security? >Get stuff done quickly with pre-integrated technology to make your job easier >Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >_______________________________________________ >Bacula-users mailing list >Bacula-users@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/bacula-users > > ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users