Hi people i have read a some mails about this problem, it looks like all
was running some 5.X branch, i have been using FreeBSD 6.1 some months
ago, yesterday i make the buildworld process, right now i have my box with
FreeBSD6.1-p10.
This box runs bacula server with this NIC:
vr0: <VIA VT6102 Rhine II 10/100BaseTX> port 0xe400-0xe4ff mem
0xee022000-0xee0220ff at device 18.0 on pci0
vr0: Reserved 0x100 bytes for rid 0x10 type 4 at 0xe400
miibus0: <MII bus> on vr0
vr0: bpf attached
vr0: Ethernet address: 00:01:6c:2c:09:90
vr0: [MPSAFE]
This NIC is integrated with the motherboard, i used this box with
freebsd 5.4-pX almost 1 year running bacula 1.38.5 without a problem.
1 full backup take almost 140Gb of data.
Last week i lost 1 job Full Backup from one of my biggest servers running
RH9 aprox 80Gb off data, bacula just backup 35Gb and mark the job ->Error
26-Sep 00:28 bacula-dir: MBXBDCB.2006-09-25_21.30.00 Fatal error: Network
error with FD during Backup: ERR=Operation timed out
26-Sep 00:28 bacula-dir: MBXBDCB.2006-09-25_21.30.00 Fatal error: No Job
status returned from FD.
26-Sep 00:28 bacula-dir: MBXBDCB.2006-09-25_21.30.00 Error: Bacula
1.38.11(28Jun06): 26-Sep-2006 00:28:48
FD termination status: Error
SD termination status: Error
Termination: *** Backup Error ***
I have no problem with the client, is running our ERP software and no
comment here.
In my freebsd console appear this:
vr0: watchdog timeout
I reset the server, and all the Differential backups has been working
good, i do the buildworld yesterday and let my bacula server ready to do a
full backup for all my clients and whops...
I lost 2 clients jobs:
Client 1:
02-Oct 18:30 bacula-dir: Start Backup JobId 176, Job=
PDC.2006-10-02_18.30.00
02-Oct 20:40 bacula-dir: PDC.2006-10-02_18.30.00 Fatal error: Network
error with FD during Backup: ERR=Operation timed out
02-Oct 20:40 bacula-dir: PDC.2006-10-02_18.30.00 Fatal error: No Job
status returned from FD.
02-Oct 20:40 bacula-dir: PDC.2006-10-02_18.30.00 Error: Bacula
1.38.11(28Jun06): 02-Oct-2006 20:40:11
JobId: 176
Job: PDC.2006-10-02_18.30.00
Backup Level: Full
Client: "PDC" Windows NT 4.0,MVS,NT 4.0.1381
FileSet: "PDC-FS" 2006-08-21 18:04:12
Pool: "FullTape"
Storage: "LTO-1"
Scheduled time: 02-Oct-2006 18:30:00
Start time: 02-Oct-2006 18:30:06
End time: 02-Oct-2006 20:40:11
Elapsed time: 2 hours 10 mins 5 secs
Priority: 11
FD Files Written: 0
SD Files Written: 0
FD Bytes Written: 0 (0 B)
SD Bytes Written: 0 (0 B)
Rate: 0.0 KB/s
Software Compression: None
Volume name(s): FullTape-0004
Volume Session Id: 2
Volume Session Time: 1159832414
Last Volume Bytes: 38,857,830,949 ( 38.85 GB)
Non-fatal FD errors: 0
SD Errors: 0
FD termination status: Error
SD termination status: Error
Termination: *** Backup Error ***
Client 2
02-Oct 21:30 bacula-dir: Start Backup JobId 178, Job=
MBXBDCB.2006-10-02_21.30.00
02-Oct 21:31 bacula-dir: MBXBDCB.2006-10-02_21.30.00 Warning: bnet.c:853
Could not connect to File daemon on 192.168.2.9:9102. ERR=Host is down
Retrying ...
02-Oct 21:37 bacula-dir: MBXBDCB.2006-10-02_21.30.00 Warning: bnet.c:853
Could not connect to File daemon on 192.168.2.9:9102. ERR=Host is down
Retrying ...
02-Oct 21:44 bacula-dir: MBXBDCB.2006-10-02_21.30.00 Warning: bnet.c:853
Could not connect to File daemon on 192.168.2.9:9102. ERR=Host is down
Retrying ...
02-Oct 21:51 bacula-dir: MBXBDCB.2006-10-02_21.30.00 Warning: bnet.c:853
Could not connect to File daemon on 192.168.2.9:9102. ERR=Host is down
Retrying ...
02-Oct 21:58 bacula-dir: MBXBDCB.2006-10-02_21.30.00 Warning: bnet.c:853
Could not connect to File daemon on 192.168.2.9:9102. ERR=Host is down
Retrying ...
02-Oct 22:04 bacula-dir: MBXBDCB.2006-10-02_21.30.00 Warning: bnet.c:853
Could not connect to File daemon on 192.168.2.9:9102. ERR=Host is down
Retrying ...
02-Oct 22:10 bacula-dir: MBXBDCB.2006-10-02_21.30.00 Fatal error: bnet.c:859
Unable to connect to File daemon on 192.168.2.9:9102 . ERR=Host is down
02-Oct 22:10 bacula-dir: MBXBDCB.2006-10-02_21.30.00 Error: Bacula
1.38.11(28Jun06): 02-Oct-2006 22:10:03
JobId: 178
Job: MBXBDCB.2006-10-02_21.30.00
Backup Level: Full
Client: "MBXBDCB" i686-pc-linux-gnu,redhat,9
FileSet: "MBXBDCB-FS" 2006-08-21 23:00:02
Pool: "FullTape"
Storage: "LTO-1"
Scheduled time: 02-Oct-2006 21:30:00
Start time: 02-Oct-2006 21:30:02
End time: 02-Oct-2006 22:10:03
Elapsed time: 40 mins 1 sec
Priority: 13
FD Files Written: 0
SD Files Written: 0
FD Bytes Written: 0 (0 B)
SD Bytes Written: 0 (0 B)
Rate: 0.0 KB/s
Software Compression: None
Volume name(s):
Volume Session Id: 4
Volume Session Time: 1159832414
Last Volume Bytes: 38,857,830,949 (38.85 GB)
Non-fatal FD errors: 0
SD Errors: 0
FD termination status:
SD termination status: Waiting on FD
Termination: *** Backup Error ***
My console again:
vr0: watchdog timeout
But my catalog backup was made with success.
03-Oct 03:00 bacula-dir: Start Backup JobId 179, Job=
BackupCatalog.2006-10-03_03.00.00
03-Oct 03:03 bacula-dir: Bacula 1.38.11 (28Jun06): 03-Oct-2006 03:03:00
JobId: 179
Job: BackupCatalog.2006-10-03_03.00.00
Backup Level: Full
Client: "BACULA" i386-portbld-freebsd6.1,freebsd,
6.1-RELEASE-p3
FileSet: "CATALOG-FS" 2006-08-22 05:00:02
Pool: "FullTape"
Storage: "LTO-1"
Scheduled time: 03-Oct-2006 03:00:00
Start time: 03-Oct-2006 03:00:50
End time: 03-Oct-2006 03:03:00
Elapsed time: 2 mins 10 secs
Priority: 14
FD Files Written: 7,646
SD Files Written: 7,646
FD Bytes Written: 360,432,688 (360.4 MB)
SD Bytes Written: 361,320,457 (361.3 MB)
Rate: 2772.6 KB/s
Software Compression: None
Volume name(s): FullTape-0004
Volume Session Id: 5
Volume Session Time: 1159832414
Last Volume Bytes: 39,219,629,264 (39.21 GB)
Non-fatal FD errors: 0
SD Errors: 0
FD termination status: OK
SD termination status: OK
Termination: Backup OK
I wasnt on that office, i note this during the morning because went i
was trying to access that server from the other building with putty, i
couldn't connect at first, them my main say "it's happend again :-("... i
call to my friend there to un-plug and plug the cable and just with that i
was able to connect to that server.
It looks like this NIC is having problems with the workload hi, i have
2 things here that i can do:
1; Change the cable and try again.
2; Change the NIC and try again.
What else can i do..?
But i really hope someone fix this problem, thanks all for your time.
Part of my dmesg output:
Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 6.1-RELEASE-p10 #5: Mon Oct 2 13:26:52 PDT 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/BACULA
Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a14000.
Preloaded elf module "/boot/kernel/acpi.ko" at 0xc0a14188.
Table 'FACP' at 0x1bff3040
Table 'APIC' at 0x1bff7dc0
MADT: Found table at 0x1bff7dc0
MP Configuration Table version 1.1 found at 0xc00f1400
APIC: Using the MADT enumerator.
MADT: Found CPU APIC ID 0 ACPI ID 0: enabled
ACPI APIC Table: <KM400 AWRDACPI>
Calibrating clock(s) ... i8254 clock: 1193181 Hz
CLK_USE_I8254_CALIBRATION not specified - using default frequency
Timecounter "i8254" frequency 1193182 Hz quality 0
Calibrating TSC clock ... TSC clock: 1600072446 Hz
CPU: AMD Duron(tm) processor (1600.07-MHz 686-class CPU)
Origin = "AuthenticAMD" Id = 0x681 Stepping = 1
Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
AMD Features=0xc0400800<SYSCALL,MMX+,3DNow+,3DNow>
Data TLB: 32 entries, fully associative
Instruction TLB: 16 entries, fully associative
L1 data cache: 64 kbytes, 64 bytes/line, 1 lines/tag, 2-way associative
L1 instruction cache: 64 kbytes, 64 bytes/line, 1 lines/tag, 2-way
associative
L2 internal cache: 64 kbytes, 64 bytes/line, 1 lines/tag, 8-way
associative
real memory = 469696512 (447 MB)
Physical memory chunk(s):
0x0000000000001000 - 0x000000000009efff, 647168 bytes (158 pages)
0x0000000000100000 - 0x00000000003fffff, 3145728 bytes (768 pages)
0x0000000000c25000 - 0x000000001b7d7fff, 448475136 bytes (109491 pages)
avail memory = 450490368 (429 MB)
bios32: Found BIOS32 Service Directory header at 0xc00fac70
bios32: Entry = 0xfb0f0 (c00fb0f0) Rev = 0 Len = 1
pcibios: PCI BIOS entry at 0xf0000+0xb160
pnpbios: Found PnP BIOS data at 0xc00fbc20
pnpbios: Entry = f0000:bc50 Rev = 1.0
Other BIOS signatures found:
APIC: CPU 0 has ACPI ID 0
MADT: Found IO APIC ID 2, Interrupt 0 at 0xfec00000
ioapic0: Routing external 8259A's -> intpin 0
Greetings.