On Feb 10, 2006, at 1:28 PM, Michael Reifenberger wrote:
On Fri, 10 Feb 2006, Markus Trippelsdorf wrote:
...
...
ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=58914495
ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=123039679
ad1: WARNING - WRITE_DMA UDMA ICRC error (retrying request)
LBA=54591167
It looks like bad cabling to me. Try new cables and also run
smartctl -a /dev/ad0 (and ad1) to check if the hardware is OK.
smartctl doesn't reports any errors, and accessing only on disk at
a time
doesn't give errors either. So probably cabling isn't the issue here.
More likely a timing/locking interaction between gmirror/ata...
Bye/2
---
Michael Reifenberger, Business Development Manager SAP-Basis, Plaut
Consulting
Comp: [EMAIL PROTECTED] | Priv: [EMAIL PROTECTED]
http://www.plaut.de | http://
www.Reifenberger.com
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-
[EMAIL PROTECTED]"
I have the same problem with 6.0 and gmirror. It's not cabling or HW
problem as it works fine with FreeBSD 5.4 or previous, OpenBSD 3.8
and Linux 2.6.x, with and without mirroring.
I have a 2xPIII 700 MHz SMP system with Maxtor PCI SATA controller
and two 250 GB Maxtor disks, plus SCSI disks for the OS.
atapci0: <Promise PDC20375 SATA150 controller> port 0x3080-0x30bf,
0x30c0-0x30cf,0x3000-0x307f mem 0xf4220000-0xf4220fff,
0xf4200000-0xf421ffff irq 20 at device 4.0 on pci2
ad4: 239372MB <Maxtor 7L250S0 BANC1E00> at ata2-master SATA150
ad6: 239372MB <Maxtor 7L250S0 BANC1E00> at ata2-master SATA150
The disks are gmirror'ed and never completes a synchronization. When
reaching around 40%-45% of gmirror resynch the system crashes. No log
is written and the screen has some garbage in it. The system also
crashes occasionally under heavy load, after logging some TIMEOUT -
READ_DMA errors.
The system is then unable to boot again. It crashes during the boot
sequence when the mirror is reestablished and the resynch is started
again. It crashes also if the resynch is prevented (through
NOAUTOSYNCH). I need to boot the installation CD and clean gmirror
metadata on one disk to be able to boot.
I have 6.0-RELEASE-p4 (but it happens on any 6.0) and this is my kernel:
# include standard distribution's SMP kernel build file, which in
turns include the generic kernel build file (named GENERIC)
include SMP
# set custom kernel ident name
ident ZOE_020
# additional/overridden settings starts here
nooptions PREEMPTION # Disable kernel thread
preemption
# standard system settings
maxusers 64 # 64 users is a lot, but we
should have plenty of memory!
options INCLUDE_CONFIG_FILE # Include this file in kernel
for reference
# memory settings - 2 GBytes for data or stack maximum size, 1 GByte
as default initial size
options MAXDSIZ=(2048UL*1024*1024)
options MAXSSIZ=(2048UL*1024*1024)
options DFLDSIZ=(1024UL*1024*1024)
# SYSV options (shared memory, semaphores, message queues)
options SEMMAP=63 # Maximum number of entries
in a semaphore map.
options SEMMNI=512 # Maximum number of System V
semaphores that can be used on the system at one time.
options SEMMNS=512 # Total number of semaphores
system wide
options SEMMNU=512 # Total number of undo
structures in system
options SEMMSL=64 # Maximum number of System V
semaphores that can be used by a single process at one time.
options SEMOPM=128 # Maximum number of
operations that can be outstanding on a single System V semaphore at
one time.
options SEMUME=48 # Maximum number of undo
operations that can be outstanding on a single System V semaphore at
one time.
options SHMALL=262144 # Maximum number of shared
memory pages system wide.
options SHMMAX=(SHMMAXPGS*PAGE_SIZE+1) # Maximum size, in
bytes, of a single System V shared memory region.
options SHMMAXPGS=262144 # Maximum size, in pages, of
a single System V shared memory region.
options SHMMIN=2 # Minimum size, in bytes, of
a single System V shared memory region.
options SHMMNI=128 # Maximum number of shared
memory regions that can be used on the system at one time.
options SHMSEG=32 # Maximum number of System V
shared memory regions that can be attached to a single process at one
time.
options MSGMNB=2049 # Max number of chars in queue
options MSGMNI=41 # Max number of message queue
identifiers
options MSGSEG=2049 # Max number of message segments
options MSGSSZ=16 # Size of a message segment
(must be a power of 2 between 8 and 1024)
options MSGTQL=41 # Max number of messages in
system
I need to go to production soon and I want FreeBSD 6.0 as Linux/other-
BSD don't fit my requirements (e.g. jail, GEOM...), so I changed the
controller to a Promise TX2300. It still has problems with 6.0 as
again, at 40%-45% of mirror rebuild I get this error:
ad4: req=0xc1c43d48 SETFEATURES SET TRANSFER MODE semaphore
timeout !! DANGER Will Robinson !!
... I am quite worried as I need to trust storage! Any idea of what
is the problem?
If needed I can provide more data or make tests. I have a photo of
the screen of the panic during the boot sequence (1MB)
Thanks and excuse me for the bad english!
Paolo
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"