Re: [zfs-discuss] Heap corruption, possibly hotswap related (snv_134 with imr_sas, nvdisk drivers)

Kaya Bekiroğlu Thu, 18 Mar 2010 13:50:19 -0700

2010/3/18 Kaya Bekiroğlu <k...@bekiroglu.com>:
> I first noticed this panic when conducting hot-swap tests.  However,
> now I see it every hour or so, even when all drives are attached and
> no ZFS resilvering is in progress.


It appears that these panics recur on my system when the
zfs-auto-snapshot service runs.  Disabling the hourly
zfs-auto-snapshot service prevents the panic.  The panic appears to be
load-related, which explains why it can also occur around hot swap,
but perhaps drivers are not to blame.

> Repro:
> - Pull a drive
> - Wait for drive absence to be acknowledged by fm
> - Physically re-add the drive
>
> This machine contains two LSI 9240-8i SAS controllers running imr_sas
> (the driver from LSI's website) and a umem NVRAM card running the
> nvdisk driver.  It also contains an SSD L2ARC.
>
> Mar 17 16:00:10 storage genunix: [ID 478202 kern.notice] kernel memory
> allocator:
> Mar 17 16:00:10 storage genunix: [ID 432124 kern.notice] buffer freed
> to wrong cache
> Mar 17 16:00:10 storage genunix: [ID 815666 kern.notice] buffer was
> allocated from kmem_alloc_160,
> Mar 17 16:00:10 storage genunix: [ID 530907 kern.notice] caller
> attempting free to kmem_alloc_48.
> Mar 17 16:00:10 storage genunix: [ID 563406 kern.notice]
> buffer=ffffff0715c74510  bufctl=0  cache: kmem_alloc_48
> Mar 17 16:00:10 storage unix: [ID 836849 kern.notice]
> Mar 17 16:00:10 storage ^Mpanic[cpu7]/thread=ffffff002de17c60:
> Mar 17 16:00:10 storage genunix: [ID 812275 kern.notice] kernel heap
> corruption detected
> Mar 17 16:00:10 storage unix: [ID 100000 kern.notice]
> Mar 17 16:00:10 storage genunix: [ID 655072 kern.notice]
> ffffff002de17a70 genunix:kmem_error+501 ()
> Mar 17 16:00:10 storage genunix: [ID 655072 kern.notice]
> ffffff002de17ac0 genunix:kmem_slab_free+2d5 ()
> Mar 17 16:00:10 storage genunix: [ID 655072 kern.notice]
> ffffff002de17b20 genunix:kmem_magazine_destroy+fe ()
> Mar 17 16:00:10 storage genunix: [ID 655072 kern.notice]
> ffffff002de17b70 genunix:kmem_cache_magazine_purge+a0 ()
> Mar 17 16:00:10 storage genunix: [ID 655072 kern.notice]
> ffffff002de17ba0 genunix:kmem_cache_magazine_resize+32 ()
> Mar 17 16:00:10 storage genunix: [ID 655072 kern.notice]
> ffffff002de17c40 genunix:taskq_thread+248 ()
> Mar 17 16:00:10 storage genunix: [ID 655072 kern.notice]
> ffffff002de17c50 unix:thread_start+8 ()
> Mar 17 16:00:10 storage unix: [ID 100000 kern.notice]
> Mar 17 16:00:10 storage genunix: [ID 672855 kern.notice] syncing file 
> systems...
> Mar 17 16:00:10 storage genunix: [ID 904073 kern.notice]  done
> Mar 17 16:00:11 storage genunix: [ID 111219 kern.notice] dumping to
> /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
> Mar 17 16:00:11 storage ahci: [ID 405573 kern.info] NOTICE: ahci0:
> ahci_tran_reset_dport port 0 reset port
>
> I'd file this directly to the bug database but I'm waiting for my
> account to be reactivated.
>
> zpool status:
>  pool: tank
>  state: ONLINE
>  scrub: resilver completed after 0h0m with 0 errors on Thu Mar 18 10:07:12 
> 2010
> config:
>
>        NAME         STATE     READ WRITE CKSUM
>        tank         ONLINE       0     0     0
>          raidz1-0   ONLINE       0     0     0
>            c6t15d1  ONLINE       0     0     0
>            c6t14d1  ONLINE       0     0     0
>            c6t13d1  ONLINE       0     0     0
>          raidz1-1   ONLINE       0     0     0
>            c6t12d1  ONLINE       0     0     0
>            c6t11d1  ONLINE       0     0     0
>            c6t10d1  ONLINE       0     0     0
>          raidz1-2   ONLINE       0     0     0
>            c6t9d1   ONLINE       0     0     0
>            c6t8d1   ONLINE       0     0     0
>            c5t9d1   ONLINE       0     0     0
>        logs
>          c7d1p0     ONLINE       0     0     0
>        cache
>          c4t0d0p2   ONLINE       0     0     0
>        spares
>          c5t8d1     AVAIL
>
> --
> Kaya
>



-- 
Kaya
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Heap corruption, possibly hotswap related (snv_134 with imr_sas, nvdisk drivers)

Reply via email to