Forwarding here, as suggested by chaps on storage-discuss.

Just to clarify, I was running filebench directly on the x4500, not from
an initiator, so this is probably not a COMSTAR thing.

Ceri
-- 
That must be wonderful!  I don't understand it at all.
                                                  -- Moliere
--- Begin Message ---
We've got an x4500 running SXCE build 91 with stmf configured to share
out a (currently) small number (9) of LUs to a (currently) small number
of hosts (4).

The x4500 is configured with ZFS root mirror, 6 RAIDZ sets across all
six controllers, some hot spares in the gaps and a RAID10 set to use
everything else up.

Since this is an investigative setup, I have been running filebench
locally on the x4500 to get some stats before moving on to do the same
on the initiators against the x4500 and our current storage.

While running the filebench OLTP workload with $filesize=5g on one of
the RAIDZ pools, the x4500 seemed to hang while creating the fileset.
On further investigation, a lot of things actually still worked; log in
via SSH was fine, /usr/bin/ps worked ok, /usr/ucb/ps and any of the
/usr/proc ptools just hung, man hung, and so on.  "savecore -L" managed
to do a dump but couldn't seem to exit.

So I did a hard reset, the system came up fine and I actually do have
the dump from "savecore -L".  I'm kind of out of my depth with mdb, but
it looks pretty clear to me that all of the "hung" processes were
somewhere in ZFS:

# mdb -k unix.0 vmcore.0 
mdb: failed to read panicbuf and panic_reg -- current register set will
be unavailable
Loading modules: [ unix genunix specfs dtrace cpu.generic
cpu_ms.AuthenticAMD.15 uppc pcplusmp scsi_vhci zfs sd ip hook neti sctp
arp usba fctl nca lofs md cpc random crypto nfs fcip logindmux ptm nsctl
ufs sppp ipc ]
> ::memstat
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                    3085149             12051   74%
Anon                        20123                78    0%
Exec and libs                3565                13    0%
Page cache                 200779               784    5%
Free (cachelist)           193955               757    5%
Free (freelist)            663990              2593   16%

Total                     4167561             16279
Physical                  4167560             16279
> ::pgrep ptree
S    PID   PPID   PGID    SID    UID      FLAGS             ADDR NAME
R   1825   1820   1825   1803      0 0x4a004000 ffffff04f5096c80 ptree
R   1798   1607   1798   1607  15000 0x4a004900 ffffff04f7b72930 ptree
R   1795   1302   1795   1294      0 0x4a004900 ffffff05179f7de0 ptree
> ::pgrep ptree | ::walk thread | ::findstack
stack pointer for thread ffffff04ea2ca440: ffffff00201777d0
[ ffffff00201777d0 _resume_from_idle+0xf1() ]
  ffffff0020177810 swtch+0x17f()
  ffffff00201778b0 turnstile_block+0x752()
  ffffff0020177920 rw_enter_sleep+0x1b0()
  ffffff00201779f0 zfs_getpage+0x10e()
  ffffff0020177aa0 fop_getpage+0x9f()
  ffffff0020177c60 segvn_fault+0x9ef()
  ffffff0020177d70 as_fault+0x5ae()
  ffffff0020177df0 pagefault+0x95()
  ffffff0020177f00 trap+0xbd3()
  ffffff0020177f10 0xfffffffffb8001d9()
stack pointer for thread ffffff04e8752400: ffffff001f9307d0
[ ffffff001f9307d0 _resume_from_idle+0xf1() ]
  ffffff001f930810 swtch+0x17f()
  ffffff001f9308b0 turnstile_block+0x752()
  ffffff001f930920 rw_enter_sleep+0x1b0()
  ffffff001f9309f0 zfs_getpage+0x10e()
  ffffff001f930aa0 fop_getpage+0x9f()
  ffffff001f930c60 segvn_fault+0x9ef()
  ffffff001f930d70 as_fault+0x5ae()
  ffffff001f930df0 pagefault+0x95()
  ffffff001f930f00 trap+0xbd3()
  ffffff001f930f10 0xfffffffffb8001d9()
stack pointer for thread ffffff066fbc6a80: ffffff001f27de90
[ ffffff001f27de90 _resume_from_idle+0xf1() ]
  ffffff001f27ded0 swtch+0x17f()
  ffffff001f27df00 cv_wait+0x61()
  ffffff001f27e040 vmem_xalloc+0x602()
  ffffff001f27e0b0 vmem_alloc+0x159()
  ffffff001f27e140 segkmem_xalloc+0x8c()
  ffffff001f27e1a0 segkmem_alloc_vn+0xcd()
  ffffff001f27e1d0 segkmem_zio_alloc+0x20()
  ffffff001f27e310 vmem_xalloc+0x4fc()
  ffffff001f27e380 vmem_alloc+0x159()
  ffffff001f27e410 kmem_slab_create+0x7d()
  ffffff001f27e450 kmem_slab_alloc+0x57()
  ffffff001f27e4b0 kmem_cache_alloc+0x136()
  ffffff001f27e4d0 zio_data_buf_alloc+0x28()
  ffffff001f27e510 arc_get_data_buf+0x175()
  ffffff001f27e560 arc_buf_alloc+0x9a()
  ffffff001f27e610 arc_read+0x122()
  ffffff001f27e6b0 dbuf_read_impl+0x129()
  ffffff001f27e710 dbuf_read+0xc5()
  ffffff001f27e7c0 dmu_buf_hold_array_by_dnode+0x1c4()
  ffffff001f27e860 dmu_read+0xd4()
  ffffff001f27e910 zfs_fillpage+0x15e()
  ffffff001f27e9f0 zfs_getpage+0x187()
  ffffff001f27eaa0 fop_getpage+0x9f()
  ffffff001f27ec60 segvn_fault+0x9ef()
  ffffff001f27ed70 as_fault+0x5ae()
  ffffff001f27edf0 pagefault+0x95()
  ffffff001f27ef00 trap+0xbd3()
  ffffff001f27ef10 0xfffffffffb8001d9()
> ::pgrep go_filebench | ::walk thread | ::findstack
stack pointer for thread ffffff055ee097e0: ffffff001f2394f0
[ ffffff001f2394f0 _resume_from_idle+0xf1() ]
  ffffff001f239530 swtch+0x17f()
  ffffff001f239560 cv_wait+0x61()
  ffffff001f2396a0 vmem_xalloc+0x602()
  ffffff001f239710 vmem_alloc+0x159()
  ffffff001f2397a0 segkmem_xalloc+0x8c()
  ffffff001f239800 segkmem_alloc_vn+0xcd()
  ffffff001f239830 segkmem_zio_alloc+0x20()
  ffffff001f239970 vmem_xalloc+0x4fc()
  ffffff001f2399e0 vmem_alloc+0x159()
  ffffff001f239a70 kmem_slab_create+0x7d()
  ffffff001f239ab0 kmem_slab_alloc+0x57()
  ffffff001f239b10 kmem_cache_alloc+0x136()
  ffffff001f239b30 zio_data_buf_alloc+0x28()
  ffffff001f239b70 arc_get_data_buf+0x175()
  ffffff001f239bc0 arc_buf_alloc+0x9a()
  ffffff001f239c00 dbuf_noread+0x9b()
  ffffff001f239c30 dmu_buf_will_fill+0x1f()
  ffffff001f239cd0 dmu_write_uio+0xd3()
  ffffff001f239dd0 zfs_write+0x468()
  ffffff001f239e40 fop_write+0x69()
  ffffff001f239f00 write+0x2af()
  ffffff001f239f10 sys_syscall+0x17b()
stack pointer for thread ffffff052eeebea0: ffffff001fbdcd00
[ ffffff001fbdcd00 _resume_from_idle+0xf1() ]
  ffffff001fbdcd40 swtch+0x17f()
  ffffff001fbdcd70 cv_wait+0x61()
  ffffff001fbdcdb0 exitlwps+0x1cb()
  ffffff001fbdce30 psig+0x4b1()
  ffffff001fbdcf00 post_syscall+0x446()
  ffffff001fbdcf10 0xfffffffffb800cad()
>

If I can turn this dump to use somehow, please just let me know.

Ceri
-- 
That must be wonderful!  I don't understand it at all.
                                                  -- Moliere

Attachment: pgpk8gFB8BZFt.pgp
Description: PGP signature

_______________________________________________
storage-discuss mailing list
[EMAIL PROTECTED]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

--- End Message ---

Attachment: pgpZCNDuo0pr6.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to