I can't help but be curious about something, which perhaps you verified but did not post.
What the data here shows is; - CPU 31 is buried in the kernel (100% sys). - CPU 31 is handling a moderate-to-high rate of xcalls. What the data does not prove empirically is that the 100% sys time of CPU 31 is in xcall handling. What's the hot stack when this occurs and you run this; dtrace -n 'profile-997hz /cpu == 31/ { @[stack()] = count(); }' On Jun 6, 2012, at 3:48 AM, Sašo Kiselkov wrote: > So I have this dual 16-core Opteron Dell R715 with 128G of RAM attached > to a SuperMicro disk enclosure with 45 2TB Toshiba SAS drives (via two > LSI 9200 controllers and MPxIO) running OpenIndiana 151a4 and I'm > occasionally seeing a storm of xcalls on one of the 32 VCPUs (>100000 > xcalls a second). The machine is pretty much idle, only receiving a > bunch of multicast video streams and dumping them to the drives (at a > rate of ~40MB/s). At an interval of roughly 1-2 minutes I get a storm of > xcalls that completely eat one of the CPUs, so the mpstat line for the > CPU looks like: > > CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl > 31 0 0 102191 1 0 0 0 0 0 0 0 0 100 > 0 0 > > 100% busy in the system processing cross-calls. When I tried dtracing > this issue, I found that this is the most likely culprit: > > dtrace -n 'sysinfo:::xcalls {@[stack()]=count();}' > unix`xc_call+0x46 > unix`hat_tlb_inval+0x283 > unix`x86pte_inval+0xaa > unix`hat_pte_unmap+0xed > unix`hat_unload_callback+0x193 > unix`hat_unload+0x41 > unix`segkmem_free_vn+0x6f > unix`segkmem_zio_free+0x27 > genunix`vmem_xfree+0x104 > genunix`vmem_free+0x29 > genunix`kmem_slab_destroy+0x87 > genunix`kmem_slab_free+0x2bb > genunix`kmem_magazine_destroy+0x39a > genunix`kmem_depot_ws_reap+0x66 > genunix`taskq_thread+0x285 > unix`thread_start+0x8 > 3221701 > > This happens in the sched (pid 0) process. My fsstat one looks like this: > > # fsstat /content 1 > new name name attr attr lookup rddir read read write write > file remov chng get set ops ops ops bytes ops bytes > 0 0 0 664 0 952 0 0 0 664 38.0M /content > 0 0 0 658 0 935 0 0 0 656 38.6M /content > 0 0 0 660 0 946 0 0 0 659 37.8M /content > 0 0 0 677 0 969 0 0 0 676 38.5M /content > > What's even more puzzling is that this happens apparently entirely > because of some factor other than userland, since I see no changes to > CPU usage of processes in prstat(1M) when this xcall storm happens, only > an increase of loadavg of +1.00 (the busy CPU). > > I Googled and found that > http://mail.opensolaris.org/pipermail/dtrace-discuss/2009-September/008107.html > seems to have been an issue identical to mine, however, it remains > unresolved at that time and it worries me about putting this kind of > machine into production use. > > Could some ZFS guru please tell me what's going on in segkmem_zio_free? > When I disable the writers to the /content filesystem, this issue goes > away, so it has obviously something to do with disk IO. Thanks! > > Cheers, > -- > Saso > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss