I can't help but be curious about something, which perhaps you verified but
did not post.

What the data here shows is;
- CPU 31 is buried in the kernel (100% sys).
- CPU 31 is handling a moderate-to-high rate of xcalls.

What the data does not prove empirically is that the 100% sys time of
CPU 31 is in xcall handling.

What's the hot stack when this occurs and you run this;

dtrace -n 'profile-997hz /cpu == 31/ { @[stack()] = count(); }'


On Jun 6, 2012, at 3:48 AM, Sašo Kiselkov wrote:

> So I have this dual 16-core Opteron Dell R715 with 128G of RAM attached
> to a SuperMicro disk enclosure with 45 2TB Toshiba SAS drives (via two
> LSI 9200 controllers and MPxIO) running OpenIndiana 151a4 and I'm
> occasionally seeing a storm of xcalls on one of the 32 VCPUs (>100000
> xcalls a second). The machine is pretty much idle, only receiving a
> bunch of multicast video streams and dumping them to the drives (at a
> rate of ~40MB/s). At an interval of roughly 1-2 minutes I get a storm of
> xcalls that completely eat one of the CPUs, so the mpstat line for the
> CPU looks like:
> 
> CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
> 31    0   0 102191     1    0    0    0    0    0    0     0    0 100
> 0   0
> 
> 100% busy in the system processing cross-calls. When I tried dtracing
> this issue, I found that this is the most likely culprit:
> 
> dtrace -n 'sysinfo:::xcalls {@[stack()]=count();}'
>   unix`xc_call+0x46
>   unix`hat_tlb_inval+0x283
>   unix`x86pte_inval+0xaa
>   unix`hat_pte_unmap+0xed
>   unix`hat_unload_callback+0x193
>   unix`hat_unload+0x41
>   unix`segkmem_free_vn+0x6f
>   unix`segkmem_zio_free+0x27
>   genunix`vmem_xfree+0x104
>   genunix`vmem_free+0x29
>   genunix`kmem_slab_destroy+0x87
>   genunix`kmem_slab_free+0x2bb
>   genunix`kmem_magazine_destroy+0x39a
>   genunix`kmem_depot_ws_reap+0x66
>   genunix`taskq_thread+0x285
>   unix`thread_start+0x8
> 3221701
> 
> This happens in the sched (pid 0) process. My fsstat one looks like this:
> 
> # fsstat /content 1
> new  name   name  attr  attr lookup rddir  read read  write write
> file remov  chng   get   set    ops   ops   ops bytes   ops bytes
>    0     0     0   664     0    952     0     0     0   664 38.0M /content
>    0     0     0   658     0    935     0     0     0   656 38.6M /content
>    0     0     0   660     0    946     0     0     0   659 37.8M /content
>    0     0     0   677     0    969     0     0     0   676 38.5M /content
> 
> What's even more puzzling is that this happens apparently entirely
> because of some factor other than userland, since I see no changes to
> CPU usage of processes in prstat(1M) when this xcall storm happens, only
> an increase of loadavg of +1.00 (the busy CPU).
> 
> I Googled and found that
> http://mail.opensolaris.org/pipermail/dtrace-discuss/2009-September/008107.html
> seems to have been an issue identical to mine, however, it remains
> unresolved at that time and it worries me about putting this kind of
> machine into production use.
> 
> Could some ZFS guru please tell me what's going on in segkmem_zio_free?
> When I disable the writers to the /content filesystem, this issue goes
> away, so it has obviously something to do with disk IO. Thanks!
> 
> Cheers,
> --
> Saso
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to