On 5/12/11 6:55 AM, Stanislav Sedov wrote:
On Thu, 12 May 2011 13:43:58 +0300
Andriy Gapon<a...@freebsd.org>  mentioned:


Theory:
- smp_rv_waiters[2] becomes equal to smp_rv_ncpus
- [at least] one slave CPU is still in the last call to cpu_spinwait() in
smp_rendezvous_action()
- master CPU notices that the condition is true, exits smp_rendezvous_cpus() and
calls it again
- the slave CPU is still in spinwait
- the master CPU resets smp_rv_waiters[2] to zero
- the slave CPU exits spinwait, see smp_rv_waiters[2] with zero value
- endless loop


That might explain it.
Do you have a patch for me to try?

Thanks!


The NetApp folks working on BHyVe also ran into this. They have a fix that I think sounds reasonable which is to add a generation count to the smp rendezvous "structure" and have waiting CPUs stop waiting if the generation count changes.

--
John Baldwin
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Reply via email to