On Wed, May 20, 2015 at 10:41 AM, Alexander Kabaev <kab...@gmail.com> wrote:
> On Wed, 20 May 2015 09:54:45 -0700 > Matthew Ahrens <m...@mahrens.org> wrote: > > > On Wed, May 20, 2015 at 9:00 AM, Alexander Kabaev <kab...@gmail.com> > > wrote: > > > > > On Fri, 15 May 2015 13:50:38 +0000 (UTC) > > > John Baldwin <j...@freebsd.org> wrote: > > > > > > > Author: jhb > > > > Date: Fri May 15 13:50:37 2015 > > > > New Revision: 282971 > > > > URL: https://svnweb.freebsd.org/changeset/base/282971 > > > > > > > > Log: > > > > Previously, cv_waiters was only updated by cv_signal or > > > > cv_wait. If a thread awakened due to a time out, then cv_waiters > > > > was not decremented. If INT_MAX threads timed out on a cv without > > > > an intervening cv_broadcast, then cv_waiters could overflow. To > > > > fix this, have each sleeping thread decrement cv_waiters when it > > > > resumes. > > > > > > > > Note that previously cv_waiters was protected by the sleepq > > > > chain lock. However, that lock is not held when threads resume > > > > from sleep. In addition, the interlock is also not always > > > > reacquired after resuming (cv_wait_unlock), nor is it always held > > > > by callers of cv_signal() or cv_broadcast(). Instead, use atomic > > > > ops to update cv_waiters. Since the sleepq chain lock is still > > > > held on every increment, it should still be safe to compare > > > > cv_waiters against zero while holding the lock in the wakeup > > > > routines as the only way the race should be lost would result in > > > > extra calls to sleepq_signal() or sleepq_broadcast(). > > > > Differential Revision: https://reviews.freebsd.org/D2427 > > > > Reviewed by: benno > > > > Reported by: benno (wrap of cv_waiters in the field) > > > > MFC after: 2 weeks > > > > > > > > Modified: > > > > head/sys/kern/kern_condvar.c > > > > head/sys/sys/condvar.h > > > > > > > > > > This breaks ZFS range locking code, which expects to be able to > > > wakeup everyone on the condition variable and then free the > > > structure that contains it. Having woken up threads modify > > > cv_waiters results in a race that leads to already freed memory to > > > be accessed. > > > > > > It is debatable just how correct ZFS code in its expectations, but I > > > think this commit should probably be reverted until either ZFS is > > > changed not to expect cv modifiable by waking threads or until > > > alternative solution is found to the cv_waiters overflow issue > > > fixed by this commit. > > > > > > > > It isn't clear to me how the zfs_range_unlock() code could know when > > all the waiters have woken up and updated the CV, and thus it's safe > > to destroy/free the CV. Would the woken threads ask, "was I the last > > thread to be woken by this CV" and if so free the struct containing > > the CV? Obviously such a check would need to ensure that the other > > threads have completed their updates to the CV. > > > > --matt > > Assuming other threads _need_ to update cv after they have been woken > up. Clearly Solaris implementation managed to do without and our code > changed that breaking range locks implementation we took directly from > OpenSolaris (or illumos). What was previously possible now isn't. As I > wrote before, while merits of this expectations are debatable and it is > not hard to see where Solaris way is advantageous, that is really > besides the point. Are you arguing that we should leave kernel in known > broken state until 'proper' fix makes its way through possible upstream > detour? > Not at all. Breaking ZFS is not OK. I was just trying to understand if it's even possible to fix the breakage within ZFS. If it's not possible/reasonable, then the CV semantics would clearly have to be reverted. > > Also, we have large code base taken from Solaris and chances are this is > not the only place that might be affected. I think we are better off > with this commit temporarily reverted until necessary repairs and > auditing are complete for it to be safely re-enabled. > > Agreed that the risk is large (a huge amount of code is potentially impacted, probably not only Solaris-derived code), and does not seem to have been analyzed before this change was landed. --matt _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"