On Feb 13, 2015, at 4:21 PM, Gleb Smirnoff <gleb...@freebsd.org> wrote:
> On Mon, Feb 09, 2015 at 03:11:21PM -0500, John Baldwin wrote: > J> On Monday, February 09, 2015 07:28:12 PM Randall Stewart wrote: > J> > Author: rrs > J> > Date: Mon Feb 9 19:28:11 2015 > J> > New Revision: 278472 > J> > URL: https://svnweb.freebsd.org/changeset/base/278472 > J> > > J> > Log: > J> > This fixes a bug in the way that the LLE timers for nd6 > J> > and arp were being used. They basically would pass in the > J> > mutex to the callout_init. Because they used this method > J> > to the callout system, it was possible to "stop" the callout. > J> > When flushing the table and you stopped the running callout, the > J> > callout_stop code would return 1 indicating that it was going > J> > to stop the callout (that was about to run on the callout_wheel blocked > J> > by the function calling the stop). Now when 1 was returned, it would > J> > lower the reference count one extra time for the stopped timer, then > J> > a few lines later delete the memory. Of course the callout_wheel was > J> > stuck in the lock code and would then crash since it was accessing > J> > freed memory. By using callout_init(c, 1) we always get a 0 back > J> > and the reference counting bug does not rear its head. We do have > J> > to make a few adjustments to the callouts themselves though to make > J> > sure it does the proper thing if rescheduled as well as gets the lock. > J> > > J> > Commented upon by hiren and sbruno > J> > See Phabricator D1777 for more details. > J> > > J> > Commented upon by hiren and sbruno > J> > Reviewed by: adrian, jhb and bz > J> > Sponsored by: Netflix Inc. > J> > J> Eh, I looked at it, but I really, really don't like it. I think > J> callout_init_*() should be preferred to CALLOUT_MPSAFE whenever possible > as it > J> is less race-prone. I think this should probably be fixed by adding Hans' > J> callout_drain_async() instead, though this is fine as a temporary > workaround. > > I second concerns. Please look at kern/165863 and r238990 that fixed it. > Transition from CALLOUT_MPSAFE to callout_init_rw() was intentional > and fixed races. > > I added to Cc guys who helped to track down that races. May be someone still > has test scripts at hand. AFAIR, there were some that allowed to put a box > down quite quickly. Well without it we can also put a box down quickly.. at least Sbruno and Hiren seem to be able to.. you end up with deleted memory being accessed by the callout code. I can look at kern/165863 and 238990.. let me go see what I can see ;0 > > -- > Totus tuus, Glebius. -------- Randall Stewart r...@netflix.com 803-317-4952 _______________________________________________ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"