My patch address the following:
On Nov 13, 2015, at 4:13 PM, Alexander V. Chernikov <melif...@freebsd.org> wrote: > > > 13.11.2015, 23:59, "Randall Stewart" <r...@netflix.com>: >> Strange >> >> I went looking through all calls to callout stop with cscope and saw >> no one paying attention to the return value… (which I thought was not good). > > 23:49 [0] m@fhead5 grep -R callout_stop sys | egrep '(=|\<if\>)' > sys/netgraph/ng_base.c: rval = callout_stop(c); This one does not need changing. > sys/netpfil/pf/if_pfsync.c: if (callout_stop(&pd->pd_tmo)) { > sys/netpfil/pf/if_pfsync.c: if (callout_stop(&pd->pd_tmo)) The above two I changed to > 0 > sys/dev/isci/isci_timer.c: /* callout_stop() will *not* keep the time None of the ones in isci_timer.c check the return code or do anything different. > > sys/netinet6/nd6.c: canceled = callout_stop(&ln->lle_timer); > sys/netinet6/in6.c: if (callout_stop(&lle->lle_timer)) > sys/net/if_llatbl.c: if (callout_stop(&lle->lle_timer)) The above needed the same > sys/kern/subr_taskqueue.c: pending = !!callout_stop(&timeout_task->c); same as above.. only I think the !! is strange :-) > sys/kern/kern_exit.c: callout_stop(&p->p_itcallout) == 0) { Hmm I may have missed that one let me check Ok looking at that one it does not need to be changed.. in fact it is more correct. Since the 0 return on a already expired callout is now -1 which this if code is looking for. > sys/kern/subr_sleepqueue.c: else if (callout_stop(&td->td_slpcallout) == > 0) { This one again was causing extra work when the callout was already stopped and it returned 0.. it would do a synchronize on the other CPU.. but if -1 comes back it says the callout is already stopped.. so no synchronization is needed.. > sys/netinet/in.c: if (callout_stop(&lle->lle_timer)) > sys/netinet/tcp_timer.c: if (callout_stop(t_callout) && These two I made > 0 though the TCP one needs to change to use the new async_drain > > (not counting callout_drain() here) drain is different since it is done safe it should wait for the completion of the timeout. I don’t know if you could ever get a 0 return from it.. R > >> >> And yes I am running this in a lot of systems. > Try this: > 0:11 [0] fhead0# ifconfig vtnet0 alias 10.10.10.10/32 > 0:11 [0] fhead0# ifconfig vtnet0 -alias 10.10.10.10 > callout_stop() for lle 10.10.10.10 on vtnet0, lle_refcnt=1 > panic: bogus refcnt 0 on lle 0xfffff8001996c400 >> >> R >> >> >> On Nov 13, 2015, at 6:16 AM, Alexander V. Chernikov <melif...@freebsd.org> >> wrote: >> >>> >>> 10.11.2015, 17:49, "Randall Stewart" <r...@freebsd.org>: >>>> >>>> Author: rrs >>>> Date: Tue Nov 10 14:49:32 2015 >>>> New Revision: 290664 >>>> URL: https://svnweb.freebsd.org/changeset/base/290664 >>>> >>>> Log: >>>> Add new async_drain to the callout system. This is so-far not used but >>>> should be used by TCP for sure in its cleanup of the IN-PCB (will be >>>> coming shortly). >>> >>> Randall, this commit introduced change in callout_stop() which was not >>> mentioned in commit message. >>> This change has broken lltable arp/nd handling: deleting interface address >>> causes immediate panic. >>> I also see other other code/subsystems relying on callout_stop() return >>> value (netgraph, pfsync, iscsi). >>> I was not able to find any discussion/analysis/testing for these in D4076 >>> so this change does not look like being properly tested prior commiting.. >>> >>> >>> >>>> >>>> Sponsored by: Netflix Inc. >>>> Differential Revision: https://reviews.freebsd.org/D4076 >>>> >>>> Modified: >>>> head/share/man/man9/timeout.9 >>>> head/sys/kern/kern_timeout.c >>>> head/sys/sys/callout.h >>>> >>>> Modified: head/share/man/man9/timeout.9 >>>> ============================================================================== >>>> --- head/share/man/man9/timeout.9 Tue Nov 10 14:14:41 2015 (r290663) >>>> +++ head/share/man/man9/timeout.9 Tue Nov 10 14:49:32 2015 (r290664) >>>> @@ -35,6 +35,7 @@ >>>> .Sh NAME >>>> .Nm callout_active , >>>> .Nm callout_deactivate , >>>> +.Nm callout_async_drain , >>>> .Nm callout_drain , >>>> .Nm callout_handle_init , >>>> .Nm callout_init , >>>> @@ -69,6 +70,8 @@ typedef void timeout_t (void *); >>>> .Ft void >>>> .Fn callout_deactivate "struct callout *c" >>>> .Ft int >>>> +.Fn callout_async_drain "struct callout *c" "timeout_t *drain" >>>> +.Ft int >>>> .Fn callout_drain "struct callout *c" >>>> .Ft void >>>> .Fn callout_handle_init "struct callout_handle *handle" >>>> @@ -236,17 +239,42 @@ The function >>>> cancels a callout >>>> .Fa c >>>> if it is currently pending. >>>> -If the callout is pending, then >>>> +If the callout is pending and successfuly stopped, then >>>> .Fn callout_stop >>>> -returns a non-zero value. >>>> -If the callout is not set, >>>> -has already been serviced, >>>> -or is currently being serviced, >>>> +returns a value of one. >>>> +If the callout is not set, or >>>> +has already been serviced, then >>>> +negative one is returned. >>>> +If the callout is currently being serviced and cannot be stopped, >>>> then zero will be returned. >>>> If the callout has an associated lock, >>>> then that lock must be held when this function is called. >>>> .Pp >>>> The function >>>> +.Fn callout_async_drain >>>> +is identical to >>>> +.Fn callout_stop >>>> +with one difference. >>>> +When >>>> +.Fn callout_async_drain >>>> +returns zero it will arrange for the function >>>> +.Fa drain >>>> +to be called using the same argument given to the >>>> +.Fn callout_reset >>>> +function. >>>> +.Fn callout_async_drain >>>> +If the callout has an associated lock, >>>> +then that lock must be held when this function is called. >>>> +Note that when stopping multiple callouts that use the same lock it is >>>> possible >>>> +to get multiple return's of zero and multiple calls to the >>>> +.Fa drain >>>> +function, depending upon which CPU's the callouts are running. The >>>> +.Fa drain >>>> +function itself is called from the context of the completing callout >>>> +i.e. softclock or hardclock, just like a callout itself. >>>> +p >>>> +.Pp >>>> +The function >>>> .Fn callout_drain >>>> is identical to >>>> .Fn callout_stop >>>> >>>> Modified: head/sys/kern/kern_timeout.c >>>> ============================================================================== >>>> --- head/sys/kern/kern_timeout.c Tue Nov 10 14:14:41 2015 (r290663) >>>> +++ head/sys/kern/kern_timeout.c Tue Nov 10 14:49:32 2015 (r290664) >>>> @@ -136,6 +136,7 @@ u_int callwheelsize, callwheelmask; >>>> */ >>>> struct cc_exec { >>>> struct callout *cc_curr; >>>> + void (*cc_drain)(void *); >>>> #ifdef SMP >>>> void (*ce_migration_func)(void *); >>>> void *ce_migration_arg; >>>> @@ -170,6 +171,7 @@ struct callout_cpu { >>>> #define callout_migrating(c) ((c)->c_iflags & CALLOUT_DFRMIGRATION) >>>> >>>> #define cc_exec_curr(cc, dir) cc->cc_exec_entity[dir].cc_curr >>>> +#define cc_exec_drain(cc, dir) cc->cc_exec_entity[dir].cc_drain >>>> #define cc_exec_next(cc) cc->cc_next >>>> #define cc_exec_cancel(cc, dir) cc->cc_exec_entity[dir].cc_cancel >>>> #define cc_exec_waiting(cc, dir) cc->cc_exec_entity[dir].cc_waiting >>>> @@ -679,6 +681,7 @@ softclock_call_cc(struct callout *c, str >>>> >>>> cc_exec_curr(cc, direct) = c; >>>> cc_exec_cancel(cc, direct) = false; >>>> + cc_exec_drain(cc, direct) = NULL; >>>> CC_UNLOCK(cc); >>>> if (c_lock != NULL) { >>>> class->lc_lock(c_lock, lock_status); >>>> @@ -744,6 +747,15 @@ skip: >>>> CC_LOCK(cc); >>>> KASSERT(cc_exec_curr(cc, direct) == c, ("mishandled cc_curr")); >>>> cc_exec_curr(cc, direct) = NULL; >>>> + if (cc_exec_drain(cc, direct)) { >>>> + void (*drain)(void *); >>>> + >>>> + drain = cc_exec_drain(cc, direct); >>>> + cc_exec_drain(cc, direct) = NULL; >>>> + CC_UNLOCK(cc); >>>> + drain(c_arg); >>>> + CC_LOCK(cc); >>>> + } >>>> if (cc_exec_waiting(cc, direct)) { >>>> /* >>>> * There is someone waiting for the >>>> @@ -1145,7 +1157,7 @@ callout_schedule(struct callout *c, int >>>> } >>>> >>>> int >>>> -_callout_stop_safe(struct callout *c, int safe) >>>> +_callout_stop_safe(struct callout *c, int safe, void (*drain)(void *)) >>>> { >>>> struct callout_cpu *cc, *old_cc; >>>> struct lock_class *class; >>>> @@ -1225,19 +1237,22 @@ again: >>>> * stop it by other means however. >>>> */ >>>> if (!(c->c_iflags & CALLOUT_PENDING)) { >>>> - c->c_flags &= ~CALLOUT_ACTIVE; >>>> - >>>> /* >>>> * If it wasn't on the queue and it isn't the current >>>> * callout, then we can't stop it, so just bail. >>>> + * It probably has already been run (if locking >>>> + * is properly done). You could get here if the caller >>>> + * calls stop twice in a row for example. The second >>>> + * call would fall here without CALLOUT_ACTIVE set. >>>> */ >>>> + c->c_flags &= ~CALLOUT_ACTIVE; >>>> if (cc_exec_curr(cc, direct) != c) { >>>> CTR3(KTR_CALLOUT, "failed to stop %p func %p arg >>>> %p", >>>> c, c->c_func, c->c_arg); >>>> CC_UNLOCK(cc); >>>> if (sq_locked) >>>> sleepq_release(&cc_exec_waiting(cc, >>>> direct)); >>>> - return (0); >>>> + return (-1); >>>> } >>>> >>>> if (safe) { >>>> @@ -1298,14 +1313,16 @@ again: >>>> CC_LOCK(cc); >>>> } >>>> } else if (use_lock && >>>> - !cc_exec_cancel(cc, direct)) { >>>> + !cc_exec_cancel(cc, direct) && (drain == NULL)) { >>>> >>>> /* >>>> * The current callout is waiting for its >>>> * lock which we hold. Cancel the callout >>>> * and return. After our caller drops the >>>> * lock, the callout will be skipped in >>>> - * softclock(). >>>> + * softclock(). This *only* works with a >>>> + * callout_stop() *not* callout_drain() or >>>> + * callout_async_drain(). >>>> */ >>>> cc_exec_cancel(cc, direct) = true; >>>> CTR3(KTR_CALLOUT, "cancelled %p func %p arg %p", >>>> @@ -1351,11 +1368,17 @@ again: >>>> #endif >>>> CTR3(KTR_CALLOUT, "postponing stop %p func %p arg >>>> %p", >>>> c, c->c_func, c->c_arg); >>>> + if (drain) { >>>> + cc_exec_drain(cc, direct) = drain; >>>> + } >>>> CC_UNLOCK(cc); >>>> return (0); >>>> } >>>> CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p", >>>> c, c->c_func, c->c_arg); >>>> + if (drain) { >>>> + cc_exec_drain(cc, direct) = drain; >>>> + } >>>> CC_UNLOCK(cc); >>>> KASSERT(!sq_locked, ("sleepqueue chain still locked")); >>>> return (0); >>>> >>>> Modified: head/sys/sys/callout.h >>>> ============================================================================== >>>> --- head/sys/sys/callout.h Tue Nov 10 14:14:41 2015 (r290663) >>>> +++ head/sys/sys/callout.h Tue Nov 10 14:49:32 2015 (r290664) >>>> @@ -81,7 +81,7 @@ struct callout_handle { >>>> */ >>>> #define callout_active(c) ((c)->c_flags & CALLOUT_ACTIVE) >>>> #define callout_deactivate(c) ((c)->c_flags &= ~CALLOUT_ACTIVE) >>>> -#define callout_drain(c) _callout_stop_safe(c, 1) >>>> +#define callout_drain(c) _callout_stop_safe(c, 1, NULL) >>>> void callout_init(struct callout *, int); >>>> void _callout_init_lock(struct callout *, struct lock_object *, int); >>>> #define callout_init_mtx(c, mtx, flags) \ >>>> @@ -119,10 +119,11 @@ int callout_schedule(struct callout *, i >>>> int callout_schedule_on(struct callout *, int, int); >>>> #define callout_schedule_curcpu(c, on_tick) \ >>>> callout_schedule_on((c), (on_tick), PCPU_GET(cpuid)) >>>> -#define callout_stop(c) _callout_stop_safe(c, 0) >>>> -int _callout_stop_safe(struct callout *, int); >>>> +#define callout_stop(c) _callout_stop_safe(c, 0, NULL) >>>> +int _callout_stop_safe(struct callout *, int, void (*)(void *)); >>>> void callout_process(sbintime_t now); >>>> - >>>> +#define callout_async_drain(c, d) \ >>>> + _callout_stop_safe(c, 0, d) >>>> #endif >>>> >>>> #endif /* _SYS_CALLOUT_H_ */ >> >> -------- >> Randall Stewart >> r...@netflix.com >> 803-317-4952 >> >> >> -------- Randall Stewart r...@netflix.com 803-317-4952 _______________________________________________ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"