On Thu, May 06, 2021 at 11:43:55AM -0500, Scott Cheloha wrote:
> On Wed, May 05, 2021 at 11:05:06AM +1000, David Gwynne wrote:
> > On Tue, May 04, 2021 at 11:54:31AM -0500, Scott Cheloha wrote:
> > >
> > > [...]
> > >
> > > Here is where I get confused.
> > >
> > > Why do I want to wait for *all* timeouts in the queue to finish
> > > running?
> >
> > You don't, you just want to know that whatever timeout was running
> > has finished cos you don't know if that currently running timeout is
> > yours or not.
> >
> > > My understanding of the timeout_del_barrier(9) use case was as
> > > follows:
> > >
> > > 1. I have a dynamically allocated timeout struct. The timeout is
> > > scheduled to execute at some point in the future.
> > >
> > > 2. I want to free the memory allocated to for the timeout.
> > >
> > > 3. To safely free the memory I need to ensure the timeout
> > > is not executing. Until the timeout function, i.e.
> > > to->to_func(), has returned it isn't necessarily safe to
> > > free that memory.
> > >
> > > 4. If I call timeout_del_barrier(9), it will only return if the
> > > timeout in question is not running. Assuming the timeout itself
> > > is not a periodic timeout that reschedules itself we can then
> > > safely free the memory.
> >
> > Barriers aren't about references to timeouts, they're about references
> > to the work associated with a timeout.
> >
> > If you only cared about the timeout struct itself, then you can get
> > all the information about whether it's referenced by the timeout
> > queues from the return values from timeout_add and timeout_del.
> > timeout_add() returns 1 if the timeout subsystem takes the reference
> > to the timeout. If the timeout is already scheduled it returns 0
> > because it's already got a reference. timeout_del returns 1 if the
> > timeout itself was scheduled and it removed it, therefore giving
> > up it's reference. If you timeout_del and it returns 0, then it
> > wasn't on a queue inside timeouts. Easy.
> >
> > What timeout_add and timeout_del don't tell you is whether the work
> > referenced by the timeout is currently running. The timeout runners
> > copy the function and argument onto the stack when they dequeue a
> > timeout to run, and give up the reference to the timeout when the
> > mutex around the timeout wheels/cirqs is released. However, the argument
> > is still referenced on the stack and the function to process it may be
> > about to run or is running. You can't tell that from the timeout_add/del
> > return values though.
> >
> > We provide two ways to deal with that. One is you have reference
> > counters (or similar) on the thing you're deferring to a timeout.
> > The other is you use a barrier so you know the work you deferred
> > isn't on the timeout runners stack anymore because it's moved on
> > to run the barrier work.
> >
> > This is consistent with tasks and interrupts.
> >
> > > Given this understanding, my approach was to focus on the timeout in
> > > question. So my code just spins until it the timeout is no longer
> > > running.
> > >
> > > But now I think I am mistaken.
> > >
> > > IIUC you're telling me (and showing me, in code) that the goal of
> > > timeout_barrier() is to wait for the *whole* softclock() to return,
> > > not just the one timeout.
> >
> > No, just the currently running timeout.
> >
> > Putting the barrier timeout on the end of the proc and todo cirqs
> > is because there's no CIRCQ_INSERT_HEAD I can use right now.
> >
> > > Why do I want to wait for the whole softclock() or softclock_thread()?
> > > Why not just wait for the one timeout to finish?
> >
> > Cos I was too lazy to write CIRCQ_INSERT_HEAD last night :D
>
> Okay. After thinking this over I'm pretty sure we are skinning the
> same cat in two different ways.
>
> Your cond_wait(9) approach is fine for both timeout types because
> timeout_del_barrier(9) is only safe to call from process context.
> When I wrote my first diff I was under the impression it was safe to
> call timeout_del_barrier(9) from interrupt context. But I reread the
> manpage and now I see that this is not the case, my bad.
>
> > [...]
> >
> > This discussion has some relevance to taskqs too. I was also lazy when I
> > implemented taskq_barrier and used task_add to put the barrier onto the
> > list of work, which meant it got added to the end. Now DRM relies on
> > this behaviour. Maybe I should have pushed to commit the diff below.
> >
> > This diff lets taskq barriers run as soon as possible, but adds compat
> > to the drm workqueue stuff so they can wait for all pending work to
> > finish as they expect.
> >
> > P.S. don't call timeout_del from inside a timeout handler. I still think
> > timeout_set_proc was a mistake.
> >
> > [...]
>
> Let's pick this whole train of thought (taskq, etc.) up in a
> different thread. One thing at a time.
>
> --
>
> Updated patch:
>
> - Keep timeout_barrier(9).
>
> - In timeout_barrier(9) use the barrier timeout for both normal and
> process-context timeouts. All callers use cond_wait(9) now.
>
> - Remove the kernel lock from the non-process path. I assume I don't
> need to splx(splschedclock()) anymore?
>
> - Rename "timeout_proc_barrier()" to "timeout_barrier_timeout()"
> because now we use it for both timeout types.
>
> - Update timeout.9: timeout_barrier(9) no longer takes the kernel
> lock.
>
> Am I on the right track? Aside from dropping the kernel lock there is
> no behavior change here.
>
> We can talk about inserting the barrier timeouts at the head of the
> list to reduce latency in a different thread after we're done removing
> the kernel lock here. Don't want to combine a behavior change with a
> locking change in the same revision.
yep.
ok by me, but be warned that i'm very distracted at the moment.
> Index: sys/kern/kern_timeout.c
> ===================================================================
> RCS file: /cvs/src/sys/kern/kern_timeout.c,v
> retrieving revision 1.83
> diff -u -p -r1.83 kern_timeout.c
> --- sys/kern/kern_timeout.c 8 Feb 2021 08:18:45 -0000 1.83
> +++ sys/kern/kern_timeout.c 6 May 2021 16:39:56 -0000
> @@ -172,10 +172,10 @@ void softclock_create_thread(void *);
> void softclock_process_kclock_timeout(struct timeout *, int);
> void softclock_process_tick_timeout(struct timeout *, int);
> void softclock_thread(void *);
> +void timeout_barrier_timeout(void *);
> uint32_t timeout_bucket(const struct timeout *);
> uint32_t timeout_maskwheel(uint32_t, const struct timespec *);
> void timeout_run(struct timeout *);
> -void timeout_proc_barrier(void *);
>
> /*
> * The first thing in a struct timeout is its struct circq, so we
> @@ -490,34 +490,38 @@ timeout_del_barrier(struct timeout *to)
> void
> timeout_barrier(struct timeout *to)
> {
> - int needsproc = ISSET(to->to_flags, TIMEOUT_PROC);
> + struct timeout barrier;
> + struct cond c;
> + int procflag;
> +
> + procflag = (to->to_flags & TIMEOUT_PROC);
> + timeout_sync_order(procflag);
> +
> + timeout_set_flags(&barrier, timeout_barrier_timeout, &c, procflag);
> + barrier.to_process = curproc->p_p;
> + cond_init(&c);
>
> - timeout_sync_order(needsproc);
> -
> - if (!needsproc) {
> - KERNEL_LOCK();
> - splx(splsoftclock());
> - KERNEL_UNLOCK();
> - } else {
> - struct cond c = COND_INITIALIZER();
> - struct timeout barrier;
> -
> - timeout_set_proc(&barrier, timeout_proc_barrier, &c);
> - barrier.to_process = curproc->p_p;
> + mtx_enter(&timeout_mutex);
>
> - mtx_enter(&timeout_mutex);
> - SET(barrier.to_flags, TIMEOUT_ONQUEUE);
> + barrier.to_time = ticks;
> + SET(barrier.to_flags, TIMEOUT_ONQUEUE);
> + if (procflag)
> CIRCQ_INSERT_TAIL(&timeout_proc, &barrier.to_list);
> - mtx_leave(&timeout_mutex);
> + else
> + CIRCQ_INSERT_TAIL(&timeout_todo, &barrier.to_list);
> +
> + mtx_leave(&timeout_mutex);
>
> + if (procflag)
> wakeup_one(&timeout_proc);
> + else
> + softintr_schedule(softclock_si);
>
> - cond_wait(&c, "tmobar");
> - }
> + cond_wait(&c, "tmobar");
> }
>
> void
> -timeout_proc_barrier(void *arg)
> +timeout_barrier_timeout(void *arg)
> {
> struct cond *c = arg;
>
> Index: share/man/man9/timeout.9
> ===================================================================
> RCS file: /cvs/src/share/man/man9/timeout.9,v
> retrieving revision 1.52
> diff -u -p -r1.52 timeout.9
> --- share/man/man9/timeout.9 26 Apr 2021 20:32:30 -0000 1.52
> +++ share/man/man9/timeout.9 6 May 2021 16:39:56 -0000
> @@ -194,11 +194,6 @@ but it will wait until any current execu
> ensures that any current execution of the timeout in the argument
> .Fa to
> has completed before returning.
> -If the timeout
> -.Fa to
> -has been initialised with
> -.Fn timeout_set
> -it will take the kernel lock.
> .Pp
> The
> .Fn timeout_pending