Re: Tentative proposal: TS mutexes should be queues not OS mutexes

Walt Karas Mon, 30 Sep 2019 13:46:19 -0700

>From the longer-term TSers I've heard comments about seeing profiling
results that show that waiting on mutexes is a significant performance
issue with TS.  But I'm not aware of any write-ups of such results.
Unfortunately, I'm relatively new to TS and Linux, so I'm not currently
familiar with the best approaches to profiling TS.


For better performance, I think that having a single to-run Continuation
queue, or one per core, with a queue feeding multiple event threads is the
main thing.  It's more resilient to Continuations that block.  There
doesn't seem to be enthusiasm for getting hard-core about not having
blocking Continuations (see
https://github.com/apache/trafficserver/pull/5412 ).  I'm not sure changing
to queue-based mutexes would have a significant performance impact.  But it
seems a cleaner design, making sure Continuations in the to-run list(s) are
actually ready to run.  But a different mutex implementation is not
strictly necessary in order to consolidate to-run Continuation queues.

On Mon, Sep 30, 2019 at 2:39 PM Kees Spoelstra <[email protected]>
wrote:

> Sounds very interesting.
> But what is the problem we're trying to solve here, I like the thread
> affinity because it gives us head ache free concurrency in some cases, and
> I'll bet that there is some code which doesn't have the proper continuation
> mutexes because we know it runs on the same thread.
>
> Are we seeing a lot of imbalanced threads (too much processing causing long
> queues of continuations, which I can imagine in some cases) ? And shouldn't
> we balance based on transactions or connections, move those around when we
> see imbalance and aim for embarrassingly parallel processing :) Come
> to think of it, this might introduce another set of problems, how to know
> which continuations are part of the life cycle of a connection :/
>
> Jumping threads in one transaction is not always ideal either, this can
> really hurt performance. But your proposed model seems to handle that
> somewhat better than the current implementation.
>
> Very interested and wondering what this would mean for plugin developers.
>
> On Mon, 30 Sep 2019, 19:20 Walt Karas, <[email protected]>
> wrote:
>
> > If a Continuation is scheduled, but its mutex is locked, it's put in a
> > queue specific to that mutex.  The release function for the mutex (called
> > when a Continuation holding the mutex exists) would put the Continuation
> at
> > the front of the mutex's queue (if not empty) into the ready-to-run queue
> > (transferring the lock to that Continuation).  A drawback is that the
> queue
> > would itself need a mutex (spinlock?), but the critical section would be
> > very short.
> >
> > There would be a function to lock a mutex directly.  It would create a
> > Continuation that had two condition variables.  It would assign the mutex
> > to this Continuation and schedule it.  (In this case, it might make sense
> > to put this Continuation at the front of the mutex's queue, since it
> would
> > be blocking an entire event thread.)  The direct-lock function would then
> > block on the first condition variable.  When the Continuation ran, it
> would
> > trigger the first condition variable, and then block on the second
> > condition variable.  The direct-lock function would then exit, allowing
> the
> > calling code to enter its critical section.  At the end of the critical
> > section, another function to release the direct lock would be called.  It
> > would trigger the second condition variable, which would cause the
> function
> > of the Continuation created for the direct lock to exit (thus releasing
> the
> > mutex).
> >
> > With this approach, I'm not sure thread affinities would be of any value.
> > I think perhaps each core should have it's own list of ready-to-run
> > Continuations, and a pool of event threads with affinity to that core.
> Not
> > having per-event-thread ready-to-run lists means that a Continuation
> > function that blocks is less likely to block other ready-to-run
> > Continuations.  If Continuations had core affinities to some degree, this
> > might reduce evictions in per-core memory cache.  (Multiple Continuations
> > having the same function should have the same core affinity.)
> >
>

Re: Tentative proposal: TS mutexes should be queues not OS mutexes

Reply via email to