Generally speaking, we schedule Continuations for 2 main use cases. 1) Make async sideways calls 2) Some offloading of heavy CPU work (such as encrypting/decrypting a token etc) For (1) we don't use separate thread pools and just schedule them on the Net threads. As these involve i/o, it's more predictable (and IMO recommended) to stick to Net threads. For (2) it's totally dependent on your use case. We pretty much run with the default 2 task threads and that seems to work reasonably for us. But, given that this is clearly use case specific, I wonder if it may be more appropriate to be put in Best practices/FAQ sections of the docs than the API docs itself. Thanks, Sudheer
On Friday, March 27, 2020, 10:45:46 AM PDT, Alan Carroll <solidwallofc...@verizonmedia.com.invalid> wrote: Yes, that's a problem. One of the many things on my list. On Fri, Mar 27, 2020 at 12:42 PM David Calavera <david.calav...@gmail.com> wrote: > Something I find challenging about TSContScheduleOnPool and others is that > there is no much documentation about good practices to configure thread > pools. For example the only documentation I found about task threads only > says that you must have at least 1 task thread: > > > https://docs.trafficserver.apache.org/en/8.0.x/admin-guide/files/records.config.en.html?highlight=records%20config#proxy-config-task-threads > > This option obviously depends on how much work you're doing in task > threads, but it'd be very good to have some suggestions from people's > experience changing that setting in real production environments. > > On Fri, Mar 27, 2020 at 9:03 AM Alan Carroll > <solidwallofc...@verizonmedia.com.invalid> wrote: > > > I put most of my discussion on the PR 6577, but discussing this with > > Sudheer and Fei, I think the consensus is > > > > 1) Accept the PR for ATS 9. > > 2) For ATS 10 (or maybe 9.1?), deprecate TSContSchedule. Make it explicit > > to schedule on a pool or thread. > > > > On Thu, Mar 26, 2020 at 2:51 PM Walt Karas <wka...@verizonmedia.com > > .invalid> > > wrote: > > > > > I was having problems with this too in a new plugin I was writing. I > had > > > to make this small change to get it to work: > > > > > > > > > https://github.com/apache/trafficserver/pull/6489/files#diff-8c09c39a3b13e183ba1bfd1e590cfb21 > > > > > > I found that I had to use TSContScheduleOnPool() instead of > > > TSContSchedule(). > > > > > > On Thu, Mar 26, 2020 at 12:55 PM Sudheer Vinukonda > > > <sudheervinuko...@yahoo.com.invalid> wrote: > > > > > > > While investigating a core dump that we ran into during our ATS9 > > > > migration, we discovered a couple of problems with the thread > affinity > > > > changes and the TSContSchedule API. > > > > 1) It looks like the API TSContSchedule() and TSContScheduleEvery() > now > > > > don't work unless TSContSetThreadAffinity() was called prior to > calling > > > > them. This sounds unreasonable, so, I'm proposing to change the API > to > > > set > > > > the Continuation's thread affinity to the calling thread of the API, > if > > > not > > > > already set. > > > > 2) We have a use case where one of our plugins calls TSContSchedule* > > API > > > > from a non-ATS thread (created using pthread_create). This borks in a > > > weird > > > > fashion because the FORCE_PLUGIN_SCOPED_MUTEX ends up not acquiring > the > > > > mutex (because the calling thread isn't an EThread, so, > this_ethread() > > is > > > > NULL and the holding thread for the cont's mutex is also NULL, which > > > means > > > > in Mutex_Lock(), the if (m->thread_holding != t) { evaluates to false > > > > bypassing the mutex acquire call. However, the subsequent mutex > release > > > > does get called via Mutex_unlock() and this makes bad things to > happen > > > (as > > > > the __nusers of the mutex object goes below 0 (it's an unsigned > int)). > > > > > > > > > > > > > > https://github.com/apache/trafficserver/blob/4a503c817766fcad3b9eab3ff3a8e9e73d483ae7/iocore/eventsystem/I_Lock.h#L320 > > > > > > > > So, we could either > > > > - prevent calling TSContSchedule*() API from non-ATS threads by > > asserting > > > > if this_ethread() is NULL before trying to acquire the mutex. ie. > > > > basically, fail hard and fast and in more obvious way (than an > obscure > > > > random side-effect that happens much later because a release() was > > called > > > > on a lock without calling an acquire()) > > > > OR > > > > - With the thread affinity changes, we could check if the > continuation > > > has > > > > a thread affinity set to an EThread and still allow calling > > > > TSContSchedule*() API from non-ATS Threads, as long as the thread > > > affinity > > > > is set to an ATS thread. This requires to acquire the mutex on the > > > affinity > > > > thread than the calling thread. > > > > > > > > After discussing with Bryan, we prefer option (1) which is consistent > > > with > > > > the expectation that Continuation Schedule API should only be called > > from > > > > ATS threads. > > > > > > > > Note: Both the proposed changes here should not be breaking any > > > > compatibility or expectations, but rather retain compatibility with > (1) > > > > above and make the API a bit more predictable/robust with (2) > > > > > > > > Let me know if there are any comments/concerns or questions. > > > > > > > > Thanks, > > > > Sudheer > > > > > > > > > > > > > >