While investigating a core dump that we ran into during our ATS9 migration, we discovered a couple of problems with the thread affinity changes and the TSContSchedule API. 1) It looks like the API TSContSchedule() and TSContScheduleEvery() now don't work unless TSContSetThreadAffinity() was called prior to calling them. This sounds unreasonable, so, I'm proposing to change the API to set the Continuation's thread affinity to the calling thread of the API, if not already set. 2) We have a use case where one of our plugins calls TSContSchedule* API from a non-ATS thread (created using pthread_create). This borks in a weird fashion because the FORCE_PLUGIN_SCOPED_MUTEX ends up not acquiring the mutex (because the calling thread isn't an EThread, so, this_ethread() is NULL and the holding thread for the cont's mutex is also NULL, which means in Mutex_Lock(), the if (m->thread_holding != t) { evaluates to false bypassing the mutex acquire call. However, the subsequent mutex release does get called via Mutex_unlock() and this makes bad things to happen (as the __nusers of the mutex object goes below 0 (it's an unsigned int)). https://github.com/apache/trafficserver/blob/4a503c817766fcad3b9eab3ff3a8e9e73d483ae7/iocore/eventsystem/I_Lock.h#L320
So, we could either - prevent calling TSContSchedule*() API from non-ATS threads by asserting if this_ethread() is NULL before trying to acquire the mutex. ie. basically, fail hard and fast and in more obvious way (than an obscure random side-effect that happens much later because a release() was called on a lock without calling an acquire()) OR - With the thread affinity changes, we could check if the continuation has a thread affinity set to an EThread and still allow calling TSContSchedule*() API from non-ATS Threads, as long as the thread affinity is set to an ATS thread. This requires to acquire the mutex on the affinity thread than the calling thread. After discussing with Bryan, we prefer option (1) which is consistent with the expectation that Continuation Schedule API should only be called from ATS threads. Note: Both the proposed changes here should not be breaking any compatibility or expectations, but rather retain compatibility with (1) above and make the API a bit more predictable/robust with (2) Let me know if there are any comments/concerns or questions. Thanks, Sudheer