> On Oct 10, 2018, at 11:31 AM, Walt Karas <wka...@oath.com.INVALID> wrote:
> 
> It seems like ATS does not have a coherent strategy for threading and mutexes.
> 
> One pure strategy is event driven with one thread per core (processing
> an event queue).  Share data structures are owned by a particular core
> (also helps with NUMA), and events that want to access the data
> structure are queued on the thread for that core.  Few if any mutexes
> are needed.


This should be the preferred strategy when at all possible IMO. This is why we 
had that discussions, where I mentioned that the intent at least is for the 
HostDB/DNS lookups to reschedule back to the originating ET_NET thread.

> 
> The other strategy (non event driven) is to figure that threads are
> low-overhead compared to processes, so don't worry about creating gobs
> of them, or the mutex overhead.  Most programmers find this approach
> to be more straightforward, and it jives better with the API of Unix
> and most other OSes.

That’s just not true. Look at Varnish, it does this approach, and it suffers 
with 10’s of thousands of threads, and a cache that’s essentially unusable for 
serious use. Yes, it’s low overhead, but far from zero. Assuming that the OS 
can handle things was a good idea on (research) papers, but falls apart in 
reality.

> 
> ATS has static threads, but many more than one per core.  The
> purpose(s) of the large number and various types of threads is
> unclear.  Our discussion of mutexes quickly turned into blah blah blah
> mumble mumble.


This is “legacy”, If we can figure out where the bottlenecks are, striving 
towards one ET_NET thread per core should be a goal.

Cheers,

— Leif

> On Wed, Oct 10, 2018 at 11:38 AM Pushkar Pradhan
> <pprad...@oath.com.invalid> wrote:
>> 
>> I think Alan is referring to the below code.
>> It's a try lock, so if it doesn't succeed it's just rescheduled for later.
>> 
>> void
>> EThread::process_event(Event *e, int calling_code)
>> {
>>  ink_assert((!e->in_the_prot_queue && !e->in_the_priority_queue));
>>  MUTEX_TRY_LOCK_FOR(lock, e->mutex, this, e->continuation);
>>  if (!lock.is_locked()) {
>>    e->timeout_at = cur_time + DELAY_FOR_RETRY;
>>    EventQueueExternal.enqueue_local(e);
>>  } else {
>>    if (e->cancelled) {
>>      free_event(e);
>>      return;
>>    }
>>    Continuation *c_temp = e->continuation;
>>    // Make sure that the contination is locked before calling the handler
>>    //set_cont_flags(e->continuation->control_flags);
>>    e->continuation->handleEvent(calling_code, e);
>> 
>> On Tue, Oct 9, 2018 at 2:04 PM Walt Karas <wka...@oath.com.invalid> wrote:
>> 
>>> To what "explicit continuation locking" do you refer?
>>> 
>>> How does this address the issue that using TSMutexLock() or MutexLock
>>> in a currently running continuation function (unnecessarily) blocks
>>> all other events waiting in a thread event queue?  Whereas the
>>> inability to lock a continuation mutex cause the continuation to be
>>> requeued at the end of the thread event queue, thus allowing
>>> succeeding events in the thread's queue to be handled.
>>> On Tue, Oct 9, 2018 at 3:44 PM Alan Carroll
>>> <solidwallofc...@oath.com.invalid> wrote:
>>>> 
>>>> It's a bit more complex than that. One key thing is that if you schedule
>>> an
>>>> event for a continuation, when the event handler is called the
>>> continuation
>>>> mutex will be locked. Therefore it's rarely the case a plugin needs to
>>> lock
>>>> its continuations explicitly. For that reason, simply scheduling handles
>>>> lock contention without thread blocking.
>>>> 
>>>> In the core, there is a class, MutexLock, which does the RAII style
>>> locking.
>>>> 
>>>> On Tue, Oct 9, 2018 at 2:26 PM Walt Karas <wka...@oath.com.invalid>
>>> wrote:
>>>> 
>>>>> In TS, is it important to favor use of continuation mutexes to avoid
>>>>> thread blocking.  For example, should code like this:
>>>>> 
>>>>> before();
>>>>> TSMutexLock(mutex);
>>>>> critical_section();
>>>>> TSMutexUnlock(mutex);
>>>>> after();
>>>>> 
>>>>> be replaced with code like:
>>>>> 
>>>>> int contf_after(TSCont, TSEvent, void *)
>>>>> {
>>>>>  after();
>>>>> 
>>>>>  return 0;
>>>>> }
>>>>> 
>>>>> int contf_critical_section(TSCont, TSEvent, void *)
>>>>> {
>>>>>  critical_section();
>>>>> 
>>>>>  static TSCont cont_after = TSContCreate(contf_after, nullptr);
>>>>> 
>>>>>  TSContSchedule(cont_after, 0, TS_THREAD_POOL_DEFAULT);
>>>>> 
>>>>>  return 0;
>>>>> }
>>>>> 
>>>>> // ...
>>>>> 
>>>>> before();
>>>>> 
>>>>> static TSCont cont_critical_section =
>>>>> TSContCreate(contf_critical_section, mutex);
>>>>> 
>>>>> TSContSchedule(cont_critical_section, 0, TS_THREAD_POOL_DEFAULT);
>>>>> 
>>>>> // ...
>>>>> 
>>>>> (This is plugin code but I assume the same principle would apply to
>>> core
>>>>> code.)
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> *Beware the fisherman who's casting out his line in to a dried up
>>> riverbed.*
>>>> *Oh don't try to tell him 'cause he won't believe. Throw some bread to
>>> the
>>>> ducks instead.*
>>>> *It's easier that way. *- Genesis : Duke : VI 25-28
>>> 
>> 
>> 
>> --
>> pushkar

Reply via email to