I agree with Leif as well - the "cost" in constantly reshuffling VC's across threads may outweigh the extra latency in using per-thread pool. If we do not want cross-thread communication, it seems like using per-thread pool is a more cleaner solution. Would it makes sense to spend time to investigate why the per-thread pool makes things worse (in terms of latency), perhaps? Other users of per-thread pool (Brian Geffon) claimed significant gains with that, so, may be, it's probably important/useful to actually understand why the behavior is not similar? Also, without changing the design too much, I wonder if we could actually, enhance the global pool slightly, by adding the thread_id to the key and basically try to "pick" on the same net thread as the client VC first. If there's no match found on the same thread, then simply fall back to the current design of picking from the rest of the threads. My initial thoughts are that there might more "misses" for the client VC's thread initially, but, after a few hours (min?) of reaching a steady state, there's a good chance of reducing the cross-thread "picks". Unless I'm missing something and there are serious flaws in this hybrid approach, I can even try a POC patch for this and come back with results. Thanks, Sudheer
On Friday, July 24, 2015 6:27 AM, Susan Hinrichs <shinr...@network-geographics.com> wrote: On 7/24/2015 2:11 AM, Leif Hedstrom wrote: >> On Jul 24, 2015, at 3:16 AM, Susan Hinrichs >> <shinr...@network-geographics.com> wrote: >> >> Hello, >> >> Another latent cross-thread race condition has become very active in our >> environment (TS-3797). Given that we just spent time within the last month >> squashing another cross thread race condition (TS-3486) that was active in >> several environments, Alan and I would like to step back and try to reduce >> the cross thread impact of the global session pools. >> >> I wrote up our thoughts and plan for implementation. Given that threading >> and race conditions are always tricky, I'd appreciate more eyes looking for >> flaws in our approach or suggestions for alternatives. >> >> https://cwiki.apache.org/confluence/display/TS/Threading+Issues+And+NetVC+Migration > > > My gut reaction to this is that this makes our efforts for NUMA / thread > affinity very, very difficult to achieve. The goal is to avoid memory > migrating cross NUMA sockets, to avoid QPI traffic. This would encourage the > opposite unless I misread it ? It also obviously violates the original design > goals, where VCs do *not* migrate. > > It’d be very interesting to hear from John Plevyak and what their initial > design had considered for these issues? > > Cheers, > > — Leif > Leif, Thanks for the pointers to the historical precedence. I'll look over them this morning. I will think more into NUMA issues as well. I had been focusing on reducing thread cross-talk and associated possibility for errors. I thought this solution was all good since a new net VC is created in the target thread and the socket and SSL object are copied over. But upon reflection, I realize that there are buffers associated with the socket and the SSL object. Susan