Hi, I took a quick look over the patch. I didn't get why it's named "native_tls" now, because it doesn't use "__thread" variables anymore. Actually, now the patch get rid of additional TSRMLS_ arguments, but performs near the same thing as TSRMLS_FETCH() on each module global access. It leads to huge slowdown.
bench.php. non-zts: 1.222 sec zts: 1.362 sec native_tls: 1.785 sec I think, the patch makes no sense in this state. It looks like on Windows we can't use __declspec(thread) in DLLs loaded using LoadLibray() at all, so we won't ale to build mod_php for Apache or use __declspec(thread) for module globals of extensions build as DLL On Linux it must be possible, but it depends on TLS model (gcc -ftls-model=...). "global-dynamic" model must work in all cases, but I'm not sure about performance, because it'll lead to additional function call for each "__thread" variable access (may be I'm wrong). Better models (like "initial-exec") have some limitations. I don't have enough experience to say, if they could work for us. Thanks. Dmitry. On Mon, Sep 29, 2014 at 10:54 AM, Dmitry Stogov <dmi...@zend.com> wrote: > Hi Anatol. > > I'll take a look on Tuesday or Wednesday. > > Thanks. Dmitry. > > On Sat, Sep 27, 2014 at 12:59 AM, Anatol Belski <anatol....@belski.net> > wrote: > >> Hi Dmitry, >> >> On Mon, September 22, 2014 08:43, Dmitry Stogov wrote: >> > Hi Anatol, >> > >> > >> > I didn't completely get your ideas, but if tsrm_ls_cache can't be >> > exported on Windows directly, can we have a copy of tsrm_ls_cache in >> each >> > DLL/EXE >> > and initialize it once? >> > >> > Thanks. Dmitry. >> > >> Joe and me was working on this and there is a worky version now. Generally >> it suffers from some issues already present in master, but in all things >> together it's a worky crossplatform approach. Please look up the >> native-tls branch. >> >> For the current variant I used the idea from the original RFC, but removed >> exporting the TSRM cache through a __thread variable as it's not portable. >> I've also removed the offset logic from the RFC patch, as that brought >> additional hard to find bugs especially into the current unstable version. >> I don't think it's necessary to copy the arbitrary globals structs in >> every ext, further more i think it's not easy possible without some big >> overhead. However even with the current native-tls branch I'm able to run >> wordpress, symfony, ab -c 8 -n 2048 pass also with multiple calls. Still, >> some Apache bugs are already reported against master, I also repro some >> others, mostly arbitrary shutdown crashes in Apache (so TS version). So as >> they're in master, they're for sure in native-tls. >> >> PHP happens to always have used TLS, however the pointer was passed >> directly to the functions. In TSRM.c, that's tsrm_tls_get/tsrm_tls_set. >> Now, a function wrapper is used to fetch the TLS cache directly in the >> TSRMG macro. This makes the whole slowlier, but allows to get rid of the >> TSRMLS_* macros. The big question is to optimize the function call to >> speedup the whole. Maybe one can speedup it saving a tsrm ls cache pointer >> locally per extension or code area. ATM we're checking the functional >> part, then one can proceed further with removing the TSRMLS_* macros. Any >> speedup or improvement thoughts are welcome. >> >> Possible directions of the further work after known bugs are fixed (in >> master or in native-tls), some are mutually exclusive >> >> - reimplement the offset logic instead of arrays for the globals structs >> - share the tsrm cache pointer globally to some scope, like extension or >> sapi >> - remove the linked lists logic and use TLS explicitly >> - improve locking >> >> Thanks >> >> Anatol >> >> >