Hi Anatol, I know, TSRM uses TLS APIs internally.
In my opinion, the simplest (and probably efficient) way to get rid of TSRMLS_DC arguments and TSRMLS_FETCH calls, would be introducing a global thread specific variable. __thread void ***tsrm_ls; As I understood it won't work on Windows anyway, because windows linker is not smart enough to use TLS variables across different DLLs. May be it's possible to have a local thread specific copy of tsrm_ls for each DLL, but then we should make them to be consistent... Sorry, I can't give you any advice, and can't spend a lot of time on this topic. May be description of TLS internals on ELF systems would give you some ideas. http://www.akkadia.org/drepper/tls.pdf Thanks. Dmitry. On Wed, Oct 1, 2014 at 3:50 AM, Anatol Belski <anatol....@belski.net> wrote: > Hi Dmtry, > > thanks for taking a look at this. > > On Wed, October 1, 2014 00:09, Dmitry Stogov wrote: > > Hi, > > > > > > I took a quick look over the patch. > > I didn't get why it's named "native_tls" now, because it doesn't use > > "__thread" variables anymore. > I was wondering myself but now I see (intentionally taking the 5.2 source) > > http://lxr.php.net/xref/PHP_5_2/TSRM/TSRM.c#282 > http://lxr.php.net/xref/PHP_5_2/TSRM/TSRM.c#329 > > We already use TLS :) It took quite some time to understand this. > > > Actually, now the patch get rid of additional TSRMLS_ arguments, but > > performs near the same thing as TSRMLS_FETCH() on each module global > > access. It leads to huge slowdown. > > > > bench.php. > > > > non-zts: 1.222 sec > > zts: 1.362 sec > > native_tls: 1.785 sec > > > > > > I think, the patch makes no sense in this state. > > > Absolutely, this state is just to show we can drop the TSRMLS_* things > without hurting the functional part. At least I'm glad you have noticed no > regression on the functionality, but just the slowdown. > > > It looks like on Windows we can't use __declspec(thread) in DLLs loaded > > using LoadLibray() at all, so we won't ale to build mod_php for Apache or > > use __declspec(thread) for module globals of extensions build as DLL > > > > On Linux it must be possible, but it depends on TLS model (gcc > > -ftls-model=...). "global-dynamic" model must work in all cases, but I'm > > not sure about performance, because it'll lead to additional function > call > > for each "__thread" variable access (may be I'm wrong). Better models > > (like > > "initial-exec") have some limitations. I don't have enough experience to > > say, if they could work for us. > > > With the linux part - yeah, the gcc linker does the great magic which > makes __thread variables work between shared objects. The function call is > needed specifically because on Windows it is not possible to do > __declspec(dllexport) __declspec(thread). With __dllspec(thread) unusable > when explicitly loaded - not true since Vista anymore. Please read here > > > http://msdn.microsoft.com/en-us/library/windows/desktop/ms684175%28v=vs.85%29.aspx > > "Windows Server 2003 and Windows XP: The Visual C++ compiler supports a > syntax that enables you to declare thread-local variables: > _declspec(thread). If you use this syntax in a DLL, you will not be able > to load the DLL explicitly using LoadLibrary on versions of Windows prior > to Windows Vista. If your DLL will be loaded explicitly, you must use the > thread local storage functions instead of _declspec(thread). For an > example, see Using Thread Local Storage in a Dynamic Link Library." > > So this is not an issue as we won't support XP in PHP7 anyway. But the > issue is that it cannot export a thread specific variable, but it can > perfectly gain the access to it through an explicit tls storage query. > MSDN provides also a snippet on how it works > > > http://msdn.microsoft.com/en-us/library/windows/desktop/ms686997%28v=vs.85%29.aspx > > While investigating on it, i've added one more DLL which loads through > LoadLibrary and it worked the same way as the DLL linked implicitly using > a .lib file. Let me know if you'd like to look at my investigation code on > this. Just to sumarize: > > - TLS is used all the way since at least 5.2 > - the portable way to share the tls storage is by accessing it through > tsrm_tls_get/tsrm_tls_set macros > - function calls slow down everything > > That is why I mean - the patch is functionally doing the same what the > mainstream does, but allows to remove the TSRMLS_* macros. > > Btw. the thread keys get allocated by Apache already. Maybe we have to > care about that in some other SAPI but I'm not sure there is another one > except mpm_worker/mpm_winnt which can exhaust all the TS potential. In > Apache it's spread over several sources, however here is the essential > part > > > http://svn.apache.org/viewvc/apr/apr/trunk/threadproc/win32/threadpriv.c?view=markup > > > My current idea on how to speed up it - the __thread or __declspec(thread) > variables can be used and are both portable within the same binary unit > (say .so, .dll, .exe, etc.). Once we have a resource pointer, it can be > cached in a local variable. In some header, it would be declared like > > TSRM_TLS extern void *tsrm_cache; > > And in one .c file it would be properly defined. The header would make it > accessible from all the .c files in the same binary unit (all object files > linked together). Of course, this variable has to be updated once per > thread before any globals could be read. What is needed is to lookup the > correct places to update that variable. Such a variable will have > unfortunately to be defined in every so/dll/exe and then updated > accordingly. But getting the tsrm cache per function call will still work. > Maybe something comes in your mind where such correct places should be? > zend_startup() and zend_activate()? When it's worky, it might even solve > some perf issue also with Linux, if I understood it correctly. > > Regards > > Anatol > > > > > >