Hi Anatol,

I know, TSRM uses TLS APIs internally.

In my opinion, the simplest (and probably efficient) way to get rid of
TSRMLS_DC arguments and TSRMLS_FETCH calls, would be introducing a global
thread specific variable.

__thread void ***tsrm_ls;

As I understood it won't work on Windows anyway, because windows linker is
not smart enough to use TLS variables across different DLLs.
May be it's possible to have a local thread specific copy of tsrm_ls for
each DLL, but then we should make them to be consistent...

Sorry, I can't give you any advice, and can't spend a lot of time on this
topic.

May be description of TLS internals on ELF systems would give you some
ideas.

http://www.akkadia.org/drepper/tls.pdf

Thanks. Dmitry.

On Wed, Oct 1, 2014 at 3:50 AM, Anatol Belski <anatol....@belski.net> wrote:

> Hi Dmtry,
>
> thanks for taking a look at this.
>
> On Wed, October 1, 2014 00:09, Dmitry Stogov wrote:
> > Hi,
> >
> >
> > I took a quick look over the patch.
> > I didn't get why it's named "native_tls" now, because it doesn't use
> > "__thread" variables anymore.
> I was wondering myself but now I see (intentionally taking the 5.2 source)
>
> http://lxr.php.net/xref/PHP_5_2/TSRM/TSRM.c#282
> http://lxr.php.net/xref/PHP_5_2/TSRM/TSRM.c#329
>
> We already use TLS :) It took quite some time to understand this.
>
> > Actually, now the patch get rid of additional TSRMLS_ arguments, but
> > performs near the same thing as TSRMLS_FETCH() on each module global
> > access. It leads to huge slowdown.
> >
> > bench.php.
> >
> > non-zts:     1.222 sec
> > zts:           1.362 sec
> > native_tls:  1.785 sec
> >
> >
> > I think, the patch makes no sense in this state.
> >
> Absolutely, this state is just to show we can drop the TSRMLS_* things
> without hurting the functional part. At least I'm glad you have noticed no
> regression on the functionality, but just the slowdown.
>
> > It looks like on Windows we can't use __declspec(thread) in DLLs loaded
> > using LoadLibray() at all, so we won't ale to build mod_php for Apache or
> > use __declspec(thread) for module globals of extensions build as DLL
> >
> > On Linux it must be possible, but it depends on TLS model (gcc
> > -ftls-model=...).  "global-dynamic" model must work in all cases, but I'm
> > not sure about performance, because it'll lead to additional function
> call
> >  for each "__thread" variable access (may be I'm wrong). Better models
> > (like
> > "initial-exec") have some limitations. I don't have enough experience to
> > say, if they could work for us.
> >
> With the linux part - yeah, the gcc linker does the great magic which
> makes __thread variables work between shared objects. The function call is
> needed specifically because on Windows it is not possible to do
> __declspec(dllexport) __declspec(thread). With __dllspec(thread) unusable
> when explicitly loaded - not true since Vista anymore. Please read here
>
>
> http://msdn.microsoft.com/en-us/library/windows/desktop/ms684175%28v=vs.85%29.aspx
>
> "Windows Server 2003 and Windows XP:  The Visual C++ compiler supports a
> syntax that enables you to declare thread-local variables:
> _declspec(thread). If you use this syntax in a DLL, you will not be able
> to load the DLL explicitly using LoadLibrary on versions of Windows prior
> to Windows Vista. If your DLL will be loaded explicitly, you must use the
> thread local storage functions instead of _declspec(thread). For an
> example, see Using Thread Local Storage in a Dynamic Link Library."
>
> So this is not an issue as we won't support XP in PHP7 anyway. But the
> issue is that it cannot export a thread specific variable, but it can
> perfectly gain the access to it through an explicit tls storage query.
> MSDN provides also a snippet on how it works
>
>
> http://msdn.microsoft.com/en-us/library/windows/desktop/ms686997%28v=vs.85%29.aspx
>
> While investigating on it, i've added one more DLL which loads through
> LoadLibrary and it worked the same way as the DLL linked implicitly using
> a .lib file. Let me know if you'd like to look at my investigation code on
> this. Just to sumarize:
>
> - TLS is used all the way since at least 5.2
> - the portable way to share the tls storage is by accessing it through
> tsrm_tls_get/tsrm_tls_set macros
> - function calls slow down everything
>
> That is why I mean - the patch is functionally doing the same what the
> mainstream does, but allows to remove the TSRMLS_* macros.
>
> Btw. the thread keys get allocated by Apache already. Maybe we have to
> care about that in some other SAPI but I'm not sure there is another one
> except mpm_worker/mpm_winnt which can exhaust all the TS potential. In
> Apache it's spread over several sources, however here is the essential
> part
>
>
> http://svn.apache.org/viewvc/apr/apr/trunk/threadproc/win32/threadpriv.c?view=markup
>
>
> My current idea on how to speed up it - the __thread or __declspec(thread)
> variables can be used and are both portable within the same binary unit
> (say .so, .dll, .exe, etc.). Once we have a resource pointer, it can be
> cached in a local variable. In some header, it would be declared like
>
> TSRM_TLS extern void *tsrm_cache;
>
> And in one .c file it would be properly defined. The header would make it
> accessible from all the .c files in the same binary unit (all object files
> linked together). Of course, this variable has to be updated once per
> thread before any globals could be read. What is needed is to lookup the
> correct places to update that variable. Such a variable will have
> unfortunately to be defined in every so/dll/exe and then updated
> accordingly. But getting the tsrm cache per function call will still work.
> Maybe something comes in your mind where such correct places should be?
> zend_startup() and zend_activate()? When it's worky, it might even solve
> some perf issue also with Linux, if I understood it correctly.
>
> Regards
>
> Anatol
>
>
>
>
>
>

Reply via email to