On Sun, October 5, 2014 21:32, Anatol Belski wrote:
> Hi Dmitry,
>
>
> On Wed, October 1, 2014 08:01, Dmitry Stogov wrote:
>
>> Hi Anatol,
>>
>>
>>
>> I know, TSRM uses TLS APIs internally.
>>
>>
>>
>> In my opinion, the simplest (and probably efficient) way to get rid of
>> TSRMLS_DC arguments and TSRMLS_FETCH calls, would be introducing a
>> global thread specific variable.
>>
>> __thread void ***tsrm_ls;
>>
>>
>>
>> As I understood it won't work on Windows anyway, because windows linker
>>  is not smart enough to use TLS variables across different DLLs. May be
>>  it's possible to have a local thread specific copy of tsrm_ls for each
>> DLL, but
>> then we should make them to be consistent...
>>
>> Sorry, I can't give you any advice, and can't spend a lot of time on
>> this topic.
>>
>> May be description of TLS internals on ELF systems would give you some
>> ideas.
>>
>> http://www.akkadia.org/drepper/tls.pdf
>>
>>
>>
>> Thanks. Dmitry.
>>
>>
>>
> I've reworked this patch to take a pointer per one shared unit. Please
> see here
> http://git.php.net/?p=php-src.git;a=commitdiff;h=76081df168829a5cc0409fac
> 47c217d4927ec6f6
> (though this was just the first in the series). Afterwards I've adapted
> ext/standard and also converted ext/sockets as an exemplary item because
> it's usually compiled shared.
>
> With this change I experience much better performance - a diff is in
> 100-50ms range compared to the master TS build. Particular positions in
> bench.php show even some better result.
>
> However this is not a global __thread variable, but a local one to every
> shared unit. Say tsrm_ls will have to be declared in every so, dll or exe
> and updated on request. For now I've put the update code in MINIT and
> into the first ctor (zmm is the one in the php7ts.dll) called. The ctor
> seems to be the only reliable place (but maybe I'm wrong), despite it'll
> be called for every request instead of per thread, that won't be very bad.
>
>
> I'd suggest to go this way so we have the same flow everywhere.
>
>
Here are just the results from Zend/bench.php done on 64 bit Linux and
Windows


master ts linux

simple             0.112
simplecall         0.036
simpleucall        0.129
simpleudcall       0.135
mandel             0.317
mandel2            0.340
ackermann(7)       0.086
ary(50000)         0.010
ary2(50000)        0.009
ary3(2000)         0.173
fibo(30)           0.291
hash1(50000)       0.027
hash2(500)         0.023
heapsort(20000)    0.070
matrix(20)         0.075
nestedloop(12)     0.188
sieve(30)          0.062
strcat(200000)     0.013
------------------------
Total              2.095

native-tls linux

simple             0.072
simplecall         0.048
simpleucall        0.180
simpleudcall       0.161
mandel             0.311
mandel2            0.322
ackermann(7)       0.128
ary(50000)         0.010
ary2(50000)        0.009
ary3(2000)         0.159
fibo(30)           0.394
hash1(50000)       0.029
hash2(500)         0.024
heapsort(20000)    0.067
matrix(20)         0.070
nestedloop(12)     0.129
sieve(30)          0.063
strcat(200000)     0.011
------------------------
Total              2.186


master ts windows

simple             0.096
simplecall         0.046
simpleucall        0.137
simpleudcall       0.124
mandel             0.283
mandel2            0.346
ackermann(7)       0.089
ary(50000)         0.009
ary2(50000)        0.007
ary3(2000)         0.130
fibo(30)           0.231
hash1(50000)       0.023
hash2(500)         0.020
heapsort(20000)    0.078
matrix(20)         0.065
nestedloop(12)     0.162
sieve(30)          0.045
strcat(200000)     0.012
------------------------
Total              1.903


native-tls windows

simple             0.098
simplecall         0.048
simpleucall        0.107
simpleudcall       0.109
mandel             0.285
mandel2            0.338
ackermann(7)       0.093
ary(50000)         0.009
ary2(50000)        0.007
ary3(2000)         0.140
fibo(30)           0.250
hash1(50000)       0.025
hash2(500)         0.020
heapsort(20000)    0.080
matrix(20)         0.070
nestedloop(12)     0.189
sieve(30)          0.047
strcat(200000)     0.010
------------------------
Total              1.925

Made on real hardware, no VMs.

Regards

Anatol



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to