bench.php should be faster as well. It must be slower because of not lucky code locality or something like that.
Thanks. Dmitry. On Mon, Oct 13, 2014 at 4:40 PM, <anatol....@belski.net> wrote: > Hi Dmitry, > > thanks for taking a look, > > On Mon, October 13, 2014 13:38, Dmitry Stogov wrote: > > Hi Anatol, > > > > > > At first, I still saw the same big difference on Linux. > > bench.php ZTS - 1.340 sec, native TLS - 1.785 sec. As I understood, it > must > > be related to incomplete changes in build scripts, related to > > ZEND_ENABLE_STATIC_TSRMLS_CACHE. Right? > > > Huh, not sure why the diff you get is so big. I guess it might be so with a > full build. The bench results I was sending previously was done with > --disable-all (actually more than that isn't needed for bench.php) on > opensuse/x64. I have to check this. ZEND_ENABLE_STATIC_TSRMLS_CACHE > conversions isn't complete, that's true. > > > If I get it properly, main PHP binary should be compiled with > > -DZEND_ENABLE_STATIC_TSRMLS_CACHE=1 and shared extensions without it. It > > should lead to quite fast code in main PHP binary and statically linked > > extensions, but to slow code in shared extensions. Right? > > > With a small addition it is right. Actually any ext can possibly be > compiled with -DZEND_ENABLE_STATIC_TSRMLS_CACHE=1, just in accordance with > the following > > - ext is static, then its SOMEEXT_G should just use ZEND_TSRMG instead of > TSRMG > - ext is shared, then the ext should > - ZEND_TSRMG > - use ZEND_TSRMLS_CACHE_DEFINE > - and ZEND_TSRMLS_CACHE_UPDATE at a proper place (globals ctor is a > good one, maybe there's a better one) > - and possibly it can ZEND_TSRMLS_CACHE_EXTERN in some header to > spread over multiple c files within the same ext > > - ext can be either/or static/shared, then at compile time it need to > catch both cases. I made an exemplary conversion for ext/sockets. It can > already switch to ZEND_TSRM which will use static or function call model > to fetch TSRMLS, but it has to use COMPILE_DL_SOMEEXT && ZTS to check if > it's a separate module (so it has to update it's local TSRMLS itself). > - without -DZEND_ENABLE_STATIC_TSRMLS_CACHE=1 and steps above - an ext > will always use TSRMG (so tsrm_get_ls_cache()) to fetch globals, but the > good on that is - nothing will have to be touched and it'll work even if > slower > > > > I built PHP in this way with all extensions linked statically. Now, I see > > small slowdown on bench.php (however according to callgrind it executes > > less instructions and should be faster). Wordpress became 2% faster. > > > > So the patch becomes interesting. :) > > However, many distributions prefer shard extensions, and it would be > great > > to invent some trick to make them fast too. > So you could test it even more, but WP being faster with the current state > is a bit surprising. But well :) Probably I gonna continue with some more > conversions for the remaining exts and SAPIs, will also try to figure out > why it could be possibly slower with bench.php, and then come back again. > > > > I would also prefer to keep the semantic patch small and don't delete all > > FETCH_TSRM() in thousand places (at this point). > > Replacing macro in one place must be easier. > > It's not a problem to remove them on second step if the PoC would really > > work. > > > I think one can apply a partial reverse patch for now, then all those > places should be back. Probably easier than looking for all the > corresponding commits. So I'll do it. > > Regards > > Anatol > > > > >