Hi Anatol,
Great progress! I didn't look into the code careful yet, but the explanation looks reasonable. For now, I see one missing part - opcache can't work completely out of the box, because its interned string related functions are wrapped with #ifndef ZTS Thanks. Dmitry. ________________________________ From: Anatol Belski <anatol....@belski.net> Sent: Friday, February 17, 2017 3:17:05 PM To: Dmitry Stogov; 'Nikita Popov'; 'Xinchen Hui'; 'Joe Watkins' Cc: 'PHP internals list' Subject: Thread safe interned strings Hi, I was working on a patch to support interned strings in thread safe builds of PHP. ATM there seems to be a good progress on this. The proposed patch unifies TS/NTS builds in regard to the interned strings handling and fixes issues in TS builds. https://github.com/php/php-src/pull/2390 The basic idea is like this - CG(interned_strings) is removed - there are two tables - one table is persistent only, can be filled only at startup and is maintained per process - second table is allocated per thread in TS builds and per process in NTS builds - second table is filled/emptied at every request, separate for every process/thread - in the startup phase, any interned string is saved into the first table - once a snapshot was made, that table stays unchanged until the process exit - after startup, any call to zend_new_interned_string saves into the second table, which is cleaned up per request There was also another idea we discussed with Dmitry, to have only one interned strings table, initially filled with permanent strings, which would be copied into every new thread. In my early implementation I had to abandon it, as a multiple tables idea seems to have more perspective. Especially in TS builds it means the long living strings reside in memory only once. The second table overhead is neglecting, the API can also be extended to override the storages, thus giving the full control over where and how things are stored. Additionally, I've defined three stages, where the interned string tables init/dtor happens. Those are - SAPI - TSRM - Zend The functions for init/dtor will record the stage, and will call the dtor accordingly. For the TS Apache variant it is required, because the main thread holds all the globals and the final interned strings destruction needs to happen after the main thread globals destruction. In general, it also useful in any build kind, as the permanent storage can be then used even before the engine is actually initialized. To do so, the SAPI can define init/dtor calls at the outermost places. The TSRM stage will be ATM missing in NTS builds, and if nothing else is defined, the old way of init/dtor in Zend is taken. The init function can be called at any stage, but the subsequent calls will be ignored and the dtor will be called only when the matching stage argument is passed. >From what I can tell so far, at least the bugs here are fixed https://bugs.php.net/bug.php?id=74015 https://bugs.php.net/bug.php?id=72451 These are likely to have been fixed, too, but some harder to test https://bugs.php.net/bug.php?id=71115 https://bugs.php.net/bug.php?id=74011 On my side, I tested quite hard with several small snippets from the bug reports and with WP and Symfony and could observe a huge stability increase. Unfortunately it is not really sensible to tell a performance diff, as same snippets were crashing quite quickly without the patch. The test setup I have is Apache worker/winnt penetrated by 8/16 simultaneous clients on a machine with 4 (Linux) or 8 (Windows) real processor cores. Regarding Opcache, this solution seems to work out of the box. As many the strings are overridden, there seems to be no impact. Though this should be double checked. I have to mention, that there might be also other issues regarding TS, fe with resources. I for now was only concentrated on the interned strings, as this is the most urgent part we still miss. Some general refactoring to the TS layer could be targeted later. Possible improvements to strings could be - intern more strings on startup - the constants, the ini directive names, etc. - introduce static interned strings, those that are already hardcoded into the text/data segment - introduce fixed size interned zend_string for short strings - introduce a special zend_string to hold path, thus saving strlen at many places, maybe also fixed size Thanks Anatol