Hi Anatol,

Great progress!

I didn't look into the code careful yet, but the explanation looks reasonable.

For now, I see one missing part - opcache can't work completely out of the box, 
because its interned string related functions are wrapped with #ifndef ZTS


Thanks. Dmitry.

________________________________
From: Anatol Belski <anatol....@belski.net>
Sent: Friday, February 17, 2017 3:17:05 PM
To: Dmitry Stogov; 'Nikita Popov'; 'Xinchen Hui'; 'Joe Watkins'
Cc: 'PHP internals list'
Subject: Thread safe interned strings

Hi,

I was working on a patch to support interned strings in thread safe builds
of PHP. ATM there seems to be a good progress on this. The proposed patch
unifies TS/NTS builds in regard to the interned strings handling and fixes
issues in TS builds.

https://github.com/php/php-src/pull/2390

The basic idea is like this

- CG(interned_strings) is removed
- there are two tables
- one table is persistent only, can be filled only at startup and is
maintained per process
- second table is allocated per thread in TS builds and per process in NTS
builds
- second table is filled/emptied at every request, separate for every
process/thread
- in the startup phase, any interned string is saved into the first table
- once a snapshot was made, that table stays unchanged until the process
exit
- after startup, any call to zend_new_interned_string saves into the second
table, which is cleaned up per request

There was also another idea we discussed with Dmitry, to have only one
interned strings table, initially filled with permanent strings, which would
be copied into every new thread. In my early implementation I had to abandon
it, as a multiple tables idea seems to have more perspective. Especially in
TS builds it means the long living strings reside in memory only once. The
second table overhead is neglecting, the API can also be extended to
override the storages, thus giving the full control over where and how
things are stored.

Additionally, I've defined three stages, where the interned string tables
init/dtor happens. Those are

- SAPI
- TSRM
- Zend

The functions for init/dtor will record the stage, and will call the dtor
accordingly. For the TS Apache variant it is required, because the main
thread holds all the globals and the final interned strings destruction
needs to happen after the main thread globals destruction. In general, it
also useful in any build kind, as the permanent storage can be then used
even before the engine is actually initialized. To do so, the SAPI can
define init/dtor calls at the outermost places. The TSRM stage will be ATM
missing in NTS builds, and if nothing else is defined, the old way of
init/dtor in Zend is taken. The init function can be called at any stage,
but the subsequent calls will be ignored and the dtor will be called only
when the matching stage argument is passed.

>From what I can tell so far, at least the bugs here are fixed
https://bugs.php.net/bug.php?id=74015
https://bugs.php.net/bug.php?id=72451

These are likely to have been fixed, too, but some harder to test
https://bugs.php.net/bug.php?id=71115
https://bugs.php.net/bug.php?id=74011

On my side, I tested quite hard with several small snippets from the bug
reports and with WP and Symfony and could observe a huge stability increase.
Unfortunately it is not really sensible to tell a performance diff, as same
snippets were crashing quite quickly without the patch. The test setup I
have is Apache worker/winnt penetrated by 8/16 simultaneous clients on a
machine with 4 (Linux) or 8 (Windows) real processor cores.

Regarding Opcache, this solution seems to work out of the box. As many the
strings are overridden, there seems to be no impact. Though this should be
double checked.

I have to mention, that there might be also other issues regarding TS, fe
with resources. I for now was only concentrated on the interned strings, as
this is the most urgent part we still miss. Some general refactoring to the
TS layer could be targeted later. Possible improvements to strings could be

- intern more strings on startup -  the constants, the ini directive names,
etc.
- introduce static interned strings, those that are already hardcoded into
the text/data segment
- introduce fixed size interned zend_string for short strings
- introduce a special zend_string to hold path, thus saving strlen at many
places, maybe also fixed size

Thanks

Anatol



Reply via email to