From: filip at hajny dot net
Operating system: SmartOS (possibly SunOS)
PHP version: Irrelevant
Package: FPM related
Bug Type: Bug
Bug description:FPM won't listen to signals when event ports and
catch_workers_output are used
Description:
------------
When "events.mechanism = port" and "catch_workers_output = yes" are in
effect on
SmartOS (and possibly
Illumos/SunOS in general), FPM won't terminate the process stack on a
signal
received (includes attempts to shut
down the stack with TERM, or reload with USR2). There seems to be no other
way to
stop the stack except for
SIGKILL to the master.
Looking at the debug log, we see SIGTERM sent to the children processes,
which
terminate, but stay as zombies.
# ./sapi/fpm/php-fpm -F
[01-Oct-2013 11:59:28.382567] DEBUG: pid 96294, fpm_scoreboard_init_main(),
line
40: got clock tick '100'
[01-Oct-2013 11:59:28.382842] DEBUG: pid 96294, fpm_event_init_main(), line
337:
event module is port and 5 fds
have been reserved
[01-Oct-2013 11:59:28.382863] NOTICE: pid 96294, fpm_init(), line 83: fpm
is
running, pid 96294
[01-Oct-2013 11:59:28.383274] DEBUG: pid 96294, fpm_children_make(), line
421:
[pool www] child 96295 started
[01-Oct-2013 11:59:28.383626] DEBUG: pid 96294, fpm_children_make(), line
421:
[pool www] child 96296 started
[01-Oct-2013 11:59:28.383658] DEBUG: pid 96294, fpm_event_loop(), line 366:
2232
bytes have been reserved in SHM
[01-Oct-2013 11:59:28.383665] NOTICE: pid 96294, fpm_event_loop(), line
367: ready
to handle connections
[01-Oct-2013 11:59:42.417811] DEBUG: pid 96294, fpm_event_loop(), line 419:
event
module triggered 1 events
[01-Oct-2013 11:59:42.417854] DEBUG: pid 96294, fpm_got_signal(), line 85:
received SIGTERM
[01-Oct-2013 11:59:42.417864] NOTICE: pid 96294, fpm_got_signal(), line 86:
Terminating ...
[01-Oct-2013 11:59:42.417873] DEBUG: pid 96294, fpm_pctl(), line 231:
switching to
'terminating' state
[01-Oct-2013 11:59:42.417913] DEBUG: pid 96294, fpm_pctl_kill_all(), line
159:
[pool www] sending signal 15
SIGTERM to child 96296
[01-Oct-2013 11:59:42.417933] DEBUG: pid 96294, fpm_pctl_kill_all(), line
159:
[pool www] sending signal 15
SIGTERM to child 96295
[01-Oct-2013 11:59:42.417942] DEBUG: pid 96294, fpm_pctl_kill_all(), line
168: 2
child(ren) still alive
[01-Oct-2013 11:59:42.417959] DEBUG: pid 96294, fpm_event_loop(), line 419:
event
module triggered 1 events
[01-Oct-2013 11:59:42.418045] DEBUG: pid 96294, fpm_event_loop(), line 419:
event
module triggered 2 events
[01-Oct-2013 11:59:42.418079] DEBUG: pid 96294, fpm_event_loop(), line 419:
event
module triggered 2 events
[01-Oct-2013 11:59:42.418855] ERROR: pid 96294, fpm_stdio_child_said(),
line 120:
unable to read what child say:
Bad file number (9)
[01-Oct-2013 11:59:42.418923] DEBUG: pid 96294, fpm_event_loop(), line 419:
event
module triggered 1 events
Killed
DTracing sig_handler, we see one SIGCLD per child received.
# dtrace -n 'pid$target::sig_handler:entry{ trace(arg0); }' -p $(pgrep
php|sort -
n|head -1)
dtrace: description 'pid$target::sig_handler:entry' matched 1 probe
CPU ID FUNCTION:NAME
2 483 sig_handler:entry 15
2 483 sig_handler:entry 18
2 483 sig_handler:entry 18
Oddly enough, the problem is only manifest as long as at least one child
process
exists from the flock launched at
the start. If we manually terminate all the children processes first and
let the
master re-spawn them all, the
problem goes away entirely.
# ./sapi/fpm/php-fpm -F
[01-Oct-2013 12:20:01.656647] DEBUG: pid 96910, fpm_scoreboard_init_main(),
line
40: got clock tick '100'
[01-Oct-2013 12:20:01.656920] DEBUG: pid 96910, fpm_event_init_main(), line
337:
event module is port and 5 fds
have been reserved
[01-Oct-2013 12:20:01.656940] NOTICE: pid 96910, fpm_init(), line 83: fpm
is
running, pid 96910
[01-Oct-2013 12:20:01.657345] DEBUG: pid 96910, fpm_children_make(), line
421:
[pool www] child 96911 started
[01-Oct-2013 12:20:01.657696] DEBUG: pid 96910, fpm_children_make(), line
421:
[pool www] child 96912 started
[01-Oct-2013 12:20:01.657730] DEBUG: pid 96910, fpm_event_loop(), line 366:
2232
bytes have been reserved in SHM
[01-Oct-2013 12:20:01.657740] NOTICE: pid 96910, fpm_event_loop(), line
367: ready
to handle connections
[01-Oct-2013 12:20:12.049038] DEBUG: pid 96910, fpm_event_loop(), line 419:
event
module triggered 4 events
[01-Oct-2013 12:20:12.049745] ERROR: pid 96910, fpm_stdio_child_said(),
line 120:
unable to read what child say:
Bad file number (9)
[01-Oct-2013 12:20:12.049811] DEBUG: pid 96910, fpm_event_loop(), line 419:
event
module triggered 1 events
[01-Oct-2013 12:20:12.049844] DEBUG: pid 96910, fpm_got_signal(), line 76:
received SIGCHLD
[01-Oct-2013 12:20:12.049880] WARNING: pid 96910, fpm_children_bury(), line
252:
[pool www] child 96911 exited on
signal 15 (SIGTERM) after 10.392237 seconds from start
[01-Oct-2013 12:20:12.050366] NOTICE: pid 96910, fpm_children_make(), line
421:
[pool www] child 96920 started
[01-Oct-2013 12:20:12.050404] WARNING: pid 96910, fpm_children_bury(), line
252:
[pool www] child 96912 exited on
signal 15 (SIGTERM) after 10.392396 seconds from start
[01-Oct-2013 12:20:12.050735] NOTICE: pid 96910, fpm_children_make(), line
421:
[pool www] child 96921 started
[01-Oct-2013 12:20:12.050770] DEBUG: pid 96910, fpm_got_signal(), line 76:
received SIGCHLD
[01-Oct-2013 12:20:12.050790] DEBUG: pid 96910, fpm_event_loop(), line 419:
event
module triggered 1 events
^C[01-Oct-2013 12:20:17.138831] DEBUG: pid 96910, fpm_got_signal(), line
80:
received SIGINT
[01-Oct-2013 12:20:17.138894] NOTICE: pid 96910, fpm_got_signal(), line 81:
Terminating ...
[01-Oct-2013 12:20:17.138927] DEBUG: pid 96910, fpm_pctl(), line 231:
switching to
'terminating' state
[01-Oct-2013 12:20:17.139001] DEBUG: pid 96910, fpm_pctl_kill_all(), line
159:
[pool www] sending signal 15
SIGTERM to child 96921
[01-Oct-2013 12:20:17.139031] DEBUG: pid 96910, fpm_pctl_kill_all(), line
159:
[pool www] sending signal 15
SIGTERM to child 96920
[01-Oct-2013 12:20:17.139043] DEBUG: pid 96910, fpm_pctl_kill_all(), line
168: 2
child(ren) still alive
[01-Oct-2013 12:20:17.139069] DEBUG: pid 96910, fpm_event_loop(), line 419:
event
module triggered 1 events
[01-Oct-2013 12:20:17.139796] DEBUG: pid 96910, fpm_got_signal(), line 76:
received SIGCHLD
[01-Oct-2013 12:20:17.139834] DEBUG: pid 96910, fpm_children_bury(), line
252:
[pool www] child 96920 exited on
signal 2 (SIGINT) after 5.089324 seconds from start
[01-Oct-2013 12:20:17.139876] DEBUG: pid 96910, fpm_children_bury(), line
252:
[pool www] child 96921 exited on
signal 2 (SIGINT) after 5.088997 seconds from start
[01-Oct-2013 12:20:17.139906] NOTICE: pid 96910, fpm_pctl_exit(), line 70:
exiting, bye-bye!
This has been a problem on all FPM-capable PHP branches (including current
trunk)
for at least a year now.
Configure phase seems irrelevant, but for the record:
# ./configure --enable-fpm --with-fpm-user=www --with-fpm-group=www
--with-config-
file-path=/opt/local/etc --with-
config-file-scan-dir=/opt/local/etc/php.d --sysconfdir=/opt/local/etc --
localstatedir=/var --disable-all --
prefix=/opt/local
--
Edit bug report at https://bugs.php.net/bug.php?id=65800&edit=1
--
Try a snapshot (PHP 5.4):
https://bugs.php.net/fix.php?id=65800&r=trysnapshot54
Try a snapshot (PHP 5.5):
https://bugs.php.net/fix.php?id=65800&r=trysnapshot55
Try a snapshot (trunk):
https://bugs.php.net/fix.php?id=65800&r=trysnapshottrunk
Fixed in SVN: https://bugs.php.net/fix.php?id=65800&r=fixed
Fixed in release: https://bugs.php.net/fix.php?id=65800&r=alreadyfixed
Need backtrace: https://bugs.php.net/fix.php?id=65800&r=needtrace
Need Reproduce Script: https://bugs.php.net/fix.php?id=65800&r=needscript
Try newer version: https://bugs.php.net/fix.php?id=65800&r=oldversion
Not developer issue: https://bugs.php.net/fix.php?id=65800&r=support
Expected behavior: https://bugs.php.net/fix.php?id=65800&r=notwrong
Not enough info:
https://bugs.php.net/fix.php?id=65800&r=notenoughinfo
Submitted twice:
https://bugs.php.net/fix.php?id=65800&r=submittedtwice
register_globals: https://bugs.php.net/fix.php?id=65800&r=globals
PHP 4 support discontinued: https://bugs.php.net/fix.php?id=65800&r=php4
Daylight Savings: https://bugs.php.net/fix.php?id=65800&r=dst
IIS Stability: https://bugs.php.net/fix.php?id=65800&r=isapi
Install GNU Sed: https://bugs.php.net/fix.php?id=65800&r=gnused
Floating point limitations: https://bugs.php.net/fix.php?id=65800&r=float
No Zend Extensions: https://bugs.php.net/fix.php?id=65800&r=nozend
MySQL Configuration Error: https://bugs.php.net/fix.php?id=65800&r=mysqlcfg