> Op 18 okt. 2021 om 11:27 heeft Patrick Verdon <patrick.ver...@youreko.com> 
> het volgende geschreven:
> 
> 
> Hi All,
> 
> I'd appreciate some feedback on an issue I'm experiencing. I've spent quite 
> some time researching the problem as it causes a serious outage in our 
> application. I've searched the Web, Stack Overflow, this list's mail 
> archives, the latest Apache bugs, and more, but have not been able to find 
> any reports of a similar issue.
> 
> Background. I'm running the latest Apache 2.4.51 on Amazon Linux with 
> mod_proxy, mod_php and mod_ssl with varnish in front. Some requests to our 
> application take about 45 seconds to complete so there is a warm-up cache 
> procedure at regular intervals during the day which primes the varnish cache. 
> The following steps reliably cause Apache to hang, requiring a manual restart:
> Varnish cache is cleared, causing spike in load on httpd
> Warm-up cache process kicks off with 2 long running requests (45 seconds 
> each). This is a PHP application running under mod_php - each process grows 
> up to 700 MB, so the application kills the httpd child process at the end to 
> release the memory, using posix_kill(PID, 28).
> Apache hangs and does not recover. Varnish serves 503s.

Step 3 seems to be a result of the failed step 2. Have you tried stracing the 
long-running script on CLI?

> Manual restart required: service httpd restart
> Errors in the log show that 2 children had segmentation faults, presumably 
> the 2 with long running processes.
> 
> Albeit ugly, this process has been running for a year and a half without any 
> issues. We traced the date that crashes started to the date Apache was 
> upgraded from version 2.4.46 to 2.4.48 and as you can see it's still an issue 
> in 2.4.51.
> 
> See the error_log below and details about the installation.
> 
> Any feedback on where to report this issue would be much appreciated.
> 
> Thanks.
> 
> Patrick
> 
> --
> 
> # cat /var/log/httpd/error_log
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> *** Error in `/usr/sbin/httpd': corrupted size vs. prev_size: 
> 0x0000557f94567e4f ***
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == 0' 
> failed.
> [Sun Oct 17 15:53:47.990497 2021] [core:notice] [pid 2620] AH00052: child pid 
> 3166 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990531 2021] [core:notice] [pid 2620] AH00052: child pid 
> 3483 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990545 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2657 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990557 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2660 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990568 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2661 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990579 2021] [core:notice] [pid 2620] AH00052: child pid 
> 3172 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990592 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2681 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990603 2021] [core:notice] [pid 2620] AH00052: child pid 
> 3254 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990615 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2685 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990627 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2688 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990639 2021] [core:notice] [pid 2620] AH00052: child pid 
> 3015 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990652 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2696 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990664 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2699 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990680 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2710 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990692 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2713 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990703 2021] [core:notice] [pid 2620] AH00052: child pid 
> 3250 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990716 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2721 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990726 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2724 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990739 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2734 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990750 2021] [core:notice] [pid 2620] AH00052: child pid 
> 3471 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990769 2021] [core:notice] [pid 2620] AH00052: child pid 
> 3109 exit signal Aborted (6)
> [Sun Oct 17 15:53:47.990781 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2741 exit signal Segmentation fault (11)
> *** Error in `/usr/sbin/httpd': corrupted size vs. prev_size: 
> 0x0000557f94567e4f ***
> [Sun Oct 17 15:53:48.056539 2021] [core:notice] [pid 2620] AH00052: child pid 
> 3019 exit signal Aborted (6)
> [Sun Oct 17 15:53:48.056584 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2707 exit signal Segmentation fault (11)
> [Sun Oct 17 15:53:48.056599 2021] [core:notice] [pid 2620] AH00052: child pid 
> 2727 exit signal Aborted (6)
> [Sun Oct 17 15:53:48.056667 2021] [mpm_prefork:notice] [pid 2620] AH00169: 
> caught SIGTERM, shutting down
> [Sun Oct 17 15:53:48.151770 2021] [suexec:notice] [pid 3575] AH01232: suEXEC 
> mechanism enabled (wrapper: /usr/sbin/suexec)
> [Sun Oct 17 15:53:48.180621 2021] [http2:warn] [pid 3581] AH10034: The mpm 
> module (prefork.c) is not supported by mod_http2. The mpm determines how 
> things are processed in your server. HTTP/2 has more demands in this regard 
> and the currently selected mpm will just not do. This is an advisory warning. 
> Your server will continue to work, but the HTTP/2 protocol will be inactive.
> [Sun Oct 17 15:53:48.181146 2021] [lbmethod_heartbeat:notice] [pid 3581] 
> AH02282: No slotmem from mod_heartmonitor
> [Sun Oct 17 15:53:48.243891 2021] [mpm_prefork:notice] [pid 3581] AH00163: 
> Apache/2.4.51 (Amazon) OpenSSL/1.0.2k-fips configured -- resuming normal 
> operations
> [Sun Oct 17 15:53:48.243923 2021] [core:notice] [pid 3581] AH00094: Command 
> line: '/usr/sbin/httpd'
> [Sun Oct 17 15:53:49.244527 2021] [mpm_prefork:error] [pid 3581] AH00161: 
> server reached MaxRequestWorkers setting, consider raising the 
> MaxRequestWorkers setting
> 
> # httpd -V
> Server version: Apache/2.4.51 (Amazon)
> Server built:   Oct  8 2021 19:30:47
> Server's Module Magic Number: 20120211:118
> Server loaded:  APR 1.6.3, APR-UTIL 1.5.4
> Compiled using: APR 1.6.3, APR-UTIL 1.5.4
> Architecture:   64-bit
> Server MPM:     prefork
>   threaded:     no
>     forked:     yes (variable process count)
> Server compiled with....
>  -D APR_HAS_SENDFILE
>  -D APR_HAS_MMAP
>  -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
>  -D APR_USE_SYSVSEM_SERIALIZE
>  -D APR_USE_PTHREAD_SERIALIZE
>  -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
>  -D APR_HAS_OTHER_CHILD
>  -D AP_HAVE_RELIABLE_PIPED_LOGS
>  -D DYNAMIC_MODULE_LIMIT=256
>  -D HTTPD_ROOT="/etc/httpd"
>  -D SUEXEC_BIN="/usr/sbin/suexec"
>  -D DEFAULT_PIDLOG="/var/run/httpd/httpd.pid"
>  -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
>  -D DEFAULT_ERRORLOG="logs/error_log"
>  -D AP_TYPES_CONFIG_FILE="conf/mime.types"
>  -D SERVER_CONFIG_FILE="conf/httpd.conf"
> 
> # httpd -M
> Loaded Modules:
>  core_module (static)
>  so_module (static)
>  http_module (static)
>  access_compat_module (shared)
>  actions_module (shared)
>  alias_module (shared)
>  allowmethods_module (shared)
>  auth_basic_module (shared)
>  auth_digest_module (shared)
>  authn_anon_module (shared)
>  authn_core_module (shared)
>  authn_dbd_module (shared)
>  authn_dbm_module (shared)
>  authn_file_module (shared)
>  authn_socache_module (shared)
>  authz_core_module (shared)
>  authz_dbd_module (shared)
>  authz_dbm_module (shared)
>  authz_groupfile_module (shared)
>  authz_host_module (shared)
>  authz_owner_module (shared)
>  authz_user_module (shared)
>  autoindex_module (shared)
>  cache_module (shared)
>  cache_disk_module (shared)
>  cache_socache_module (shared)
>  data_module (shared)
>  dbd_module (shared)
>  deflate_module (shared)
>  dir_module (shared)
>  dumpio_module (shared)
>  echo_module (shared)
>  env_module (shared)
>  expires_module (shared)
>  ext_filter_module (shared)
>  filter_module (shared)
>  headers_module (shared)
>  http2_module (shared)
>  include_module (shared)
>  info_module (shared)
>  log_config_module (shared)
>  logio_module (shared)
>  macro_module (shared)
>  mime_magic_module (shared)
>  mime_module (shared)
>  negotiation_module (shared)
>  remoteip_module (shared)
>  reqtimeout_module (shared)
>  request_module (shared)
>  rewrite_module (shared)
>  setenvif_module (shared)
>  slotmem_plain_module (shared)
>  slotmem_shm_module (shared)
>  socache_dbm_module (shared)
>  socache_memcache_module (shared)
>  socache_shmcb_module (shared)
>  status_module (shared)
>  substitute_module (shared)
>  suexec_module (shared)
>  unixd_module (shared)
>  userdir_module (shared)
>  version_module (shared)
>  vhost_alias_module (shared)
>  watchdog_module (shared)
>  dav_module (shared)
>  dav_fs_module (shared)
>  dav_lock_module (shared)
>  lua_module (shared)
>  mpm_prefork_module (shared)
>  proxy_module (shared)
>  lbmethod_bybusyness_module (shared)
>  lbmethod_byrequests_module (shared)
>  lbmethod_bytraffic_module (shared)
>  lbmethod_heartbeat_module (shared)
>  proxy_ajp_module (shared)
>  proxy_balancer_module (shared)
>  proxy_connect_module (shared)
>  proxy_express_module (shared)
>  proxy_fcgi_module (shared)
>  proxy_fdpass_module (shared)
>  proxy_ftp_module (shared)
>  proxy_http_module (shared)
>  proxy_hcheck_module (shared)
>  proxy_scgi_module (shared)
>  proxy_uwsgi_module (shared)
>  proxy_wstunnel_module (shared)
>  ssl_module (shared)
>  cgi_module (shared)
>  php7_module (shared)
> 
> # yum list | grep mod_
> lighttpd-mod_authn_gssapi.x86_64     1.4.53-1.36.amzn1             
> amzn-updates
> lighttpd-mod_authn_mysql.x86_64      1.4.53-1.36.amzn1             
> amzn-updates
> lighttpd-mod_authn_pam.x86_64        1.4.53-1.36.amzn1             
> amzn-updates
> lighttpd-mod_geoip.x86_64            1.4.53-1.36.amzn1             
> amzn-updates
> lighttpd-mod_mysql_vhost.x86_64      1.4.53-1.36.amzn1             
> amzn-updates
> mod_auth_kerb.x86_64                 5.4-10.9.amzn1                amzn-main
> mod_auth_mellon.x86_64               0.13.1-1.6.amzn1              
> amzn-updates
> mod_auth_mysql.x86_64                1:3.0.0-18.10.amzn1           amzn-main
> mod_auth_pgsql.x86_64                2.0.3-10.1.5.amzn1            amzn-main
> mod_authz_ldap.x86_64                0.26-16.8.amzn1               amzn-main
> mod_dav_svn.x86_64                   1.9.7-1.54.amzn1              amzn-main
> mod_fcgid.x86_64                     2.3.9-1.6.amzn1               amzn-main
> mod_geoip.x86_64                     1.2.7-1.2.amzn1               amzn-main
> mod_nss.x86_64                       1.0.10-1.13.amzn1             amzn-main
> mod_perl.x86_64                      2.0.7-7.28.amzn1              
> amzn-updates
> mod_perl-devel.x86_64                2.0.7-7.28.amzn1              
> amzn-updates
> mod_proxy_html.x86_64                3.1.2-7.3.amzn1               amzn-main
> mod_python26.x86_64                  3.3.1-17.20.amzn1             amzn-main
> mod_python27.x86_64                  3.3.1-17.20.amzn1             amzn-main
> mod_security.x86_64                  2.8.0-5.27.amzn1              amzn-main
> mod_security_crs.noarch              2.2.8-2.5.amzn1               amzn-main
> mod_security_crs-extras.noarch       2.2.8-2.5.amzn1               amzn-main
> mod_ssl.x86_64                       1:2.2.34-1.16.amzn1           amzn-main
> mod_wsgi-python26.x86_64             3.2-6.12.amzn1                
> amzn-updates
> mod_wsgi-python27.x86_64             3.2-6.12.amzn1                
> amzn-updates
> 
> --
> 
> Patrick Verdon  |  Founder
> Web: www.youreko.com
> Mobile: +44 (0)7809 296438
> Skype: patrick_verdon
> 
> This entire communication is sent on behalf of 
> Youreko Ltd and is strictly confidential to and 
> for the sole use of the intended addressee.
> 
> Registered in England - 7448349  
>  

Reply via email to