Hi Yann, Quick update - we've enabled the core dumps but haven't been able to reproduce the issue. After removing mod_http2 the first time we were able to trigger the crash after 14 attempts but we've since tried over 100 times with no luck. We'll keep trying as there's nothing worse than knowing there's a bug lurking that can cause a crash.
@Deepak - thanks for your suggestion but hitting MaxRequestWorkers is a quirk of our installation, we load the max workers on startup so that the PHP application is primed and ready, rather than have Apache spawn lots of heavy processes. This is the same configuration we've had for years without ever experiencing Apache hanging until the upgrade to 2.4.48. Thanks. Patrick *--* *Patrick Verdon | Founder* Web: www.youreko.com Mobile: +44 (0)7809 296438 Skype: patrick_verdon This entire communication is sent on behalf of Youreko Ltd and is strictly confidential to and for the sole use of the intended addressee. Registered in England - 7448349 On Tue, 19 Oct 2021 at 11:00, Deepak Goel <deic...@gmail.com> wrote: > Hi > > Looks like the step 2 in your process is not working in the upgraded > version of apache. > > Therefore it is vomiting out the following: > server reached MaxRequestWorkers setting, consider raising the > MaxRequestWorkers setting > > Deepak > "The greatness of a nation can be judged by the way its animals are > treated - Mahatma Gandhi" > > +91 73500 12833 > deic...@gmail.com > > Facebook: https://www.facebook.com/deicool > LinkedIn: www.linkedin.com/in/deicool > > "Plant a Tree, Go Green" > > Make In India : http://www.makeinindia.com/home > > > On Mon, Oct 18, 2021 at 2:57 PM Patrick Verdon <patrick.ver...@youreko.com> > wrote: > >> Hi All, >> >> I'd appreciate some feedback on an issue I'm experiencing. I've spent >> quite some time researching the problem as it causes a serious outage in >> our application. I've searched the Web, Stack Overflow, this list's mail >> archives, the latest Apache bugs, and more, but have not been able to find >> any reports of a similar issue. >> >> Background. I'm running the latest Apache 2.4.51 on Amazon Linux with >> mod_proxy, mod_php and mod_ssl with varnish in front. Some requests to our >> application take about 45 seconds to complete so there is a warm-up cache >> procedure at regular intervals during the day which primes the varnish >> cache. The following steps reliably cause Apache to hang, requiring a >> manual restart: >> >> 1. Varnish cache is cleared, causing spike in load on httpd >> 2. Warm-up cache process kicks off with 2 long running requests (45 >> seconds each). This is a PHP application running under mod_php - each >> process grows up to 700 MB, so the application kills the httpd child >> process at the end to release the memory, using posix_kill(PID, 28). >> 3. Apache hangs and does not recover. Varnish serves 503s. >> 4. Manual restart required: service httpd restart >> 5. Errors in the log show that 2 children had segmentation faults, >> presumably the 2 with long running processes. >> >> >> Albeit ugly, this process has been running for a year and a half without >> any issues. We traced the date that crashes started to the date Apache was >> upgraded from version 2.4.46 to 2.4.48 and as you can see it's still an >> issue in 2.4.51. >> >> See the error_log below and details about the installation. >> >> Any feedback on where to report this issue would be much appreciated. >> >> Thanks. >> >> Patrick >> >> -- >> >> # cat /var/log/httpd/error_log >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> *** Error in `/usr/sbin/httpd': corrupted size vs. prev_size: >> 0x0000557f94567e4f *** >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal == >> 0' failed. >> [Sun Oct 17 15:53:47.990497 2021] [core:notice] [pid 2620] AH00052: child >> pid 3166 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990531 2021] [core:notice] [pid 2620] AH00052: child >> pid 3483 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990545 2021] [core:notice] [pid 2620] AH00052: child >> pid 2657 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990557 2021] [core:notice] [pid 2620] AH00052: child >> pid 2660 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990568 2021] [core:notice] [pid 2620] AH00052: child >> pid 2661 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990579 2021] [core:notice] [pid 2620] AH00052: child >> pid 3172 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990592 2021] [core:notice] [pid 2620] AH00052: child >> pid 2681 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990603 2021] [core:notice] [pid 2620] AH00052: child >> pid 3254 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990615 2021] [core:notice] [pid 2620] AH00052: child >> pid 2685 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990627 2021] [core:notice] [pid 2620] AH00052: child >> pid 2688 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990639 2021] [core:notice] [pid 2620] AH00052: child >> pid 3015 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990652 2021] [core:notice] [pid 2620] AH00052: child >> pid 2696 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990664 2021] [core:notice] [pid 2620] AH00052: child >> pid 2699 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990680 2021] [core:notice] [pid 2620] AH00052: child >> pid 2710 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990692 2021] [core:notice] [pid 2620] AH00052: child >> pid 2713 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990703 2021] [core:notice] [pid 2620] AH00052: child >> pid 3250 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990716 2021] [core:notice] [pid 2620] AH00052: child >> pid 2721 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990726 2021] [core:notice] [pid 2620] AH00052: child >> pid 2724 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990739 2021] [core:notice] [pid 2620] AH00052: child >> pid 2734 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990750 2021] [core:notice] [pid 2620] AH00052: child >> pid 3471 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990769 2021] [core:notice] [pid 2620] AH00052: child >> pid 3109 exit signal Aborted (6) >> [Sun Oct 17 15:53:47.990781 2021] [core:notice] [pid 2620] AH00052: child >> pid 2741 exit signal Segmentation fault (11) >> *** Error in `/usr/sbin/httpd': corrupted size vs. prev_size: >> 0x0000557f94567e4f *** >> [Sun Oct 17 15:53:48.056539 2021] [core:notice] [pid 2620] AH00052: child >> pid 3019 exit signal Aborted (6) >> [Sun Oct 17 15:53:48.056584 2021] [core:notice] [pid 2620] AH00052: child >> pid 2707 exit signal Segmentation fault (11) >> [Sun Oct 17 15:53:48.056599 2021] [core:notice] [pid 2620] AH00052: child >> pid 2727 exit signal Aborted (6) >> [Sun Oct 17 15:53:48.056667 2021] [mpm_prefork:notice] [pid 2620] >> AH00169: caught SIGTERM, shutting down >> [Sun Oct 17 15:53:48.151770 2021] [suexec:notice] [pid 3575] AH01232: >> suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) >> [Sun Oct 17 15:53:48.180621 2021] [http2:warn] [pid 3581] AH10034: The >> mpm module (prefork.c) is not supported by mod_http2. The mpm determines >> how things are processed in your server. HTTP/2 has more demands in this >> regard and the currently selected mpm will just not do. This is an advisory >> warning. Your server will continue to work, but the HTTP/2 protocol will be >> inactive. >> [Sun Oct 17 15:53:48.181146 2021] [lbmethod_heartbeat:notice] [pid 3581] >> AH02282: No slotmem from mod_heartmonitor >> [Sun Oct 17 15:53:48.243891 2021] [mpm_prefork:notice] [pid 3581] >> AH00163: Apache/2.4.51 (Amazon) OpenSSL/1.0.2k-fips configured -- resuming >> normal operations >> [Sun Oct 17 15:53:48.243923 2021] [core:notice] [pid 3581] AH00094: >> Command line: '/usr/sbin/httpd' >> [Sun Oct 17 15:53:49.244527 2021] [mpm_prefork:error] [pid 3581] AH00161: >> server reached MaxRequestWorkers setting, consider raising the >> MaxRequestWorkers setting >> >> # httpd -V >> Server version: Apache/2.4.51 (Amazon) >> Server built: Oct 8 2021 19:30:47 >> Server's Module Magic Number: 20120211:118 >> Server loaded: APR 1.6.3, APR-UTIL 1.5.4 >> Compiled using: APR 1.6.3, APR-UTIL 1.5.4 >> Architecture: 64-bit >> Server MPM: prefork >> threaded: no >> forked: yes (variable process count) >> Server compiled with.... >> -D APR_HAS_SENDFILE >> -D APR_HAS_MMAP >> -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) >> -D APR_USE_SYSVSEM_SERIALIZE >> -D APR_USE_PTHREAD_SERIALIZE >> -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT >> -D APR_HAS_OTHER_CHILD >> -D AP_HAVE_RELIABLE_PIPED_LOGS >> -D DYNAMIC_MODULE_LIMIT=256 >> -D HTTPD_ROOT="/etc/httpd" >> -D SUEXEC_BIN="/usr/sbin/suexec" >> -D DEFAULT_PIDLOG="/var/run/httpd/httpd.pid" >> -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" >> -D DEFAULT_ERRORLOG="logs/error_log" >> -D AP_TYPES_CONFIG_FILE="conf/mime.types" >> -D SERVER_CONFIG_FILE="conf/httpd.conf" >> >> # httpd -M >> Loaded Modules: >> core_module (static) >> so_module (static) >> http_module (static) >> access_compat_module (shared) >> actions_module (shared) >> alias_module (shared) >> allowmethods_module (shared) >> auth_basic_module (shared) >> auth_digest_module (shared) >> authn_anon_module (shared) >> authn_core_module (shared) >> authn_dbd_module (shared) >> authn_dbm_module (shared) >> authn_file_module (shared) >> authn_socache_module (shared) >> authz_core_module (shared) >> authz_dbd_module (shared) >> authz_dbm_module (shared) >> authz_groupfile_module (shared) >> authz_host_module (shared) >> authz_owner_module (shared) >> authz_user_module (shared) >> autoindex_module (shared) >> cache_module (shared) >> cache_disk_module (shared) >> cache_socache_module (shared) >> data_module (shared) >> dbd_module (shared) >> deflate_module (shared) >> dir_module (shared) >> dumpio_module (shared) >> echo_module (shared) >> env_module (shared) >> expires_module (shared) >> ext_filter_module (shared) >> filter_module (shared) >> headers_module (shared) >> http2_module (shared) >> include_module (shared) >> info_module (shared) >> log_config_module (shared) >> logio_module (shared) >> macro_module (shared) >> mime_magic_module (shared) >> mime_module (shared) >> negotiation_module (shared) >> remoteip_module (shared) >> reqtimeout_module (shared) >> request_module (shared) >> rewrite_module (shared) >> setenvif_module (shared) >> slotmem_plain_module (shared) >> slotmem_shm_module (shared) >> socache_dbm_module (shared) >> socache_memcache_module (shared) >> socache_shmcb_module (shared) >> status_module (shared) >> substitute_module (shared) >> suexec_module (shared) >> unixd_module (shared) >> userdir_module (shared) >> version_module (shared) >> vhost_alias_module (shared) >> watchdog_module (shared) >> dav_module (shared) >> dav_fs_module (shared) >> dav_lock_module (shared) >> lua_module (shared) >> mpm_prefork_module (shared) >> proxy_module (shared) >> lbmethod_bybusyness_module (shared) >> lbmethod_byrequests_module (shared) >> lbmethod_bytraffic_module (shared) >> lbmethod_heartbeat_module (shared) >> proxy_ajp_module (shared) >> proxy_balancer_module (shared) >> proxy_connect_module (shared) >> proxy_express_module (shared) >> proxy_fcgi_module (shared) >> proxy_fdpass_module (shared) >> proxy_ftp_module (shared) >> proxy_http_module (shared) >> proxy_hcheck_module (shared) >> proxy_scgi_module (shared) >> proxy_uwsgi_module (shared) >> proxy_wstunnel_module (shared) >> ssl_module (shared) >> cgi_module (shared) >> php7_module (shared) >> >> # yum list | grep mod_ >> lighttpd-mod_authn_gssapi.x86_64 1.4.53-1.36.amzn1 >> amzn-updates >> lighttpd-mod_authn_mysql.x86_64 1.4.53-1.36.amzn1 >> amzn-updates >> lighttpd-mod_authn_pam.x86_64 1.4.53-1.36.amzn1 >> amzn-updates >> lighttpd-mod_geoip.x86_64 1.4.53-1.36.amzn1 >> amzn-updates >> lighttpd-mod_mysql_vhost.x86_64 1.4.53-1.36.amzn1 >> amzn-updates >> mod_auth_kerb.x86_64 5.4-10.9.amzn1 >> amzn-main >> mod_auth_mellon.x86_64 0.13.1-1.6.amzn1 >> amzn-updates >> mod_auth_mysql.x86_64 1:3.0.0-18.10.amzn1 >> amzn-main >> mod_auth_pgsql.x86_64 2.0.3-10.1.5.amzn1 >> amzn-main >> mod_authz_ldap.x86_64 0.26-16.8.amzn1 >> amzn-main >> mod_dav_svn.x86_64 1.9.7-1.54.amzn1 >> amzn-main >> mod_fcgid.x86_64 2.3.9-1.6.amzn1 >> amzn-main >> mod_geoip.x86_64 1.2.7-1.2.amzn1 >> amzn-main >> mod_nss.x86_64 1.0.10-1.13.amzn1 >> amzn-main >> mod_perl.x86_64 2.0.7-7.28.amzn1 >> amzn-updates >> mod_perl-devel.x86_64 2.0.7-7.28.amzn1 >> amzn-updates >> mod_proxy_html.x86_64 3.1.2-7.3.amzn1 >> amzn-main >> mod_python26.x86_64 3.3.1-17.20.amzn1 >> amzn-main >> mod_python27.x86_64 3.3.1-17.20.amzn1 >> amzn-main >> mod_security.x86_64 2.8.0-5.27.amzn1 >> amzn-main >> mod_security_crs.noarch 2.2.8-2.5.amzn1 >> amzn-main >> mod_security_crs-extras.noarch 2.2.8-2.5.amzn1 >> amzn-main >> mod_ssl.x86_64 1:2.2.34-1.16.amzn1 >> amzn-main >> mod_wsgi-python26.x86_64 3.2-6.12.amzn1 >> amzn-updates >> mod_wsgi-python27.x86_64 3.2-6.12.amzn1 >> amzn-updates >> >> *--* >> >> *Patrick Verdon | Founder* >> Web: www.youreko.com >> Mobile: +44 (0)7809 296438 >> Skype: patrick_verdon >> >> This entire communication is sent on behalf of >> Youreko Ltd and is strictly confidential to and >> for the sole use of the intended addressee. >> >> Registered in England - 7448349 >> >> >