We will then have to look into what is happening in the step (probably add
debugging code):

Warm-up cache process kicks off with 2 long running requests (45 seconds
each). This is a PHP application running under mod_php - each process grows
up to 700 MB, so the application kills the httpd child process at the end
to release the memory, using posix_kill(PID, 28).


Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deic...@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Fri, Oct 22, 2021 at 3:07 PM Patrick Verdon <patrick.ver...@youreko.com>
wrote:

> Correct.
>
>
> On Fri, 22 Oct 2021 at 10:35, Deepak Goel <deic...@gmail.com> wrote:
>
>> I guess what you are saying is that the following error happens during
>> startup and not during normal operation
>>
>> ( [Sun Oct 17 15:53:49.244527 2021] [mpm_prefork:error] [pid 3581]
>> AH00161: server reached MaxRequestWorkers setting, consider raising the
>> MaxRequestWorkers setting)
>>
>>
>> Deepak
>> "The greatness of a nation can be judged by the way its animals are
>> treated - Mahatma Gandhi"
>>
>> +91 73500 12833
>> deic...@gmail.com
>>
>> Facebook: https://www.facebook.com/deicool
>> LinkedIn: www.linkedin.com/in/deicool
>>
>> "Plant a Tree, Go Green"
>>
>> Make In India : http://www.makeinindia.com/home
>>
>>
>> On Fri, Oct 22, 2021 at 2:23 PM Patrick Verdon <
>> patrick.ver...@youreko.com> wrote:
>>
>>> Hi Yann,
>>>
>>> Quick update - we've enabled the core dumps but haven't been able to
>>> reproduce the issue. After removing mod_http2 the first time we were able
>>> to trigger the crash after 14 attempts but we've since tried over 100 times
>>> with no luck. We'll keep trying as there's nothing worse than knowing
>>> there's a bug lurking that can cause a crash.
>>>
>>> @Deepak - thanks for your suggestion but hitting MaxRequestWorkers is a
>>> quirk of our installation, we load the max workers on startup so that the
>>> PHP application is primed and ready, rather than have Apache spawn lots of
>>> heavy processes. This is the same configuration we've had for years without
>>> ever experiencing Apache hanging until the upgrade to 2.4.48.
>>>
>>> Thanks.
>>>
>>> Patrick
>>>
>>> *--*
>>>
>>> *Patrick Verdon  |  Founder*
>>> Web: www.youreko.com
>>> Mobile: +44 (0)7809 296438
>>> Skype: patrick_verdon
>>>
>>> This entire communication is sent on behalf of
>>> Youreko Ltd and is strictly confidential to and
>>> for the sole use of the intended addressee.
>>>
>>> Registered in England - 7448349
>>>
>>>
>>>
>>> On Tue, 19 Oct 2021 at 11:00, Deepak Goel <deic...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> Looks like the step 2 in your process is not working in the upgraded
>>>> version of apache.
>>>>
>>>> Therefore it is vomiting out the following:
>>>>  server reached MaxRequestWorkers setting, consider raising the
>>>> MaxRequestWorkers setting
>>>>
>>>> Deepak
>>>> "The greatness of a nation can be judged by the way its animals are
>>>> treated - Mahatma Gandhi"
>>>>
>>>> +91 73500 12833
>>>> deic...@gmail.com
>>>>
>>>> Facebook: https://www.facebook.com/deicool
>>>> LinkedIn: www.linkedin.com/in/deicool
>>>>
>>>> "Plant a Tree, Go Green"
>>>>
>>>> Make In India : http://www.makeinindia.com/home
>>>>
>>>>
>>>> On Mon, Oct 18, 2021 at 2:57 PM Patrick Verdon <
>>>> patrick.ver...@youreko.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> I'd appreciate some feedback on an issue I'm experiencing. I've spent
>>>>> quite some time researching the problem as it causes a serious outage in
>>>>> our application. I've searched the Web, Stack Overflow, this list's mail
>>>>> archives, the latest Apache bugs, and more, but have not been able to find
>>>>> any reports of a similar issue.
>>>>>
>>>>> Background. I'm running the latest Apache 2.4.51 on Amazon Linux with
>>>>> mod_proxy, mod_php and mod_ssl with varnish in front. Some requests to our
>>>>> application take about 45 seconds to complete so there is a warm-up cache
>>>>> procedure at regular intervals during the day which primes the varnish
>>>>> cache. The following steps reliably cause Apache to hang, requiring a
>>>>> manual restart:
>>>>>
>>>>>    1. Varnish cache is cleared, causing spike in load on httpd
>>>>>    2. Warm-up cache process kicks off with 2 long running requests
>>>>>    (45 seconds each). This is a PHP application running under mod_php - 
>>>>> each
>>>>>    process grows up to 700 MB, so the application kills the httpd child
>>>>>    process at the end to release the memory, using posix_kill(PID, 28).
>>>>>    3. Apache hangs and does not recover. Varnish serves 503s.
>>>>>    4. Manual restart required: service httpd restart
>>>>>    5. Errors in the log show that 2 children had segmentation faults,
>>>>>    presumably the 2 with long running processes.
>>>>>
>>>>>
>>>>> Albeit ugly, this process has been running for a year and a half
>>>>> without any issues. We traced the date that crashes started to the date
>>>>> Apache was upgraded from version 2.4.46 to 2.4.48 and as you can see it's
>>>>> still an issue in 2.4.51.
>>>>>
>>>>> See the error_log below and details about the installation.
>>>>>
>>>>> Any feedback on where to report this issue would be much appreciated.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Patrick
>>>>>
>>>>> --
>>>>>
>>>>> # cat /var/log/httpd/error_log
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> *** Error in `/usr/sbin/httpd': corrupted size vs. prev_size:
>>>>> 0x0000557f94567e4f ***
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> httpd: misc/apr_reslist.c:161: reslist_cleanup: Assertion `rl->ntotal
>>>>> == 0' failed.
>>>>> [Sun Oct 17 15:53:47.990497 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 3166 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990531 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 3483 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990545 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2657 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990557 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2660 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990568 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2661 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990579 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 3172 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990592 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2681 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990603 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 3254 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990615 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2685 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990627 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2688 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990639 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 3015 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990652 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2696 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990664 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2699 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990680 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2710 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990692 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2713 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990703 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 3250 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990716 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2721 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990726 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2724 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990739 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2734 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990750 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 3471 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990769 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 3109 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:47.990781 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2741 exit signal Segmentation fault (11)
>>>>> *** Error in `/usr/sbin/httpd': corrupted size vs. prev_size:
>>>>> 0x0000557f94567e4f ***
>>>>> [Sun Oct 17 15:53:48.056539 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 3019 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:48.056584 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2707 exit signal Segmentation fault (11)
>>>>> [Sun Oct 17 15:53:48.056599 2021] [core:notice] [pid 2620] AH00052:
>>>>> child pid 2727 exit signal Aborted (6)
>>>>> [Sun Oct 17 15:53:48.056667 2021] [mpm_prefork:notice] [pid 2620]
>>>>> AH00169: caught SIGTERM, shutting down
>>>>> [Sun Oct 17 15:53:48.151770 2021] [suexec:notice] [pid 3575] AH01232:
>>>>> suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
>>>>> [Sun Oct 17 15:53:48.180621 2021] [http2:warn] [pid 3581] AH10034: The
>>>>> mpm module (prefork.c) is not supported by mod_http2. The mpm determines
>>>>> how things are processed in your server. HTTP/2 has more demands in this
>>>>> regard and the currently selected mpm will just not do. This is an 
>>>>> advisory
>>>>> warning. Your server will continue to work, but the HTTP/2 protocol will 
>>>>> be
>>>>> inactive.
>>>>> [Sun Oct 17 15:53:48.181146 2021] [lbmethod_heartbeat:notice] [pid
>>>>> 3581] AH02282: No slotmem from mod_heartmonitor
>>>>> [Sun Oct 17 15:53:48.243891 2021] [mpm_prefork:notice] [pid 3581]
>>>>> AH00163: Apache/2.4.51 (Amazon) OpenSSL/1.0.2k-fips configured -- resuming
>>>>> normal operations
>>>>> [Sun Oct 17 15:53:48.243923 2021] [core:notice] [pid 3581] AH00094:
>>>>> Command line: '/usr/sbin/httpd'
>>>>> [Sun Oct 17 15:53:49.244527 2021] [mpm_prefork:error] [pid 3581]
>>>>> AH00161: server reached MaxRequestWorkers setting, consider raising the
>>>>> MaxRequestWorkers setting
>>>>>
>>>>> # httpd -V
>>>>> Server version: Apache/2.4.51 (Amazon)
>>>>> Server built:   Oct  8 2021 19:30:47
>>>>> Server's Module Magic Number: 20120211:118
>>>>> Server loaded:  APR 1.6.3, APR-UTIL 1.5.4
>>>>> Compiled using: APR 1.6.3, APR-UTIL 1.5.4
>>>>> Architecture:   64-bit
>>>>> Server MPM:     prefork
>>>>>   threaded:     no
>>>>>     forked:     yes (variable process count)
>>>>> Server compiled with....
>>>>>  -D APR_HAS_SENDFILE
>>>>>  -D APR_HAS_MMAP
>>>>>  -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
>>>>>  -D APR_USE_SYSVSEM_SERIALIZE
>>>>>  -D APR_USE_PTHREAD_SERIALIZE
>>>>>  -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
>>>>>  -D APR_HAS_OTHER_CHILD
>>>>>  -D AP_HAVE_RELIABLE_PIPED_LOGS
>>>>>  -D DYNAMIC_MODULE_LIMIT=256
>>>>>  -D HTTPD_ROOT="/etc/httpd"
>>>>>  -D SUEXEC_BIN="/usr/sbin/suexec"
>>>>>  -D DEFAULT_PIDLOG="/var/run/httpd/httpd.pid"
>>>>>  -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
>>>>>  -D DEFAULT_ERRORLOG="logs/error_log"
>>>>>  -D AP_TYPES_CONFIG_FILE="conf/mime.types"
>>>>>  -D SERVER_CONFIG_FILE="conf/httpd.conf"
>>>>>
>>>>> # httpd -M
>>>>> Loaded Modules:
>>>>>  core_module (static)
>>>>>  so_module (static)
>>>>>  http_module (static)
>>>>>  access_compat_module (shared)
>>>>>  actions_module (shared)
>>>>>  alias_module (shared)
>>>>>  allowmethods_module (shared)
>>>>>  auth_basic_module (shared)
>>>>>  auth_digest_module (shared)
>>>>>  authn_anon_module (shared)
>>>>>  authn_core_module (shared)
>>>>>  authn_dbd_module (shared)
>>>>>  authn_dbm_module (shared)
>>>>>  authn_file_module (shared)
>>>>>  authn_socache_module (shared)
>>>>>  authz_core_module (shared)
>>>>>  authz_dbd_module (shared)
>>>>>  authz_dbm_module (shared)
>>>>>  authz_groupfile_module (shared)
>>>>>  authz_host_module (shared)
>>>>>  authz_owner_module (shared)
>>>>>  authz_user_module (shared)
>>>>>  autoindex_module (shared)
>>>>>  cache_module (shared)
>>>>>  cache_disk_module (shared)
>>>>>  cache_socache_module (shared)
>>>>>  data_module (shared)
>>>>>  dbd_module (shared)
>>>>>  deflate_module (shared)
>>>>>  dir_module (shared)
>>>>>  dumpio_module (shared)
>>>>>  echo_module (shared)
>>>>>  env_module (shared)
>>>>>  expires_module (shared)
>>>>>  ext_filter_module (shared)
>>>>>  filter_module (shared)
>>>>>  headers_module (shared)
>>>>>  http2_module (shared)
>>>>>  include_module (shared)
>>>>>  info_module (shared)
>>>>>  log_config_module (shared)
>>>>>  logio_module (shared)
>>>>>  macro_module (shared)
>>>>>  mime_magic_module (shared)
>>>>>  mime_module (shared)
>>>>>  negotiation_module (shared)
>>>>>  remoteip_module (shared)
>>>>>  reqtimeout_module (shared)
>>>>>  request_module (shared)
>>>>>  rewrite_module (shared)
>>>>>  setenvif_module (shared)
>>>>>  slotmem_plain_module (shared)
>>>>>  slotmem_shm_module (shared)
>>>>>  socache_dbm_module (shared)
>>>>>  socache_memcache_module (shared)
>>>>>  socache_shmcb_module (shared)
>>>>>  status_module (shared)
>>>>>  substitute_module (shared)
>>>>>  suexec_module (shared)
>>>>>  unixd_module (shared)
>>>>>  userdir_module (shared)
>>>>>  version_module (shared)
>>>>>  vhost_alias_module (shared)
>>>>>  watchdog_module (shared)
>>>>>  dav_module (shared)
>>>>>  dav_fs_module (shared)
>>>>>  dav_lock_module (shared)
>>>>>  lua_module (shared)
>>>>>  mpm_prefork_module (shared)
>>>>>  proxy_module (shared)
>>>>>  lbmethod_bybusyness_module (shared)
>>>>>  lbmethod_byrequests_module (shared)
>>>>>  lbmethod_bytraffic_module (shared)
>>>>>  lbmethod_heartbeat_module (shared)
>>>>>  proxy_ajp_module (shared)
>>>>>  proxy_balancer_module (shared)
>>>>>  proxy_connect_module (shared)
>>>>>  proxy_express_module (shared)
>>>>>  proxy_fcgi_module (shared)
>>>>>  proxy_fdpass_module (shared)
>>>>>  proxy_ftp_module (shared)
>>>>>  proxy_http_module (shared)
>>>>>  proxy_hcheck_module (shared)
>>>>>  proxy_scgi_module (shared)
>>>>>  proxy_uwsgi_module (shared)
>>>>>  proxy_wstunnel_module (shared)
>>>>>  ssl_module (shared)
>>>>>  cgi_module (shared)
>>>>>  php7_module (shared)
>>>>>
>>>>> # yum list | grep mod_
>>>>> lighttpd-mod_authn_gssapi.x86_64     1.4.53-1.36.amzn1
>>>>> amzn-updates
>>>>> lighttpd-mod_authn_mysql.x86_64      1.4.53-1.36.amzn1
>>>>> amzn-updates
>>>>> lighttpd-mod_authn_pam.x86_64        1.4.53-1.36.amzn1
>>>>> amzn-updates
>>>>> lighttpd-mod_geoip.x86_64            1.4.53-1.36.amzn1
>>>>> amzn-updates
>>>>> lighttpd-mod_mysql_vhost.x86_64      1.4.53-1.36.amzn1
>>>>> amzn-updates
>>>>> mod_auth_kerb.x86_64                 5.4-10.9.amzn1
>>>>>  amzn-main
>>>>> mod_auth_mellon.x86_64               0.13.1-1.6.amzn1
>>>>>  amzn-updates
>>>>> mod_auth_mysql.x86_64                1:3.0.0-18.10.amzn1
>>>>> amzn-main
>>>>> mod_auth_pgsql.x86_64                2.0.3-10.1.5.amzn1
>>>>>  amzn-main
>>>>> mod_authz_ldap.x86_64                0.26-16.8.amzn1
>>>>> amzn-main
>>>>> mod_dav_svn.x86_64                   1.9.7-1.54.amzn1
>>>>>  amzn-main
>>>>> mod_fcgid.x86_64                     2.3.9-1.6.amzn1
>>>>> amzn-main
>>>>> mod_geoip.x86_64                     1.2.7-1.2.amzn1
>>>>> amzn-main
>>>>> mod_nss.x86_64                       1.0.10-1.13.amzn1
>>>>> amzn-main
>>>>> mod_perl.x86_64                      2.0.7-7.28.amzn1
>>>>>  amzn-updates
>>>>> mod_perl-devel.x86_64                2.0.7-7.28.amzn1
>>>>>  amzn-updates
>>>>> mod_proxy_html.x86_64                3.1.2-7.3.amzn1
>>>>> amzn-main
>>>>> mod_python26.x86_64                  3.3.1-17.20.amzn1
>>>>> amzn-main
>>>>> mod_python27.x86_64                  3.3.1-17.20.amzn1
>>>>> amzn-main
>>>>> mod_security.x86_64                  2.8.0-5.27.amzn1
>>>>>  amzn-main
>>>>> mod_security_crs.noarch              2.2.8-2.5.amzn1
>>>>> amzn-main
>>>>> mod_security_crs-extras.noarch       2.2.8-2.5.amzn1
>>>>> amzn-main
>>>>> mod_ssl.x86_64                       1:2.2.34-1.16.amzn1
>>>>> amzn-main
>>>>> mod_wsgi-python26.x86_64             3.2-6.12.amzn1
>>>>>  amzn-updates
>>>>> mod_wsgi-python27.x86_64             3.2-6.12.amzn1
>>>>>  amzn-updates
>>>>>
>>>>> *--*
>>>>>
>>>>> *Patrick Verdon  |  Founder*
>>>>> Web: www.youreko.com
>>>>> Mobile: +44 (0)7809 296438
>>>>> Skype: patrick_verdon
>>>>>
>>>>> This entire communication is sent on behalf of
>>>>> Youreko Ltd and is strictly confidential to and
>>>>> for the sole use of the intended addressee.
>>>>>
>>>>> Registered in England - 7448349
>>>>>
>>>>>
>>>>

Reply via email to