Yeah the 408 is not easy to troubleshoot it might be anything from malformed HTTP header and slow client (i.e. legit DoS attack where a malicious client sends the headers and then delais sending the request body in order to keep the connection open and exhaust server side resources) to a legit network issue.
Apart from trying to find the clew in the apache debug logs I don't know what else to suggest. On Mon, Dec 20, 2021 at 11:09 AM Dan Washusen <d...@reactive.org> wrote: > Also worth noting that Nginx OS does not experience the issues (otherwise > same setup)... > > Dan > > > On Mon, Dec 20, 2021 at 10:58 AM Dan Washusen <d...@reactive.org> wrote: > >> Yeah, all those parameters are configured and everything works fine when >> I switch to the MPM worker module. The 408 errors are returned by Jetty and >> ProxyPass is configured to not reuse connections (which seems to suggest >> not a KeepAlive issue), Jetty is erroring because it receives no request >> data. Kind of seems like the MPM event module is missing events/data (when >> behind the ALB)... >> >> Given that the TCP network load balancer doesn't show the issue, maybe >> it's something to do with how the ALB sends the data to the instance (e.g. >> Jumbo frames vs the standard MTU)...? >> >> On Mon, Dec 20, 2021 at 10:37 AM Igor Cicimov <icici...@gmail.com> wrote: >> >>> I will still say this is a timeout issue. Set the ALB timeout to 3600 >>> which is the max possible in case you do not know where to start from. I >>> guess you already have ALB logs enabled. >>> >>> In apache check the KeepAliveTimeout, RequestReadTimeout and >>> ProxyTimeout and make sure they make sense for your user case. Enable debug >>> logs too for more details. >>> >>> It is tough to guess without knowing your relevant configuration but 408 >>> is usually caused by client connection being closed while keep-alive in use. >>> >>> On Mon, Dec 20, 2021 at 10:12 AM Dan Washusen <d...@reactive.org> wrote: >>> >>>> Thanks for the response. Timeouts are configured appropriately... >>>> >>>> To clarify; everything works fine through a TCP Network Load Balancer >>>> pointing at the same infrastructure. There is something about having a HTTP >>>> based Application Load Balancer in front of an MPM event configuration >>>> that's causing issues... >>>> >>>> Dan >>>> >>>> >>>> >>>> On Mon, Dec 20, 2021 at 10:01 AM Igor Cicimov <icici...@gmail.com> >>>> wrote: >>>> >>>>> In a proxy chains like this getting the timeouts in sync is the most >>>>> important thing. Make sure that you have done that. >>>>> >>>>> On Mon, 20 Dec 2021, 08:37 Dan Washusen, <d...@reactive.org> wrote: >>>>> >>>>>> Hi All, >>>>>> I've been experimenting with the MPM event module with Apache >>>>>> instances sitting behind an AWS Application Load Balancer (ALB) and it >>>>>> really doesn't seem to be working well. Response times shoot up (compared >>>>>> to MPM event worker) and we see a fair few 502 errors returned (by the >>>>>> AWS >>>>>> ALB. >>>>>> >>>>>> The basic layout is: AWS Application Load Balancer -> Apache 2.4.x -> >>>>>> AWS Internal TCP Load Balancer (NLB) -> Jetty App Servers >>>>>> >>>>>> Debugging the issue I think I traced it down to Jetty returning a 408 >>>>>> error because it can't read the request body in a timely manner. So it >>>>>> seems like for some reason MPM is sending the request body...? >>>>>> >>>>>> We're running Ubuntu 20.04 with Apache 2.4.41-4ubuntu3.8 with the >>>>>> following worker configuration: >>>>>> >>>>>> ServerLimit 250 >>>>>> StartServers 100 >>>>>> MinSpareThreads 75 >>>>>> MaxSpareThreads 250 >>>>>> ThreadLimit 64 >>>>>> ThreadsPerChild 64 >>>>>> MaxRequestWorkers 8000 >>>>>> >>>>>> I've come across several random posts mentioning that the MPM event >>>>>> module doesn't work behind an ALB but no-one seems to be into any detail. >>>>>> Anyone have some debugging/configuration suggestions? >>>>>> >>>>>> Thanks, >>>>>> Dan >>>>>> >>>>>> p.s. I've created a serverfault post showing graphs etc: >>>>>> https://serverfault.com/questions/1087747/apache-2-4-mpm-event-module-causing-intermittent-502-errors-and-slow-response-ti >>>>>> >>>>>