Chris, I was load testing using the ec2 load balancer dns. I have increased the connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I am not seeing connection timeout in nginx logs now. No errors in kernel.log I am not seeing any errors in tomcat catalina.out. During regular operations when the request count is between 4 to 6k requests per minute the open files count for the tomcat process is between 200 to 350. Responses from tomcat are within 5 seconds. If the requests count goes beyond 6.5 k open files slowly move up to 2300 to 3000 and the request responses from tomcat become slow.
I am not concerned about high open files as I do not see any errors related to open files. Only side effect of open files going above 700 is the response from tomcat is slow. I checked if this is caused from elastic search, aws cloud watch shows elastic search response is within 5 milliseconds. what might be the reason that when the open files goes beyond 600, it slows down the response time for tomcat. I tried with tomcat 9 and it's the same behavior On Tue, Nov 3, 2020 at 9:40 PM Christopher Schultz < ch...@christopherschultz.net> wrote: > Ayub, > > On 11/3/20 10:56, Ayub Khan wrote: > > *I'm curious about why you are using all of cloudflare and ALB and > > nginx.Seems like any one of those could provide what you are getting from > > all3 of them. * > > > > Cloudflare is doing just the DNS and nginx is doing ssl termination > > What do you mean "Cloudflare is doing just the DNS?" > > So what is ALB doing, then? > > > *What is the maximum number of simultaneous requests that one > nginxinstance > > will accept? What is the maximum number of simultaneous proxiedrequests > one > > nginx instance will make to a back-end Tomcat node? Howmany nginx nodes > do > > you have? How many Tomcat nodes? * > > > > We have 4 vms each having nginx and tomcat running on them and each > tomcat > > has nginx in front of them to proxy the requests. So it's one Nginx > > proxying to a dedicated tomcat on the same VM. > > Okay. > > > below is the tomcat connector configuration > > > > <Connector port="8080" > > connectionTimeout="60000" maxThreads="2000" > > protocol="org.apache.coyote.http11.Http11NioProtocol" > > URIEncoding="UTF-8" > > redirectPort="8443" /> > > 60 seconds is a *long* time for a connection timeout. > > Do you actually need 2000 threads? That's a lot, though not insane. 2000 > threads means you expect to handle 2000 concurrent (non-async, > non-Wewbsocket) requests. Do you need that (per node)? Are you expecting > 8000 concurrent requests? Does your load-balancer understand the > topography and current-load on any given node? > > > When I am doing a load test of 2000 concurrent users I see the open files > > increase to 10,320 and when I take thread dump I see the threads are in a > > waiting state.Slowly as the requests are completed I see the open files > > come down to normal levels. > > Are you performing your load-test against the CF/ALB/nginx/Tomcat stack, > or just hitting Tomcat (or nginx) directly? > > Are you using HTTP keepalive in your load-test (from the client to > whichever server is being contacted)? > > > The output of the below command is > > sudo cat /proc/sys/kernel/pid_max > > 131072 > > > > I am testing this on a c4.8xlarge VM in AWS. > > > > below is the config I changed in nginx.conf file > > > > events { > > worker_connections 50000; > > # multi_accept on; > > } > > This will allow 50k incoming connections, and Tomcat will accept an > unbounded number of connections (for NIO connector). So limiting your > threads to 2000 only means that the work of each request will be done in > groups of 2000. > > > worker_rlimit_nofile 30000; > > I'm not sure how many connections are handled by a single nginx worker. > If you accept 50k connections and only allow 30k file handles, you may > have a problem if that's all being done by a single worker. > > > What would be the ideal config for tomcat and Nginx so this setup on > > c4.8xlarge vm could serve at least 5k or 10k requests simultaneously > > without causing the open files to spike to 10K. > > You will never be able to serve 10k simultaneous requests without having > 10k open files on the server. If you mean 10k requests across the whole > 4-node environment, then I'd expect 10k requests to open (roughly) 2500 > open files on each server. And of course, you need all kinds of other > files open as well, from JAR files to DB connections or other network > connections. > > But each connection needs a file descriptor, full stop. If you need to > handle 10k connections, then you will need to make it possible to open > 10k file handles /just for incoming network connections/ for that > process. There is no way around it. > > Are you trying to hit a performance target or are you actively getting > errors with a particular configuration? Your subject says "Connection > Timed Out". Is it nginx that is reporting the connection timeout? Have > you checked on the Tomcat side what is happening with those requests? > > -chris > > > On Thu, Oct 29, 2020 at 10:29 PM Christopher Schultz < > > ch...@christopherschultz.net> wrote: > > > >> Ayub, > >> > >> On 10/28/20 23:28, Ayub Khan wrote: > >>> During high load of 16k requests per minute, we notice below error in > >> log. > >>> > >>> [error] 2437#2437: *13335389 upstream timed out (110: Connection > timed > >>> out) while reading response header from upstream, server: jahez.net, > >>> request: "GET /serviceContext/ServiceName?callback= HTTP/1.1", > upstream: > >> " > >>> http://127.0.0.1:8080/serviceContext/ServiceName > >>> > >>> Below is the flow of requests: > >>> > >>> cloudflare-->AWS ALB--> NGINX--> Tomcat-->Elastic-search > >> > >> I'm curious about why you are using all of cloudflare and ALB and nginx. > >> Seems like any one of those could provide what you are getting from all > >> 3 of them. > >> > >>> In NGINX we have the below config > >>> > >>> location /serviceContext/ServiceName{ > >>> > >>> proxy_pass > >> http://localhost:8080/serviceContext/ServiceName; > >>> proxy_http_version 1.1; > >>> proxy_set_header Connection $connection_upgrade; > >>> proxy_set_header Upgrade $http_upgrade; > >>> proxy_set_header Host $host; > >>> proxy_set_header X-Real-IP $remote_addr; > >>> proxy_set_header X-Forwarded-For > $proxy_add_x_forwarded_for; > >>> > >>> > >>> proxy_buffers 16 16k; > >>> proxy_buffer_size 32k; > >>> } > >> > >> What is the maximum number of simultaneous requests that one nginx > >> instance will accept? What is the maximum number of simultaneous proxied > >> requests one nginx instance will make to a back-end Tomcat node? How > >> many nginx nodes do you have? How many Tomcat nodes? > >> > >>> below is tomcat connector config > >>> > >>> <Connector port="8080" > >>> protocol="org.apache.coyote.http11.Http11NioProtocol" > >>> connectionTimeout="200" maxThreads="50000" > >>> URIEncoding="UTF-8" > >>> redirectPort="8443" /> > >> > >> 50,000 threads is a LOT of threads. > >> > >>> We monitor the open file using *watch "sudo ls /proc/`cat > >>> /var/run/tomcat8.pid`/fd/ | wc -l" *the number of tomcat open files > keeps > >>> increasing slowing the responses. the only option to recover from this > is > >>> to restart tomcat. > >> > >> So this looks like Linux (/proc filesystem). Linux kernels have a 16-bit > >> pid space which means a theoretical max pid of 65535. In practice, the > >> max pid is actually to be found here: > >> > >> $ cat /proc/sys/kernel/pid_max > >> 32768 > >> > >> (on my Debian Linux system, 4.9.0-era kernel) > >> > >> Each thread takes a pid. 50k threads means more than the maximum allowed > >> on the OS. So you will eventually hit some kind of serious problem with > >> that many threads. > >> > >> How many fds do you get in the process before Tomcat grinds to a halt? > >> What does the CPU usage look like? The process I/O? Disk usage? What > >> does a thread dump look like (if you have the disk space to dump it!)? > >> > >> Why do you need that many threads? > >> > >> -chris > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >> For additional commands, e-mail: users-h...@tomcat.apache.org > >> > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > -- -------------------------------------------------------------------- Sun Certified Enterprise Architect 1.5 Sun Certified Java Programmer 1.4 Microsoft Certified Systems Engineer 2000 http://in.linkedin.com/pub/ayub-khan/a/811/b81 mobile:+966-502674604 ---------------------------------------------------------------------- It is proved that Hard Work and kowledge will get you close but attitude will get you there. However, it's the Love of God that will put you over the top!!