Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

Ayub Khan Thu, 12 Nov 2020 05:51:25 -0800

Mark,

The difference between after_start and after_load is the below sockets
which is just a sample from the repeated list, the ports are random. How to
know what these connections are related to ?


java    5021 tomcat8 3162u     IPv6              98361       0t0     TCP
localhost:http-alt->localhost:51746 (ESTABLISHED)
java    5021 tomcat8 3163u     IPv6              98362       0t0     TCP
localhost:http-alt->localhost:51748 (ESTABLISHED)
java    5021 tomcat8 3164u     IPv6              98363       0t0     TCP
localhost:http-alt->localhost:51750 (ESTABLISHED)
java    5021 tomcat8 3165u     IPv6              98364       0t0     TCP
localhost:http-alt->localhost:51752 (ESTABLISHED)
java    5021 tomcat8 3166u     IPv6              25334       0t0     TCP
localhost:http-alt->localhost:51754 (ESTABLISHED)
java    5021 tomcat8 3167u     IPv6              25335       0t0     TCP
localhost:http-alt->localhost:51756 (ESTABLISHED)
java    5021 tomcat8 3168u     IPv6              25336       0t0     TCP
localhost:http-alt->localhost:51758 (ESTABLISHED)
java    5021 tomcat8 3169u     IPv6              25337       0t0     TCP
localhost:http-alt->localhost:51760 (ESTABLISHED)
java    5021 tomcat8 3170u     IPv6              25338       0t0     TCP
localhost:http-alt->localhost:51762 (ESTABLISHED)
java    5021 tomcat8 3171u     IPv6              25339       0t0     TCP
localhost:http-alt->localhost:51764 (ESTABLISHED)
java    5021 tomcat8 3172u     IPv6              25340       0t0     TCP
localhost:http-alt->localhost:51766 (ESTABLISHED)
java    5021 tomcat8 3173u     IPv6              25341       0t0     TCP
localhost:http-alt->localhost:51768 (ESTABLISHED)
java    5021 tomcat8 3174u     IPv6              25342       0t0     TCP
localhost:http-alt->localhost:51770 (ESTABLISHED)
java    5021 tomcat8 3175u     IPv6              25343       0t0     TCP
localhost:http-alt->localhost:51772 (ESTABLISHED)
java    5021 tomcat8 3176u     IPv6              25344       0t0     TCP
localhost:http-alt->localhost:51774 (ESTABLISHED)
java    5021 tomcat8 3177u     IPv6              25345       0t0     TCP
localhost:http-alt->localhost:51776 (ESTABLISHED)
java    5021 tomcat8 3178u     IPv6              25346       0t0     TCP
localhost:http-alt->localhost:51778 (ESTABLISHED)
java    5021 tomcat8 3179u     IPv6              25347       0t0     TCP
localhost:http-alt->localhost:51780 (ESTABLISHED)
java    5021 tomcat8 3180u     IPv6              25348       0t0     TCP
localhost:http-alt->localhost:51782 (ESTABLISHED)
java    5021 tomcat8 3181u     IPv6              25349       0t0     TCP
localhost:http-alt->localhost:51784 (ESTABLISHED)
java    5021 tomcat8 3182u     IPv6              25350       0t0     TCP
localhost:http-alt->localhost:51786 (ESTABLISHED)
java    5021 tomcat8 3183u     IPv6              25351       0t0     TCP
localhost:http-alt->localhost:51788 (ESTABLISHED)

On Thu, Nov 12, 2020 at 4:05 PM Martin Grigorov <mgrigo...@apache.org>
wrote:

> On Thu, Nov 12, 2020 at 2:40 PM Ayub Khan <ayub...@gmail.com> wrote:
>
> > Martin,
> >
> > Could you provide me a command which you want me to run and provide you
> the
> > results which might help you to debug this issue ?
> >
>
> 1) start your app and click around to load the usual FDs
> 2) lsof -p `cat /var/run/tomcat8.pid` > after_start.txt
> 3) load your app
> 4) lsof -p `cat /var/run/tomcat8.pid` > after_load.txt
>
> you can analyze the differences in the files yourself before sending them
> to us :-)
>
>
> >
> >
> > On Thu, Nov 12, 2020 at 1:36 PM Martin Grigorov <mgrigo...@apache.org>
> > wrote:
> >
> > > On Thu, Nov 12, 2020 at 10:37 AM Ayub Khan <ayub...@gmail.com> wrote:
> > >
> > > > Martin,
> > > >
> > > > These are file descriptors, some are related to the jar files which
> are
> > > > included in the web application and some are related to the sockets
> > from
> > > > nginx to tomcat and some are related to database connections. I use
> the
> > > > below command to count the open file descriptors
> > > >
> > >
> > > which type of connections increase ?
> > > the sockets ? the DB ones ?
> > >
> > >
> > > >
> > > > watch "sudo ls /proc/`cat /var/run/tomcat8.pid`/fd/ | wc -l"
> > > >
> > >
> > > you can also use lsof command
> > >
> > >
> > > >
> > > >
> > > >
> > > > On Thu, Nov 12, 2020 at 10:56 AM Martin Grigorov <
> mgrigo...@apache.org
> > >
> > > > wrote:
> > > >
> > > > > On Wed, Nov 11, 2020 at 11:17 PM Ayub Khan <ayub...@gmail.com>
> > wrote:
> > > > >
> > > > > > Chris,
> > > > > >
> > > > > > I was load testing using the ec2 load balancer dns. I have
> > increased
> > > > the
> > > > > > connector timeout to 6000 and also gave 32gig to the JVM of
> > tomcat. I
> > > > am
> > > > > > not seeing connection timeout in nginx logs now. No errors in
> > > > kernel.log
> > > > > I
> > > > > > am not seeing any errors in tomcat catalina.out.
> > > > > > During regular operations when the request count is between 4 to
> 6k
> > > > > > requests per minute the open files count for the tomcat process
> is
> > > > > between
> > > > > > 200 to 350. Responses from tomcat are within 5 seconds.
> > > > > > If the requests count goes beyond 6.5 k open files slowly move up
> > to
> > > > > 2300
> > > > > > to 3000 and the request responses from tomcat become slow.
> > > > > >
> > > > > > I am not concerned about high open files as I do not see any
> errors
> > > > > related
> > > > > > to open files. Only side effect of  open files going above 700 is
> > the
> > > > > > response from tomcat is slow. I checked if this is caused from
> > > elastic
> > > > > > search, aws cloud watch shows elastic search response is within 5
> > > > > > milliseconds.
> > > > > >
> > > > > > what might be the reason that when the open files goes beyond
> 600,
> > it
> > > > > slows
> > > > > > down the response time for tomcat. I tried with tomcat 9 and it's
> > the
> > > > > same
> > > > > > behavior
> > > > > >
> > > > >
> > > > > Do you know what kind of files are being opened ?
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Nov 3, 2020 at 9:40 PM Christopher Schultz <
> > > > > > ch...@christopherschultz.net> wrote:
> > > > > >
> > > > > > > Ayub,
> > > > > > >
> > > > > > > On 11/3/20 10:56, Ayub Khan wrote:
> > > > > > > > *I'm curious about why you are using all of cloudflare and
> ALB
> > > and
> > > > > > > > nginx.Seems like any one of those could provide what you are
> > > > getting
> > > > > > from
> > > > > > > > all3 of them. *
> > > > > > > >
> > > > > > > > Cloudflare is doing just the DNS and nginx is doing ssl
> > > termination
> > > > > > >
> > > > > > > What do you mean "Cloudflare is doing just the DNS?"
> > > > > > >
> > > > > > > So what is ALB doing, then?
> > > > > > >
> > > > > > > > *What is the maximum number of simultaneous requests that one
> > > > > > > nginxinstance
> > > > > > > > will accept? What is the maximum number of simultaneous
> > > > > proxiedrequests
> > > > > > > one
> > > > > > > > nginx instance will make to a back-end Tomcat node? Howmany
> > nginx
> > > > > nodes
> > > > > > > do
> > > > > > > > you have? How many Tomcat nodes?  *
> > > > > > > >
> > > > > > > > We have 4 vms each having nginx and tomcat running on them
> and
> > > each
> > > > > > > tomcat
> > > > > > > > has nginx in front of them to proxy the requests. So it's one
> > > Nginx
> > > > > > > > proxying to a dedicated tomcat on the same VM.
> > > > > > >
> > > > > > > Okay.
> > > > > > >
> > > > > > > > below is the tomcat connector configuration
> > > > > > > >
> > > > > > > > <Connector port="8080"
> > > > > > > >                 connectionTimeout="60000" maxThreads="2000"
> > > > > > > >
> > > >  protocol="org.apache.coyote.http11.Http11NioProtocol"
> > > > > > > >                 URIEncoding="UTF-8"
> > > > > > > >                 redirectPort="8443" />
> > > > > > >
> > > > > > > 60 seconds is a *long* time for a connection timeout.
> > > > > > >
> > > > > > > Do you actually need 2000 threads? That's a lot, though not
> > insane.
> > > > > 2000
> > > > > > > threads means you expect to handle 2000 concurrent (non-async,
> > > > > > > non-Wewbsocket) requests. Do you need that (per node)? Are you
> > > > > expecting
> > > > > > > 8000 concurrent requests? Does your load-balancer understand
> the
> > > > > > > topography and current-load on any given node?
> > > > > > >
> > > > > > > > When I am doing a load test of 2000 concurrent users I see
> the
> > > open
> > > > > > files
> > > > > > > > increase to 10,320 and when I take thread dump I see the
> > threads
> > > > are
> > > > > > in a
> > > > > > > > waiting state.Slowly as the requests are completed I see the
> > open
> > > > > files
> > > > > > > > come down to normal levels.
> > > > > > >
> > > > > > > Are you performing your load-test against the
> CF/ALB/nginx/Tomcat
> > > > > stack,
> > > > > > > or just hitting Tomcat (or nginx) directly?
> > > > > > >
> > > > > > > Are you using HTTP keepalive in your load-test (from the client
> > to
> > > > > > > whichever server is being contacted)?
> > > > > > >
> > > > > > > > The output of the below command is
> > > > > > > > sudo cat /proc/sys/kernel/pid_max
> > > > > > > > 131072
> > > > > > > >
> > > > > > > > I am testing this on a c4.8xlarge VM in AWS.
> > > > > > > >
> > > > > > > > below is the config I changed in nginx.conf file
> > > > > > > >
> > > > > > > > events {
> > > > > > > >          worker_connections 50000;
> > > > > > > >          # multi_accept on;
> > > > > > > > }
> > > > > > >
> > > > > > > This will allow 50k incoming connections, and Tomcat will
> accept
> > an
> > > > > > > unbounded number of connections (for NIO connector). So
> limiting
> > > your
> > > > > > > threads to 2000 only means that the work of each request will
> be
> > > done
> > > > > in
> > > > > > > groups of 2000.
> > > > > > >
> > > > > > > > worker_rlimit_nofile 30000;
> > > > > > >
> > > > > > > I'm not sure how many connections are handled by a single nginx
> > > > worker.
> > > > > > > If you accept 50k connections and only allow 30k file handles,
> > you
> > > > may
> > > > > > > have a problem if that's all being done by a single worker.
> > > > > > >
> > > > > > > > What would be the ideal config for tomcat and Nginx so this
> > setup
> > > > on
> > > > > > > > c4.8xlarge vm could serve at least 5k or 10k requests
> > > > simultaneously
> > > > > > > > without causing the open files to spike to 10K.
> > > > > > >
> > > > > > > You will never be able to serve 10k simultaneous requests
> without
> > > > > having
> > > > > > > 10k open files on the server. If you mean 10k requests across
> the
> > > > whole
> > > > > > > 4-node environment, then I'd expect 10k requests to open
> > (roughly)
> > > > 2500
> > > > > > > open files on each server. And of course, you need all kinds of
> > > other
> > > > > > > files open as well, from JAR files to DB connections or other
> > > network
> > > > > > > connections.
> > > > > > >
> > > > > > > But each connection needs a file descriptor, full stop. If you
> > need
> > > > to
> > > > > > > handle 10k connections, then you will need to make it possible
> to
> > > > open
> > > > > > > 10k file handles /just for incoming network connections/ for
> that
> > > > > > > process. There is no way around it.
> > > > > > >
> > > > > > > Are you trying to hit a performance target or are you actively
> > > > getting
> > > > > > > errors with a particular configuration? Your subject says
> > > "Connection
> > > > > > > Timed Out". Is it nginx that is reporting the connection
> timeout?
> > > > Have
> > > > > > > you checked on the Tomcat side what is happening with those
> > > requests?
> > > > > > >
> > > > > > > -chris
> > > > > > >
> > > > > > > > On Thu, Oct 29, 2020 at 10:29 PM Christopher Schultz <
> > > > > > > > ch...@christopherschultz.net> wrote:
> > > > > > > >
> > > > > > > >> Ayub,
> > > > > > > >>
> > > > > > > >> On 10/28/20 23:28, Ayub Khan wrote:
> > > > > > > >>> During high load of 16k requests per minute, we notice
> below
> > > > error
> > > > > in
> > > > > > > >> log.
> > > > > > > >>>
> > > > > > > >>>    [error] 2437#2437: *13335389 upstream timed out (110:
> > > > Connection
> > > > > > > timed
> > > > > > > >>> out) while reading response header from upstream,  server:
> > > > > jahez.net
> > > > > > ,
> > > > > > > >>> request: "GET /serviceContext/ServiceName?callback=
> > HTTP/1.1",
> > > > > > > upstream:
> > > > > > > >> "
> > > > > > > >>> http://127.0.0.1:8080/serviceContext/ServiceName
> > > > > > > >>>
> > > > > > > >>> Below is the flow of requests:
> > > > > > > >>>
> > > > > > > >>> cloudflare-->AWS ALB--> NGINX--> Tomcat-->Elastic-search
> > > > > > > >>
> > > > > > > >> I'm curious about why you are using all of cloudflare and
> ALB
> > > and
> > > > > > nginx.
> > > > > > > >> Seems like any one of those could provide what you are
> getting
> > > > from
> > > > > > all
> > > > > > > >> 3 of them.
> > > > > > > >>
> > > > > > > >>> In NGINX we have the below config
> > > > > > > >>>
> > > > > > > >>> location /serviceContext/ServiceName{
> > > > > > > >>>
> > > > > > > >>>       proxy_pass
> > > > > > > >> http://localhost:8080/serviceContext/ServiceName;
> > > > > > > >>>      proxy_http_version  1.1;
> > > > > > > >>>       proxy_set_header    Connection
> > > > $connection_upgrade;
> > > > > > > >>>       proxy_set_header    Upgrade
>  $http_upgrade;
> > > > > > > >>>       proxy_set_header    Host                      $host;
> > > > > > > >>>       proxy_set_header    X-Real-IP
> > $remote_addr;
> > > > > > > >>>       proxy_set_header    X-Forwarded-For
> > > > > > >  $proxy_add_x_forwarded_for;
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>>           proxy_buffers 16 16k;
> > > > > > > >>>           proxy_buffer_size 32k;
> > > > > > > >>> }
> > > > > > > >>
> > > > > > > >> What is the maximum number of simultaneous requests that one
> > > nginx
> > > > > > > >> instance will accept? What is the maximum number of
> > simultaneous
> > > > > > proxied
> > > > > > > >> requests one nginx instance will make to a back-end Tomcat
> > node?
> > > > How
> > > > > > > >> many nginx nodes do you have? How many Tomcat nodes?
> > > > > > > >>
> > > > > > > >>> below is tomcat connector config
> > > > > > > >>>
> > > > > > > >>> <Connector port="8080"
> > > > > > > >>>
> > > > > > protocol="org.apache.coyote.http11.Http11NioProtocol"
> > > > > > > >>>                  connectionTimeout="200" maxThreads="50000"
> > > > > > > >>>                  URIEncoding="UTF-8"
> > > > > > > >>>                  redirectPort="8443" />
> > > > > > > >>
> > > > > > > >> 50,000 threads is a LOT of threads.
> > > > > > > >>
> > > > > > > >>> We monitor the open file using *watch "sudo ls /proc/`cat
> > > > > > > >>> /var/run/tomcat8.pid`/fd/ | wc -l" *the number of tomcat
> open
> > > > files
> > > > > > > keeps
> > > > > > > >>> increasing slowing the responses. the only option to
> recover
> > > from
> > > > > > this
> > > > > > > is
> > > > > > > >>> to restart tomcat.
> > > > > > > >>
> > > > > > > >> So this looks like Linux (/proc filesystem). Linux kernels
> > have
> > > a
> > > > > > 16-bit
> > > > > > > >> pid space which means a theoretical max pid of 65535. In
> > > practice,
> > > > > the
> > > > > > > >> max pid is actually to be found here:
> > > > > > > >>
> > > > > > > >> $ cat /proc/sys/kernel/pid_max
> > > > > > > >> 32768
> > > > > > > >>
> > > > > > > >> (on my Debian Linux system, 4.9.0-era kernel)
> > > > > > > >>
> > > > > > > >> Each thread takes a pid. 50k threads means more than the
> > maximum
> > > > > > allowed
> > > > > > > >> on the OS. So you will eventually hit some kind of serious
> > > problem
> > > > > > with
> > > > > > > >> that many threads.
> > > > > > > >>
> > > > > > > >> How many fds do you get in the process before Tomcat grinds
> > to a
> > > > > halt?
> > > > > > > >> What does the CPU usage look like? The process I/O? Disk
> > usage?
> > > > What
> > > > > > > >> does a thread dump look like (if you have the disk space to
> > dump
> > > > > it!)?
> > > > > > > >>
> > > > > > > >> Why do you need that many threads?
> > > > > > > >>
> > > > > > > >> -chris
> > > > > > > >>
> > > > > > > >>
> > > > >
> ---------------------------------------------------------------------
> > > > > > > >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > > > > > >> For additional commands, e-mail:
> users-h...@tomcat.apache.org
> > > > > > > >>
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > > >
> > > ---------------------------------------------------------------------
> > > > > > > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > > > > > For additional commands, e-mail: users-h...@tomcat.apache.org
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > --------------------------------------------------------------------
> > > > > > Sun Certified Enterprise Architect 1.5
> > > > > > Sun Certified Java Programmer 1.4
> > > > > > Microsoft Certified Systems Engineer 2000
> > > > > > http://in.linkedin.com/pub/ayub-khan/a/811/b81
> > > > > > mobile:+966-502674604
> > > > > >
> > > ----------------------------------------------------------------------
> > > > > > It is proved that Hard Work and kowledge will get you close but
> > > > attitude
> > > > > > will get you there. However, it's the Love
> > > > > > of God that will put you over the top!!
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > --------------------------------------------------------------------
> > > > Sun Certified Enterprise Architect 1.5
> > > > Sun Certified Java Programmer 1.4
> > > > Microsoft Certified Systems Engineer 2000
> > > > http://in.linkedin.com/pub/ayub-khan/a/811/b81
> > > > mobile:+966-502674604
> > > >
> ----------------------------------------------------------------------
> > > > It is proved that Hard Work and kowledge will get you close but
> > attitude
> > > > will get you there. However, it's the Love
> > > > of God that will put you over the top!!
> > > >
> > >
> >
> >
> > --
> > --------------------------------------------------------------------
> > Sun Certified Enterprise Architect 1.5
> > Sun Certified Java Programmer 1.4
> > Microsoft Certified Systems Engineer 2000
> > http://in.linkedin.com/pub/ayub-khan/a/811/b81
> > mobile:+966-502674604
> > ----------------------------------------------------------------------
> > It is proved that Hard Work and kowledge will get you close but attitude
> > will get you there. However, it's the Love
> > of God that will put you over the top!!
> >
>


-- 
--------------------------------------------------------------------
Sun Certified Enterprise Architect 1.5
Sun Certified Java Programmer 1.4
Microsoft Certified Systems Engineer 2000
http://in.linkedin.com/pub/ayub-khan/a/811/b81
mobile:+966-502674604
----------------------------------------------------------------------
It is proved that Hard Work and kowledge will get you close but attitude
will get you there. However, it's the Love
of God that will put you over the top!!

Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

Reply via email to