I just got the info on mod_jk: sh-3.00# *ls -l mod_jk.so* -rwxrwxr-x 1 root root 1004496 May 8 00:29 mod_jk.so
sh-3.00# *cksum mod_jk.so* 2745861777 1004496 mod_jk.so sh-3.00# *strings mod_jk.so |grep mod_jk/* mod_jk/1.2.31 (1026297) mod_jk/1.2.31 mod_jk/1.2.31 (1026297) I have the latest configs, but haven't had time to sanitize them. Will post later tonight. Danté On 08/10/2011 06:23 PM, Mark Eggers wrote: > ----- Original Message ----- > >> From: Dante Bell <dantepasqu...@cocoanet.us> >> To: Tomcat Users List <users@tomcat.apache.org> >> Cc: Christopher Schultz <ch...@christopherschultz.net> >> Sent: Wednesday, August 10, 2011 11:26 AM >> Subject: Re: TC 6.0.20 Cleanup after application crash >> >> Hi Chris, >> >> I did indeed read and digest Mark's email and talked to the vendor about >> that issue. The stack trace on the old blog post is from the one Mark >> was helping out with (man, that was a really bad sentance!). >> >> This is a different issue :( I don't have a stack trace and I don't have >> access to the lab they are running these tests in. I've requested the >> stack traces when this happens, but haven't received those yet. >> >> Your question about 'crash' is valid and the explanation I received was >> that the load test application crashes. That's all I have at this time >> from them. I'm helping them from a dark, distant planet and only see the >> things they want me to see ;) Weirdly, it doesn't sound like TC is dead >> from what they are telling me, after 15 minutes it starts serving up db >> responses! >> >> Yes, they are using mod_jk. >> >> >> >> On 08/10/2011 12:55 PM, Christopher Schultz wrote: >>> Dante, >>> >>> On 8/10/2011 11:57 AM, Dante Bell wrote: >>> > We are seeing that after an application crash (customized load >>> > tester with minimal error handling so it crashes often) >>> >>> When you say "crash", do you mean you get a stack trace in the >> logs and >>> Tomcat stays up, or do you mean that you bring-down the JVM? If you >>> bring-down the JVM, what is the error that is occurring (check hs_*.txt >>> files laying around in the working directory for that)? >>> >>> > that TC isn't releasing the connection for about 15 minutes. >>> >>> If TC is truly dead, then it's not holding connections at all. That >>> would be the OS holding them. >>> >>> What makes you think they are not being "released"? What counts >> as >>> "released"? >>> >>> > I've reviewed some of the worker directives, but I'm really >> unsure as >>> > to which one or combination would shorten this interval >>> > significantly. >>> >>> Does that mean you are using mod_jk/mod_proxy_ajp? Good to have that >>> kind of information. >>> >>> > The Apache server still serves up static content, which makes me >>> > think that there isn't anything at the OS or Apache layer that is >>> > causing the connection to hang around (granted, this isn't an >>> > absolute and we are investigating these 2 components also). >>> >>> So you're using Apache httpd, too. Also good to know. >>> >>> > We've done some minor TCP/IP tuning in the Solaris stack, and that >>> > has helped with other issues regarding heavy loads. >>> >>> On Solaris. >>> >>> > If TC is the culprit, would we need to be setting the advanced >>> > connector directives such as: >>> >>> > |recovery_options |4: close the connection to Tomcat, if we >>> > detect an error when writing back the answer to the client (browser) >>> >>> That depends upon what the errors actually are. Care to tell us about >>> them? >>> >>> > PS. Configs can be found at: http://bit.ly/pFIzO0 >>> >>> Sigh. You should look into "template" workers. >>> >>> Apache httpd MaxClients setting default is 256. <Connector> >> MaxThreads >>> is set to 750, so Tomcat should have almost 3 times more than you need. >>> Where do you see 750 stuck threads? >>> >>> I looked at your thread dump. You clearly have not read Mark's previous >>> response on this list where he told you exactly what was happening: your >>> webapp is killing itself with these SingleThreadModel servlets. This is >>> not thread starvation due to configuration, this is thread starvation >>> due to a poorly-implemented web application. >>> >>> > Apache:* Apache HTTP Server Version 2.2 -- prefork with mpm *Tomcat:* >>> > 6.0.20 *JK Connector:* Same as whatever is bundled in with Apache 2.2 >>> > (from customer) *Solaris* Solaris 10 10/09 s10s_u8wos_08a SPARC >>> >>> Aah, here's all the configuration information. Description then >> context. >>> Not the best term paper I've ever read. :( >>> >>> I think you mean "prefork MPM". Apache httpd does not bundle >> mod_jk. >>> Check your version. > > As is my normal self, this will be horrifically long. I apologize for that in > advance. Here are the cliff notes first. > > 1. Clean up your httpd.conf - it's a mess > Notes in the main message > > 2. Clean up your workers.properties - it's not a mess, but certainly missing > things > Notes and an example in the main message > > 3. Clean up your AJP Connector in server.xml - it's a mess > Notes and an example in the main message > > 4. Use JMeter - well-tested, robust, freely available testing tool > http://jakarta.apache.org/jmeter/ > > 5. Fix the application - there really is no other viable solution > > And now for the novel . . . > > * Introduction > > This will be a long and rambling set of comments on the entire > configuration. I will try to address issues as I see them. I will also > note missing information as I go. > > I don't have any hard and fast solutions to the problems that are > being posted. However, a first order of business is to clean up the > existing issues as noted below. Once those issues are addressed, then > the underlying causes to the problems can be investigated. > > In short, it's often very difficult to see the forest for the trees > when working with problems like this. > > * The Platform > > OS: Solaris 10 > JRE: unknown > HTTPD: 2.2.17 prefork (the default on UNIX and Linux) > MOD_JK: unknown > Tomcat: 6.0.20 > > First of all, it would be nice to know the versions of those listed as > "unknown". As has been noted in the mailing list, mod_jk does not > come with Apache HTTPD. Some of the configuration notes for > workers.properties depend on which version of mod_jk you are using. > > HTTPD 2.2.17 is not horribly out of date. According to the web site, > 2.2.19 is the latest released version. Issues that are addressed in > 2.2.19 (actually, 2.2.18 which is abandoned) that may concern you are > as follows: > > *) Core HTTP: disable keepalive when the Client has sent Expect: > 100-continue but we respond directly with a non-100 response. > Keepalive here led to data from clients continuing being treated > as a new request. PR 47087. [Nick Kew] > > *) prefork: Update MPM state in children during a graceful restart. > Allow the HTTP connection handling loop to terminate early during > a graceful restart. PR 41743. [Andrew Punch <andrew.punch > 247realmedia.com>] > > *) mod_ssl: Correctly read full lines in input filter when the line > is incomplete during first read. PR 50481. [Ruediger Pluem] > > Tomcat 6.0.20 is out of date. The current version is 6.0.32, and I > imagine 6.0.33 will be out soon. I won't post the changelog here, but > there are many important fixes. > > * Configurations > > I will be a bit hamstrung in commenting about your > configurations. This is mainly due to the lack of information > concerning mod_jk. If you don't know the version, you may be able to > find out by doing the following: > > strings mod_jk.so | grep mod_jk/ > > On my system (Fedora 15, kernel 2.6.40 - which is 3.0) this returns: > > mod_jk/1.2.32 () > mod_jk/1.2.32 > > ** HTTPD Configuration > > Since this is not the Apache HTTPD mailing list, I won't make a lot of > comments about the general configuration here. It is pretty much a > mess, and the maintainers of this need to clean it up before going > into production. > > *** Defaults Used > > ServerAdmin y...@example.com > ServerName mycompany.com:80 > > These are the defaults and should be changed. > > LoadModule proxy_module libexec/mod_proxy.so > LoadModule proxy_connect_module libexec/mod_proxy_connect.so > LoadModule proxy_ftp_module libexec/mod_proxy_ftp.so > LoadModule proxy_http_module libexec/mod_proxy_http.so > LoadModule proxy_scgi_module libexec/mod_proxy_scgi.so > LoadModule proxy_ajp_module libexec/mod_proxy_ajp.so > LoadModule proxy_balancer_module libexec/mod_proxy_balancer.so > > If your server is not secured this is a security issue. Since you are > using mod_jk (see lines later in the configuration file), I can see no > reason to load proxy_ajp_module. I suspect that there is no reason to > load any of the proxy modules, but I've not gone through the > configuration carefully. > > Interestingly enough, mod_proxy and mod_proxy_http are both commented > out later in the configuration file. > > LoadModule dav_module libexec/mod_dav.so > LoadModule dav_fs_module libexec/mod_dav_fs.so > > This allows (with proper configuration) remote users to edit files on > the server via the webdav protocol. I'm not sure you would want this > on a customer-facing web server. You may, and it seems to be enabled > here: > > # Distributed authoring and versioning (WebDAV) > Include conf/extra/httpd-dav.conf > > You don't have any prefork configuration, so you're using the > defaults. These are: > > StartServers 5 > MinSpareServers 5 > MaxSpareServers 10 > ServerLimit 256 > MaxClients 256 > MaxRequestsPerChild 10000 > > This means that the HTTPD server can handle 256 simultaneous > requests. You can read in the documentation what the other numbers > mean, but the names are pretty self-evident. > > The 256 number is relevant to Connector element configuration. The > largest number of simultaneous connections this server can handle is > 256. This means the largest number of requests that can be forwarded > to Tomcat at any one time is 256. This has an impact on your > server.xml file as noted below. > > Finally, there is a lot of SSL configuration in httpd.conf, but > mod_ssl is commented out. > > *** mod_jk configuration > > I'm only going to comment in detail lines that are uncommented in the > httpd.conf file. There are a lot of other issues that I'll just > mention. > > 1. There are many lines that perform the same forwarding function > > For example: > > JkMount /MyCfg/servlet/* worker1 > > This would include > > JkMount /MyCfg/servlet/Login worker1 > > 2. If all of your workers go to the same host and port (which means > the same Tomcat), why are there multiple workers configured? > > The above lines (and others like it) look suspiciously like the > application is using the Invoker servlet. By default this is disabled > in Tomcat 6 due to security concerns. > > Since the web application was written with NetBeans (I recognize the > doProcess() method), there is no reason to not map the servlets to > appropriate URLs in web.xml. > > Please post $CATALINA_HOME/conf/web.xml with comments removed. > > Stripping down everything, your current mod_jk configuration looks > like the following. > > JkWorkersFile > "/mycompany/apps/myfm/fmserver/Tomcat/conf/workers.properties" > JkLogFile /usr/apache2_cgems/logs/mod_jk.log > JkLogLevel error > JkLogStampFormat "[%a %b %d %H:%M:%S %Y] " > JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories > JkRequestLogFormat "%w %V %T" > > JkMount /ACT worker2 > JkMount /ACT/* worker2 > > A couple of quick comments here. > > You don't have JkShmFile, jk-status, or jk-manager configured. This is > useful to see what's going on with mod_jk. > > There is no need for quotes around the JkWorkersFile name. > > Since workers.properties is a mod_jk configuration (and part of Apache > HTTPD), I normally put this with all of the other Apache HTTPD > configuration files (/etc/httpd/conf.d on Fedora 15). > > The JkLogStampFormat is the default for mod_jk prior to 1.2.24, so I'm > going to guess that your mod_jk may actually be 1.2.23 or older. If > so, time to upgrade. See the notes above on one way to determine this. > > -ForwardDirectories is the default. > +ForwardKeySize is the default. > +ForwardURICompat was the default until mod_jk 1.2.22 > > From the documentation at > http://tomcat.apache.org/connectors-doc/reference/apache.html, this is > less spec compliant and not safe if you are using prefix > JkMount. Apparently this means if you don't map to exact URLs, then > this option results in unsafe operation. > > ** workers.properties > > Since the only worker you are using in httpd.conf is worker2, then the > following is sufficient. > > # Minimal jk configuration > worker.list=worker2 > worker.worker2.type=ajp13 > worker.worker2.host=localhost > worker.worker2.port=8019 > > However, a more explicit configuration may be desired. This all > depends on your version of mod_jk. A while back I posted a > workers.properties file to the list in answer to another question. An > abbreviated version of that is shown below. > > worker.list=worker2 > # > # template > # > # Notes on configuration > # type - ajp13 which is the protocol and the default > # socket_connect_timeout - in milliseconds (what happens when Tomcat > # is started later? > # socket_keepalive - send keep alive packets when connection is > # idle > # ping - how to do the keep alive (see > # documentation) > # ping_timeout - default in milliseconds > # minsize - minimum pool size - drops to zero after a > # while > # timeout - pool timeout should match AJP connector in > # Tomcat. Note time here is in seconds and > # must match the AJP connector in > # server.xml. Note, there is no timeout by > # default in server.xml > # reply_timeout - timeout for a reply. The default is no > # timeout. The value is in milliseconds. Make > # longer than the longest Tomcat will process > # a request, otherwise an error will be > # returned. > # recovery_options - a bitmapped flag for recovery when a > # request is successfully sent but no reply > # is received. 0 is the default, 3 says don't > # retry on another backend > > worker.template.type=ajp13 > worker.template.host=localhost > worker.template.socket_connect_timeout=5000 > worker.template.socket_keepalive=true > worker.template.ping_mode=A > worker.template.ping_timeout=10000 > worker.template.connection_pool_minsize=0 > worker.template.connection_pool_timeout=600 > worker.template.reply_timeout=300000 > worker.template.recovery_options=3 > > # > # now to define the actual workers > # > worker.worker2.reference=worker.template > worker.worker2.port=8019 > > This is based on the configurations found in > tomcat-connectors-[version]-src/conf. I think this started appearing > in version 1.2.31. That's the earliest version I have unpacked on my > system at any rate. > > One thing to note here. The connection_pool_timeout must be the same > as the timeout value for the AJP connector in server.xml. The value > here is in seconds. The value in server.xml is in milliseconds. > > I do not understand why you have the other workers configured. They > all go to the same host. Apache HTTPD will only open 256 connections > (max) by default. I cannot think of a reason why you don't just have > one worker per Tomcat. > > ** server.xml > > I will just comment on the portion that has to do with the AJP > connections. Note that I have a much longer connection pool timeout > than you do, and will be changing the connectionTimeout value > accordingly. > > <Connector port="8019" > connectionTimeout="10000" > maxThreads="750" > minSpareThreads="20" > maxSpareThreads="50" > request.TomcatAuthentication="false" > protocol="AJP/1.3" > redirectPort="8445" /> > > There are several issues here that need to be addressed. > > 1. connectionTimeout="10000" > > This must match the pool_timeout in workers.properties, so in this > example it should be 600000. > > 2. maxThreads="750" > > In your current HTTPD configuration, you can never have more than 256 > connections from HTTPD to Tomcat. The default value is 200. Since you > said that Apache HTTPD also serves some static content, leaving this > at the default is probably a good idea. > > 3. minSpareThreads, maxSpareThreads > > I don't see either of these in the Tomcat 6 documentation. > > 4. request.TomcatAuthentication="false" > > According to the documentation if you do not want Tomcat to process > authentication (and it appears this way from your Apache HTTPD > configuration), the directive is tomcatAuthentication="false" > > 5. Encoding > > By default, the URIEncoding is set to ISO-8859-1. You might wish to > change that to UTF-8. > > Applying the above changes to your AJP connector configuration (and > reflecting the 600 second timeout in workers.properties), the > following Connector element is arrived at. > > <Connector port="8019" > connectionTimeout="600000" > tomcatAuthentication="false" > URIEncoding="UTF-8" > protocol="AJP/1.3" > redirectPort="8445" /> > > * Load Test Tool Crash > > I really cannot comment on this since it's a custom built tool. Are > there reasons for not using something like JMeter? > > * Other Application Issues [Soapbox below] > > Over the weekend I wrote a quick Single Thread Model servlet and poked > around with JMX. I didn't see any way to tell what was going on > without doing a thread dump. Once you reach the limit of 20 STM threads, I'm > not > sure what you would do. Would you kill one or more threads? How? Which > one would you choose? If you could kill a thread running the STM > servlet, how would you tell Tomcat that there's another slot available > for another STM thread? What state would Tomcat end up in if you could > kill off a thread running an STM servlet? > > In short, fix the application. STM servlets provide a false sense of > thread safety at any rate. STM does not protect context attributes > from modification by other servlets. Session variables are probably > also not thread safe (one browser, two tabs?). > > I suspect that the original authors were trying to get around the > non-idempotent nature of POSTs. This plus the possible use of the > Invoker servlet leads me to believe that this is an old application > ripe for a rewrite. > > . . . . just my nickel (since it's a long post) > /mde/ > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org >