I am intrigued by this article, as the problem sounds the same as mine: http://corner.squareup.com/2014/09/logging-can-be-tricky.html
"No significant amount of resources appeared to be in use — disk I/O, network I/O, CPU, and memory all looked fairly tame. Furthermore, the bulk of queries being served were all performing as expected. " So I tried to follow their example regarding strace. But I have never worked with strace before. I used grep to find the PID and then I: sudo strace -c -f -p 19363 and I got: Process 19363 attached with 45 threads Then I ran our "health check" which is like a series of functional tests that ping our actual app (a live environment rather than a test environment). I got nothing out of strace except these 2 lines appeared: Process 20973 attached Process 20974 attached What does this mean? I had the impression that the JVM ran in 1 process? Does strace show me userland threads (like htop does) or are these child processes? On Monday, September 15, 2014 12:15:14 AM UTC-4, larry google groups wrote: > > > I have an embarrassing problem. I convinced my boss that I could use > Clojure to build a RESTful API. I was successful in so far as that went, > but now I face the issue that every once in a while, the program pauses, > for a painfully long time -- sometimes 30 seconds, which causes some > requests to the API to timeout. We are still in testing, so there is no > real load on the app, just the frontenders, writing Javascript and making > Ajax calls to the service. > > The service seems like a basic Clojure web app. I use Jetty as the > webserver, and the libraries in use are: > > Ring > > Compojure > > Liberator > > Monger > > Timbre > > Lamina > > Dire > > When someone complains about the pauses, I will go test the service, and I > can hit with 40 requests in 10 seconds and it has great performance. The > pauses actually seem to come after periods of inactivity, which made me > think that this had something to do with garbage collection, except that > the pauses are so extreme -- like I said, sometimes as much as 30 seconds, > causing requests to timeout. When the app does finally start to respond it > again, it goes very fast, and responds to those pending request very fast. > > But I have to find a way to fix these pauses. > > Right now I packaged the app as an Uberjar and put it on the server, spun > it up on port 24000 and proxied it through Apache. I put a script in > /etc/init.d to start the app using start-stop-daemon. > > Possible things that could be going wrong: > > Maybe Jetty needs more threads, or maybe less threads? How would I test > that? > > Maybe the link to MongoDB sometimes dies? (Mongo is on another server at > Amazon) How would I test that? > > Maybe it is garbage collection? How would I test that? > > Maybe I have some code that somehow blocks the whole app? Seems unlikely > but I'm trying to keep an open mind. > > Maybe the thread pool managed by Lamina sometimes gets overwhelmed? How > would I test that? > > Maybe when Timbre writes to the log file it causes things to pause? (But I > believe Timbre does this in its own thread?) How do I test that? > > This is a small app: only about 1,100 lines of code. > > I don't have much experience debugging problems on the JVM, so I welcome > any suggestions. > > > > > > > > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.