Daniel Compton, good suggestion. I've increased the memory to see if I can postpone the GCs, and I'll log that more carefully.
On Wednesday, October 11, 2017 at 8:35:44 PM UTC-4, Daniel Compton wrote: > > Without more information it's hard to tell, but this looks a like it could > be a garbage collection issue. Can you run your test again and add some > logging/monitoring to show each garbage collection? If my hunch is right, > you'll see garbage collections getting more and more frequent until they > take up nearly all the CPU time, preventing much forward progress writing > to the queue. > > If it's AWS based throttling, then CloudWatch monitoring > http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-volume-status.html#using_cloudwatch_ebs > might > show you some hints. You could also test with an NVMe drive attached, just > to see if disk bandwidth is the issue. > > On Thu, Oct 12, 2017 at 11:58 AM Justin Smith <noise...@gmail.com > <javascript:>> wrote: > >> a small thing here, if memory usage is important you should be building >> and running an uberjar instead of using lein on the server (this also has >> other benefits), and if you are doing that your project.clj jvm-opts are >> not used, you have to configure your java command line in aws instead >> >> On Wed, Oct 11, 2017 at 3:52 PM <lawrence...@gmail.com <javascript:>> >> wrote: >> >>> I can't figure out if this is a Clojure question or an AWS question. And >>> if it is a Clojure question, I can't figure out if it is more of a general >>> JVM question, or if it is specific to some library such as durable-queue. I >>> can redirect my question elsewhere, if people think this is an AWS >>> question. >>> >>> In my project.clj, I try to give my app a lot of memory: >>> >>> :jvm-opts ["-Xms7g" "-Xmx7g" "-XX:-UseCompressedOops"]) >>> >>> And the app starts off pulling data from MySQL and writing it to >>> Durable-Queue at a rapid rate. ( >>> https://github.com/Factual/durable-queue ) >>> >>> I have some logging set up to report every 30 seconds. >>> >>> :enqueued 370137, >>> >>> 30 seconds later: >>> >>> :enqueued 608967, >>> >>> 30 seconds later: >>> >>> :enqueued 828950, >>> >>> It's a dramatic slowdown. The app is initially writing to the queue at >>> faster than 10,000 documents a second, but it slows steadily, and after 10 >>> minutes it writes less than 1,000 documents per second. Since I have to >>> write a few million documents, 10,000 a second is the slowest speed I can >>> live with. >>> >>> The queues are in the /tmp folder of an AWS instance that has plenty of >>> disk space, 4 CPUs, and 16 gigs of RAM. >>> >>> Why does the app slow down so much? I had 4 thoughts: >>> >>> 1.) the app struggles as it hits a memory limit >>> >>> 2.) memory bandwidth is the problem >>> >>> 3.) AWS is enforcing some weird IOPS limit >>> >>> 4.) durable-queue is misbehaving >>> >>> As to possibility #1, I notice the app starts like this: >>> >>> Memory in use (percentage/used/max-heap): (\"66%\" \"2373M\" \"3568M\") >>> >>> but 60 seconds later I see: >>> >>> Memory in use (percentage/used/max-heap): (\"94%\" \"3613M\" \"3819M\") >>> >>> So I've run out of allowed memory. But why is that? I thought I gave >>> this app 7 gigs: >>> >>> :jvm-opts ["-Xms7g" "-Xmx7g" "-XX:-UseCompressedOops"]) >>> >>> As to possibility #2, I found this old post on the Clojure mailist: >>> >>> Andy Fingerhut wrote, "one thing I've found in the past on a 2-core >>> machine that was achieving much less than 2x speedup was memory bandwidth >>> being the limiting factor." >>> >>> >>> https://groups.google.com/forum/#!searchin/clojure/xmx$20xms$20maximum%7Csort:relevance/clojure/48W2eff3caU/HS6u547gtrAJ >>> >>> But I am not sure how to test this. >>> >>> As to possibility #3, I'm not sure how AWS enforces its IOPS limits. If >>> people think this is the most likely possibility, then I will repost my >>> question in an AWS forum. >>> >>> As to possibility #4, durable-queue is well-tested and used in a lot of >>> projects, and Zach Tellman is smart and makes few mistakes, so I'm doubtful >>> that it is to blame, but I do notice that it starts off with 4 active slabs >>> and then after 120 seconds, it is only using 1 slab. Is that expected? If >>> people think this is the possible problem then I'll ask somewhere specific >>> to durable-queue >>> >>> Overall, my log information looks like this: >>> >>> ("\nStats about from-mysql-to-tables-queue: " {"message" {:num-slabs >>> 3, :num-active-slabs 2, :enqueued 370137, :retried 0, :completed 369934, >>> :in-progress 10}}) >>> >>> ("\nResource usage: " "Memory in use (percentage/used/max-heap): >>> (\"66%\" \"2373M\" \"3568M\")\n\nCPU usage (how-many-cpu's/load-average): >>> [4 5.05]\n\nFree memory in jvm: [1171310752]") >>> >>> 30 seconds later >>> >>> ("\nStats about from-mysql-to-tables-queue: " {"message" {:num-slabs >>> 4, :num-active-slabs 4, :enqueued 608967, :retried 0, :completed 608511, >>> :in-progress 10}}) >>> >>> ("\nResource usage: " "Memory in use (percentage/used/max-heap): >>> (\"76%\" \"2752M\" \"3611M\")\n\nCPU usage (how-many-cpu's/load-average): >>> [4 5.87]\n\nFree memory in jvm: [901122456]") >>> >>> 30 seconds later >>> >>> ("\nStats about from-mysql-to-tables-queue: " {"message" {:num-slabs >>> 4, :num-active-slabs 3, :enqueued 828950, :retried 0, :completed 828470, >>> :in-progress 10}}) >>> >>> ("\nResource usage: " "Memory in use (percentage/used/max-heap): >>> (\"94%\" \"3613M\" \"3819M\")\n\nCPU usage (how-many-cpu's/load-average): >>> [4 6.5]\n\nFree memory in jvm: [216459664]") >>> >>> 30 seconds later >>> >>> ("\nStats about from-mysql-to-tables-queue: " {"message" {:num-slabs >>> 1, :num-active-slabs 1, :enqueued 1051974, :retried 0, :completed 1051974, >>> :in-progress 0}}) >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To post to this group, send email to clo...@googlegroups.com >>> <javascript:> >>> Note that posts from new members are moderated - please be patient with >>> your first post. >>> To unsubscribe from this group, send email to >>> clojure+u...@googlegroups.com <javascript:> >>> For more options, visit this group at >>> http://groups.google.com/group/clojure?hl=en >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to clojure+u...@googlegroups.com <javascript:>. >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clo...@googlegroups.com >> <javascript:> >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+u...@googlegroups.com <javascript:> >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+u...@googlegroups.com <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.