So I have some working/theoretical code that should resolve this. It acquires a lock *per* ActiveMQDestination, this way there is no lock contention.
But here’s where I’m stuck. @Override > public void removeSubscription(ConnectionContext context, > RemoveSubscriptionInfo info) throws Exception { > inactiveDestinationsPurgeLock.readLock().lock(); > try { > topicRegion.removeSubscription(context, info); > } finally { > inactiveDestinationsPurgeLock.readLock().unlock(); > } > } .. this is in RegionBroker. There is no ActiveMQDestination involved here so I’m not sure the best way to resolve this. Any advice? On Sun, Feb 22, 2015 at 8:11 PM, Kevin Burton <bur...@spinn3r.com> wrote: > Yes. That was my thinking too.. that just replacing the CopyOnWriteArraySet > with something more performance would solve the issue. > > This would also improve queue creation time as well as queue deletion > time. > > What I think I”m going to do in the mean time is: > > - implement a granular lock based on queue name… I am going to use an > interface so we can replace the implementation later. > > - implement timing for the purge thread so it tracks how long it takes to > remove a queue but also how long the entire loop takes. > > I’ll do this on a branch so it should be easy to merge. > > On Sun, Feb 22, 2015 at 7:40 PM, Tim Bain <tb...@alumni.duke.edu> wrote: > >> A decent amount of the time is being spent calling remove() on various >> array-backed collections. Those data structures might be inappropriate >> for >> the number of destinations you're running, since array-backed collections >> tend to have add/remove operations that are O(N); some improvement might >> come from something as simple as moving to a ConcurrentHashSet instead of >> a >> CopyOnWriteArraySet, for example. (Or it might make performance worse >> because of other aspects of how those collections are used; people other >> than me would be in a better position to evaluate the full range of >> performance requirements for those collections.) >> >> Scheduler.cancel() also takes an alarming amount of time for what looks >> like a really simple method ( >> >> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-all/5.10.0/org/apache/activemq/thread/Scheduler.java#Scheduler.cancel%28java.lang.Runnable%29 >> ). >> >> On Sun, Feb 22, 2015 at 7:56 PM, Kevin Burton <bur...@spinn3r.com> wrote: >> >> > Here’s a jprofiler view with the advisory support enabled if you’re >> > curious. >> > >> > http://i.imgur.com/I1jesZz.jpg >> > >> > I’m not familiar with the internals of ActiveMQ enough to have any >> obvious >> > optimization ideas. >> > >> > One other idea I had (which would require a ton of refactoring I think) >> > would be to potentially bulk delete all the queues at once. >> > >> > >> > On Sun, Feb 22, 2015 at 6:42 PM, Kevin Burton <bur...@spinn3r.com> >> wrote: >> > >> > > And spending some more time in jprofiler, looks like 20% of this is >> do to >> > > schedulerSupport and the other 80% of this is do to advisorySupport. >> > > >> > > if I set both to false the total runtime of my tests drops in half… >> and >> > > the latencies fall from >> > > >> > > max create producer latency: 10,176 ms >> > >> max create message on existing producer and consumer: 2 ms >> > > >> > > >> > > … to >> > > >> > > max create producer latency: 1 ms >> > >> max create message on existing producer and consumer: 1 ms >> > > >> > > >> > > and this isn’t without fixing the purge background lock. >> > > >> > > So the question now is what the heck is the advisory support doing >> that >> > > can result in such massive performance overhead. >> > > >> > > … and I think advisorySupport is enabled by default so that’s >> problematic >> > > as well. >> > > >> > > >> > > On Sun, Feb 22, 2015 at 4:45 PM, Kevin Burton <bur...@spinn3r.com> >> > wrote: >> > > >> > >> OK. Loaded up JProfiler and confirmed that it’s not LevelDB. >> > >> >> > >> This is a non-persistent broker I’m testing on. >> > >> >> > >> Looks like it’s spending all it’s time in CopyOnWriteArrayList.remove >> > and >> > >> Timer.purge… >> > >> >> > >> Which is hopeful because this is ALL due to ActiveMQ internals and in >> > >> theory LevelDB should perform well if we improve the performance of >> > >> ActiveMQ internals and fix this lock bug. >> > >> >> > >> Which would rock! >> > >> >> > >> It should ALSO make queue creation faster. >> > >> >> > >> >> > >> On Sun, Feb 22, 2015 at 4:10 PM, Kevin Burton <bur...@spinn3r.com> >> > wrote: >> > >> >> > >>> >> > >>> >> > >>> On Sun, Feb 22, 2015 at 3:30 PM, Tim Bain <tb...@alumni.duke.edu> >> > wrote: >> > >>> >> > >>>> So if LevelDB cleanup during removeDestination() is the presumed >> > >>>> culprit, >> > >>>> can we spin off the LevelDB cleanup work into a separate thread >> > >>>> (better: a >> > >>>> task object to be serviced by a ThreadPool so you can avoid a fork >> > bomb >> > >>>> if >> > >>>> we remove many destinations at once) so the call to >> > removeDestination() >> > >>>> can >> > >>>> return quickly and LevelDB can do its record-keeping in the >> background >> > >>>> without blocking message-processing? >> > >>>> >> > >>> >> > >>> Would that be possible? If the delete is pending on ActiveMQ there >> is >> > a >> > >>> race where a producer could re-create it unless the lock is held. >> > >>> >> > >>> Though I guess if you dispatched to the GC thread WITH the lock >> still >> > >>> held you would be ok but I think if we use the existing purge thread >> > then >> > >>> we’re fine. >> > >>> >> > >>> OK. I think I’m wrong about LevelDB being the issue. To be fair I >> > >>> wasn’t 100% certain before but I should have specified. >> > >>> >> > >>> Our current production broker is running with persistent=false.. >> .and I >> > >>> just re-ran the tests without disk persistence and it has the same >> > problem. >> > >>> >> > >>> So the main issue how is why the heck is ActiveMQ taking SO LONG to >> GC >> > a >> > >>> queue. It’s taking about 100ms which is an insane amount of time >> > >>> considering this is done all in memory. >> > >>> >> > >>> Kevin >> > >>> >> > >>> -- >> > >>> >> > >>> Founder/CEO Spinn3r.com >> > >>> Location: *San Francisco, CA* >> > >>> blog: http://burtonator.wordpress.com >> > >>> … or check out my Google+ profile >> > >>> <https://plus.google.com/102718274791889610666/posts> >> > >>> <http://spinn3r.com> >> > >>> >> > >>> >> > >> >> > >> >> > >> -- >> > >> >> > >> Founder/CEO Spinn3r.com >> > >> Location: *San Francisco, CA* >> > >> blog: http://burtonator.wordpress.com >> > >> … or check out my Google+ profile >> > >> <https://plus.google.com/102718274791889610666/posts> >> > >> <http://spinn3r.com> >> > >> >> > >> >> > > >> > > >> > > -- >> > > >> > > Founder/CEO Spinn3r.com >> > > Location: *San Francisco, CA* >> > > blog: http://burtonator.wordpress.com >> > > … or check out my Google+ profile >> > > <https://plus.google.com/102718274791889610666/posts> >> > > <http://spinn3r.com> >> > > >> > > >> > >> > >> > -- >> > >> > Founder/CEO Spinn3r.com >> > Location: *San Francisco, CA* >> > blog: http://burtonator.wordpress.com >> > … or check out my Google+ profile >> > <https://plus.google.com/102718274791889610666/posts> >> > <http://spinn3r.com> >> > >> > > > > -- > > Founder/CEO Spinn3r.com > Location: *San Francisco, CA* > blog: http://burtonator.wordpress.com > … or check out my Google+ profile > <https://plus.google.com/102718274791889610666/posts> > <http://spinn3r.com> > > -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile <https://plus.google.com/102718274791889610666/posts> <http://spinn3r.com>