Kevin,

Go for it.

Yes, out tests need a lot of love. Actually most of the tests are not really unit tests, but more like integration tests. Helping with the spring (as in season) cleanup would be immensely appreciated.

Hadrian


On 02/24/2015 02:18 PM, Kevin Burton wrote:
Ah.  OK… I assume continuous integration isn’t a goal.  I’m upset if my
testing takes more than 20 minutes :-P

I bet that this would be taken down to an hour if they were parallelized.
But of course someone needs to take the time to get that done.

I might be allocating some of our internal dev resources to getting our
build to parallelize across containers.  If this generalizable to other
projects I’ll OSS it.. Would be great to get better testing feedback.

Right now I think my patches are ready to be merged but I’d feel more
comfortable suggesting it if I knew the tests worked.

Should I propose it now and if there are any bugs we can fix them later?

Kevin

On Tue, Feb 24, 2015 at 10:54 AM, Timothy Bish <tabish...@gmail.com> wrote:

On 02/24/2015 01:22 PM, Kevin Burton wrote:

oh  geez… seriously though.  can you give me an estimate?

Assuming 30 seconds each I’m assuming about 11 hours.

That's probably not to far off, usually just let them run over night for
the all profile.


http://activemq.apache.org/junit-reports.html
https://hudson.apache.org/hudson/job/ActiveMQ/
Hudson is down for ActiveMQ.

Also, is there a disk space requirement for the tests? I have 10GB free or
so and I’m getting TONS of these:

I don't think anyone has looked into the min required free for the tests
to run.



java.lang.RuntimeException: Failed to start provided job scheduler store:
JobSchedulerStore:activemq-data

at java.io.RandomAccessFile.seek(Native Method)

at
org.apache.activemq.util.RecoverableRandomAccessFile.seek(
RecoverableRandomAccessFile.java:384)

at
org.apache.activemq.store.kahadb.disk.page.PageFile.
readPage(PageFile.java:877)

at
org.apache.activemq.store.kahadb.disk.page.Transaction.
load(Transaction.java:427)

at
org.apache.activemq.store.kahadb.disk.page.Transaction.
load(Transaction.java:377)

at
org.apache.activemq.store.kahadb.disk.index.BTreeIndex.
load(BTreeIndex.java:159)

at
org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl$
MetaData.load(JobSchedulerStoreImpl.java:90)

at
org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl$3.
execute(JobSchedulerStoreImpl.java:277)

at
org.apache.activemq.store.kahadb.disk.page.Transaction.
execute(Transaction.java:779)

at
org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl.doStart(
JobSchedulerStoreImpl.java:261)

at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)

at
org.apache.activemq.broker.BrokerService.setJobSchedulerStore(
BrokerService.java:1891)

at
org.apache.activemq.transport.stomp.StompTestSupport.
createBroker(StompTestSupport.java:174)

at
org.apache.activemq.transport.stomp.StompTestSupport.
startBroker(StompTestSupport.java:112)

at
org.apache.activemq.transport.stomp.StompTestSupport.setUp(
StompTestSupport.java:94)

at
org.apache.activemq.transport.stomp.Stomp12Test.setUp(
Stomp12Test.java:41)

at
org.apache.activemq.transport.stomp.Stomp12SslAuthTest.
setUp(Stomp12SslAuthTest.java:38)

On Tue, Feb 24, 2015 at 10:09 AM, Timothy Bish <tabish...@gmail.com>
wrote:

  On 02/24/2015 01:01 PM, Kevin Burton wrote:
  How long do the tests usually take?
  An eternity + 1hr
   I’m looking at 45 minutes right now

before I gave up… I think part of it was that the box I was testing on
was
virtualized and didn’t have enough resources.

I tried to parallelize the tests (-T 8 with maven) but I got other
errors
so I assume the ports are singletons.

  They won't be happy if you run them parallel.


  On Tue, Feb 24, 2015 at 8:03 AM, Gary Tully <gary.tu...@gmail.com>
wrote:

   if there are any test failures - try to run them individually

-Dtest=a,b etc. There may be an issue with a full test run, but all of
the tests that are enabled should work. I know there are some issues
with jdbc tests that hang or fail due to previous runs no cleaning up,
but that should be the most of it. I got a bunch of full test runs
before the 5.11 release if that is any help.

On 23 February 2015 at 20:38, Kevin Burton <bur...@spinn3r.com> wrote:

  OK.  This is ready to go and I have a patch branch:
https://issues.apache.org/jira/browse/AMQ-5609

I’m stuck at the moment though because tests don’t pass.  But it was
failing tests before so I don’t think it has anything to do with my
changes.



On Sun, Feb 22, 2015 at 11:11 PM, Kevin Burton <bur...@spinn3r.com>

  wrote:
  Actually, is the lock even needed here?  Why would it be?  if we’re
*removing* a subscription, why does it care if we possibly ALSO
remove
a
separate / isolated queue before/after the subscription is removed.

I think this is redundant and can be removed.  Maybe I’m wrong
though.

I looked at all the callers and none were associated with queues.

On Sun, Feb 22, 2015 at 11:07 PM, Kevin Burton <bur...@spinn3r.com>

  wrote:
So I have some working/theoretical code that should resolve this.

It acquires a lock *per* ActiveMQDestination, this way there is no
lock
contention.

But here’s where I’m stuck.

       @Override

        public void removeSubscription(ConnectionContext context,
RemoveSubscriptionInfo info) throws Exception {
           inactiveDestinationsPurgeLock.readLock().lock();
           try {
               topicRegion.removeSubscription(context, info);
           } finally {
               inactiveDestinationsPurgeLock.readLock().unlock();
           }
       }

  .. this is in RegionBroker.
There is no ActiveMQDestination involved here so I’m not sure the
best
way to resolve this.

Any advice?


On Sun, Feb 22, 2015 at 8:11 PM, Kevin Burton <bur...@spinn3r.com>

  wrote:
Yes.  That was my thinking too.. that just replacing the

CopyOnWriteArraySet
with something more performance would solve the issue.
This would also improve queue creation time as well as queue deletion
time.

What I think I”m going to do in the mean time is:

- implement a granular lock based on queue name… I am going to use
an
interface so we can replace the implementation later.

- implement timing for the purge thread so it tracks how long it
takes
to remove a queue but also how long the entire loop takes.

I’ll do this on a branch so it should be easy to merge.

On Sun, Feb 22, 2015 at 7:40 PM, Tim Bain <tb...@alumni.duke.edu>

  wrote:
A decent amount of the time is being spent calling remove() on
various
array-backed collections.  Those data structures might be
inappropriate
for
the number of destinations you're running, since array-backed
collections
tend to have add/remove operations that are O(N); some improvement

  might
come from something as simple as moving to a ConcurrentHashSet
instead
of a
CopyOnWriteArraySet, for example.  (Or it might make performance
  worse
because of other aspects of how those collections are used; people
other
than me would be in a better position to evaluate the full range of
performance requirements for those collections.)
Scheduler.cancel() also takes an alarming amount of time for what

  looks
like a really simple method (
   http://grepcode.com/file/repo1.maven.org/maven2/org.

apache.activemq/activemq-all/5.10.0/org/apache/activemq/
thread/Scheduler.java#Scheduler.cancel%28java.lang.Runnable%29

  ).
On Sun, Feb 22, 2015 at 7:56 PM, Kevin Burton <bur...@spinn3r.com>
wrote:

   Here’s a jprofiler view with the advisory support enabled if
you’re

curious.

http://i.imgur.com/I1jesZz.jpg

I’m not familiar with the internals of ActiveMQ enough to have
any

  obvious
  optimization ideas.
One other idea I had (which would require a ton of refactoring I

  think)
  would be to potentially bulk delete all the queues at once.

On Sun, Feb 22, 2015 at 6:42 PM, Kevin Burton <
bur...@spinn3r.com>

  wrote:
  And spending some more time in jprofiler, looks like 20% of this
is

do to
schedulerSupport and the other 80% of this is do to
advisorySupport.

if I set both to false the total runtime of my tests drops in
half…
and
the latencies fall from
max create producer latency: 10,176 ms

  max create message on existing producer and consumer: 2 ms
  … to
max create producer latency: 1 ms

  max create message on existing producer and consumer: 1 ms
  and this isn’t without fixing the purge background lock.
So the question now is what the heck is the advisory support

  doing
that
can result in such massive performance overhead.
… and I think advisorySupport is enabled by default so that’s

  problematic
as well.

On Sun, Feb 22, 2015 at 4:45 PM, Kevin Burton <

  bur...@spinn3r.com>
wrote:
OK.  Loaded up JProfiler and confirmed that it’s not LevelDB.
This is a non-persistent broker I’m testing on.

Looks like it’s spending all it’s time in

  CopyOnWriteArrayList.remove
and

  Timer.purge…
Which is hopeful because this is ALL due to ActiveMQ internals

  and
in
theory LevelDB should perform well if we improve the
performance of

ActiveMQ internals and fix this lock bug.
Which would rock!
It should ALSO make queue creation faster.


On Sun, Feb 22, 2015 at 4:10 PM, Kevin Burton <

  bur...@spinn3r.com>
wrote:
On Sun, Feb 22, 2015 at 3:30 PM, Tim Bain <
  tb...@alumni.duke.edu>
wrote:
So if LevelDB cleanup during removeDestination() is the
presumed
culprit,
can we spin off the LevelDB cleanup work into a separate
  thread
(better: a
task object to be serviced by a ThreadPool so you can avoid a
  fork
bomb
  if
we remove many destinations at once) so the call to
  removeDestination()
can
return quickly and LevelDB can do its record-keeping in the
  background
without blocking message-processing?
   Would that be possible?  If the delete is pending on ActiveMQ
there is
a
  race where a producer could re-create it unless the lock is
held.

Though I guess if you dispatched to the GC thread WITH the lock
still
held you would be ok but I think if we use the existing purge
thread
then
  we’re fine.
OK. I think I’m wrong about LevelDB being the issue.  To be
  fair I
wasn’t 100% certain before but I should have specified.
Our current production broker is running with
  persistent=false..
.and I
just re-ran the tests without disk persistence and it has the
same
problem.
So the main issue how is why the heck is ActiveMQ taking SO
LONG

to GC
a
  queue.  It’s taking about 100ms which is an insane amount of
time

considering this is done all in memory.
Kevin
--

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>



  --
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>



  --
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>



  --
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>


  --
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>



  --
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>



  --
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>



  --
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>


  --
Tim Bish
Sr Software Engineer | RedHat Inc.
tim.b...@redhat.com | www.redhat.com
skype: tabish121 | twitter: @tabish121
blog: http://timbish.blogspot.com/



--
Tim Bish
Sr Software Engineer | RedHat Inc.
tim.b...@redhat.com | www.redhat.com
skype: tabish121 | twitter: @tabish121
blog: http://timbish.blogspot.com/




Reply via email to