Hi Rob, I have the truck code which I am testing with, I haven't finished the test runs yet. I was hoping that once I validate the change, I can simply release 6.0.5.
Thanks Ramayan On Thu, Oct 27, 2016 at 12:41 PM, Rob Godfrey <[email protected]> wrote: > Hi Ramayan, > > did you verify that the change works for you? You said you were going to > test with the trunk code... > > I'll discuss with the other developers tomorrow about whether we can put > this change into 6.0.5. > > Cheers, > Rob > > On 27 October 2016 at 20:30, Ramayan Tiwari <[email protected]> > wrote: > > > Hi Rob, > > > > I looked at the release notes for 6.0.5 and it doesn't include the fix > for > > large consumers issues [1]. The fix is marked for 6.1, which will not > have > > JMX and for us to use this version requires major changes in our > monitoring > > framework. Could you please include the fix in 6.0.5 release? > > > > Thanks > > Ramayan > > > > [1]. https://issues.apache.org/jira/browse/QPID-7462 > > > > On Wed, Oct 19, 2016 at 4:49 PM, Helen Kwong <[email protected]> > wrote: > > > > > Hi Rob, > > > > > > Again, thank you so much for answering our questions and providing a > > patch > > > so quickly :) One more question I have: would it be possible to include > > > test cases involving many queues and listeners (in the order of > thousands > > > of queues) for future Qpid releases, as part of standard perf testing > of > > > the broker? > > > > > > Thanks, > > > Helen > > > > > > On Tue, Oct 18, 2016 at 10:40 AM, Ramayan Tiwari < > > [email protected] > > > > wrote: > > > > > >> Thanks so much Rob, I will test the patch against trunk and will > update > > >> you with the outcome. > > >> > > >> - Ramayan > > >> > > >> On Tue, Oct 18, 2016 at 2:37 AM, Rob Godfrey <[email protected] > > > > >> wrote: > > >> > > >>> On 17 October 2016 at 21:50, Rob Godfrey <[email protected]> > > >>> wrote: > > >>> > > >>> > > > >>> > > > >>> > On 17 October 2016 at 21:24, Ramayan Tiwari < > > [email protected]> > > >>> > wrote: > > >>> > > > >>> >> Hi Rob, > > >>> >> > > >>> >> We are certainly interested in testing the "multi queue consumers" > > >>> >> behavior > > >>> >> with your patch in the new broker. We would like to know: > > >>> >> > > >>> >> 1. What will the scope of changes, client or broker or both? We > are > > >>> >> currently running 0.16 client, so would like to make sure that we > > will > > >>> >> able > > >>> >> to use these changes with 0.16 client. > > >>> >> > > >>> >> > > >>> > There's no change to the client. I can't remember what was in the > > 0.16 > > >>> > client... the only issue would be if there are any bugs in the > > parsing > > >>> of > > >>> > address arguments. I can try to test that out tmr. > > >>> > > > >>> > > >>> > > >>> OK - with a little bit of care to get round the address parsing > issues > > in > > >>> the 0.16 client... I think we can get this to work. I've created the > > >>> following JIRA: > > >>> > > >>> https://issues.apache.org/jira/browse/QPID-7462 > > >>> > > >>> and attached to it are a patch which applies against trunk, and a > > >>> separate > > >>> patch which applies against the 6.0.x branch ( > > >>> https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this is > > >>> 6.0.4 > > >>> plus a few other fixes which we will soon be releasing as 6.0.5) > > >>> > > >>> To create a consumer which uses this feature (and multi queue > > >>> consumption) > > >>> for the 0.16 client you need to use something like the following as > the > > >>> address: > > >>> > > >>> queue_01 ; {node : { type : queue }, link : { x-subscribes : { > > >>> arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ], > > >>> x-pull-only : true }}}} > > >>> > > >>> > > >>> Note that the initial queue_01 has to be a name of an actual queue on > > >>> the virtual host, but otherwise it is not actually used (if you were > > >>> using a 0.32 or later client you could just use '' here). The actual > > >>> queues that are consumed from are in the list value associated with > > >>> x-multiqueue. For my testing I created a list with 3000 queues here > > >>> and this worked fine. > > >>> > > >>> Let me know if you have any questions / issues, > > >>> > > >>> Hope this helps, > > >>> Rob > > >>> > > >>> > > >>> > > > >>> > > > >>> >> 2. My understanding is that the "pull vs push" change is only with > > >>> respect > > >>> >> to broker and it does not change our architecture where we use > > >>> >> MessageListerner to receive messages asynchronously. > > >>> >> > > >>> > > > >>> > Exactly - this is only a change within the internal broker > threading > > >>> > model. The external behaviour of the broker remains essentially > > >>> unchanged. > > >>> > > > >>> > > > >>> >> > > >>> >> 3. Once I/O refactoring is completely, we would be able to go back > > to > > >>> use > > >>> >> standard JMS consumer (Destination), what is the timeline and > broker > > >>> >> release version for the completion of this work? > > >>> >> > > >>> > > > >>> > You might wish to continue to use the "multi queue" model, > depending > > on > > >>> > your actual use case, but yeah once the I/O work is complete I > would > > >>> hope > > >>> > that you could use the thousands of consumers model should you > wish. > > >>> We > > >>> > don't have a schedule for the next phase of I/O rework right now - > > >>> about > > >>> > all I can say is that it is unlikely to be complete this year. I'd > > >>> need to > > >>> > talk with Keith (who is currently on vacation) as to when we think > we > > >>> may > > >>> > be able to schedule it. > > >>> > > > >>> > > > >>> >> > > >>> >> Let me know once you have integrated the patch and I will re-run > our > > >>> >> performance tests to validate it. > > >>> >> > > >>> >> > > >>> > I'll make a patch for 6.0.x presently (I've been working on a > change > > >>> > against trunk - the patch will probably have to change a bit to > apply > > >>> to > > >>> > 6.0.x). > > >>> > > > >>> > Cheers, > > >>> > Rob > > >>> > > > >>> > Thanks > > >>> >> Ramayan > > >>> >> > > >>> >> On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey < > > [email protected] > > >>> > > > >>> >> wrote: > > >>> >> > > >>> >> > OK - so having pondered / hacked around a bit this weekend, I > > think > > >>> to > > >>> >> get > > >>> >> > decent performance from the IO model in 6.0 for your use case > > we're > > >>> >> going > > >>> >> > to have to change things around a bit. > > >>> >> > > > >>> >> > Basically 6.0 is an intermediate step on our IO / threading > model > > >>> >> journey. > > >>> >> > In earlier versions we used 2 threads per connection for IO (one > > >>> read, > > >>> >> one > > >>> >> > write) and then extra threads from a pool to "push" messages > from > > >>> >> queues to > > >>> >> > connections. > > >>> >> > > > >>> >> > In 6.0 we move to using a pool for the IO threads, and also > > stopped > > >>> >> queues > > >>> >> > from "pushing" to connections while the IO threads were acting > on > > >>> the > > >>> >> > connection. It's this latter fact which is screwing up > > performance > > >>> for > > >>> >> > your use case here because what happens is that on each network > > >>> read we > > >>> >> > tell each consumer to stop accepting pushes from the queue until > > >>> the IO > > >>> >> > interaction has completed. This is causing lots of loops over > > your > > >>> 3000 > > >>> >> > consumers on each session, which is eating up a lot of CPU on > > every > > >>> >> network > > >>> >> > interaction. > > >>> >> > > > >>> >> > In the final version of our IO refactoring we want to remove the > > >>> >> "pushing" > > >>> >> > from the queue, and instead have the consumers "pull" - so that > > the > > >>> only > > >>> >> > threads that operate on the queues (outside of housekeeping > tasks > > >>> like > > >>> >> > expiry) will be the IO threads. > > >>> >> > > > >>> >> > So, what we could do (and I have a patch sitting on my laptop > for > > >>> this) > > >>> >> is > > >>> >> > to look at using the "multi queue consumers" work I did for you > > guys > > >>> >> > before, but augmenting this so that the consumers work using a > > >>> "pull" > > >>> >> model > > >>> >> > rather than the push model. This will guarantee strict fairness > > >>> between > > >>> >> > the queues associated with the consumer (which was the issue you > > had > > >>> >> with > > >>> >> > this functionality before, I believe). Using this model you'd > > only > > >>> >> need a > > >>> >> > small number (one?) of consumers per session. The patch I have > is > > >>> to > > >>> >> add > > >>> >> > this "pull" mode for these consumers (essentially this is a > > preview > > >>> of > > >>> >> how > > >>> >> > all consumers will work in the future). > > >>> >> > > > >>> >> > Does this seem like something you would be interested in > pursuing? > > >>> >> > > > >>> >> > Cheers, > > >>> >> > Rob > > >>> >> > > > >>> >> > On 15 October 2016 at 17:30, Ramayan Tiwari < > > >>> [email protected]> > > >>> >> > wrote: > > >>> >> > > > >>> >> > > Thanks Rob. Apologies for sending this over weekend :( > > >>> >> > > > > >>> >> > > Are there are docs on the new threading model? I found this on > > >>> >> > confluence: > > >>> >> > > > > >>> >> > > https://cwiki.apache.org/confluence/display/qpid/IO+ > > >>> >> > Transport+Refactoring > > >>> >> > > > > >>> >> > > We are also interested in understanding the threading model a > > >>> little > > >>> >> > better > > >>> >> > > to help us figure our its impact for our usage patterns. Would > > be > > >>> very > > >>> >> > > helpful if there are more docs/JIRA/email-threads with some > > >>> details. > > >>> >> > > > > >>> >> > > Thanks > > >>> >> > > > > >>> >> > > On Sat, Oct 15, 2016 at 9:21 AM, Rob Godfrey < > > >>> [email protected] > > >>> >> > > > >>> >> > > wrote: > > >>> >> > > > > >>> >> > > > So I *think* this is an issue because of the extremely large > > >>> number > > >>> >> of > > >>> >> > > > consumers. The threading model in v6 means that whenever a > > >>> network > > >>> >> > read > > >>> >> > > > occurs for a connection, it iterates over the consumers on > > that > > >>> >> > > connection > > >>> >> > > > - obviously where there are a large number of consumers this > > is > > >>> >> > > > burdensome. I fear addressing this may not be a trivial > > >>> change... > > >>> >> I > > >>> >> > > shall > > >>> >> > > > spend the rest of my afternoon pondering this... > > >>> >> > > > > > >>> >> > > > - Rob > > >>> >> > > > > > >>> >> > > > On 15 October 2016 at 17:14, Ramayan Tiwari < > > >>> >> [email protected]> > > >>> >> > > > wrote: > > >>> >> > > > > > >>> >> > > > > Hi Rob, > > >>> >> > > > > > > >>> >> > > > > Thanks so much for your response. We use transacted > sessions > > >>> with > > >>> >> > > > > non-persistent delivery. Prefetch size is 1 and every > > message > > >>> is > > >>> >> same > > >>> >> > > > size > > >>> >> > > > > (200 bytes). > > >>> >> > > > > > > >>> >> > > > > Thanks > > >>> >> > > > > Ramayan > > >>> >> > > > > > > >>> >> > > > > On Sat, Oct 15, 2016 at 2:59 AM, Rob Godfrey < > > >>> >> > [email protected]> > > >>> >> > > > > wrote: > > >>> >> > > > > > > >>> >> > > > > > Hi Ramyan, > > >>> >> > > > > > > > >>> >> > > > > > this is interesting... in our testing (which admittedly > > >>> didn't > > >>> >> > cover > > >>> >> > > > the > > >>> >> > > > > > case of this many queues / listeners) we saw the 6.0.x > > >>> broker > > >>> >> using > > >>> >> > > > less > > >>> >> > > > > > CPU on average than the 0.32 broker. I'll have a look > > this > > >>> >> weekend > > >>> >> > > as > > >>> >> > > > to > > >>> >> > > > > > why creating the listeners is slower. On the dequeing, > > can > > >>> you > > >>> >> > give > > >>> >> > > a > > >>> >> > > > > > little more information on the usage pattern - are you > > using > > >>> >> > > > > transactions, > > >>> >> > > > > > auto-ack or client ack? What prefetch size are you > using? > > >>> How > > >>> >> > large > > >>> >> > > > are > > >>> >> > > > > > your messages? > > >>> >> > > > > > > > >>> >> > > > > > Thanks, > > >>> >> > > > > > Rob > > >>> >> > > > > > > > >>> >> > > > > > On 14 October 2016 at 23:46, Ramayan Tiwari < > > >>> >> > > [email protected]> > > >>> >> > > > > > wrote: > > >>> >> > > > > > > > >>> >> > > > > > > Hi All, > > >>> >> > > > > > > > > >>> >> > > > > > > We have been validating the new Qpid broker (version > > >>> 6.0.4) > > >>> >> and > > >>> >> > > have > > >>> >> > > > > > > compared against broker version 0.32 and are seeing > > major > > >>> >> > > > regressions. > > >>> >> > > > > > > Following is the summary of our test setup and > results: > > >>> >> > > > > > > > > >>> >> > > > > > > *1. Test Setup * > > >>> >> > > > > > > *a). *Qpid broker runs on a dedicated host (12 > cores, > > >>> 32 GB > > >>> >> > RAM). > > >>> >> > > > > > > *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 > > >>> broker, > > >>> >> we > > >>> >> > use > > >>> >> > > > 8GB > > >>> >> > > > > > > heap and 8GB direct memory. > > >>> >> > > > > > > *c).* For 6.0.4, flow to disk has been configured at > > >>> 60%. > > >>> >> > > > > > > *d).* Both the brokers use BDB host type. > > >>> >> > > > > > > *e).* Brokers have around 6000 queues and we create > 16 > > >>> >> listener > > >>> >> > > > > > > sessions/threads spread over 3 connections, where each > > >>> >> session is > > >>> >> > > > > > listening > > >>> >> > > > > > > to 3000 queues. However, messages are only enqueued > and > > >>> >> processed > > >>> >> > > > from > > >>> >> > > > > 10 > > >>> >> > > > > > > queues. > > >>> >> > > > > > > *f).* We enqueue 1 million messages across 10 > > different > > >>> >> queues > > >>> >> > > > > (evenly > > >>> >> > > > > > > divided), at the start of the test. Dequeue only > starts > > >>> once > > >>> >> all > > >>> >> > > the > > >>> >> > > > > > > messages have been enqueued. We run the test for 2 > hours > > >>> and > > >>> >> > > process > > >>> >> > > > as > > >>> >> > > > > > > many messages as we can. Each message runs for around > > 200 > > >>> >> > > > milliseconds. > > >>> >> > > > > > > *g).* We have used both 0.16 and 6.0.4 clients for > > these > > >>> >> tests > > >>> >> > > > (6.0.4 > > >>> >> > > > > > > client only with 6.0.4 broker) > > >>> >> > > > > > > > > >>> >> > > > > > > *2. Test Results * > > >>> >> > > > > > > *a).* System Load Average (read notes below on how > we > > >>> >> compute > > >>> >> > > it), > > >>> >> > > > > for > > >>> >> > > > > > > 6.0.4 broker is 5x compared to 0.32 broker. During > start > > >>> of > > >>> >> the > > >>> >> > > test > > >>> >> > > > > > (when > > >>> >> > > > > > > we are not doing any dequeue), load average is normal > > >>> (0.05 > > >>> >> for > > >>> >> > > 0.32 > > >>> >> > > > > > broker > > >>> >> > > > > > > and 0.1 for new broker), however, while we are > dequeuing > > >>> >> > messages, > > >>> >> > > > the > > >>> >> > > > > > load > > >>> >> > > > > > > average is very high (around 0.5 consistently). > > >>> >> > > > > > > > > >>> >> > > > > > > *b). *Time to create listeners in new broker has > gone > > >>> up by > > >>> >> > 220% > > >>> >> > > > > > compared > > >>> >> > > > > > > to 0.32 broker (when using 0.16 client). For old > broker, > > >>> >> creating > > >>> >> > > 16 > > >>> >> > > > > > > sessions each listening to 3000 queues takes 142 > seconds > > >>> and > > >>> >> in > > >>> >> > new > > >>> >> > > > > > broker > > >>> >> > > > > > > it took 456 seconds. If we use 6.0.4 client, it took > > even > > >>> >> longer > > >>> >> > at > > >>> >> > > > > 524% > > >>> >> > > > > > > increase (887 seconds). > > >>> >> > > > > > > *I).* The time to create consumers increases as > we > > >>> create > > >>> >> > more > > >>> >> > > > > > > listeners on the same connections. We have 20 sessions > > >>> (but > > >>> >> end > > >>> >> > up > > >>> >> > > > > using > > >>> >> > > > > > > around 5 of them) on each connection and we create > about > > >>> 3000 > > >>> >> > > > consumers > > >>> >> > > > > > and > > >>> >> > > > > > > attach MessageListener to it. Each successive session > > >>> takes > > >>> >> > longer > > >>> >> > > > > > > (approximately linear increase) to setup same number > of > > >>> >> consumers > > >>> >> > > and > > >>> >> > > > > > > listeners. > > >>> >> > > > > > > > > >>> >> > > > > > > *3). How we compute System Load Average * > > >>> >> > > > > > > We query the Mbean SysetmLoadAverage and divide it by > > the > > >>> >> value > > >>> >> > of > > >>> >> > > > > MBean > > >>> >> > > > > > > AvailableProcessors. Both of these MBeans are > available > > >>> under > > >>> >> > > > > > > java.lang.OperatingSystem. > > >>> >> > > > > > > > > >>> >> > > > > > > I am not sure what is causing these regressions and > > would > > >>> like > > >>> >> > your > > >>> >> > > > > help > > >>> >> > > > > > in > > >>> >> > > > > > > understanding it. We are aware about the changes with > > >>> respect > > >>> >> to > > >>> >> > > > > > threading > > >>> >> > > > > > > model in the new broker, are there any design docs > that > > >>> we can > > >>> >> > > refer > > >>> >> > > > to > > >>> >> > > > > > > understand these changes at a high level? Can we tune > > some > > >>> >> > > parameters > > >>> >> > > > > to > > >>> >> > > > > > > address these issues? > > >>> >> > > > > > > > > >>> >> > > > > > > Thanks > > >>> >> > > > > > > Ramayan > > >>> >> > > > > > > > > >>> >> > > > > > > > >>> >> > > > > > > >>> >> > > > > > >>> >> > > > > >>> >> > > > >>> >> > > >>> > > > >>> > > > >>> > > >> > > >> > > > > > >
