Re: Unexplained slow sending

James Green Thu, 29 Aug 2019 10:13:38 -0700

Tim,

The NIC issue is a potential issue. We're trying to deploy via Fargate
where possible to shift the operational burden away from our developer
staff, so noisy neighbours is entirely possible, as is cross-AZ latency.


Touch wood, I seem to be in a place where throughput is at least as good as
existing production. I've yet to "liven" all the potential traffic patterns
to simulate the more complex loads but it may be good enough for now.


On Thu, 29 Aug 2019 at 13:40, Tim Bain <tb...@alumni.duke.edu> wrote:

> Might the choke point be the NIC on the EC2 instance? If you run the
> consumers for A and B on different EC2s, how does that throughput compare
> to what you're seeing?
>
> Also, I'd recommend you use JVisualVM or similar to capture a CPU sampling
> (not profiling!) snapshot of your producer program to see where it's
> spending its time. If there's a significant amount of time spent anywhere
> except making the network call to send the bytes of the payload, then dig
> into that.
>
> Tim
>
> On Wed, Aug 28, 2019, 12:03 PM alan protasio <alanp...@gmail.com> wrote:
>
> > Hi,
> >
> > I think you can try disable concurrentStoreAndDispatchQueues and rerun
> the
> > tests.
> >
> > Alan Diego
> >
> >
> > On Wed, Aug 28, 2019 at 9:42 AM James Green <james.mk.gr...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > Following-up as I've run more tests.
> > >
> > > My minimal producer suffered the same bug as our main application: we
> had
> > > spring boot activemq thread pooling turned on as a property, but the
> > > library (referenced in the main docs) was not included. Looking back at
> > my
> > > rather sparse notes at the time I activated this my sends of 10K
> messages
> > > went from taking 11m47s to between 2m56s - 3m31s which is a marked
> > > improvement.
> > >
> > > To my chagrin, this has made little different to our real-world
> > > application, and so I have modified my minimal producer to be capable
> of
> > > sending to the main application via it's queues.
> > >
> > > Allow me to elaborate at this point as it's important to understand
> what
> > > I'm looking at...
> > >
> > > The messages follow a small path through a series of queues as they are
> > > processed. Queue A -> B -> C.
> > >
> > > If my minimal producer sends to Queue C (skipping A and B) I'm able to
> > > produce at 49/s which is "quick enough".
> > > If my minimal producer sends to Queue B (skipping A) I'm able to
> produce
> > at
> > > 28/s - 38/s which is variable but most of the tests reached 38/s.
> > > If my minimal producer sends to Queue A I'm able to produce at 28/s -
> > 42/s
> > > - again variable.
> > >
> > > Now Queues A and B are consumed by separate Camel routes inside the
> same
> > > application. Queue C is entirely separate.
> > >
> > > Looking at throughput graphs of the consumption of Queue C, when first
> > > going through (A,B) for 10K messages, then going through (B), I can see
> > > (A,B) is twice as slow.
> > >
> > > I'm left wondering if there's contention somehow within the application
> > > consuming from (A,B) that is only showing up during load testing on
> AWS,
> > I
> > > was not expecting it would be 2x slower unless the producer thread is
> > > shared - you might imagine a thread pool was solve that!
> > >
> > > At this point I have ensured that there are 4 instances of each
> > application
> > > and they can happily deal with about 50 messages per second across the
> > > queues with persistence on. I am uncertain whether I should be
> expecting
> > > more.
> > >
> > > If anyone has insights on why the two routes within the same
> application
> > > appear contended and indeed on whether overall throughput should be a
> lot
> > > higher I'd love to hear it.
> > >
> > > James
> > >
> > >
> > > On Thu, 22 Aug 2019 at 14:02, Tim Bain <tb...@alumni.duke.edu> wrote:
> > >
> > > > Can you create a minimal producer via the OpenWire protocol in Java
> or
> > > > another language of your choice, to determine if your Camel producer
> is
> > > > slow because it's OpenWire or because it's Camel? I suspect you'll
> find
> > > > that OpenWire is the culprit, not Camel, but let's confirm that.
> > > >
> > > > All of these numbers sound tiny compared to what the ActiveMQ product
> > is
> > > > capable of (though I don't have any insight into how Amazon has
> > > configured
> > > > the brokers, nor into any code customizations they might have made).
> If
> > > you
> > > > run multiple minimal producers in parallel, does throughput increase
> > > > linearly?
> > > >
> > > > Also, you say you're testing with small payloads; are they small
> enough
> > > > that you might be running into the Nagle algorithm on your TCP
> sockets?
> > > If
> > > > you use larger (e.g. 1KB) payloads, what does that do to your
> > throughput
> > > on
> > > > a single producer?
> > > >
> > > > Tim
> > > >
> > > > On Thu, Aug 22, 2019, 2:54 AM James Green <james.mk.gr...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I've been busy shifting an existing workload into AWS recently,
> and a
> > > > load
> > > > > test shows a serious performance drop when sending to ActiveMQ
> which
> > I
> > > > > could use some advice on.
> > > > >
> > > > > Quick architecture summary: We send requests via a webserver that
> are
> > > > > forwarded as messages to a queue. A backend receives these messages
> > and
> > > > > forwards them onward to another queue. Spring Boot with Camel
> powers
> > > the
> > > > > show within Docker containers. Messages are persistent.
> > > > >
> > > > > Story so far:
> > > > >
> > > > > Tests show this first queue builds rapidly with pending messages
> yet
> > > > > monitoring of our existing production environment shows no such
> > > backlog.
> > > > >
> > > > > Our existing production environment has everything in a single DC
> so
> > > it's
> > > > > super low latency. Our AWS environment uses Fargate with AmazonMQ.
> I
> > > > > understand send latency will be higher and AmazonMQ will store the
> > > > messages
> > > > > across three AZs.
> > > > >
> > > > > So I launched a small EC2 instance to run some comparison tests:
> > > > >
> > > > > Receiving via a Camel route is super quick. This is not a problem.
> > > > > Sending via a minimal Camel route is super slow. 14 messages per
> > > second.
> > > > We
> > > > > appear to be doing at least 20-30 per second in production but it's
> > > > enough
> > > > > of a difference.
> > > > > Sending via PHP with stomp-php setting both persistence on and
> > receipt
> > > > > headers on is substantially faster than sending via Camel. 55
> > messages
> > > > per
> > > > > second.
> > > > > Tests have been with 10K small payloads.
> > > > >
> > > > > At this point I'm thinking that both Camel and PHP should be
> sending
> > > with
> > > > > the same properties - synchronously and with persistence. The
> > messages
> > > on
> > > > > the queue are flagged persistent when viewed by the web console.
> > > > >
> > > > > Can anyone provide further suggestions to try?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > James
> > > > >
> > > >
> > >
> >
>

Re: Unexplained slow sending

Reply via email to