RE: Performance and production readiness of Pulsar Standalone

Kramer, Andre Wed, 04 Mar 2020 08:29:24 -0800

Yes, I missed the journalSyncData setting when deploying standalone on 
Kubernetes.


Thanks a lot for spotting that!
Andre

-----Original Message-----
From: Sijie Guo <guosi...@gmail.com> 
Sent: 02 March 2020 19:27
To: Dev <dev@pulsar.apache.org>
Subject: Re: Performance and production readiness of Pulsar Standalone

Andre,

The standalone disables fsync by default. If you deploy pulsar in cluster mode, 
the fsync is enabled by default. That's probably the reason why you see the 
performance differences.

You need to check if the settings are matched between standalone and cluster.

- Sijie


On Mon, Mar 2, 2020 at 4:13 AM Kramer, Andre <andre.kra...@softwareag.com>
wrote:

> Yes, we are using pulsar-perf to do a simple throughput test 
> (./bin/pulsar-perf produce --service-url pulsar://localhost 
> --num-messages
> 1000000 -r 1000000 -s 256 perf-test). We're failing to understand why 
> we get much higher throughput with standalone than running separately 
> (in Kubernetes pods).
>
> Andre
>
> -----Original Message-----
> From: Sijie Guo <guosi...@gmail.com>
> Sent: 28 February 2020 16:38
> To: Dev <dev@pulsar.apache.org>
> Subject: Re: Performance and production readiness of Pulsar Standalone
>
> On Fri, Feb 28, 2020 at 6:49 AM Kramer, Andre 
> <andre.kra...@softwareag.com
> >
> wrote:
>
> > Hello,
> >
> > We have found that Pulsar standalone (which has zookeeper and bookie 
> > in same Java process as the broker) on a simple throughput test 
> > showed over 3 times the message throughput rate as did 3 separate 
> > processes (Zookeeper, 1 Broker, 1 Bookkeeper) when deployed as pods 
> > in Kubernetes. All pods/containers ran on a single node as our test 
> > cluster just had a single node so only "virtualized" networking is 
> > involved. We found that containerization only had limited overhead 
> > when deploying Pulsar standalone with or without Kubernetes so the 
> > only difference we know about is that broker to bookie communication 
> > has to go via two Pods (Docker containers) on the same VM. We 
> > expected some overheads (network IP stack and extra context 
> > switching) but were really surprised by the > 3x throughput. Is 
> > there any architectural difference between Pulsar Standalone and 
> > separate Broker/Bookkeeper except for running in different 
> > processes? Such as direct communications (not over network sockets), 
> > sharing of data buffers or
> cache configuration that makes Pulsar standalone inherently faster?
> >
>
> I don't expect there will be performance difference. It depends on how 
> do you run the test. Were you using pulsar-perf to do the test?
>
>
> >
> > Our second question is on production readiness of Pulsar Standalone.
> > Is anyone using it in production if no fault tolerance (other than 
> > recovery on
> > restart) is required? Or do people deploy a 1 node cluster which can 
> > be upgraded / managed more easily and scaled if needed?
> >
>
> Standalone was orignally designed for development. You can for sure 
> run it on production if there is no fault tolerance requirement. 
> However it is better to start with 1-node cluster and upgrade/scale as needed.
>
>
> >
> > Thanks in advance for your answers,
> > Andre
> >
> > Andre Kramer
> > andre.kra...@softwareag.com<mailto:andre.kra...@softwareag.com>
> >
>

RE: Performance and production readiness of Pulsar Standalone

Reply via email to