Re: Performance and production readiness of Pulsar Standalone

Sijie Guo Mon, 02 Mar 2020 11:27:51 -0800

Andre,

The standalone disables fsync by default. If you deploy pulsar in cluster
mode, the fsync is enabled by default. That's probably the reason why you
see the performance differences.


You need to check if the settings are matched between standalone and
cluster.

- Sijie


On Mon, Mar 2, 2020 at 4:13 AM Kramer, Andre <andre.kra...@softwareag.com>
wrote:

> Yes, we are using pulsar-perf to do a simple throughput test
> (./bin/pulsar-perf produce --service-url pulsar://localhost --num-messages
> 1000000 -r 1000000 -s 256 perf-test). We're failing to understand why we
> get much higher throughput with standalone than running separately (in
> Kubernetes pods).
>
> Andre
>
> -----Original Message-----
> From: Sijie Guo <guosi...@gmail.com>
> Sent: 28 February 2020 16:38
> To: Dev <dev@pulsar.apache.org>
> Subject: Re: Performance and production readiness of Pulsar Standalone
>
> On Fri, Feb 28, 2020 at 6:49 AM Kramer, Andre <andre.kra...@softwareag.com
> >
> wrote:
>
> > Hello,
> >
> > We have found that Pulsar standalone (which has zookeeper and bookie
> > in same Java process as the broker) on a simple throughput test showed
> > over 3 times the message throughput rate as did 3 separate processes
> > (Zookeeper, 1 Broker, 1 Bookkeeper) when deployed as pods in
> > Kubernetes. All pods/containers ran on a single node as our test
> > cluster just had a single node so only "virtualized" networking is
> > involved. We found that containerization only had limited overhead
> > when deploying Pulsar standalone with or without Kubernetes so the
> > only difference we know about is that broker to bookie communication
> > has to go via two Pods (Docker containers) on the same VM. We expected
> > some overheads (network IP stack and extra context switching) but were
> > really surprised by the > 3x throughput. Is there any architectural
> > difference between Pulsar Standalone and separate Broker/Bookkeeper
> > except for running in different processes? Such as direct
> > communications (not over network sockets), sharing of data buffers or
> cache configuration that makes Pulsar standalone inherently faster?
> >
>
> I don't expect there will be performance difference. It depends on how do
> you run the test. Were you using pulsar-perf to do the test?
>
>
> >
> > Our second question is on production readiness of Pulsar Standalone.
> > Is anyone using it in production if no fault tolerance (other than
> > recovery on
> > restart) is required? Or do people deploy a 1 node cluster which can
> > be upgraded / managed more easily and scaled if needed?
> >
>
> Standalone was orignally designed for development. You can for sure run it
> on production if there is no fault tolerance requirement. However it is
> better to start with 1-node cluster and upgrade/scale as needed.
>
>
> >
> > Thanks in advance for your answers,
> > Andre
> >
> > Andre Kramer
> > andre.kra...@softwareag.com<mailto:andre.kra...@softwareag.com>
> >
>

Re: Performance and production readiness of Pulsar Standalone

Reply via email to