Yes, we are using pulsar-perf to do a simple throughput test (./bin/pulsar-perf produce --service-url pulsar://localhost --num-messages 1000000 -r 1000000 -s 256 perf-test). We're failing to understand why we get much higher throughput with standalone than running separately (in Kubernetes pods).
Andre -----Original Message----- From: Sijie Guo <guosi...@gmail.com> Sent: 28 February 2020 16:38 To: Dev <dev@pulsar.apache.org> Subject: Re: Performance and production readiness of Pulsar Standalone On Fri, Feb 28, 2020 at 6:49 AM Kramer, Andre <andre.kra...@softwareag.com> wrote: > Hello, > > We have found that Pulsar standalone (which has zookeeper and bookie > in same Java process as the broker) on a simple throughput test showed > over 3 times the message throughput rate as did 3 separate processes > (Zookeeper, 1 Broker, 1 Bookkeeper) when deployed as pods in > Kubernetes. All pods/containers ran on a single node as our test > cluster just had a single node so only "virtualized" networking is > involved. We found that containerization only had limited overhead > when deploying Pulsar standalone with or without Kubernetes so the > only difference we know about is that broker to bookie communication > has to go via two Pods (Docker containers) on the same VM. We expected > some overheads (network IP stack and extra context switching) but were > really surprised by the > 3x throughput. Is there any architectural > difference between Pulsar Standalone and separate Broker/Bookkeeper > except for running in different processes? Such as direct > communications (not over network sockets), sharing of data buffers or cache > configuration that makes Pulsar standalone inherently faster? > I don't expect there will be performance difference. It depends on how do you run the test. Were you using pulsar-perf to do the test? > > Our second question is on production readiness of Pulsar Standalone. > Is anyone using it in production if no fault tolerance (other than > recovery on > restart) is required? Or do people deploy a 1 node cluster which can > be upgraded / managed more easily and scaled if needed? > Standalone was orignally designed for development. You can for sure run it on production if there is no fault tolerance requirement. However it is better to start with 1-node cluster and upgrade/scale as needed. > > Thanks in advance for your answers, > Andre > > Andre Kramer > andre.kra...@softwareag.com<mailto:andre.kra...@softwareag.com> >