Yes, we are using pulsar-perf to do a simple throughput test (./bin/pulsar-perf 
produce --service-url pulsar://localhost --num-messages 1000000 -r 1000000 -s 
256 perf-test). We're failing to understand why we get much higher throughput 
with standalone than running separately (in Kubernetes pods).

Andre

-----Original Message-----
From: Sijie Guo <guosi...@gmail.com> 
Sent: 28 February 2020 16:38
To: Dev <dev@pulsar.apache.org>
Subject: Re: Performance and production readiness of Pulsar Standalone

On Fri, Feb 28, 2020 at 6:49 AM Kramer, Andre <andre.kra...@softwareag.com>
wrote:

> Hello,
>
> We have found that Pulsar standalone (which has zookeeper and bookie 
> in same Java process as the broker) on a simple throughput test showed 
> over 3 times the message throughput rate as did 3 separate processes 
> (Zookeeper, 1 Broker, 1 Bookkeeper) when deployed as pods in 
> Kubernetes. All pods/containers ran on a single node as our test 
> cluster just had a single node so only "virtualized" networking is 
> involved. We found that containerization only had limited overhead 
> when deploying Pulsar standalone with or without Kubernetes so the 
> only difference we know about is that broker to bookie communication 
> has to go via two Pods (Docker containers) on the same VM. We expected 
> some overheads (network IP stack and extra context switching) but were 
> really surprised by the > 3x throughput. Is there any architectural 
> difference between Pulsar Standalone and separate Broker/Bookkeeper 
> except for running in different processes? Such as direct 
> communications (not over network sockets), sharing of data buffers or cache 
> configuration that makes Pulsar standalone inherently faster?
>

I don't expect there will be performance difference. It depends on how do you 
run the test. Were you using pulsar-perf to do the test?


>
> Our second question is on production readiness of Pulsar Standalone. 
> Is anyone using it in production if no fault tolerance (other than 
> recovery on
> restart) is required? Or do people deploy a 1 node cluster which can 
> be upgraded / managed more easily and scaled if needed?
>

Standalone was orignally designed for development. You can for sure run it on 
production if there is no fault tolerance requirement. However it is better to 
start with 1-node cluster and upgrade/scale as needed.


>
> Thanks in advance for your answers,
> Andre
>
> Andre Kramer
> andre.kra...@softwareag.com<mailto:andre.kra...@softwareag.com>
>

Reply via email to