Hi Lari

> It could be useful to throttle based on WriteBufferWaterMark
> settings,
> but the settings should be much higher than the defaults.
> Throttling
> should be implemented using ServerCnxThrottleTracker's
> incrementThrottleCount/decrementThrottleCount.
> However, this solution
> wouldn't limit the overall broker memory usage and
> therefore wouldn't
> be the primary solution for addressing the presented issue.

Highlight: The purpose of the fix is to solve the high memory usage caused
by a single channel when it is not writable. The fix is not in order to
limit a broker-level memory usage.

> Besides the negative impacts mentioned in the previous
> email, one potential negative impact relates to
> performance. This would potentially cause a negative
> performance impact since Pulsar shares broker connections.
> Netty's writability is constantly changing when
> dispatching to consumers, especially when the connection
> is shared across many consumers from the same client.
> The channel writability is controlled with
> https://netty.io/4.1/api/io/netty/channel/WriteBufferWaterMark.html
> settings. We don't expose the configuration for these
> settings. By default, the high watermark is 64kB. When more
> than 64kB of output is buffered, the writability will change to false.
> The reason why this could impact performance is that
> there would be less "pipelining" in the Pulsar broker after
> this change is made.
> Less pipelining could mean that a single consumer dispatch
> will toggle the autoread to false, and only after the dispatched
> records have been written to the socket successfully would
> the Netty channel resume processing new input commands.
> This would mean that there couldn't be
> many pending operations in progress simultaneously.
> This would most likely impact performance negatively.
> Pipelining is important for performance in many
> distributed systems.

Once the channel is not writable, all requests that are sent to this
channel can not receive a reply anymore because the response can not be
written out. The results are the same; the clients receive replies delayed.
To improve performance, users should consider using more channels; in other
words, they can set a bigger `connectionsPerBroker` or separate clients.

> By the way, the solution in PR 24423 uses an incorrect
> way for
> backpressuring based on outboundBuffer's totalPendingWriteBytes.
> Instead of having a new setting "connectionMaxPendingWriteBytes",
> the correct way would be to expose and configure
> WriteBufferWaterMark high and low settings for the child
> channel (ChannelOption.WRITE_BUFFER_WATER_MARK)
> and rely on channelWritabilityChanged and use
> ServerCnxThrottleTracker's
> incrementThrottleCount/decrementThrottleCount there.

Increasing the config will only increase the number of active packet losses
and packet retransmissions of TCP, which is of no benefit to the system. It
merely shifts the memory pressure to the system kernel, the implementer of
the TCP protocol



On Tue, Jul 8, 2025 at 2:07 PM Yubiao Feng <yubiao.f...@streamnative.io>
wrote:

> Hi all
>
> I want to satrt a discussion, which relates to the PR. #24423: Handling
> Overloaded Netty Channels in Apache Pulsar
>
> Problem Statement
> We've encountered a critical issue in our Apache Pulsar clusters where
> brokers experience Out-Of-Memory (OOM) errors and continuous restarts under
> specific load patterns. This occurs when Netty channel write buffers become
> full, leading to a buildup of unacknowledged responses in the broker's
> memory.
>
> Background
> Our clusters are configured with numerous namespaces, each containing
> approximately 8,000 to 10,000 topics. Our consumer applications are quite
> large, with each consumer using a regular expression (regex) pattern to
> subscribe to all topics within a namespace.
>
> The problem manifests particularly during consumer application restarts.
> When a consumer restarts, it issues a getTopicsOfNamespace request. Due to
> the sheer number of topics, the response size is extremely large. This
> massive response overwhelms the socket output buffer, causing it to fill up
> rapidly. Consequently, the broker's responses get backlogged in memory,
> eventually leading to the broker's OOM and subsequent restart loop.
>
> Why "Returning an Error" Is Not a Solution
> A common approach to handling overload is to simply return an error when
> the broker cannot process a request. However, in this specific scenario,
> this solution is ineffective. If a consumer application fails to start due
> to an error, it triggers a user pod restart, which then leads to the same
> getTopicsOfNamespace request being reissued, resulting in a continuous loop
> of errors and restarts. This creates an unrecoverable state for the
> consumer application and puts immense pressure on the brokers.
>
> Proposed Solution and Justification
> We believe the solution proposed in
> https://github.com/apache/pulsar/pull/24423 is highly suitable for
> addressing this issue. The core mechanism introduced in this PR – pausing
> acceptance of new requests when a channel cannot handle more output – is
> exceptionally reasonable and addresses the root cause of the memory
> pressure.
>
> This approach prevents the broker from accepting new requests when its
> write buffers are full, effectively backpressuring the client and
> preventing the memory buildup that leads to OOMs. Furthermore, we
> anticipate that this mechanism will not significantly increase future
> maintenance costs, as it elegantly handles overload scenarios at a
> fundamental network layer.
>
> I invite the community to discuss this solution and its potential benefits
> for the overall stability and resilience of Apache Pulsar.
>
> Thanks
> Yubiao Feng
>

-- 
This email and any attachments are intended solely for the recipient(s) 
named above and may contain confidential, privileged, or proprietary 
information. If you are not the intended recipient, you are hereby notified 
that any disclosure, copying, distribution, or reproduction of this 
information is strictly prohibited. If you have received this email in 
error, please notify the sender immediately by replying to this email and 
delete it from your system.

Reply via email to