If you are building a broadcast to construct a producer with a set of
options then the producer is merely going to operate how it’s going to be
configured - it has nothing to do with spark save that the foreachPartition
is constructing it via the broadcast.

A strategy I’ve used in the past is to
* increase memory pool for asynchronous processing
* make multiple broadcast producers and randomly access the producer to
balance the asynchronous sending across more thread pools
* implement back pressure via an adapter class to capture errors

These are the same things you would want to consider while writing a high
volume Kafka based application



-dan


On Wed, Apr 16, 2025 at 7:17 AM Abhishek Singla <abhisheksingla...@gmail.com>
wrote:

> Yes, producing via kafka-clients using foreachPartition works as expected.
> Kafka Producer is initialised within call(Iterator<T> t) method.
>
> The issue is with using kafka connector with Spark batch. The configs are
> not being honored even when they are being set in ProducerConfig. This
> means kafka records production rate cannot be controlled via kafka
> connector in Spark batch. This can lead to lag in in-sync replicas if they
> are not able to catch up and eventually kafka server failing writes it
> in-sync replicas count reduced the required in-sync replicas. Is there any
> way to solve this using kafka connector?
>
> Regards,
> Abhishek Singla
>

Reply via email to