Thanks for driving this initiative.

I think that it is better to not add a configuration flag for this feature.
Because:
- it is always a good idea to not create a batch message
- everybody will have to turn this feature explicitly on, otherwise
there is no benefit
- it is very  hard to explain why you should not enable this feature
and we will have to tell to everybody to turn it on
- BatchMessageIdImpl is a implementation detail, that is not part of
the public API, applications must never rely on internal Impl classes

If you really would like to have a way to turn off this behaviour I
would at most use a system property, but this is something that we
never did in the Pulsar API.

I know that it will be a pain to fix internal Pulsar tests about
batching, because we won't be guaranteed anymore to create batch
messages.


Enrico

Il giorno ven 15 lug 2022 alle ore 11:21 Anon Hxy
<anonhx...@gmail.com> ha scritto:
>
> Hi Pulsar community:
>
> I open a pip to discuss "No batching if only one message in batch"
>
> Proposal Link: https://github.com/apache/pulsar/issues/16619
>
> ---
>
> ## Motivation
>
> The original discussion mail :
> https://lists.apache.org/thread/5svpl5qp3bfoztf5fvtojh51zbklcht2
>
> linked issue: https://github.com/apache/pulsar/issues/16547
>
> Introduce the ability for producers to publish a non-batched message if
> there is only one message in the batch.
>
> It is useful to save the `SingleMessageMetadata` space in entry  and reduce
> workload of consumers to deserialize the `SingleMessageMetadata`,
>  especially  when sometimes there is an amount of batched messages with
> only one real message.
>
> ## API Changes
>
> When this feature is applied, the returned type of `MessageId` may not be
> `BatchMessageIdImpl`,  even if we have set the `enableBatching` as true. It
> is because the producer will publish a single message  as a non-batched
> message.  Also, the consumer will deserialize the entry as a non-batched
> message,  which will receive a message with normal `MessageIdImpl` but not
> `BatchMessageIdImpl`.
>
> So this may cause  `((BatchMessageIdImpl) messageId)` throw
> `ClassCastException`.  we need to add a switch for the producer to enable
> or disable this feature
>
> ```
> ProducerBuilder<T> batchingSingleMessage(boolean batchingSingleMessage);
>  // default value is true
>
> ```
>
> ## Implementation
>
> For `BatchMessageContainerImpl` :
> ```
> public OpSendMsg createOpSendMsg() throws IOException {
>         if (!producer.conf.isBatchingSingleMessage() && messages.size() ==
> 1) {
>              // If only one message,  create OpSendMsg as non-batched
> publish.
>         }
>
>        // ....
> }
> ```
>
> For `BatchMessageKeyBasedContainer`,  there is no need to change, because
> it uses `BatchMessageContainerImpl` to create `OpSendMsg`
>
>
> ## Reject Alternatives
>
> - Always return `BatchMessageIdImpl` when `enableBatching` is set as true,
> even if publish single message  with a non-batched message.
> Rejection reason: Consumer have to deserialize to check if there is
> `SingleMessageMetadata` from the payload

Reply via email to