Hi Enrico

> it is always a good idea to not create a batch message
I agree this. So maybe I need a little work to fix the internal tests.:)

Thanks
Xiaoyu Hou

Enrico Olivelli <eolive...@gmail.com> 于2022年7月15日周五 17:29写道:

> Thanks for driving this initiative.
>
> I think that it is better to not add a configuration flag for this feature.
> Because:
> - it is always a good idea to not create a batch message
> - everybody will have to turn this feature explicitly on, otherwise
> there is no benefit
> - it is very  hard to explain why you should not enable this feature
> and we will have to tell to everybody to turn it on
> - BatchMessageIdImpl is a implementation detail, that is not part of
> the public API, applications must never rely on internal Impl classes
>
> If you really would like to have a way to turn off this behaviour I
> would at most use a system property, but this is something that we
> never did in the Pulsar API.
>
> I know that it will be a pain to fix internal Pulsar tests about
> batching, because we won't be guaranteed anymore to create batch
> messages.
>
>
> Enrico
>
> Il giorno ven 15 lug 2022 alle ore 11:21 Anon Hxy
> <anonhx...@gmail.com> ha scritto:
> >
> > Hi Pulsar community:
> >
> > I open a pip to discuss "No batching if only one message in batch"
> >
> > Proposal Link: https://github.com/apache/pulsar/issues/16619
> >
> > ---
> >
> > ## Motivation
> >
> > The original discussion mail :
> > https://lists.apache.org/thread/5svpl5qp3bfoztf5fvtojh51zbklcht2
> >
> > linked issue: https://github.com/apache/pulsar/issues/16547
> >
> > Introduce the ability for producers to publish a non-batched message if
> > there is only one message in the batch.
> >
> > It is useful to save the `SingleMessageMetadata` space in entry  and
> reduce
> > workload of consumers to deserialize the `SingleMessageMetadata`,
> >  especially  when sometimes there is an amount of batched messages with
> > only one real message.
> >
> > ## API Changes
> >
> > When this feature is applied, the returned type of `MessageId` may not be
> > `BatchMessageIdImpl`,  even if we have set the `enableBatching` as true.
> It
> > is because the producer will publish a single message  as a non-batched
> > message.  Also, the consumer will deserialize the entry as a non-batched
> > message,  which will receive a message with normal `MessageIdImpl` but
> not
> > `BatchMessageIdImpl`.
> >
> > So this may cause  `((BatchMessageIdImpl) messageId)` throw
> > `ClassCastException`.  we need to add a switch for the producer to enable
> > or disable this feature
> >
> > ```
> > ProducerBuilder<T> batchingSingleMessage(boolean batchingSingleMessage);
> >  // default value is true
> >
> > ```
> >
> > ## Implementation
> >
> > For `BatchMessageContainerImpl` :
> > ```
> > public OpSendMsg createOpSendMsg() throws IOException {
> >         if (!producer.conf.isBatchingSingleMessage() && messages.size()
> ==
> > 1) {
> >              // If only one message,  create OpSendMsg as non-batched
> > publish.
> >         }
> >
> >        // ....
> > }
> > ```
> >
> > For `BatchMessageKeyBasedContainer`,  there is no need to change, because
> > it uses `BatchMessageContainerImpl` to create `OpSendMsg`
> >
> >
> > ## Reject Alternatives
> >
> > - Always return `BatchMessageIdImpl` when `enableBatching` is set as
> true,
> > even if publish single message  with a non-batched message.
> > Rejection reason: Consumer have to deserialize to check if there is
> > `SingleMessageMetadata` from the payload
>

Reply via email to