Hi Enrico > it is always a good idea to not create a batch message I agree this. So maybe I need a little work to fix the internal tests.:)
Thanks Xiaoyu Hou Enrico Olivelli <eolive...@gmail.com> 于2022年7月15日周五 17:29写道: > Thanks for driving this initiative. > > I think that it is better to not add a configuration flag for this feature. > Because: > - it is always a good idea to not create a batch message > - everybody will have to turn this feature explicitly on, otherwise > there is no benefit > - it is very hard to explain why you should not enable this feature > and we will have to tell to everybody to turn it on > - BatchMessageIdImpl is a implementation detail, that is not part of > the public API, applications must never rely on internal Impl classes > > If you really would like to have a way to turn off this behaviour I > would at most use a system property, but this is something that we > never did in the Pulsar API. > > I know that it will be a pain to fix internal Pulsar tests about > batching, because we won't be guaranteed anymore to create batch > messages. > > > Enrico > > Il giorno ven 15 lug 2022 alle ore 11:21 Anon Hxy > <anonhx...@gmail.com> ha scritto: > > > > Hi Pulsar community: > > > > I open a pip to discuss "No batching if only one message in batch" > > > > Proposal Link: https://github.com/apache/pulsar/issues/16619 > > > > --- > > > > ## Motivation > > > > The original discussion mail : > > https://lists.apache.org/thread/5svpl5qp3bfoztf5fvtojh51zbklcht2 > > > > linked issue: https://github.com/apache/pulsar/issues/16547 > > > > Introduce the ability for producers to publish a non-batched message if > > there is only one message in the batch. > > > > It is useful to save the `SingleMessageMetadata` space in entry and > reduce > > workload of consumers to deserialize the `SingleMessageMetadata`, > > especially when sometimes there is an amount of batched messages with > > only one real message. > > > > ## API Changes > > > > When this feature is applied, the returned type of `MessageId` may not be > > `BatchMessageIdImpl`, even if we have set the `enableBatching` as true. > It > > is because the producer will publish a single message as a non-batched > > message. Also, the consumer will deserialize the entry as a non-batched > > message, which will receive a message with normal `MessageIdImpl` but > not > > `BatchMessageIdImpl`. > > > > So this may cause `((BatchMessageIdImpl) messageId)` throw > > `ClassCastException`. we need to add a switch for the producer to enable > > or disable this feature > > > > ``` > > ProducerBuilder<T> batchingSingleMessage(boolean batchingSingleMessage); > > // default value is true > > > > ``` > > > > ## Implementation > > > > For `BatchMessageContainerImpl` : > > ``` > > public OpSendMsg createOpSendMsg() throws IOException { > > if (!producer.conf.isBatchingSingleMessage() && messages.size() > == > > 1) { > > // If only one message, create OpSendMsg as non-batched > > publish. > > } > > > > // .... > > } > > ``` > > > > For `BatchMessageKeyBasedContainer`, there is no need to change, because > > it uses `BatchMessageContainerImpl` to create `OpSendMsg` > > > > > > ## Reject Alternatives > > > > - Always return `BatchMessageIdImpl` when `enableBatching` is set as > true, > > even if publish single message with a non-batched message. > > Rejection reason: Consumer have to deserialize to check if there is > > `SingleMessageMetadata` from the payload >