Hi Pulsar community: I open a pip to discuss "No batching if only one message in batch"
Proposal Link: https://github.com/apache/pulsar/issues/16619 --- ## Motivation The original discussion mail : https://lists.apache.org/thread/5svpl5qp3bfoztf5fvtojh51zbklcht2 linked issue: https://github.com/apache/pulsar/issues/16547 Introduce the ability for producers to publish a non-batched message if there is only one message in the batch. It is useful to save the `SingleMessageMetadata` space in entry and reduce workload of consumers to deserialize the `SingleMessageMetadata`, especially when sometimes there is an amount of batched messages with only one real message. ## API Changes When this feature is applied, the returned type of `MessageId` may not be `BatchMessageIdImpl`, even if we have set the `enableBatching` as true. It is because the producer will publish a single message as a non-batched message. Also, the consumer will deserialize the entry as a non-batched message, which will receive a message with normal `MessageIdImpl` but not `BatchMessageIdImpl`. So this may cause `((BatchMessageIdImpl) messageId)` throw `ClassCastException`. we need to add a switch for the producer to enable or disable this feature ``` ProducerBuilder<T> batchingSingleMessage(boolean batchingSingleMessage); // default value is true ``` ## Implementation For `BatchMessageContainerImpl` : ``` public OpSendMsg createOpSendMsg() throws IOException { if (!producer.conf.isBatchingSingleMessage() && messages.size() == 1) { // If only one message, create OpSendMsg as non-batched publish. } // .... } ``` For `BatchMessageKeyBasedContainer`, there is no need to change, because it uses `BatchMessageContainerImpl` to create `OpSendMsg` ## Reject Alternatives - Always return `BatchMessageIdImpl` when `enableBatching` is set as true, even if publish single message with a non-batched message. Rejection reason: Consumer have to deserialize to check if there is `SingleMessageMetadata` from the payload