Hi, Rajan. I think we can make changes only on the client side. Probably only the part of sendMessages of the producer.
When the producer sends a chunked message, the message-id of the first chunk is stored temporarily in the producer until all chunks have been sent, and then it is returned to the user after all chunks has been sent successfully. I think this solution may be simpler to implement and will not cause any changes on the server side, nor will it change the seek part of the consumer. What do you think? Thanks, Zike > On Sep 22, 2021, at 3:58 PM, Rajan Dhabalia <rdhaba...@apache.org> wrote: > > Hi, > > Though chunked messages are sequential for a specific producer, it's not > guaranteed that they will be contiguous when the broker receives them and > writes them to a ledger. So, it will be a little tricky to find out the > first message-id of any chunked-message at any given time unless broker > tags chunked messages while persisting them at server side. But such > tagging and updating message metadata might not be straightforward and may > not scale when the topic has a large number of producers and chunked > messages are being published from all different producers. > > Consumer::seek(messageId) also doesn't work if the user provides an > incorrect messageId. So, if the user points to an incomplete chunk then > it's expected that the consumer can't receive the same chunked-message but > then the consumer should be able to receive and consume the next complete > chunk and deliver it to the application. This behavior should not require > any server side change but should expect the client to consume the next > correct chunked message after the given messageId. > > Thanks, > Rajan > > > > On Tue, Sep 21, 2021 at 8:56 PM Zike Yang <zky...@streamnative.io.invalid> > wrote: > >> Hi Pulsar Community, >> >> >> Currently, when we send chunked messages, the producer returns the >> message-id of the last chunk. This can cause some problems. For example, >> when we use this message-id to seek, it will cause the consumer to consume >> from the position of the last chunk, and the consumer will mistakenly think >> that the previous chunks are lost and choose to skip the current message. >> If we use the inclusive seek, the consumer may skip the first message, >> which brings the wrong behavior. >> >> >> Here is the simple code used to demonstrate the problem. >> >> ``` >> >> var msgId = producer.send(...); // eg. return 0:1:-1 >> >> var otherMsg = producer.send(...); // return 0:2:-1 >> >> consumer.seek(msgId); // inclusive seek >> >> var receiveMsgId = consumer.receive().getMessageId(); // it may skip the >> first message and return like 0:2:-1 >> >> Assert.assertEquals(msgId, receiveMsgId); // fail >> >> ``` >> >> >> To fix this, I think we could return the message ID of the first chunk when >> sending chunked messages. I would like to know if this solution will bring >> other problems. Any ideas on this? >> >> >> Thanks >> -- >> Zike Yang >>