I don't get it - you say msgId is a data structure contained within MessageId implementation, right? I presume msgId is the data structure the client transmit to the server, so that means you are transmitting topic to the server?
On Fri, May 12, 2023 at 7:45 AM Rajan Dhabalia <rdhaba...@apache.org> wrote: > Thank you for sharing your knowledge about the PIP which should be created > before PR and I think everyone in the community knows about it. but you can > check the PR for context which was blocked for sometime and we decided to > create PIP with proto changes. > > This PIP/PR tries to fix the issue where partitioned topic fails while > acking deserialized messageId. topic name will be part of MsgIdData which > is the data-structure used by messageID to store msgID context along with > partition, batching, and other metadata. topic name will be attached only > when the user tries to serialize and deserialize the messageId which will > be purely client side implementation and in other cases it will not be > transmitted to server. Also, partitioned topic's abstract concept for user > and messageID must be also remain abstract for users and users must not > know about different implementation of messageId and our goal is to > maintain that abstraction without telling user about MessageIdImpl or > TopicMessageIdImpl. > > Thanks, > Rajan > > > On Thu, May 11, 2023 at 7:29 AM Asaf Mesika <asaf.mes...@gmail.com> wrote: > > > Hi Rajan, > > > > A few comments on the PIP as I couldn't understand it fully as some > pieces > > of information is missing. > > > > First, I would like to remind about the rules, that exists in the > beginning > > of the PIP template: > > > > <!-- > > RULES > > * Never place a link to an external site like Google Doc. The proposal > > should be in this issue entirely. > > * Use a spelling and grammar checker tools if available for you (there > are > > plenty of free ones) > > > > PROPOSAL HEALTH CHECK > > I can read the design document and understand the problem statement and > > what you plan to change *without* resorting to a couple of hours of code > > reading just to start having a high level understanding of the change. > > --> > > > > > > In this specific case > > 1. I would include explanation and detail the data structures fields of > > objects you mentioned, such as: MessageIdImpl and MessageIdData. > > 2. I would not put a PR as the design section, so I need to read code to > > understand what the exact solution details. > > > > You wrote: > > > > > Pulsar api provides MessageId > > > < > > > https://github.com/apache/pulsar/blob/master/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/MessageId.java > > > > interface > > > which is generally used by producer and consumer applications to manage > > > topic offset. > > > > > > I think it's used to allow consumers to acknowledge (can be per message) > so > > offset if wrong terminology here. > > For producers, not sure exactly its usage. Maybe if they need to refer to > > this message later when reading by Reader interface. > > I would correct this section. > > > > However, right now Pulsar doesn't support correct deserialization of > > > multi-topic or partitioned-topic because of that 1acknowledge` API call > > > fails for those topics with below error > > > > > > You're saying that the acknowledgement API method signature receives > > MessageId, but do not receive TopicMessageId? > > > > I have a few questions on that: > > > > 1. The acknowledgement API is part of Pulsar binary protocol. Is your > plan > > to alter that protocol so it will also include the topic field as part of > > the message ID? > > > > 2. I think your PIP needs to explain the following items which are > missing > > as context: > > - There are two implementation for MessageId interface, one for topic and > > one for partitioned topic. > > - The problem is that the serialization/desrialization method is used > > mainly for translating the ID into the binary protocol, which only > requires > > the ID (ledger ID, entry ID). > > - The reason for that is that once you created a consumer, it has a topic > > attached to it. Transferring the topic for the ack is redundant. > > > > All of this needs to be in the background. > > > > I have several ideas on solving that, which IMO should mainly be in the > > client level, but I must get answers to the questions above before I can > > continue. > > > > Last note > > You have basically placed a link to a pull request as the design solution > > (high-level/detailed design). > > The whole idea of the design is that you describe the solution without > > resorting to code. > > IMO you should amend the design, state the goal shortly, and have a high > > level design section which contains 1-2 short paragraphs describing > exactly > > your solution. > > > > Thanks, > > > > Asaf > > > > > > > > On Tue, May 9, 2023 at 3:24 PM Yunze Xu <x...@apache.org> wrote: > > > > > I'm talking about whether to add a new separate API. I'm concerned > > > about whether existing applications would be affected, no matter if > > > the existing implementation has the limitation. If yes, we should > > > document it in the PIP so that users can know that. > > > > > > > it's a new optional field which would not break the compatibility > > > > > > And yes, I just confirmed it with simple demos in my local env. So I'm > > > +1 to this proposal. > > > > > > Thanks, > > > Yunze > > > > > > On Tue, May 9, 2023 at 3:05 PM Rajan Dhabalia <rdhaba...@apache.org> > > > wrote: > > > > > > > > Weill there are multiple things: it's a new optional field which > would > > > not > > > > break the compatibility , also current messaegId serialization and > > > > deserialization anyway only impact multi-topic consumer which is > > already > > > > broken or has the limitation and, adding a new separate API for > > > partitioned > > > > topic is not only not acceptable but creates too much confusion for > > users > > > > to use separate ack APIs for non-partition and partition topics and > > that > > > > also breaks partitioned topic abstraction which we would like to > avoid. > > > > > > > > Thanks, > > > > Rajan > > > > > > > > On Mon, May 8, 2023 at 11:27 PM Yunze Xu <x...@apache.org> wrote: > > > > > > > > > It seems that `TopicMessageIdImpl#toByteArray` now could serialize > > the > > > > > optional topic field to the bytes. I didn't test it now but I have > a > > > > > concern about whether it would bring a breaking change. > > > > > > > > > > Assuming there are two applications (let's say A and B) based on an > > > > > older Pulsar client, A writes serialized bytes into a file, B reads > > > > > bytes from the file and parses it to a MessageId. If A upgraded its > > > > > Pulsar client to the latest while B did not, what would happen? > Could > > > > > B still get the correct MessageId or the bytes would not be able to > > > > > parsed? > > > > > > > > > > P.S. it's better to add the API changes and potential breaking > > changes > > > > > in the proposal. > > > > > > > > > > Thanks, > > > > > Yunze > > > > > > > > > > On Tue, May 9, 2023 at 1:59 PM Rajan Dhabalia < > rdhaba...@apache.org> > > > > > wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > Pulsar api provides MessageId interface which is generally used > by > > > > > producer > > > > > > and consumer applications to manage topic offset. Sometimes, > these > > > > > > applications would like to serialize and deserialize messageIds, > > > > > > specifically consumer app which would like to persist messageId > and > > > ack > > > > > > with those messageIds by deserializing them. However, right now > > > Pulsar > > > > > > doesn't support correct deserialization of multi-topic or > > > > > partitioned-topic > > > > > > because of that 1acknowledge` API call fails for those topics > with > > > below > > > > > > error: > > > > > > "Only TopicMessageId is allowed to acknowledge for a multi-topics > > > > > consumer" > > > > > > > > > > > > MessageIdImpl stores id metadata into MessageIdData which doesn't > > > contain > > > > > > context about topic name to find out which topic belongs to this > > > > > MessageID. > > > > > > Therefore, we need to add topic-name into MessageIdData and allow > > > > > > multi-topic/partitioned topics to deserialize messages correctly > > so, > > > > > > consumer app can perform as expected. > > > > > > > > > > > > Please visit PIP for any suggestions: > > > > > > https://github.com/apache/pulsar/issues/20221 > > > > > > > > > > > > This PIP is created with PR: > > > https://github.com/apache/pulsar/pull/19944 > > > > > > > > > > > > Thanks, > > > > > > Rajan > > > > > > > > > > >