Hi Penghui On 2022/06/29 04:07:35 PengHui Li wrote: > Hi Haiting, > > Thanks for the explanation. I'm clear for now. > > Pulsar functions also can do such things by connecting data from one topic > to another topic. > But the difference is this proposal only copies the data to the cache of > another topic, and the data not > in the cache is also available by reading from ledgers. > > And this approach also follows benefits compared with replicating data to > multiple "real" topics. > > - reuse the topic metadata > - the same message ID which easy for troubleshooting > > Just one question > > >>>>>>> > ``` > message CommandSend { // ... // message id for shadow topic optional > MessageIdData shadow_message_id = 9; } > ``` > > Can we get the message ID from the replicated data to avoid introducing a > new command? > Or use a marker message to avoid broker-to-broker directly protobuf command > interaction. > Sorry for not wrote it clearly. CommandSend is not a new command. It's exactly the main command producer used to send message to broker. The only change is add a new field in it. The whole command proto would be like this: ``` message CommandSend { required uint64 producer_id = 1; required uint64 sequence_id = 2; optional int32 num_messages = 3 [default = 1]; optional uint64 txnid_least_bits = 4 [default = 0]; optional uint64 txnid_most_bits = 5 [default = 0];
/// Add highest sequence id to support batch message with external sequence id optional uint64 highest_sequence_id = 6 [default = 0]; optional bool is_chunk =7 [default = false]; // Specify if the message being published is a Pulsar marker or not optional bool marker = 8 [default = false]; // message id for shadow topic optional MessageIdData shadow_message_id = 9; } ``` So there won't be any broker-to-broker directly protobuf command interactions. Thanks, Haiting > Thanks, > Penghui > > On Wed, Jun 29, 2022 at 10:31 AM Haiting Jiang <jianghait...@apache.org> > wrote: > > > Hi Penghui & Asaf: > > > > Please allow me to provide some more detailes about **metadata** > > synchronization > > between source topic and shadow topic. > > > > 1.When shadow topic initializes, it will read from metadata store path > > "/managed-ledgers/{source_topic_ledger_name}", which contains all the > > managed ledger info. We don't > > need to read the ledger information from source topic broker. > > > > 2. When shadow topic received new message from replicator, if the ledger > > id of the message > > is the same as the last ledger, it just updates the LAC. If not, it will > > update ledger list from metadata, > > and then open the new ledger handle and update the LAC. > > > > As for the copy itself and add shadow message id in CommandSend, it mostly > > serves the purpose > > of filling the EntryCache. > > > > Thanks, > > Haiting > > > > On 2022/06/23 02:08:46 PengHui Li wrote: > > > > One question comes to mind here: Why not simply read the ledger > > information > > > from original topic, without copy? > > > > > > I think this is a good idea. > > > > > > Penghui > > > On Jun 22, 2022, 23:57 +0800, dev@pulsar.apache.org, wrote: > > > > > > > > One question comes to mind here: Why not simply read the ledger > > information > > > > from original topic, without copy? > > > > > >