One more thing: when this happens, is the client authorized to write to both topicA and topicB?
Ismael On Sat, Feb 8, 2025 at 6:19 PM Ismael Juma <m...@ismaeljuma.com> wrote: > +1 for filing a JIRA ticket and for checking if the issue reproduces with > newer versions. It would also be helpful to know when it started happening > and if there were any changes before this was first observed. > > Ismael > > On Fri, Jan 17, 2025 at 12:14 PM Matthias J. Sax <mj...@apache.org> wrote: > >> Thanks for reaching out. Can you file a Jira ticket to report this as a >> bug? Hard to say if this could be a broker or producer bug... >> >> Did you try newer clients / broker version? Can you reproduce this with >> latest 3.9.0 release? >> >> -Matthias >> >> On 1/17/25 11:31 AM, Donny Nadolny wrote: >> > We're experiencing messages very occasionally ending up on a different >> > topic than what they were published to. That is, we publish a message to >> > topicA and consumers of topicB see it and fail to parse it because the >> > message contents are meant for topicA. This has happened for various >> > topics. Searching existing bug reports hasn't shown anything, has anyone >> > seen anything like this? >> > >> > We've begun adding a header with the intended topic (which we get just >> by >> > reading the topic from the record that we're about to pass to the OSS >> > client) right before we call producer.send, this header shows the >> correct >> > topic (which also matches up with the message contents itself). >> Similarly >> > we're able to use this header and compare it to the actual topic to >> prevent >> > consuming these misrouted messages, but it causes work for us to replay >> > these messages to the right topic and is also pretty concerning. >> > >> > Some details: >> > - This happens rarely: approximately once per 10 trillion messages >> > - It often happens in a small burst, eg 2 or 3 messages very close in >> time >> > (but from different hosts) will be misrouted >> > - It often but not always coincides with some sort of event in the >> cluster >> > (a broker restarting or being replaced, network issues causing errors, >> > etc). Also these cluster events happen quite often with no misrouted >> > messages >> > - We run many clusters, it has happened for several of them >> > - There is no pattern between intended and actual topic, other than >> the >> > intended topic tends to be higher volume ones (but I'd attribute that to >> > there being more messages published -> more occurrences affecting it >> rather >> > than it being more likely per-message) >> > - It only occurs with clients that are using a non-zero linger >> > - Once it happened with two sequential messages, both were intended >> for >> > topicA but both ended up on topicB, published by the same host >> (presumably >> > within the same linger batch) >> > - Most of our clients are 3.2.3 and it has only affected those, our >> > brokers are 3.2.3 as well (but I suspect a client rather than broker >> > problem because of it never happening with clients that use 0 linger) >> > >> > Thanks, >> > Donny >> > >> >>