One more thing: when this happens, is the client authorized to write to
both topicA and topicB?

Ismael

On Sat, Feb 8, 2025 at 6:19 PM Ismael Juma <m...@ismaeljuma.com> wrote:

> +1 for filing a JIRA ticket and for checking if the issue reproduces with
> newer versions. It would also be helpful to know when it started happening
> and if there were any changes before this was first observed.
>
> Ismael
>
> On Fri, Jan 17, 2025 at 12:14 PM Matthias J. Sax <mj...@apache.org> wrote:
>
>> Thanks for reaching out. Can you file a Jira ticket to report this as a
>> bug? Hard to say if this could be a broker or producer bug...
>>
>> Did you try newer clients / broker version? Can you reproduce this with
>> latest 3.9.0 release?
>>
>> -Matthias
>>
>> On 1/17/25 11:31 AM, Donny Nadolny wrote:
>> > We're experiencing messages very occasionally ending up on a different
>> > topic than what they were published to. That is, we publish a message to
>> > topicA and consumers of topicB see it and fail to parse it because the
>> > message contents are meant for topicA. This has happened for various
>> > topics. Searching existing bug reports hasn't shown anything, has anyone
>> > seen anything like this?
>> >
>> > We've begun adding a header with the intended topic (which we get just
>> by
>> > reading the topic from the record that we're about to pass to the OSS
>> > client) right before we call producer.send, this header shows the
>> correct
>> > topic (which also matches up with the message contents itself).
>> Similarly
>> > we're able to use this header and compare it to the actual topic to
>> prevent
>> > consuming these misrouted messages, but it causes work for us to replay
>> > these messages to the right topic and is also pretty concerning.
>> >
>> > Some details:
>> >   - This happens rarely: approximately once per 10 trillion messages
>> >   - It often happens in a small burst, eg 2 or 3 messages very close in
>> time
>> > (but from different hosts) will be misrouted
>> >   - It often but not always coincides with some sort of event in the
>> cluster
>> > (a broker restarting or being replaced, network issues causing errors,
>> > etc). Also these cluster events happen quite often with no misrouted
>> > messages
>> >   - We run many clusters, it has happened for several of them
>> >   - There is no pattern between intended and actual topic, other than
>> the
>> > intended topic tends to be higher volume ones (but I'd attribute that to
>> > there being more messages published -> more occurrences affecting it
>> rather
>> > than it being more likely per-message)
>> >   - It only occurs with clients that are using a non-zero linger
>> >   - Once it happened with two sequential messages, both were intended
>> for
>> > topicA but both ended up on topicB, published by the same host
>> (presumably
>> > within the same linger batch)
>> >   - Most of our clients are 3.2.3 and it has only affected those, our
>> > brokers are 3.2.3 as well (but I suspect a client rather than broker
>> > problem because of it never happening with clients that use 0 linger)
>> >
>> > Thanks,
>> > Donny
>> >
>>
>>

Reply via email to