Re: KRaft Observer node unable to recover after (re-)bootstrapping to Follower node

2025-06-03 Thread José Armando García Sancio
Hi Justin, Thanks for creating the issue. I left a comment for you. Emailing you here in case you didn't see it. On Fri, May 30, 2025 at 7:02 PM Justin Chen wrote: > > Here is the JIRA issue: https://issues.apache.org/jira/browse/KAFKA-19354 > > Thank you, > Justin C > > On Fri, May 30, 2025 at

Re: KRaft Observer node unable to recover after (re-)bootstrapping to Follower node

2025-05-30 Thread Justin Chen
Here is the JIRA issue: https://issues.apache.org/jira/browse/KAFKA-19354 Thank you, Justin C On Fri, May 30, 2025 at 11:47 AM José Armando García Sancio wrote: > Thanks for the bug report Justin. Looks like a bug to me. Please file > a Jira with these details and share it in this thread. > > -

Re: KRaft Observer node unable to recover after (re-)bootstrapping to Follower node

2025-05-30 Thread José Armando García Sancio
Thanks for the bug report Justin. Looks like a bug to me. Please file a Jira with these details and share it in this thread. -- -José

Re: KRaft Observer node unable to recover after (re-)bootstrapping to Follower node

2025-05-30 Thread Justin Chen
Hello, Thank you for looking into this so quickly! Unfortunately, we didn't have DEBUG logs enabled in observers nodes at the time as this was a production environment. However, I was able to reproduce the behaviour in a sandbox cluster with the following setup (though the root cause may have diff

Re: KRaft Observer node unable to recover after (re-)bootstrapping to Follower node

2025-05-29 Thread Luke Chen
It's awesome you already reproduced the issue, Alyssa! But @Justin, if possible, could you still share the logs and the quorum state store file on the observer like Alyssa requested? Thank you. Luke On Fri, May 30, 2025 at 9:40 AM Alyssa Huang wrote: > Wanted to correct my wording and resend t

Re: KRaft Observer node unable to recover after (re-)bootstrapping to Follower node

2025-05-29 Thread Alyssa Huang
Wanted to correct my wording and resend the original text of my email because I got a bounce back - > Thanks Justin, > > It's hard to say from the current details if it's simply a network issue > (e.g. broker never receives the response with the leaderId), bug (broker > does receive response with

Re: KRaft Observer node unable to recover after (re-)bootstrapping to Follower node

2025-05-29 Thread Alyssa Huang
Thanks Justin, It's hard to say from the current details if it's simply a network issue (e.g. broker never receives the response with the leaderId), bug (broker does receive response with leaderId, never transitions to follower), or something else. Could you potentially send over logs from the mis

Re: KRaft Observer node unable to recover after (re-)bootstrapping to Follower node

2025-05-29 Thread Justin Chen
To correct my original description: We have observed that KRaft observers (process.roles=broker) that typically send FETCH requests to the quorum Leader node can enter a state of indefinitely **sending FETCH requests to a voter (follower) node**, which we believe to be after a re-bootstrap due to

KRaft Observer node unable to recover after (re-)bootstrapping to Follower node

2025-05-29 Thread Justin Chen
Hello, In our Kafka 4.0 cluster (dynamic quorum, 5 controller nodes), we have observed that KRaft observers (process.roles=broker) that typically send FETCH requests to the quorum Leader node can enter a state of indefinitely re-bootstraping to a voter (follower) node, likely after some sort of re