Commented on the jira.
Thanks,
Jun
On Sat, Jun 29, 2013 at 6:21 AM, Jason Rosenberg wrote:
> I added this scenario to KAFKA-955.
>
> I'm thinking that this scenario could be a problem for ack=0 in general
> (even without controlled shutdown). If we do an "uncontrolled" shutdown,
> it seems t
I added this scenario to KAFKA-955.
I'm thinking that this scenario could be a problem for ack=0 in general
(even without controlled shutdown). If we do an "uncontrolled" shutdown,
it seems that some topics won't ever know there could have been a leader
change. Would it make sense to force a met
Also, looking back at my logs, I'm wondering if a producer will reuse the
same socket to send data to the same broker, for multiple topics (I'm
guessing yes). In which case, it looks like I'm seeing this scenario:
1. producer1 is happily sending messages for topicX and topicY to serverA
(serverA
Filed https://issues.apache.org/jira/browse/KAFKA-955
On Mon, Jun 24, 2013 at 10:14 PM, Jason Rosenberg wrote:
> Jun,
>
> To be clear, this whole discussion was started, because I am clearly
> seeing "failed due to Leader not local" on the last broker restarted,
> after all the controlled shutt
Jun,
To be clear, this whole discussion was started, because I am clearly seeing
"failed due to Leader not local" on the last broker restarted, after all
the controlled shutting down has completed and all brokers restarted.
This leads me to believe that a client made a meta data request and found
That should be fine since the old socket in the producer will no longer be
usable after a broker is restarted.
Thanks,
Jun
On Mon, Jun 24, 2013 at 9:50 PM, Jason Rosenberg wrote:
> What about a non-controlled shutdown, and a restart, but the producer never
> attempts to send anything during t
What about a non-controlled shutdown, and a restart, but the producer never
attempts to send anything during the time the broker was down? That could
have caused a leader change, but without the producer knowing to refresh
it's metadata, no?
On Mon, Jun 24, 2013 at 9:05 PM, Jun Rao wrote:
> Ot
Other than controlled shutdown, the only other case that can cause the
leader to change when the underlying broker is alive is when the broker
expires its ZK session (likely due to GC), which should be rare. That being
said, forwarding in the broker may not be a bad idea. Could you file a jira
to t
Yeah,
I see that with ack=0, the producer will be in a bad state anytime the
leader for it's partition has changed, while the broker that it thinks is
the leader is still up. So this is a problem in general, not only for
controlled shutdown, but even for the case where you've restarted a server
(
I think Jason was suggesting quiescent time as a possibility only if the
broker did request forwarding if it is not the leader.
On Monday, June 24, 2013, Jun Rao wrote:
> Jason,
>
> The quiescence time that you proposed won't work. The reason is that with
> ack=0, the producer starts losing data
Jason,
The quiescence time that you proposed won't work. The reason is that with
ack=0, the producer starts losing data silently from the moment the leader
is moved (by controlled shutdown) until the broker is shut down. So, the
sooner that you can shut down the broker, the better. What we realize
After we implement non-blocking IO for the producer, there may not be much
incentive left to use ack = 0, but this is an interesting idea - not just
for the controlled shutdown case, but also when leadership moves due to
say, a broker's zk session expiring. Will have to think about it a bit more.
Yeah I am using ack = 0, so that makes sense. I'll need to rethink that,
it would seem. It would be nice, wouldn't it, in this case, for the broker
to realize this and just forward the messages to the correct leader. Would
that be possible?
Also, it would be nice to have a second option to the
Jason,
Are you using ack = 0 in the producer? This mode doesn't work well with
controlled shutdown (this is explained in FAQ i*n
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#)*
*
*
Thanks,
Jun
On Sun, Jun 23, 2013 at 1:45 AM, Jason Rosenberg wrote:
> I'm working on tryi
Hi Sriram,
I don't see any indication at all on the producer that there's a problem.
Only the above logging on the server (and it repeats continually). I
think what may be happening is that the producer for that topic did not
actually try to send a message between the start of the controlled shu
Hey Jason,
The producer on failure initiates a metadata request to refresh its state
and should issue subsequent requests to the new leader. The errors that
you see should only happen once per topic partition per producer. Let me
know if this is not what you see. On the producer end you should see
16 matches
Mail list logo