Got it. Thanks a lot Ewen!

Cheers, Steve

On Thu, Sep 3, 2015, 10:06 AM Ewen Cheslack-Postava <e...@confluent.io>
wrote:

> Steve,
>
> I don't think there is a better solution at the moment. This is an easy
> issue to miss in unit testing because generally connections to localhost
> will be rejected immediately if there isn't anything listening on the port.
> If you're running in an environment where this happens normally, then for
> now you'll need to wait for the long timeout.
>
> https://issues.apache.org/jira/browse/KAFKA-2120 may also alleviate the
> problem by at least reducing the amount of time for the request to fail.
> Depending on how adventurous you are, you could try using a version with
> that patch and maybe adjust the setting lower than its default.
>
> -Ewen
>
> On Wed, Sep 2, 2015 at 10:46 AM, Steve Tian <steve.cs.t...@gmail.com>
> wrote:
>
> > Would kafka dev kindly give us some advice on this?
> >
> > Cheers, Steve
> >
> > On Tue, Sep 1, 2015, 11:20 PM Steve Tian <steve.cs.t...@gmail.com>
> wrote:
> >
> > > Thanks, Rahul!  In my environment I need to have reconnect.backoff.ms
> > > longer than OS default tcp timeout so that NetworkClient can give
> second
> > > node a try.
> > >
> > > I believe this is related to
> > > https://issues.apache.org/jira/browse/KAFKA-2459 .
> > >
> > > Cheers, Steve
> > >
> > > On Tue, Sep 1, 2015, 5:24 PM Rahul Jain <rahul...@gmail.com> wrote:
> > >
> > >> We did notice something similar. When a broker node (out of 3) went
> > down,
> > >> metadata calls continued to go to the failed node and producer kept
> > >> failing. We were able to make it work by increasing the
> > >> reconnect.backoff.ms
> > >> to 1 second.
> > >>
> > >> Something similar was discussed earlier -
> > >>
> > >>
> >
> http://qnalist.com/questions/6002514/new-producer-metadata-update-problem-on-2-node-cluster
> > >>
> > >>
> > >>
> > >> On Mon, Aug 31, 2015 at 11:00 PM, Steve Tian <steve.cs.t...@gmail.com
> >
> > >> wrote:
> > >>
> > >> > Hi everyone,
> > >> >
> > >> > Is there any concerns to have a long reconnect.backoff.ms for new
> > java
> > >> > Kafka producer (0.8.2.0/0.8.2.1)?
> > >> >
> > >> > Assuming we have
> bootstrap.servers=host1:port1,host2:port2,host3:port3
> > >> and
> > >> > host1 is *down* in the very beginning. If a newly created Kafka
> > producer
> > >> > decide to choose host1 as first node to connect for metadata update,
> > >> then
> > >> > that producer will keep trying on host1 *only* as default tcp
> timeout
> > is
> > >> > surely longer than default value of reconnect.backoff.ms, which is
> 10
> > >> ms.
> > >> >
> > >> > I am thinking to have reconnect.backoff.ms longer than N * T where
> N
> > is
> > >> > the
> > >> > number of nodes in bootstrap.servers and T is the default tcp
> timeout.
> > >> Is
> > >> > there any concerns to have a long reconnect.backoff.ms like that?
> > Any
> > >> > better solutions?
> > >> >
> > >> > Cheers, Steve
> > >> >
> > >>
> > >
> >
>
>
>
> --
> Thanks,
> Ewen
>

Reply via email to