+1  to zk bootstrap + close as an option at least

On Tue, Jan 28, 2014 at 10:09 AM, Neha Narkhede <neha.narkh...@gmail.com>wrote:

> >> The producer since 0.8 is actually zookeeper free, so this is not new to
> this client it is true for the current client as well. Our experience was
> that direct zookeeper connections from zillions of producers wasn't a good
> idea for a number of reasons.
>
> The problem with several thousand connections to zookeeper is mainly the
> long lived sessions causing overhead on zookeeper.
> This further degrades zookeeper performance causing it to be flaky and
> expire sessions/disconnect clients and so on. That being said,
> I don't see why we can't use zookeeper *just* for the bootstrap on client
> startup and close the connection right after the bootstrap is done.
> IMO, this is more intuitive and convenient as it will allow users to the
> same "bootstrap config" across producers, consumers and brokers and
> will not cause any performance/operational issues on zookeeper. This is
> assuming that all the zillion clients don't bootstrap at the same time,
> which is rare in practice.
>
> Thanks,
> Neha
>
>
> On Tue, Jan 28, 2014 at 8:02 AM, Mattijs Ugen (DT) <matt...@holmes.nl
> >wrote:
>
> > Sorry to tune in a bit late, but here goes.
> >
> > > 1. The producer since 0.8 is actually zookeeper free, so this is not
> new
> > to
> > > this client it is true for the current client as well. Our experience
> was
> > > that direct zookeeper connections from zillions of producers wasn't a
> > good
> > > idea for a number of reasons. Our intention is to remove this
> dependency
> > > from the consumer as well. The configuration in the producer doesn't
> need
> > > the full set of brokers, though, just one or two machines to bootstrap
> > the
> > > state of the cluster from--in other words it isn't like you need to
> > > reconfigure your clients every time you add some servers. This is
> exactly
> > > how zookeeper works too--if we used zookeeper you would need to give a
> > list
> > > of zk urls in case a particular zk server was down. Basically either
> way
> > > you need a few statically configured nodes to go to discover the full
> > state
> > > of the cluster. For people who don't like hard coding hosts you can
> use a
> > > VIP or dns or something instead.
> > In our configuration, the zookeeper quorum is actually one of the few
> > stable (in the sense of host names / ip addresses) pillars of the
> > complete ecosystem: every distributed service uses zookeeper to
> > coordinate the hosts that make up the service as a whole. Considering
> > that the kafka cluster will save the information needed for this
> > bootstrap to zookeeper anyhow, having clients (either producers or
> > consumers) retrieve this information at first use makes sense to me.
> >
> > We could create routine that retrieves a list of brokers from zookeeper
> > before initializing a Producer, but that feels more like a workaround
> > for a feature that in my humble opinion could well be part of the kafka
> > client library. That said, I realise that having two options for
> > connection bootstrapping (assuming that hardcoding a list of brokers is
> > here to stay) could be confusing for new users, but bypassing zookeeper
> > for this was rather confusing for me when I first came across it :)
> >
> > So, in short, I'd love it if the option to bootstrap the broker list
> > from zookeeper was there, rather than requiring to configure additional
> > (moving) virtual hostnames or fixed ip addresses for producers in our
> > cluster setup. I've been baffled a few times by this option not being
> > available for a distributed service that coordinates itself through
> > zookeeper.
> >
> > Just my two cents :)
> >
> > Mattijs
> >
>

Reply via email to