Re: why did Kafka choose pull instead of push for a consumer ?

Gerard Klijs Fri, 23 Sep 2016 00:15:30 -0700

I haven't tried it myself, nor very likely will in the near future, but
since it's also distributed I guess that with a large enough cluster you
will be able to handle any load. One of the things kafka might be better at
is more connecters available, a better at least once guarantee, better
monitoring options. I really don't know, but if latancy is really important
pulsar might be better, they used kafka before at yahoo and maybe still do
for some stuff, recent work on https://github.com/yahoo/kafka-manager seems
to suggest so.
Alternatively you could configure a kafka topic/producer/consumer to limit
latency, and that may also be enough to get a low enough latency. It would
certainly be interesting to compare the two, with the same hardware, and
with high load.


On Thu, Sep 22, 2016 at 6:01 PM kant kodali <kanth...@gmail.com> wrote:

> @Gerard Thanks for this. It looks good any benchmarks on this throughput
> wise?
>
>
>
>
>
>
> On Thu, Sep 22, 2016 7:45 AM, Gerard Klijs gerard.kl...@dizzit.com
> wrote:
> We have a simple application producing 1 msg/sec, and did nothing to
>
> optimise the performance and have about a 10 msec delay between consumer
>
> and producer. When low latency is important, maybe pulsar is a better fit,
>
> https://www.datanami.com/2016/09/07/yahoos-new-pulsar-kafka-competitor/ .
>
>
>
>
> On Tue, Sep 20, 2016 at 2:24 PM Michael Freeman <mikfree...@gmail.com>
>
> wrote:
>
>
>
>
> > Thanks for sharing Radek, great article.
>
> >
>
> > Michael
>
> >
>
> > > On 17 Sep 2016, at 21:13, Radoslaw Gruchalski <ra...@gruchalski.com>
>
> > wrote:
>
> > >
>
> > > Please read this article:
>
> > >
>
> >
>
> https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
>
> > >
>
> > > –
>
> > > Best regards,
>
> > > Radek Gruchalski
>
> > > ra...@gruchalski.com
>
> > >
>
> > >
>
> > > On September 17, 2016 at 9:49:43 PM, kant kodali (kanth...@gmail.com)
>
> > wrote:
>
> > >
>
> > > Still it should be possible to implement using reactive streams right.
>
> > > Could you please enlighten me on what are the some major differences
> you
>
> > > see
>
> > > between a commit log and a message queue? I see them being different
> only
>
> > > in the
>
> > > implementation but not functionality wise so I would be glad to hear
> your
>
> > > thoughts.
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On Sat, Sep 17, 2016 12:39 PM, Radoslaw Gruchalski
> ra...@gruchalski.com
>
> > > wrote:
>
> > > Kafka is not a queue. It’s a distributed commit log.
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > –
>
> > >
>
> > > Best regards,
>
> > >
>
> > > Radek Gruchalski
>
> > >
>
> > > ra...@gruchalski.com
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On September 17, 2016 at 9:23:09 PM, kant kodali (kanth...@gmail.com)
>
> > > wrote:
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Hmm...Looks like Kafka is written in Scala. There is this thing called
>
> > >
>
> > > reactive
>
> > >
>
> > > streams where a slow consumer can apply back pressure if they are
>
> > consuming
>
> > >
>
> > > slow. Even with Java this is possible with a Library called RxJava and
>
> > >
>
> > > these
>
> > >
>
> > > ideas will be incorporated in Java 9 as well.
>
> > >
>
> > > I still don't see why they would pick poll just to solve this one
> problem
>
> > >
>
> > > and
>
> > >
>
> > > compensating on others. Poll just don't sound realtime. I heard from
> some
>
> > >
>
> > > people
>
> > >
>
> > > that they would set poll to 100ms. Well 1) that is a lot of time. 2)
>
> > >
>
> > > Financial
>
> > >
>
> > > applications requires micro second latency. Kafka from what I
> understand
>
> > >
>
> > > looks
>
> > >
>
> > > like has a very high latency and here is the article.
>
> > >
>
> > > http://bravenewgeek.com/dissecting-message-queues/ I usually don't go
> by
>
> > >
>
> > > articles but I ran my own experiments on different queues and my
> numbers
>
> > >
>
> > > are
>
> > >
>
> > > very close to this article so I would say whoever wrote this article
> has
>
> > >
>
> > > done a
>
> > >
>
> > > good Job. 3) poll does generate unnecessary traffic in case if the data
>
> > >
>
> > > isn't
>
> > >
>
> > > available.
>
> > >
>
> > > Finally still not sure why they would pick poll() ? or do they plan on
>
> > >
>
> > > introducing reactive streams?Thanks,kant
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On Sat, Sep 17, 2016 5:14 AM, Radoslaw Gruchalski ra...@gruchalski.com
>
> > >
>
> > > wrote:
>
> > >
>
> > > I'm only guessing here regarding if this is the reason:
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Pull is much more sensible when a lot of data is pushed through. It
>
> > allows
>
> > >
>
> > > consumers consuming at their own pace, slow consumers do not slow the
>
> > >
>
> > > complete
>
> > >
>
> > > system down.
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > --
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Best regards,
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Rad
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > On Sat, Sep 17, 2016 at 11:18 AM +0200, "kant kodali" <
>
> > kanth...@gmail.com>
>
> > >
>
> > > wrote:
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > why did Kafka choose pull instead of push for a consumer? push sounds
>
> > like
>
> > >
>
> > > it
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > is more realtime to me than poll and also wouldn't poll just keeps
>
> > polling
>
> > >
>
> > > even
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > when they are no messages in the broker causing more traffic? please
>
> > >
>
> > > enlighten
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > me
>
> >

Re: why did Kafka choose pull instead of push for a consumer ?

Reply via email to