This is very interesting, this is what I see as well. I wish someone could
explain why it is not as explained here:
http://engineering.gnip.com/kafka-async-producer/


On Wed, Jan 1, 2014 at 2:39 PM, Gerrit Jansen van Vuuren <
gerrit...@gmail.com> wrote:

> I don't know the code enough to comment on that (maybe someone else on the
> user list can do that), but from what I've seen doing some heavy profiling
> I only see one thread per producer instance, it doesn't matter how many
> brokers or topics you have the number of threads is always 1 per producer.
> If you create 2 producers 2 threads and so on.
>
>
>
>
>
> On Wed, Jan 1, 2014 at 1:27 PM, yosi botzer <yosi.bot...@gmail.com> wrote:
>
> > But shouldn't I see a separate thread per broker (I am using the async
> > mode)?  Why do I get a better performance sending a message that has
> fewer
> > partitions?
> >
> >
> > On Wed, Jan 1, 2014 at 2:22 PM, Gerrit Jansen van Vuuren <
> > gerrit...@gmail.com> wrote:
> >
> > > The producer is heavily synchronized (i.e. all the code in the send
> > method
> > > is encapsulated in one huge synchronized block).
> > > Try creating multiple producers and round robin send over them.
> > >
> > > e.g.
> > >
> > > p = producers[ n++ % producers.length ]
> > >
> > > p.send msg
> > > This will give you one thread per producer instance.
> > >
> > > I'm working on an async multi threaded producer for kafka, but its
> > nothing
> > > near complete yet.
> > > https://github.com/gerritjvv/kafka-fast
> > >
> > >
> > > Regards,
> > >  Gerrit
> > >
> > >
> > > On Wed, Jan 1, 2014 at 1:17 PM, yosi botzer <yosi.bot...@gmail.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I am using kafka 0.8. I have 3 machines each running kafka broker.
> > > >
> > > > I am using async mode of my Producer. I expected to see 3 different
> > > threads
> > > > with names starting with ProducerSendThread- (according to this
> > article:
> > > > http://engineering.gnip.com/kafka-async-producer/)
> > > >
> > > > However I can see only one thread with the name *ProducerSendThread-*
> > > >
> > > > This is my producer configuration:
> > > >
> > > > server=1
> > > > topic=dat7
> > > > metadata.broker.list=
> > > > ec2-54-245-111-112.us-west-2.compute.amazonaws.com:9092
> > > > ,ec2-54-245-111-69.us-west-2.compute.amazonaws.com:9092,
> > > > ec2-54-218-183-14.us-west-2.compute.amazonaws.com:9092
> > > > serializer.class=kafka.serializer.DefaultEncoder
> > > > request.required.acks=1
> > > > compression.codec=snappy
> > > > producer.type=async
> > > > queue.buffering.max.ms=2000
> > > > queue.buffering.max.messages=1000
> > > > batch.num.messages=500
> > > >
> > > >
> > > > *What am I missing here?*
> > > >
> > > >
> > > > BTW, I have also experienced very strange behavior regrading my
> > producer
> > > > performance (which may or may not be related to the issue above).
> > > >
> > > > When I have defined a topic with 1 partition I got much better
> > throughput
> > > > comparing to a topic with 3 partitions. A producer sending messages
> to
> > a
> > > > topic with 3 partitions had much better throughput comparing to a
> topic
> > > > with 12 partitions.
> > > >
> > > > I would expect to have best performance for the topic with 12
> > partitions
> > > > since I have 3 machines running a broker each of with 4 disks (the
> > broker
> > > > is configured to use all 4 disks)
> > > >
> > > > *Is there any logical explanation for this behavior?*
> > > >
> > >
> >
>

Reply via email to