I am using this code (from "org.apache.kafka" % "kafka_2.10" % "0.8.2.0"), no idea if it is the old producer or the new one....
import kafka.producer.Produced import kafka.producer.ProducerConfig val prodConfig : ProducerConfig = new ProducerConfig(properties) val producer : Producer[Integer,String] = new Producer[Integer,String](prodConfig) How can I know which producer I am using? And what is the behavior of the new producer? Thanks, Sébastien 2015-06-03 20:04 GMT+02:00 Jiangjie Qin <j...@linkedin.com.invalid>: > > Are you using new producer or old producer? > The old producer has 10 min sticky partition behavior while the new > producer does not. > > Thanks, > > Jiangjie (Becket) Qin > > On 6/2/15, 11:58 PM, "Sebastien Falquier" <sebastien.falqu...@teads.tv> > wrote: > > >Hi Jason, > > > >The default partitioner does not make the job since my producers haven't a > >smooth traffic. What I mean is that they can deliver lots of messages > >during 10 minutes and less during the next 10 minutes, that is too say the > >first partition will have stacked most of the messages of the last 20 > >minutes. > > > >By the way, I don't understand your point about breaking batch into 2 > >separate partitions. With that code, I jump to a new partition on message > >201, 401, 601, ... with batch size = 200, where is my mistake? > > > >Thanks for your help, > >Sébastien > > > >2015-06-02 16:55 GMT+02:00 Jason Rosenberg <j...@squareup.com>: > > > >> Hi Sebastien, > >> > >> You might just try using the default partitioner (which is random). It > >> works by choosing a random partition each time it re-polls the meta-data > >> for the topic. By default, this happens every 10 minutes for each topic > >> you produce to (so it evenly distributes load at a granularity of 10 > >> minutes). This is based on 'topic.metadata.refresh.interval.ms'. > >> > >> I suspect your code is causing double requests for each batch, if your > >> partitioning is actually breaking up your batches into 2 separate > >> partitions. Could be an off by 1 error, with your modulo calculation? > >> Perhaps you need to use '% 0' instead of '% 1' there? > >> > >> Jason > >> > >> > >> > >> On Tue, Jun 2, 2015 at 3:35 AM, Sebastien Falquier < > >> sebastien.falqu...@teads.tv> wrote: > >> > >> > Hi guys, > >> > > >> > I am new to Kafka and I am facing a problem I am not able to sort out. > >> > > >> > To smooth traffic over all my brokers' partitions, I have coded a > >>custom > >> > Paritioner for my producers, using a simple round robin algorithm that > >> > jumps from a partition to another on every batch of messages > >> (corresponding > >> > to batch.num.messages value). It looks like that : > >> > https://gist.github.com/sfalquier/4c0c7f36dd96d642b416 > >> > > >> > With that fix, every partitions are used equally, but the amount of > >> > requests from the producers to the brokers have been multiplied by 2. > >>I > >> do > >> > not understand since all producers are async with > >>batch.num.messages=200 > >> > and the amount of messages processed is still the same as before. Why > >>do > >> > producers need more requests to do the job? As internal traffic is a > >>bit > >> > critical on our platform, I would really like to reduce producers' > >> requests > >> > volume if possible. > >> > > >> > Any idea? Any suggestion? > >> > > >> > Regards, > >> > Sébastien > >> > > >> > >