Thanks Jiangjie. The new producer is exactly what I was looking for and it works perfectly in production. It should be more documented on the official site.
Jason : you're right, I missed a point with my AtomicIntegers. Regards, Sebastien 2015-06-04 20:02 GMT+02:00 Jason Rosenberg <j...@squareup.com>: > Sebastien, I think you may have an off by 1 error (e.g. batch should be > 0-199, not 1-200). Thus you are sending 2 batches each time (one for 0, > another for 1-199). > > Jason > > On Thu, Jun 4, 2015 at 1:33 PM, Jiangjie Qin <j...@linkedin.com.invalid> > wrote: > > > From the code you pasted, that is old producer. > > The new producer class is > org.apache.kafka.clients.producer.KafkaProducer. > > > > The new producer does not have sticky partition behavior. The default > > partitioner use round-robin like way to send non-keyed messages to > > partitions. > > > > Jiangjie (Becket) Qin > > > > On 6/3/15, 11:35 PM, "Sebastien Falquier" <sebastien.falqu...@teads.tv> > > wrote: > > > > >I am using this code (from "org.apache.kafka" % "kafka_2.10" % > "0.8.2.0"), > > >no idea if it is the old producer or the new one.... > > > > > >import kafka.producer.Produced > > >import kafka.producer.ProducerConfig > > >val prodConfig : ProducerConfig = new ProducerConfig(properties) > > >val producer : Producer[Integer,String] = new > > >Producer[Integer,String](prodConfig) > > > > > >How can I know which producer I am using? And what is the behavior of > the > > >new producer? > > > > > >Thanks, > > >Sébastien > > > > > > > > >2015-06-03 20:04 GMT+02:00 Jiangjie Qin <j...@linkedin.com.invalid>: > > > > > >> > > >> Are you using new producer or old producer? > > >> The old producer has 10 min sticky partition behavior while the new > > >> producer does not. > > >> > > >> Thanks, > > >> > > >> Jiangjie (Becket) Qin > > >> > > >> On 6/2/15, 11:58 PM, "Sebastien Falquier" < > sebastien.falqu...@teads.tv> > > >> wrote: > > >> > > >> >Hi Jason, > > >> > > > >> >The default partitioner does not make the job since my producers > > >>haven't a > > >> >smooth traffic. What I mean is that they can deliver lots of messages > > >> >during 10 minutes and less during the next 10 minutes, that is too > say > > >>the > > >> >first partition will have stacked most of the messages of the last 20 > > >> >minutes. > > >> > > > >> >By the way, I don't understand your point about breaking batch into 2 > > >> >separate partitions. With that code, I jump to a new partition on > > >>message > > >> >201, 401, 601, ... with batch size = 200, where is my mistake? > > >> > > > >> >Thanks for your help, > > >> >Sébastien > > >> > > > >> >2015-06-02 16:55 GMT+02:00 Jason Rosenberg <j...@squareup.com>: > > >> > > > >> >> Hi Sebastien, > > >> >> > > >> >> You might just try using the default partitioner (which is random). > > >>It > > >> >> works by choosing a random partition each time it re-polls the > > >>meta-data > > >> >> for the topic. By default, this happens every 10 minutes for each > > >>topic > > >> >> you produce to (so it evenly distributes load at a granularity of > 10 > > >> >> minutes). This is based on 'topic.metadata.refresh.interval.ms'. > > >> >> > > >> >> I suspect your code is causing double requests for each batch, if > > >>your > > >> >> partitioning is actually breaking up your batches into 2 separate > > >> >> partitions. Could be an off by 1 error, with your modulo > > >>calculation? > > >> >> Perhaps you need to use '% 0' instead of '% 1' there? > > >> >> > > >> >> Jason > > >> >> > > >> >> > > >> >> > > >> >> On Tue, Jun 2, 2015 at 3:35 AM, Sebastien Falquier < > > >> >> sebastien.falqu...@teads.tv> wrote: > > >> >> > > >> >> > Hi guys, > > >> >> > > > >> >> > I am new to Kafka and I am facing a problem I am not able to sort > > >>out. > > >> >> > > > >> >> > To smooth traffic over all my brokers' partitions, I have coded a > > >> >>custom > > >> >> > Paritioner for my producers, using a simple round robin algorithm > > >>that > > >> >> > jumps from a partition to another on every batch of messages > > >> >> (corresponding > > >> >> > to batch.num.messages value). It looks like that : > > >> >> > https://gist.github.com/sfalquier/4c0c7f36dd96d642b416 > > >> >> > > > >> >> > With that fix, every partitions are used equally, but the amount > of > > >> >> > requests from the producers to the brokers have been multiplied > by > > >>2. > > >> >>I > > >> >> do > > >> >> > not understand since all producers are async with > > >> >>batch.num.messages=200 > > >> >> > and the amount of messages processed is still the same as before. > > >>Why > > >> >>do > > >> >> > producers need more requests to do the job? As internal traffic > is > > >>a > > >> >>bit > > >> >> > critical on our platform, I would really like to reduce > producers' > > >> >> requests > > >> >> > volume if possible. > > >> >> > > > >> >> > Any idea? Any suggestion? > > >> >> > > > >> >> > Regards, > > >> >> > Sébastien > > >> >> > > > >> >> > > >> > > >> > > > > >