HI Jay, Fundamental, problem is batch size is already configured and producers are running in production with given configuration. ( Previous value were just sample). How do we increase partitions for topics when batch size exceed and configured buffer limit ? Yes, had we planed for batch size smaller we can do this, but we cannot do this if producers are already running. Have you faced this problem at LinkedIn or any other place ?
Thanks, Bhavesh On Tue, Nov 4, 2014 at 4:25 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > Hey Bhavesh, > > No there isn't such a setting. But what I am saying is that I don't think > you really need that feature. I think instead you can use a 32k batch size > with your 64M memory limit. This should mean you can have up up to 2048 > batches in flight. Assuming one batch in flight and one being added to at > any given time, then this should work well for up to ~1000 partitions. So > rather than trying to do anything dynamic. So assuming each producer sends > to just one topic then you would be fine as long as that topic had fewer > than 1000 partitions. If you wanted to add more you would need to add > memory on producers. > > -Jay > > On Tue, Nov 4, 2014 at 4:04 PM, Bhavesh Mistry <mistry.p.bhav...@gmail.com > > > wrote: > > > Hi Jay, > > > > I agree and understood what you have mentioned in previous email. But > when > > you have 5000+ producers running in cloud ( I am sure linkedin has many > > more and need to increase partitions for scalability) then all running > > producer will not send any data. So Is there any feature or setting that > > make sense to shrink batch size to fit the increase. I am sure other > will > > face the same issue. Had I configured with block.on.buffer.full=true it > > will be even worse and will block application threads. Our use case is > > *logger.log(msg)* method can not be blocked so that is why we have > > configuration to false. > > > > So I am sure others will run into this same issues. Try to find the > > optimal solution and recommendation from Kafka Dev team for this > particular > > use case (which may become common). > > > > Thanks, > > > > Bhavesh > > > > On Tue, Nov 4, 2014 at 3:12 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > > > > > Hey Bhavesh, > > > > > > Here is what your configuration means > > > buffer.memory=64MB # This means don't use more than 64MB of memory > > > batch.size=1MB # This means allocate a 1MB buffer for each partition > with > > > data > > > block.on.buffer.full=false # This means immediately throw an exception > if > > > there is not enough memory to create a new buffer > > > > > > Not sure what linger time you have set. > > > > > > So what you see makes sense. If you have 1MB buffers and 32 partitions > > then > > > you will have approximately 32MB of memory in use (actually a bit more > > than > > > this since one buffer will be filling while another is sending). If you > > > have 128 partitions then you will try to use 128MB, and since you have > > > configured the producer to fail when you reach 64 (rather than waiting > > for > > > memory to become available) that is what happens. > > > > > > I suspect if you want a smaller batch size. More than 64k is usually > not > > > going to help throughput. > > > > > > -Jay > > > > > > On Tue, Nov 4, 2014 at 11:39 AM, Bhavesh Mistry < > > > mistry.p.bhav...@gmail.com> > > > wrote: > > > > > > > Hi Kafka Dev, > > > > > > > > With new Producer, we are having to change the # partitions for a > > topic, > > > > and we face this issue BufferExhaustedException. > > > > > > > > Here is example, we have set 64MiB and 32 partitions and 1MiB of > > batch > > > > size. But when we increase the partition to 128, it throws > > > > BufferExhaustedException right way (non key based message). Buffer > is > > > > allocated based on batch.size. This is very common need to set auto > > > > calculate batch size when partitions increase because we have about > > ~5000 > > > > boxes and it is not practical to deploy code in all machines than > > expand > > > > partition for scalability purpose. What are options available > while > > > new > > > > producer is running and partition needs to increase and not enough > > buffer > > > > to allocate batch size for additional partition ? > > > > > > > > buffer.memory=64MiB > > > > batch.size=1MiB > > > > block.on.buffer.full=false > > > > > > > > > > > > Thanks, > > > > > > > > Bhavesh > > > > > > > > > >