I was wonder if there are any good rules of thumb for determining the optimal batch size for the producer. For example lets say I have a group of producers that are in aggregate producing messages at about 40 million per minute with an average size of 700 bytes per message. With the default of 16384 bytes per batch size this would mean that there are only 23 tuples per batch. We have enough memory to accomodate larger batch sizes but I'm curious what the right trade-off is between batches that are too large vs too small. If anyone has a good rule of thumb calculation to determine a good batch size that would be awesome.
Thanks! --- Andrew Jorgensen @ajorgensen