This is not good solution to monitor and kill the bad consumer, if my consumer can't manage my partition well even when there are idle threads then I have a bad design.
I can't design a system that in some situations doesn't deliver thousands of emails because one thread couldn't manage things well(even when I have enough number of partitions) So I understand that Kafka doesn't provide concurrently in the form that rabbitmq provides. I just can't understand why should any message delayed when I have enough machines and threads idle. On Thursday, September 2015, Helleren, Erik <erik.helle...@cmegroup.com> wrote: > So, the general scalability approach with kafka is to add more partitions > to scale. If you are using consumer groups and the High Level Consumer > API, redistribution of partitions is automatic on a failover of a member > of a consumer group. But, the High level consumer doesn¹t allow a > configuration to break up partitions as is noted here: > https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example > There isn¹t really any way for multiple separate clients on separate JVM's > to coordinate their consumption off of a single partition efficiently. So > the solution is simply to break up a topic into enough partitions so that > a single partition is a reasonable unit to scale a consumer by. If a > consumer can only handle a single partition or worse, is falling behind, > your partitions are too large and need to be adjusted. > > And if for some reason a process hangs on a partition, kill it and start > up a new one. Provided partitions are a reasonable unit of scale, it > shouldn¹t be a problem. There will be a latency spike, but that¹s better > than starvation. You can split processing of a single partition pretty > easily within a JVM. The kafka consuming runnable can just put messages > into a concurrent queue of some sort, and then have a large thread pool > pulling from that queue to do the processing. That way if a thread in the > pool gets hung, there are many left to consume off the queue so nothing > gets hung up. But this adds some risk on failover based on how kafka does > offset management for the high level consumer. > > So, I don¹t think that sending backoff messages to a producer to let up on > a partition is a good design pattern for kafka. Again, the solution is > more partitions. But offset data is stored in either kafka or zookeeper > depending on your configuration, which can tell you how many messages your > consumer is behind by. But, since messages being published should be > evenly distributed across all partitions for a topic, all partitions > should be lagging equally. > > If you need a true unified queue RabitMQ might be right for your needs. > But if order doesn¹t matter at all, kafka should give you more throughput > with enough partitions. And since order doesn¹t matter, you have a lot of > flexibility here. > > Also, another option to doing everything in a native java client is to use > a Spark application. It makes faning out your data very easy, and has > some semantics that make it well suited for some of these concerns. > > > > On 9/10/15, 9:54 AM, "Reza Aliakbari" <raliakb...@gmail.com <javascript:;>> > wrote: > > >Hi Everybody, > > > >I have 2 question regarding the way consumers, consume messages of a > >partition. > > > > > > - * Is it possible to configure Kafka to allow concurrent message > > consumption from one partition concurrently? The order is not my > >concern at > > all.* > > > > I couldn't find any way to that by the Group Of Consumer > >approach, If it is possible please let me know, If impossible, then let me > >know how to address this problem: > > For a reason a consumer that is assigned to a partition could > >get very slow, and the messages would be processed very slowly. How can I > >detect this and stop producing on this slow partition... > > > > > > > > > > - * Suppose I have 5 partitions and 3 consumers and I am using Group of > > Consumers model(I had 5 consumers at start but 2 servers crashed), 3 > > consumers are working busy with their 3 partitions and they never get > > finished since the producer produce to their partitions **non-stop and > >a > > little faster than their consumption. What happens to the other 2 > > partitions that are missing consumers? How the Group of Consumers can > > handle this issue?* > > > > > >*The order is no matter for me, I need a simple configuration that address > >my concurrency needs and I need to make sure no message gets into > >starvation scenario that never consumed.* > > > >Please let us know, we want to select between Kafka and RabitMQ and we > >prefer Kafka because it is growing community and high throughput, But > >first > >we need to address these basic needs. > > > > > >Thanks, > > > >Reza Aliakabri > >