Need feedbacks to consumer if hangs due to some __consumer_offsets partitions failed
hi My kafka cluster has two brokers( 0 and 1) and it created __consumer_offsets topic has 50 partitions . When broker 0 failed, half of partitions failed (because topic __consumer_offsets ReplicationFactor is 1). This results consequence that my kafkaConsumer sometimes can work well and sometimes hangs with 0 outputs. My consumer generated a new goupid upon it start up . When consumer startup, it will request groupcoordinator with groupid . When Consumer's groupid is routed to a good partition, it can work well, when grouid is routed to a failed partition, it hangs with a response GROUP_COORDINATOR_NOT_AVAILABLE. I hope broker can feedback to consumer if hangs due to some __consumer_offsets partitions failed , it can help user improve expericence and debug issure fastly. 1095193...@qq.com
Re: What's the use of timestamp in ProducerRecord?
kafka does not delete message when message is consumed, it will purge message when this message is expired. I guess this timeStamp is for checking whether message is expired. 1095193...@qq.com From: Jake Yoon Date: 2018-01-19 11:46 To: users Subject: What's the use of timestamp in ProducerRecord? Hi, I am very new to Kafka. And I have a very basic question. Kafka doc says, *ProducerRecord <https://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html#ProducerRecord(java.lang.String,%20java.lang.Integer,%20java.lang.Long,%20K,%20V)>* (String <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html?is-external=true> topic, Integer <http://docs.oracle.com/javase/7/docs/api/java/lang/Integer.html?is-external=true> partition, Long <http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html?is-external=true> timestamp, K <https://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html> key, V <https://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html> value) and I know the default timestamp is the current time. But I am not sure what's the use of it. - Is it just to log when the record is added? - How Kafka use it for? - Are there any other uses of it? - Can "Consumer" retrieves the timestamp of the "ProducerRecord"? Thanks,
Visual tool for kafka?
Hi I need a Visual tool for kafka. For example, A Web UI can display the detail of each topicsăthe offset of each consumer. Has any recommended visual tools? 1095193...@qq.com
Re: Whether kafka broker will have impact with 2 MB message size
We have a use case where we want to produce data to kafka with max size of 2 MB rarely (That is, based on user operations message size will vary). Whether producing 2 Mb size will have any impact or we need to split the message to small chunk such as 100 KB and produce. If we produce into small chunk, it will increase response time for the user. Also, we have checked by producing 2 MB message to kafka and we doesn't see much latency there. Anyway if we split the data and produce, it doesn't have any impact in disk size. But whether broker performance will degrade due to this? Our broker configuration is: RAM 125.6 GB Disk Size 2.9 TB Processors 40 Thanks, Hemnath K B.
Re: Using kafka with RESTful API
Hi, A request-response structure is more suitable for your scenario, you should still persist RESTful API rather than Kafka. 1095193...@qq.com From: Desmond Lim Date: 2019-03-19 09:52 To: users Subject: Using kafka with RESTful API Hi all, Just started using kafka yesterday and I have this question. I have a RESTful API that gets JSON data from a user (via POST) passes it to an app, the app computes the data and return it to the RESTful API and the API would display the results to the user. I'm trying to do this: The RESTful API would send the data to kafka. The app will get the data, compute and return the results to the user. The sending to kafka and the app, that I know how to do, I'm stuck at the second part. How can I get the return results back to the RESTful API? There are 2 scenarios: 1. It returns it via kafka again, the RESTful API will get the data and return. But I can't understand how this would work in a queue? If I have 5 users using it, how can I ensure that the right result gets returned to the right user? Also, I assume that I need a while loop to check the kafka topic while it waits for the results. 2. The second, is not to use kafka at all for the returning of results but I don't know how it can be done and I think that it would make this really complex (I might be wrong). Any help would be appreciated. Thanks. Desmond
Re: Consumer poll stuck on
Hi, A you said consumer is stuck after one node failed, you should check whether the partitions of topic are in Isr by using kafka-topics command. The two topics you can pay attention to are __consumer_offsets and your business topic, check whether all partitions are in Isr(in-sync replicas). For example, ./kafka-topics.sh --zookeeper 56.32.15.98:24002/kafka --describe --topic __consumer_offsets Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:2 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer Topic: __consumer_offsets Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2 Topic: __consumer_offsets Partition: 1 Leader: 2 Replicas: 2,1 Isr: 1,2 Topic: __consumer_offsets Partition: 2 Leader: 1 Replicas: 1,2 Isr: 1,2 Topic: __consumer_offsets Partition: 3 Leader: 2 Replicas: 2,1 Isr: 1,2 Topic: __consumer_offsets Partition: 4 Leader: 1 Replicas: 1,2 Isr: 1,2 Topic: __consumer_offsets Partition: 5 Leader: 2 Replicas: 2,1 Isr: 1,2 Topic: __consumer_offsets Partition: 6 Leader: 1 Replicas: 1,2 Isr: 1,2 Topic: __consumer_offsets Partition: 7 Leader: 2 Replicas: 2,1 Isr: 1,2 Topic: __consumer_offsets Partition: 8 Leader: 1 Replicas: 1,2 Isr: 1,2 Topic: __consumer_offsets Partition: 9 Leader: 2 Replicas: 2,1 Isr: 1,2 Topic: __consumer_offsets Partition: 10 Leader: 1 Replicas: 1,2 Isr: 1,2 If any partitions is not in Isr, you can fix it. 1095193...@qq.com From: Manu Jacob Date: 2019-03-21 09:39 To: users@kafka.apache.org; d...@kafka.apache.org Subject: Consumer poll stuck on Hi, We have a Kafka cluster (version 1.1.1) where one node unexpectedly failed. After that consumers from a couple of consumers are stuck in the poll() API call. Looking at the thread dump, it looks like the consumer is stuck in org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady() call. The heartbeat thread is also blocked waiting for the ConsumerCoordinator. Any idea what is the cause and how to resolve this issue? Thanks. "BusinessEventRecordsDispatcherThread" #43 prio=5 os_prio=0 tid=0x7f71764fb800 nid=0x241e sleeping[0x7f70d24e9000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.kafka.common.utils.SystemTime.sleep(SystemTime.java:45) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:235) - locked <0x0004fc30e628> (a org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:205) - locked <0x0004fc30e628> (a org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:351) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:290) at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1149) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115) ... at java.lang.Thread.run(Thread.java:748) "kafka-coordinator-heartbeat-thread | prod-mkt-datahub-loader" #45 daemon prio=5 os_prio=0 tid=0x7f711a286800 nid=0x2422 in Object.wait() [0x7f7120125000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x0004fc30e628> (a org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) at java.lang.Object.wait(Object.java:502) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:937) - locked <0x0004fc30e628> (a org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
Re: Partition Count Dilemma
Hi, The number of partitions drives the parailism of consumers. In general, the more partitions, the more parallel consumer can be added , the more throughput can be provided. In other words, if you have 10 partitions, the most number of consumer is 10. So you need to assume the throughput a consumer can provide is C, and the target throughput is T. Then the minimum number of partitions, that is, the number of consumers, is T/C. 1095193...@qq.com From: shalom sagges Date: 2019-03-21 06:43 To: users Subject: Partition Count Dilemma Hi All, I'm really new to Kafka and wanted to know if anyone can help me better understand partition count in relation to the Kafka cluster (apologies in advance for noob questions). I was requested to increase a topic's partition count from 30 to 100 in order to increase workers' parallelism (there are already other topics in this cluster with 100-200 partition counts per topic). The cluster is built of 4 physical servers. Each server has 132 GB RAM, 40 CPU cores, 6 SAS disks 1.1 TB each. Is PartitionCount:100 considered a high number of partitions per topic in relation to the cluster? Is there a good way for me to predetermine what an optimal partition count might be? Thanks a lot!
How to prevent data loss in "read-process-write" application?
Hi I have a application consume from Kafka, process and send to Kafka. In order to prevent data loss, I need to commit consumer offset after committing a batch of messages to Kafka successfully. I investigate Transaction fearture that provided atomic writes to multiple partitions could solve my problem. Has any other recommended solution in addition to enable Transcation( I dont need exactly once process)? 1095193...@qq.com