Need feedbacks to consumer if hangs due to some __consumer_offsets partitions failed

2018-01-08 Thread 1095193...@qq.com
hi
My kafka cluster has  two brokers( 0 and 1) and  it  created  
__consumer_offsets  topic  has 50 partitions . When  broker 0 failed,  half of 
partitions failed (because topic   __consumer_offsets   ReplicationFactor is 1).
 This results consequence that my kafkaConsumer sometimes can work well and 
sometimes hangs with 0 outputs.  My consumer generated a new goupid upon it 
start up . When consumer startup, it will request groupcoordinator with groupid 
 . When  Consumer's groupid is routed to a good partition, it can work well, 
when  grouid is routed to a failed partition, it hangs with a response 
GROUP_COORDINATOR_NOT_AVAILABLE. I hope broker can feedback  to consumer if 
hangs  due to some __consumer_offsets   partitions failed , it can help user 
improve  expericence and  debug issure fastly.



1095193...@qq.com


Re: What's the use of timestamp in ProducerRecord?

2018-01-18 Thread 1095193...@qq.com
kafka  does not delete message when message is consumed, it will purge message 
when this message is expired. I guess this timeStamp is for checking whether 
message is expired.  



1095193...@qq.com
 
From: Jake Yoon
Date: 2018-01-19 11:46
To: users
Subject: What's the use of timestamp in ProducerRecord?
Hi, I am very new to Kafka.
And I have a very basic question.
 
Kafka doc says,
 
*ProducerRecord
<https://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html#ProducerRecord(java.lang.String,%20java.lang.Integer,%20java.lang.Long,%20K,%20V)>*
(String
<http://docs.oracle.com/javase/7/docs/api/java/lang/String.html?is-external=true>
topic, Integer
<http://docs.oracle.com/javase/7/docs/api/java/lang/Integer.html?is-external=true>
partition, Long
<http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html?is-external=true>
timestamp, K
<https://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html>
key, V
<https://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html>
value)
 
and I know the default timestamp is the current time. But I am not sure
what's the use of it.
 
- Is it just to log when the record is added?
- How Kafka use it for?
- Are there any other uses of it?
- Can "Consumer" retrieves the timestamp of the "ProducerRecord"?
 
Thanks,


Visual tool for kafka?

2018-10-18 Thread 1095193...@qq.com
Hi 
   I need a Visual tool for kafka. For example, A Web UI can display the detail 
of each topics、the offset of each consumer. Has any recommended visual tools?



1095193...@qq.com


Re: Whether kafka broker will have impact with 2 MB message size

2019-03-13 Thread 1095193...@qq.com
We have a use case where we want to produce data to kafka with max
size of 2 MB rarely (That is, based on user operations message size
will vary).

Whether producing 2 Mb size will have any impact or we need to split
the message to small chunk such as 100 KB and produce.

If we produce into small chunk, it will increase response time for the
user. Also, we have checked by producing 2 MB message to kafka and we
doesn't see much latency there.

Anyway if we split the data and produce, it doesn't have any impact in
disk size. But whether broker performance will degrade due to this?

Our broker configuration is:

RAM 125.6 GB Disk Size 2.9 TB Processors 40

Thanks,
Hemnath K B.


Re: Using kafka with RESTful API

2019-03-18 Thread 1095193...@qq.com
Hi,
 A request-response structure is more suitable for your scenario, you should 
still persist RESTful API rather than Kafka. 


1095193...@qq.com
 
From: Desmond Lim
Date: 2019-03-19 09:52
To: users
Subject: Using kafka with RESTful API
Hi all,
 
Just started using kafka yesterday and I have this question.
 
I have a RESTful API that gets JSON data from a user (via POST) passes it
to an app, the app computes the data and return it to the RESTful API and
the API would display the results to the user.
 
I'm trying to do this:
 
The RESTful API would send the data to kafka. The app will get the data,
compute and return the results to the user.
 
The sending to kafka and the app, that I know how to do, I'm stuck at the
second part. How can I get the return results back to the RESTful API?
 
There are 2 scenarios:
 
1. It returns it via kafka again, the RESTful API will get the data and
return. But I can't understand how this would work in a queue? If I have 5
users using it, how can I ensure that the right result gets returned to the
right user?
 
Also, I assume that I need a while loop to check the kafka topic while it
waits for the results.
 
2. The second, is not to use kafka at all for the returning of results but
I don't know how it can be done and I think that it would make this really
complex (I might be wrong).
 
Any help would be appreciated. Thanks.
 
Desmond


Re: Consumer poll stuck on

2019-03-20 Thread 1095193...@qq.com
Hi,
A you said  consumer is stuck after one node failed,  you should check whether 
the partitions of topic are in Isr  by using kafka-topics command. The two 
topics you can pay attention to are __consumer_offsets and your business topic, 
check whether all partitions are in Isr(in-sync replicas). For example,
./kafka-topics.sh --zookeeper 56.32.15.98:24002/kafka --describe --topic 
__consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:2 
Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: __consumer_offsets Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic: __consumer_offsets Partition: 1 Leader: 2 Replicas: 2,1 Isr: 1,2
Topic: __consumer_offsets Partition: 2 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic: __consumer_offsets Partition: 3 Leader: 2 Replicas: 2,1 Isr: 1,2
Topic: __consumer_offsets Partition: 4 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic: __consumer_offsets Partition: 5 Leader: 2 Replicas: 2,1 Isr: 1,2
Topic: __consumer_offsets Partition: 6 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic: __consumer_offsets Partition: 7 Leader: 2 Replicas: 2,1 Isr: 1,2
Topic: __consumer_offsets Partition: 8 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic: __consumer_offsets Partition: 9 Leader: 2 Replicas: 2,1 Isr: 1,2
Topic: __consumer_offsets Partition: 10 Leader: 1 Replicas: 1,2 Isr: 1,2
If any partitions is not in Isr, you can fix it.


1095193...@qq.com
 
From: Manu Jacob
Date: 2019-03-21 09:39
To: users@kafka.apache.org; d...@kafka.apache.org
Subject: Consumer poll stuck on
Hi,
 
We have a  Kafka cluster (version 1.1.1) where one node unexpectedly failed. 
After that consumers from  a couple of consumers are stuck in the poll() API 
call. Looking at the thread dump, it looks like the consumer is stuck in 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady()
 call. The heartbeat thread is also blocked waiting for the 
ConsumerCoordinator. Any idea what is the cause and how to resolve this issue? 
Thanks.
 
 
"BusinessEventRecordsDispatcherThread" #43 prio=5 os_prio=0 
tid=0x7f71764fb800 nid=0x241e sleeping[0x7f70d24e9000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
   at java.lang.Thread.sleep(Native Method)
   at org.apache.kafka.common.utils.SystemTime.sleep(SystemTime.java:45)
   at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:235)
   - locked <0x0004fc30e628> (a 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
   at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:205)
   - locked <0x0004fc30e628> (a 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
   at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:351)
   at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
   at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:290)
   at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1149)
   at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115)
   ...
   at java.lang.Thread.run(Thread.java:748)
 
 
 
"kafka-coordinator-heartbeat-thread | prod-mkt-datahub-loader" #45 daemon 
prio=5 os_prio=0 tid=0x7f711a286800 nid=0x2422 in Object.wait() 
[0x7f7120125000]
   java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on <0x0004fc30e628> (a 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
   at java.lang.Object.wait(Object.java:502)
   at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:937)
   - locked <0x0004fc30e628> (a 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
 


Re: Partition Count Dilemma

2019-03-20 Thread 1095193...@qq.com
Hi,
The number of partitions drives the parailism of consumers. In general, the 
more partitions, the more parallel consumer can be added , the more throughput 
can be provided. In other words, if you have 10 partitions, the most number of 
consumer is 10.  So you need to assume the  throughput a consumer can provide 
is C, and the target throughput is T. Then the minimum number of partitions, 
that is, the number of consumers,  is T/C.



1095193...@qq.com
 
From: shalom sagges
Date: 2019-03-21 06:43
To: users
Subject: Partition Count Dilemma
Hi All,
 
I'm really new to Kafka and wanted to know if anyone can help me better
understand partition count in relation to the Kafka cluster (apologies in
advance for noob questions).
 
I was requested to increase a topic's partition count from 30 to 100 in
order to increase workers' parallelism (there are already other topics in
this cluster with 100-200 partition counts per topic).
The cluster is built of 4 physical servers. Each server has 132 GB RAM, 40
CPU cores, 6 SAS disks 1.1 TB each.
 
Is PartitionCount:100 considered a high number of partitions per topic in
relation to the cluster?
Is there a good way for me to predetermine what an optimal partition count
might be?
 
Thanks a lot!


How to prevent data loss in "read-process-write" application?

2019-06-02 Thread 1095193...@qq.com
Hi
   I have a application consume from Kafka, process and send to Kafka. In order 
to prevent data loss, I need to commit consumer offset after committing a batch 
of  messages to Kafka successfully. I investigate Transaction fearture that 
provided atomic writes to multiple partitions could  solve my problem. Has any 
other recommended solution in addition to enable Transcation( I dont need 
exactly once process)?



1095193...@qq.com