Hey
I had an issue in the production two days ago,
For some reason 2 brokers in my 5 brokers cluster were stuck, meaning their
process was up, but they didn't answer to port 9092. The ZK saw them as live
brokers.
Producer couldn't produce events to them and consumer couldn't consume
S
If you can't see the image, I uploaded it to dropbox
https://www.dropbox.com/s/gckn4gt7gv26l9w/graph.png
From: Guy Doulberg [mailto:guy.doulb...@perion.com]
Sent: Monday, August 11, 2014 4:58 PM
To: users@kafka.apache.org
Subject: RE: Consume more than produce
Hey
I had an issue in the prod
This seems to happen when fetching the metadata. Are you using a VIP as
broker.list?
Thanks,
Jun
On Fri, Aug 8, 2014 at 2:42 PM, S. Zhou wrote:
> Thanks Guozhang. Any ideas on what could be wrong on that machine? We set
> up multiple producers in the same way but only one has this issue.
>
>
Sorry I should have provided this information. All my topics have single
partition meaning there are 2 nodes that already have topic partition. It's
just that the node having 3rd partition is down. So if a message fails there is
no reason why other message should succeed, unless there is network
I am not sure I understand completely. How many brokers did you shutdown
out of the total number of 5 brokers? With single-partition topics, if the
replication factor is 3, this partition will be hosted on 3 brokers.
Guozhang
On Mon, Aug 11, 2014 at 9:46 AM, Tanneru, Raj wrote:
> Sorry I shoul
I shutdown 3 out 5. With 2 brokers I start seeing failures after successfully
sending some messages. Not all messages are failing. I wanted to understand the
case/s when we log below message? If you notice there is difference in
send/receive buffer size of actual and requested. I don’t see this
The offset checker does show lot of lag.
rain-raw-consumers1 rain-raw-listner 47 630138482
32181
rain-raw-consumers1_dm1mad06.echostar.com-1407776221959-74777cd2-3
From: Seshadri, Balaji
Sent: Monday, August 11, 2014 12:11 PM
To: 'jun@gmail.com
Hi Raj,
I have a couple of more questions for you:
1. On the server configs, did you set enable.controlled.shutdown to true or
not?
2. When you shutdown just one broker, did you see any errors? Here I am
assuming you are not shutting down brokers too quickly, but shutdown one
broker at a time, wa
Thank you all for the comments. Yes, I understand concern from community
members with extra burden of having the complexity to drop message, but if
ability to inject implementation of the Queue which will make this
completely transparent to Kafka.
I just need fine-grained control of the applicati
I have a single broker test Kafka instance that was running fine on Friday
(basically out of the box configuration with 2 partitions), now I come back
on Monday and producers are unable to send messages.
What else can i look at to debug, and prevent?
I know how to recover by removing data directo
Any pointers would helpful.
From: Seshadri, Balaji
Sent: Monday, August 11, 2014 12:19 PM
To: 'jun@gmail.com'; 'neha.narkh...@gmail.com'; 'users@kafka.apache.org'
Subject: RE: Kafka Consumer not consuming in webMethods.
The offset checker does show lot of lag.
rain-raw-consumers1 rain-raw-li
HI Kafka Dev Team,
We have to aggregate events (count) per DC and across DCs for one of topic.
We have standard Linked-in data pipe line producers --> Local Brokers -->
MM --> Center Brokers.
So I would like to know How MM handles messages when custom partitioning
logic is used as below and
Folks,
Is there any potential issue with creating 240 topics every day? Although
the retention of each topic is set to be 2 days, I am a little concerned
that since right now there is no delete topic api, the zookeepers might be
overloaded.
Thanks,
Chen
Hi,
We are using the following method on ConsumerConnector to get multiple
streams per topic, and we have multiple partitions per topic. It looks like
only one of the runnable is active through a relative long time period. Is
there anything we could possible missed?
public Map>>
createMessageStr
Hi Guozhang,
I didn't set enable.controlled.shutdown to true. Yes I am shutting down 1
broker at a time slowly. However I begin the test(2 clients producing messages)
long time after taking down the brokers. I see the below debug message on live
broker once in a while.
[2014-08-11 15:09:56,078
Could you try again by setting the config to true?
On Mon, Aug 11, 2014 at 3:14 PM, Tanneru, Raj wrote:
> Hi Guozhang,
>
> I didn't set enable.controlled.shutdown to true. Yes I am shutting down 1
> broker at a time slowly. However I begin the test(2 clients producing
> messages) long time afte
Is it anyhow related to the issue?
WARN No previously checkpointed highwatermark value found for topic RAW
partition 0. Returning 0 as the highwatermark
(kafka.server.HighwaterMarkCheckpoint)
Mingtao
You need to consider your total partition count as you do this. After 30
days, assuming 1 partition per topic, you have 7200 partitions. Depending
on how many brokers you have, this can start to be a problem. We just
found an issue on one of our clusters that has over 70k partitions that
there¹s no
Hi Ryan,
Could you check if all of your brokers are still live and running? Also
could you check the server log in addition to the producer / state-change /
controller logs?
Guozhang
On Mon, Aug 11, 2014 at 12:45 PM, Ryan Williams wrote:
> I have a single broker test Kafka instance that was r
Mingtao,
How many partitions of the consumed topic has? Basically the data is
distributed per-partition, and hence if the number of consumers is larger
than the number of partitions, some consumers will not get any data.
Guozhang
On Mon, Aug 11, 2014 at 3:29 PM, Mingtao Zhang
wrote:
> Is it a
Todd,
I actually only intend to keep each topic valid for 3 days most. Each of
our topic has 3 partitions, so its around 3*240*3 =2160 partitions. Since
there is no api for deleting topic, i guess i could set up a cron job
deleting the out dated topics(folders) from zookeeper..
do you know when the
I'd love to know more about what you're trying to do here. It sounds like
you're trying to create topics on a schedule, trying to make it easy to locate
data for a given time range? I'm not sure it makes sense to use Kafka in this
manner.
Can you provide more detail?
Philip
---
The broker appears to be running
$ telnet kafka-server 9092
Trying...
Connected to kafka-server
Escape character is '^]'.
I've attached today's server.log. There was a manual restart of kafka,
which you'll notice, but that didn't fix the issue.
Thanks for looking!
On Mon, Aug 11, 2014 a
Philip,
That is right. There is huge amount of data flushed into the topic within
each 6 minutes. Then at the end of each 6 min, I only want to read from
that specify topic, and data within that topic has to be processed as fast
as possible. I was originally using redis queue for this purpose, but
It's still not clear to me why you need to create so many topics.
Write the data to a single topic and consume it when it arrives. It doesn't
matter if it arrives in bursts, as long as you can process it all within 6
minutes, right?
And if you can't consume it all within 6 minutes, partition t
Hi Guozhang,
I do have another Email talking about Partitions per topic. I paste it
within this Email.
I am expecting those consumers will work concurrently. The behavior I
observed here is consumer thread-1 will work a while, then thread-3 will
work, then thread-0 ..., is it normal?
version is
"And if you can't consume it all within 6 minutes, partition the topic
until you can run enough consumers such that you can keep up.", this is
what I intend to do for each 6min -topic.
What I really need is a partitioned queue: each 6 minute of data can put
into a separate partition, so that I can
Why do you need to read it every 6 minutes? Why not just read it as it arrives?
If it naturally arrives in 6 minute bursts, you'll read it in 6 minute bursts,
no?
Perhaps the data does not have timestamps embedded in it, so that is why you
are relying on time-based topic names? In that case I w
Those data has a timestamp: its actually email campaigns with scheduled
send time. But since they can be scheduled ahead(e.g, two days ahead), I
cannot read it when it arrives. It has to wait until its actual scheduled
send time. As you can tell, the sequence within the 6 min does not matter,
but
Your use case requires messages to pushed out when time comes instead of
the order in which they arrived, while kafka may not be best for this as
within the Q you want some message batch to be sent out early and some
later. There could be another way to solve this with offset management as
kafka is
Ok, now that is good detail. I understand your issue.
It's somewhat difficult as to use Kafka in your situation, as Kafka is a FIFO
queue, but you are trying to use it with data that is not tightly ordered in
that manner.
I don't have any definite solutions, but perhaps this might work.
Assu
Vipul,
The problem is that the producer does not know when it should set the
window start and window end boundary.. The data does not arrive in order. I
also think its difficult to get the offset of the boundary, and only pull
messages between those boundaries: i am already trying to avoid use the
Unfortunately, this would not work in our system. It means almost every
several minutes i will need to scan the entire queue, which is not possible
in our case. In fact, our old system is designed in this way: store the
data in hbase, and with hourly mapreduce to scan entire table figure out
which
In order to delete topics, you need to shut down the entire cluster (all
brokers), delete the topics from Zookeeper, and delete the log files and
partition directory from the disk on the brokers. Then you can restart the
cluster. Assuming that you can take a periodic outage on your cluster, you
can
Todd,
Yes I actually thought about that. My concern is that even a weeks topic
partition(240*7*3 = 5040) is too many. Does linkedin have a good experience
in using this many topics in your system?:-)
Thanks,
Chen
On Mon, Aug 11, 2014 at 9:02 PM, Todd Palino
wrote:
> In order to delete topics, y
[2014-08-11 19:12:45,321] ERROR Controller 0 epoch 3 initiated state change
for partition [mytopic,0] from OfflinePartition to OnlinePartition failed
(state.change.logger)
kafka.common.
NoReplicaOnlineException: No replica for partition [mytopic,0]
is alive. Live brokers are: [Set()], Assigned repl
0.8-beta is really old. Could you try using 0.8.1.1?
Thanks,
Jun
On Mon, Aug 11, 2014 at 11:11 AM, Seshadri, Balaji wrote:
> Hi Jun/Neha/Team,
>
>
>
> We are trying to consume from Kafka using webMethods as our consumer. When
> we start the consumer, fetcher and leader threads went into WAI
As I noted, we have a cluster right now with 70k partitions. It’s running
on over 30 brokers, partly to cover the number of partitions and and
partly to cover the amount of data that we push through it. If you can
have at least 4 or 5 brokers, I wouldn’t anticipate any problems with the
number of p
Ryan,
Apache mailing list does not allow attachments exceeding a certain size
limit, so the server logs is blocked.
>From the controller log it seems this only broker has failed and hence no
partitions will be available. This could be a soft failure (e.g. long GC),
or the ZK server side issues. Y
Got it. thanks for the input Todd!
Chen
On Mon, Aug 11, 2014 at 9:31 PM, Todd Palino
wrote:
> As I noted, we have a cluster right now with 70k partitions. It’s running
> on over 30 brokers, partly to cover the number of partitions and and
> partly to cover the amount of data that we push throug
Hello Mingtao,
The partition will not be re-assigned to other consumers unless the current
consumer fails, so the behavior you described will not be expected.
Guozhang
On Mon, Aug 11, 2014 at 6:27 PM, Mingtao Zhang
wrote:
> Hi Guozhang,
>
> I do have another Email talking about Partitions per
Thanks Neha,
Indeed, there are no replicas apparently.
$ bin/kafka-topics.sh --describe --zookeeper localhost:2181
Topic:eventsPartitionCount:2ReplicationFactor:1Configs:
Topic: eventsPartition: 0Leader: 0Replicas: 0Isr: 0
Topic: eventsPartition: 1Leade
Thanks for the heads up on attachments, here's a gist:
https://gist.githubusercontent.com/ryanwi/84deb8774a6922ff3704/raw/75c33ad71d0d41301533cbc645fa9846736d5eb0/gistfile1.txt
This seems to mostly happen in my development environment, when running a
single broker. I don't see any broker failure
43 matches
Mail list logo