Kafka 0.8's VerifyConsumerRebalance reports an error

2014-06-24 Thread Yury Ruchin
Hi, I've run into the following problem. I try to read from a 50-partition Kafka topic using high level consumer with 8 streams. I'm using 8-thread pool, each thread handling one stream. After a short time, the threads reading from the stream stop reading. Lag between topic latest offset and the c

How does number of partitions affect sequential disk IO

2014-06-24 Thread Daniel Compton
I’ve been reading the Kafka docs and one thing that I’m having trouble understanding is how partitions affect sequential disk IO. One of the reasons Kafka is so fast is that you can do lots of sequential IO with read-ahead cache and all of that goodness. However, if your broker is responsible fo

Re: How does number of partitions affect sequential disk IO

2014-06-24 Thread Paul Mackles
You'll want to account for the number of disks per node. Normally, partitions are spread across multiple disks. Even more important, the OS file cache reduces the amount of seeking provided that you are reading mostly sequentially and your consumers are keeping up. On 6/24/14 3:58 AM, "Daniel Comp

Re: Consumer offset is getting reset back to some old value automatically

2014-06-24 Thread Hemath Kumar
Yes kane i have the replication factor configured as 3 On Tue, Jun 24, 2014 at 2:42 AM, Kane Kane wrote: > Hello Neha, can you explain your statements: > >>Bringing one node down in a cluster will go smoothly only if your > replication factor is 1 and you enabled controlled shutdown on the brok

Re: How does number of partitions affect sequential disk IO

2014-06-24 Thread Daniel Compton
Good point. We've only got two disks per node and two topics so I was planning to have one disk/partition. Our workload is very write heavy so I'm mostly concerned about write throughput. Will we get write speed improvements by sticking to 1 partition/disk or will the difference between 1 and

Re: How does number of partitions affect sequential disk IO

2014-06-24 Thread Paul Mackles
Its probably best to run some tests that simulate your usage patterns. I think a lot of it will be determined by how effectively you are able to utilize the OS file cache in which case you could have many more partitions. Its a delicate balance but you definitely want to err on the side of having m

What's the behavior when Kafka is deleting messages and consumers are still reading

2014-06-24 Thread Lian, Li
When Kafka broker is trying to delete message according to the log retention setting, either triggered by log age or topic partition size, if the same time there are still consumers reading the topic, what will happen? Best regards, Lex Lian Email: ll...@ebay.com

Announcing Kafka Web Console v2.0.0

2014-06-24 Thread Claude Mamo
Announcing the second major release of Kafka Web Console: https://github.com/claudemamo/kafka-web-console/releases/tag/v2.0.0. Highlights: - I've borrowed some ideas from Kafka Offset Monitor and added graphs to show the history of consumers offsets and lag as well as message throughput - Added p

Re: Announcing Kafka Web Console v2.0.0

2014-06-24 Thread Joe Stein
Awesome Claude, thanks! /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop / On Tue, Jun 2

Re: high level consumer not working

2014-06-24 Thread Guozhang Wang
Hi Li Li, If you use the same consumer group id then offsets may have already been committed to Kafka, hence messages before that will not be consumed. Guozhang On Mon, Jun 23, 2014 at 6:09 PM, Li Li wrote: > no luck by adding props.put("auto.offset.reset", "smallest"); > but running consumer

Re: What's the behavior when Kafka is deleting messages and consumers are still reading

2014-06-24 Thread Guozhang Wang
Hi Li, The log operations are protected by a lock, so if there is a concurrent read on this partition it will not be deleted. But then when it is deleted the next fetch/read will result in a OffsetOutOfRange exception and the consumer needs to restart from a offset reset value. Guozhang On Tue,

Experiences with larger message sizes

2014-06-24 Thread Denny Lee
By any chance has anyone worked with using Kafka with message sizes that are approximately 50MB in size?  Based on from some of the previous threads there are probably some concerns on memory pressure due to the compression on the broker and decompression on the consumer and a best practices on

Apacha Kafka Commercial Support

2014-06-24 Thread Diego Alvarez Zuluaga
Hello Are there any vendors who provide commercial support for Kafka? We're very interested in using kafka, but our infraestructure team ask us (DevTeam) for a commercial support. Tks diegoalva...@sura.com.co http://www.gruposuramericana.com/en/Pages/OurPortfolio/Suramericana.aspx

Re: restarting a broker during partition reassignment

2014-06-24 Thread Luke Forehand
My hypothesis for how Partition [luke3,3] with leader 11, had offset reset to zero, caused by reboot of leader broker during partition reassignment: The replicas for [luke3,3] were in progress being reassigned from broker 10,11,12 -> 11,12,13 I rebooted broker 11 which was the leader for [luke3.3

Re: Using GetOffsetShell against non-existent topic creates the topic which cannot be deleted

2014-06-24 Thread Luke Forehand
Definitely, this is version 0.8.1.1 https://issues.apache.org/jira/browse/KAFKA-1507 Luke Forehand | Networked Insights | Software Engineer On 6/23/14, 6:58 PM, "Guozhang Wang" wrote: >Luke, > >Thanks for the findings, could you file a JIRA to keep track of this bug? > >Guozhang > > >O

Re: Apacha Kafka Commercial Support

2014-06-24 Thread Joe Stein
Hi Diego, Big Data Open Source Security LLC https://www.linkedin.com/company/big-data-open-source-security-llc provides commercial support around Apache Kafka. We currently do this as a retained professional service rate (so the cost is not nodes or volume like product vendors). We have also bee

Re: How does number of partitions affect sequential disk IO

2014-06-24 Thread Jay Kreps
The primary relevant factor here is the fsync interval. Kafka's replication guarantees do not require fsyncing every message, so the reason for doing so is to handle correlated power loss (a pretty uncommon failure in a real data center). Replication will handle most other failure modes with much m

Re: Experiences with larger message sizes

2014-06-24 Thread Joe Stein
Hi Denny, have you considered saving those files to HDFS and sending the "event" information to Kafka? You could then pass that off to Apache Spark in a consumer and get data locality for the file saved (or something of the sort [no pun intended]). You could also stream every line (or however you

Re: Experiences with larger message sizes

2014-06-24 Thread Denny Lee
Hey Joe, Yes, I have - my original plan is to do something similar to what you suggested which was to simply push the data into HDFS / S3 and then having only the event information within Kafka so that way multiple consumers can just read the event information and ping HDFS/S3 for the actual me

Re: Experiences with larger message sizes

2014-06-24 Thread Joe Stein
You could then chunk the data (wrapped in an outer message so you have meta data like file name, total size, current chunk size) and produce that with the partition key being filename. We are in progress working on a system for doing file loading to Kafka (which will eventually support both chunke

Re: What's the behavior when Kafka is deleting messages and consumers are still reading

2014-06-24 Thread Neha Narkhede
The behavior on the consumer in this case is governed by the value of the "auto.offset.reset" config. Depending on this config, it will reset it's offset to either the earliest or the latest in the log. Thanks, Neha On Tue, Jun 24, 2014 at 8:17 AM, Guozhang Wang wrote: > Hi Li, > > The log ope

Re: Consumer offset is getting reset back to some old value automatically

2014-06-24 Thread Neha Narkhede
Can you elaborate your notion of "smooth"? I thought if you have replication factor=3 in this case, you should be able to tolerate loss of a node? Yes, you should be able to tolerate the loss of a node but if controlled shutdown is not enabled, the delay between loss of the old leader and election

Re: Kafka 0.8's VerifyConsumerRebalance reports an error

2014-06-24 Thread Neha Narkhede
Is it possible that maybe the zookeeper url used for the VerifyConsumerRebalance tool is incorrect? On Tue, Jun 24, 2014 at 12:02 AM, Yury Ruchin wrote: > Hi, > > I've run into the following problem. I try to read from a 50-partition > Kafka topic using high level consumer with 8 streams. I'm u

Re: Kafka 0.8's VerifyConsumerRebalance reports an error

2014-06-24 Thread Yury Ruchin
I've just double-checked. The URL is correct, the same one is used by Kafka clients. 2014-06-24 22:21 GMT+04:00 Neha Narkhede : > Is it possible that maybe the zookeeper url used for the > VerifyConsumerRebalance tool is incorrect? > > > On Tue, Jun 24, 2014 at 12:02 AM, Yury Ruchin > wrote: >

Re: Consumer offset is getting reset back to some old value automatically

2014-06-24 Thread Kane Kane
Hello Neha, >>ZK cluster of 3 nodes will tolerate the loss of 1 node, but if there is a subsequent leader election for any reason, there is a chance that the cluster does not reach a quorum. It is less likely but still risky to some extent. Does it mean if you have to tolerate 1 node loss without

Re: Consumer offset is getting reset back to some old value automatically

2014-06-24 Thread Kane Kane
Sorry, i meant 5 nodes in previous question. On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane wrote: > Hello Neha, > >>>ZK cluster of 3 nodes will tolerate the loss of 1 node, but if there is a > subsequent leader election for any reason, there is a chance that the > cluster does not reach a quorum. I

Re: Kafka 'reassign-partitions' behavior if the replica does not catches up

2014-06-24 Thread Virendra Pratap Singh
In process of giving 0.8.1.1 a try. However I believe the question still holds true. If the amount of data getting pumped in a partition is such that any new replica partition can never catch up, then what would the partition reassignment tool behavior? If it will be in an infinitely wait mode wai

Re: How does number of partitions affect sequential disk IO

2014-06-24 Thread Daniel Compton
Thanks Jay, that's exactly what I was looking for. On 25 June 2014 04:18, Jay Kreps wrote: > The primary relevant factor here is the fsync interval. Kafka's replication > guarantees do not require fsyncing every message, so the reason for doing > so is to handle correlated power loss (a pretty

Re: Kafka 'reassign-partitions' behavior if the replica does not catches up

2014-06-24 Thread Neha Narkhede
We have a JIRA to track the cancel feature - https://issues.apache.org/jira/browse/KAFKA-1506. Thanks, Neha On Tue, Jun 24, 2014 at 1:32 PM, Virendra Pratap Singh < vpsi...@yahoo-inc.com.invalid> wrote: > In process of giving 0.8.1.1 a try. > > However I believe the question still holds true. >

Re: Kafka 0.8's VerifyConsumerRebalance reports an error

2014-06-24 Thread Neha Narkhede
I would turn on DEBUG on the tool to see which url it reads and doesn't find the owners. On Tue, Jun 24, 2014 at 11:28 AM, Yury Ruchin wrote: > I've just double-checked. The URL is correct, the same one is used by Kafka > clients. > > > 2014-06-24 22:21 GMT+04:00 Neha Narkhede : > > > Is it p

Re: Consumer offset is getting reset back to some old value automatically

2014-06-24 Thread Neha Narkhede
See the explanation from the zookeeper folks here - " Because Zookeeper requires a majority, it is best to use an odd number of machines. For example, with four machines ZooKeeper can only handle the failure of a single machine; if two

Kafka 0.8/VIP/SSL

2014-06-24 Thread Reiner Stach
I'm looking for advice on running Kafka 0.8 behind VIPs. The goal is to support SSL traffic, with encryption and decryption being performed by back-to-back VIPs at the client and in front of the broker. That is: Kafka client --> vip1a.myco.com:8080 (SSL encrypt) --- WAN ---> VIP 1b (SSL decryp

Re: What's the behavior when Kafka is deleting messages and consumers are still reading

2014-06-24 Thread Lian, Li
GuoZhang, Thanks for explaining. I thought there might be such kind of lock mechanism but cannot confirm it in any documentation on Kafka website. It will be better if this could be written down in some Wiki or FAQ. Best regards, Lex Lian Email: ll...@ebay.com On 6/24/14, 11:17 PM, "Guozh

Re: What's the behavior when Kafka is deleting messages and consumers are still reading

2014-06-24 Thread Guozhang Wang
I think we can probably update the documentation page for this update: https://kafka.apache.org/documentation.html#compaction On Tue, Jun 24, 2014 at 3:54 PM, Lian, Li wrote: > GuoZhang, > > Thanks for explaining. I thought there might be such kind of lock > mechanism but cannot confirm it in

Uneven distribution of kafka topic partitions across multiple brokers

2014-06-24 Thread Virendra Pratap Singh
Have a kafka cluster with 10 brokers (kafka 0.8.0). All of the brokers were setup upfront. None was added later. Default number of partition is set to 4 and default replication to 2. Have 3 topics in the system. None of these topics are manually created upfront, when the cluster is setup. So re

Monitoring Producers at Large Scale

2014-06-24 Thread Bhavesh Mistry
We use Kafka as Transport Layer to transport application logs. How do we monitor Producers at large scales about 6000 boxes x 4 topic per box so roughly 24000 producers (spread across multiple data center.. we have brokers per DC). We do the monitoring based on logs. I have tried intercepting lo

Blacklisting Brokers

2014-06-24 Thread Lung, Paul
Hi All, Is there anyway to blacklist brokers? Sometimes we run into situations where there are certain hardware failures on a broker machine, and the machines goes into a “half dead” state. The broker process is up and participating in the cluster, but it can’t actually transmit messages proper

kafka.common.LeaderNotAvailableException

2014-06-24 Thread Zack Payton
Hi all, I have 3 zookeeper servers and 2 Kafka servers. Running Kafka version 0.8.1.1. Running zookeeper 3.3.5-cdh3u6. >From the Kafka servers I can access the zookeeper servers on 2181. >From one of the Kafka servers I can create a topic no problem: [root@kafka1 kafka-0.8.1.1-src]# bin/kafka-topi

Re: kafka.common.LeaderNotAvailableException

2014-06-24 Thread Joe Stein
Are there any errors in the broker's logs? /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop **

Re: kafka.common.LeaderNotAvailableException

2014-06-24 Thread Zack Payton
server.log has a lot of these errors: [2014-06-24 20:07:16,124] ERROR [KafkaApi-6] error when handling request Name: FetchRequest; Version: 0; CorrelationId: 81138; ClientId: ReplicaFetche rThread-0-5; ReplicaId: 6; MaxWait: 500 ms; MinBytes: 1 bytes; RequestInfo: [test1,0] -> PartitionFetchInfo(0,

Re: Uneven distribution of kafka topic partitions across multiple brokers

2014-06-24 Thread Joe Stein
Take a look at bin/kafka-reassign-partitions.sh Option Description -- --- --broker-list The list of brokers to which the partitions need to be reassigned in

Getting "java.io.IOException: Too many open files"

2014-06-24 Thread Lung, Paul
Hi All, I just upgraded my cluster from 0.8.1 to 0.8.1.1. I’m seeing the following error messages on the same 3 brokers once in a while: [2014-06-24 21:43:44,711] ERROR Error in acceptor (kafka.network.Acceptor) java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChann

Re: Getting "java.io.IOException: Too many open files"

2014-06-24 Thread Prakash Gowri Shankor
How many files does each broker itself have open ? You can find this from 'ls -l /proc//fd' On Tue, Jun 24, 2014 at 10:18 PM, Lung, Paul wrote: > Hi All, > > > I just upgraded my cluster from 0.8.1 to 0.8.1.1. I’m seeing the following > error messages on the same 3 brokers once in a while: >

Re: Getting "java.io.IOException: Too many open files"

2014-06-24 Thread Lung, Paul
The controller machine has 3500 or so, while the other machines have around 1600. Paul Lung On 6/24/14, 10:31 PM, "Prakash Gowri Shankor" wrote: >How many files does each broker itself have open ? You can find this from >'ls -l /proc//fd' > > > > >On Tue, Jun 24, 2014 at 10:18 PM, Lung, Paul w

Re: Uneven distribution of kafka topic partitions across multiple brokers

2014-06-24 Thread Neha Narkhede
Looking at the output of list topics, here is what I think happened. When topic1 and topic3 were created, only brokers 1&2 were online and alive. When topic2 was created, almost all brokers were online. Only brokers that are alive at the time of topic creation can be assigned replicas for the topic

Re: Getting "java.io.IOException: Too many open files"

2014-06-24 Thread Lung, Paul
Ok. What I just saw was that when the controller machine reaches around 4100+ files, it crashes. Then I think the controller bounced between 2 other machines, taking them down too, and the circled back to the original machine. Paul Lung On 6/24/14, 10:51 PM, "Lung, Paul" wrote: >The controller