Re: Performance of jBOD

2013-10-03 Thread Graeme Wallace
Its whatever dstat reports - i think it must be MBytes ? On Wed, Oct 2, 2013 at 10:29 PM, Jun Rao wrote: > If you use a reasonable fetch size in the consumer (e.g. 100+KB) or above, > you could probably get most of the sequential scan performance from those > disks. Do you mean 250 Mbits or MBy

Re: Metadata API returns localhost.localdomain for one of the brokers in EC2

2013-10-03 Thread David Arthur
You can configure the hostname for the broker with the "host.name" property in the broker's config (server.properties?). If you don't specify one here, then all interfaces will be bound to and one will be chosen to get published via ZooKeeper (what the metadata API is reading) See: http://kafk

Re: Num of streams for consumers using TopicFilter.

2013-10-03 Thread Jun Rao
See https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example Thanks, Jun On Wed, Oct 2, 2013 at 9:42 PM, Jason Rosenberg wrote: > Jun, > > Thanks, can you point me to the client code to issue a metadata request! > > Jason > > > On Thu, Oct 3, 2013 at 12:24 AM, Jun Rao w

Re: Metadata API returns localhost.localdomain for one of the brokers in EC2

2013-10-03 Thread Jun Rao
There is an FAQ too. https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-OnEC2%2Cwhycan%27tmyhighlevelconsumersconnecttothebrokers%3F Thanks, Jun On Thu, Oct 3, 2013 at 6:43 AM, David Arthur wrote: > You can configure the hostname for the broker with the "host.name" > property in the b

Re: Num of streams for consumers using TopicFilter.

2013-10-03 Thread Jason Rosenberg
Ah, So this is exposed directly in the simple consumer (but not the high-level one?). Jason On Thu, Oct 3, 2013 at 10:25 AM, Jun Rao wrote: > See > > https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example > > Thanks, > > Jun > > > On Wed, Oct 2, 2013 at 9:42 PM, Jason

Re: Metadata API returns localhost.localdomain for one of the brokers in EC2

2013-10-03 Thread Aniket Bhatnagar
Thanks Jun and David. I think the FAQ mentions why it's not possible to connect to broker from outside. In my case, all servers (producers and brokers) are in the same VPC. Call to InetAddress.getLocalHost.getHostAddress should return an internal IP to which producers should be able to connect. Th

Re: as i understand rebalance happens on client side

2013-10-03 Thread Keith Bourgoin
Hi Kane, I just wanted to chime in as well. I currently maintain samsa ( https://github.com/getsamsa/samsa or PyPI), which does implement the rebalancing logic, but currently doesn't support 0.8. If you're on 0.7.x still, it might be helpful. If not, we're working on 0.8 support, but it probably w

Re: Num of streams for consumers using TopicFilter.

2013-10-03 Thread Jason Rosenberg
I filed this, to address the need for allowing parallelism when consuming multiple single-partition topics selected with a topic filter: https://issues.apache.org/jira/browse/KAFKA-1072 On Thu, Oct 3, 2013 at 10:56 AM, Jason Rosenberg wrote: > Ah, > > So this is exposed directly in the simple c

JMX connections timing out in EC2 in Kafka 0.8

2013-10-03 Thread Aniket Bhatnagar
I don't think this issue is related to Kafka at all but I just wanted to try my luck in this user mailing list just in case people have run into similar issues. I am trying to enable JMX monitoring in Kafka running on EC2 by setting the JMX_PORT variable to 1099. Once Kafka is done booting, I am ab

Re: Metadata API returns localhost.localdomain for one of the brokers in EC2

2013-10-03 Thread Rajasekar Elango
Hi Aniket, We had same issue it turns out that we need to make sure ip to hostname mapping should be correctly configured in /etc/hosts file. For eg: If you had something like 127.0.0.1 localhost localhost as first line in /etc/hosts file, you will get his error. To fix we need to ad

Re: JMX connections timing out in EC2 in Kafka 0.8

2013-10-03 Thread Rajasekar Elango
May be this helps. See last post on http://qnalist.com/questions/4522179/jmx Thanks, Raja. On Thu, Oct 3, 2013 at 11:17 AM, Aniket Bhatnagar < aniket.bhatna...@gmail.com> wrote: > I don't think this issue is related to Kafka at all but I just wanted to > try my luck in this user mailing list j

Re: as i understand rebalance happens on client side

2013-10-03 Thread Kane Kane
Hi Keith, thanks for update! Interestingly i've found your library yesterday and was going to borrow your zookeeper code for partition management. Do you have any ETA for the 0.8 support? Thanks! On Thu, Oct 3, 2013 at 8:09 AM, Keith Bourgoin wrote: > Hi Kane, > > I just wanted to chime in as w

Re: Metadata API returns localhost.localdomain for one of the brokers in EC2

2013-10-03 Thread Jay Kreps
Is there something we could be doing in Kafka to avoid this problem? -Jay On Thu, Oct 3, 2013 at 8:29 AM, Rajasekar Elango wrote: > Hi Aniket, > > We had same issue it turns out that we need to make sure ip to hostname > mapping should be correctly configured in /etc/hosts file. > > For eg: If

Re: Performance of jBOD

2013-10-03 Thread Graeme Wallace
Jun, Just to follow up - have you any stats on what disk read + write throughput you are getting in production at LinkedIn ? regards, Graeme On Thu, Oct 3, 2013 at 7:17 AM, Graeme Wallace < graeme.wall...@farecompare.com> wrote: > Its whatever dstat reports - i think it must be MBytes ? > >

Re: is it possible to commit offsets on a per stream basis?

2013-10-03 Thread Jason Rosenberg
I added a comment/suggestion to: https://issues.apache.org/jira/browse/KAFKA-966 Basically to expose an api for marking an offset for commit, such that the auto-commit would only commit offsets up to the last message 'markedForCommit', and not the last 'consumed' offset, which may or may not have

ISR differs between Kafka Metadata and Zookeeper

2013-10-03 Thread Florian Weingarten
Hi list, I am trying to debug a strange issue we are seeing. We are using "Sarama" [1], our own Go implementation of the Kafka API. Somehow, we either found a bug in Kafka, have a bug in our own code, or got our cluster into a weird state. If we use our library to query Kafka for metadata about a

Re: as i understand rebalance happens on client side

2013-10-03 Thread Keith Bourgoin
To be honest, 0.8 it's a pretty big change from how things are currently managed in samsa. I don't have much of an ETA at the moment, but I'm hoping to find the time in the next month or two. Keith. On Thu, Oct 3, 2013 at 11:59 AM, Kane Kane wrote: > Hi Keith, thanks for update! Interestingly

Re: Strategies for auto generating broker ID

2013-10-03 Thread Neha Narkhede
>> what is the procedure for having the new broker assume the identity of the previously failed one? Copying the meta file to any one of the data directories on the new broker prior to the initial startup will work. Thanks, Neha On Wed, Oct 2, 2013 at 11:47 AM, Jason Rosenberg wrote: > The on

Re: ISR differs between Kafka Metadata and Zookeeper

2013-10-03 Thread Neha Narkhede
When you issue a metadata request to any broker, it responds with the cluster metadata in memory. The broker gets it through an UpdateMetadata request issued by the controller broker. If there are state changes in flight, there may be a delay in the UpdateMetadata request propagation to the broker

Re: Performance of jBOD

2013-10-03 Thread Neha Narkhede
The average observed per broker write bandwidth is roughly 5 MB/s and read bandwidth is roughly 20 MB/s. But these statistics are from clusters that are no where close to maxing out the write/read bandwidth available. In most cases though, you will hit the network bottleneck (~125 MB/s) first. Tha

Cannot run producer from local windows

2013-10-03 Thread Yi Jiang
Hi, everyone I am currently installed Kafka cluster in EC2, I am sure all ports have been opened, but I want to kick some data into kafka in cloud, and i even cannot run the producer from my local. It always throw the exception with "failed after 3 retries". But there is no any problem when I ru