How does offsets.retention.minutes work

2017-03-15 Thread tao xiao
Hi team, I know that Kafka deletes offset for a consumer group after a period of time (configured by offsets.retention.minutes) if the consumer group is inactive for this amount of time. I want to understand the definition of "inactive". I came across this post[1] and it suggests that no offset co

Need kafka client v0.9.0.2 / fix for KAFKA-3594

2017-03-15 Thread Phil Adams
I'm currently using the kafka-clients v0.9.0.1 library along with the v0.9.0.1 kafka broker. Unfortunately, we've run into this problem: https://issues.apache.org/jira/browse/KAFKA-3594 According to the JIRA, this has been fixed in v0.9.0.2 but I can't find a pre-built kafka-clients library for thi

Re: restart Kafka Streams application takes around 5 minutes

2017-03-15 Thread Guozhang Wang
Are you building with any released versions of Kafka or with a build from Kafka trunk? There are a few fixes we have made post 0.10.2 in trunk that has largely reduced the rebalance latency so I'd recommend try using a build from Kafka trunk for testing if possible. Guozhang On Wed, Mar 15, 2017

Re: Kafka Stream: RocksDBKeyValueStoreSupplier performance

2017-03-15 Thread Eno Thereska
Tianji, A couple of things: - for now could you use RocksDb without the cache? I've opened a JIRA to verify why it's slower with the cache: https://issues.apache.org/jira/browse/KAFKA-4904 - you can tune the RocksDb performance further by in

Re: Kafka Stream: RocksDBKeyValueStoreSupplier performance

2017-03-15 Thread Tianji Li
Hi Eno, Rocksdb without caching took around 7 minutes. Tianji On Wed, Mar 15, 2017 at 9:40 AM, Eno Thereska wrote: > Tianji, > > Could you provide a third data point, running with RocksDb, but without > caching, i.e: > > > StateStoreSupplier stateStoreSupplier = Stores.create(storeName) > >

Re: Not Serializable Result Error

2017-03-15 Thread Michael Noll
Hi Armaan, > org.apache.spark.SparkException: Job aborted due to stage failure: >Task 0.0 in stage 0.0 (TID 0) had a not serializable result: org.apache.kafka.clients.consumer.ConsumerRecord perhaps you should ask that question in the Spark mailing list, which should increase your chances of

Javadoc for org.apache.kafka.common.requests package

2017-03-15 Thread Afshartous, Nick
Hi, I'm trying to use kafka-clients-0.10.1.0.jar and don't see Javadoc for the classes in package org.apache.kafka.common.requests Am I looking in the right place here https://kafka.apache.org/0100/javadoc/index.html Thanks fo

Re: restart Kafka Streams application takes around 5 minutes

2017-03-15 Thread Tianji Li
It seems independent to the rocksdb sizes. It also took 5 minutes when there are 375Mbytes this morning... On Wed, Mar 15, 2017 at 9:13 AM, Sachin Mittal wrote: > rocksdb state store initialization may be taking up that time. > Whats the size of your rockksdb state directory. May be partitioning

Re: Offset commit request failing

2017-03-15 Thread Robert Quinlivan
I should also mention that this error was seen on broker version 0.10.1.1. I found that this condition sounds somewhat similar to KAFKA-4362 , but that issue was submitted in 0.10.1.1 so they appear to be different issues. On Wed, Mar 15, 2017 at 1

Help with SASL configuration for Zookeeper on the Microsoft AD.

2017-03-15 Thread Shrikant Patel
Hi Has anyone experience with securing Kafka to Zookeeper configuration and setting up SASL on Microsoft AD account. We create keytab and principal for Kafka and ZK using https://www.confluent.io/blog/apache-kafka-security-authorization-authentication-encryption/ We see these principal in our

Re: Performance and Encryption

2017-03-15 Thread Hans Jespersen
You are correct that a Kafka broker is not just writing to one file. Jay Kreps wrote a great blog post with lots of links to even greater detail on the topic of Kafka and disk write performance. Still a good read many years later. https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-

Offset commit request failing

2017-03-15 Thread Robert Quinlivan
Good morning, I'm hoping for some help understanding the expected behavior for an offset commit request and why this request might fail on the broker. *Context:* For context, my configuration looks like this: - Three brokers - Consumer offsets topic replication factor set to 3 - Auto c

RE: Streams 0.10.2.0 + RocksDB + Avro

2017-03-15 Thread Adrian McCague
Hi Damian, That is the SerDe we are using, agreed that looks like a good modification to make here for a state store version. I would add that it is a good idea to include the record type as well since an edit of the topology arrangement could still lead to issues down the line. Thank you for

Re: Streams 0.10.2.0 + RocksDB + Avro

2017-03-15 Thread Damian Guy
Hi Adrian, The state store names are local names hence don't have the applicationId prefix, i.e., they are laid out on disk like so: /state.dir/applicationId/task/state-store-name. Their corresponding change-logs are prefixed with the applicationId. However, i can see in the case of the schema-re

RE: Performance and Encryption

2017-03-15 Thread Nicolas MOTTE
Ok that makes sense, thanks ! The next question I have regarding performance is about the way Kafka writes in the data files. I often hear Kafka is very performant because it writes in an append-only fashion. So even with hard disk (not SSD) we get a great performance because it writes in seque

Re: Trying to use Kafka Stream

2017-03-15 Thread Mina Aslani
Hi Eno, Great finding! You were right! I had to change KAFKA_ADVERTISED_LISTENERS to be PLAINTEXT://$(docker-machine ip ): to make it work from IDE. Step 2 (pointing to : in my stream app) was already done. Later, I'll try using CLI as mentioned here https://github.com/ confluentinc/examples/blob

Consumer group stable state

2017-03-15 Thread Igor Velichko
Hi all, is it possible to determine on client's side when consumer group's state become stable in group coordinator? From what I see (Kafka 0.9.0.1), when a consumer subscribes to a topic, it's not immediately ready to receive messages because of rebalancing process in the broker. So I'd like

Re: Kafka connection to start from latest offset

2017-03-15 Thread Stephen Durfey
Yes, it is pretty coarse. I have a pull request open for supporting overriding those settings at the connector level, I'm just waiting for it to be pulled. So, if any committers are interested in reviewing/pulling it for me, that would be great :) https://github.com/apache/kafka/pull/2548 On Wed

Re: Kafka connection to start from latest offset

2017-03-15 Thread Aaron Niskode-Dossett
Thank you Stephen! That's a very coarse setting, as you note, since it's at the worker level, but I'll take it. -Aaron On Tue, Mar 14, 2017 at 8:07 PM Stephen Durfey wrote: > Producer and consumer overrides used by the connect worker can be > overridden by prefixing the specific kafka config w

Re: Kafka Stream: RocksDBKeyValueStoreSupplier performance

2017-03-15 Thread Eno Thereska
Tianji, Could you provide a third data point, running with RocksDb, but without caching, i.e: > StateStoreSupplier stateStoreSupplier = Stores.create(storeName) >.withKeys(stringSerde) >.withValues(avroSerde) >.persistent() >.disableLogging() >.build();

Re: restart Kafka Streams application takes around 5 minutes

2017-03-15 Thread Sachin Mittal
rocksdb state store initialization may be taking up that time. Whats the size of your rockksdb state directory. May be partitioning the source topic, increasing the number of threads/instances processing the source and reducing the time window of aggregation can help in reducing the startup time.

restart Kafka Streams application takes around 5 minutes

2017-03-15 Thread Tianji Li
Hi there, In the experiments I am doing now, if I restart the streams application, I have to wait for around 5 minutes for some reason. I can see something in the Kafka logs: [2017-03-15 08:36:18,118] INFO [GroupCoordinator 0]: Preparing to restabilize group xxx-test25 with old generation 2 (kaf

Streams 0.10.2.0 + RocksDB + Avro

2017-03-15 Thread Adrian McCague
Hi all, We are getting collisions with subject names in our schema registry due to state stores that are holding Avro events: "KSTREAM-JOINOTHER-07-store-value", "KSTREAM-JOINOTHER-06-store-value", "KSTREAM-JOINOTHER-05-store-value", "KSTREAM-OUTEROTHER-05-s

Kafka Stream: RocksDBKeyValueStoreSupplier performance

2017-03-15 Thread Tianji Li
Hi there, It seems that the RocksDB state store is quite slow in my case and I wonder if I did anything wrong. I have a topic, that I groupBy() and then aggregate() 50 times. That is, I will create 50 results topics and a lot more changelog and repartition topics. There are a few things that are

Re: Trying to use Kafka Stream

2017-03-15 Thread Michael Noll
Ah, I see. > However, running the program (e.g. https://github.com/ confluentinc/examples/blob/3.2.x/kafka-streams/src/main/ java/io/confluent/examples/streams/WordCountLambdaExample.java#L178-L181) in my IDE was not and still is not working. Another thing to try is to run the program above from

Re: Trying to use Kafka Stream

2017-03-15 Thread Mina Aslani
Hi Michael, I was aware that the output should be written in a kafka topic not the console. To understand if streams can reach the kafka as Eno asked in earlier email I found http://docs.confluent.io/3.2.0/streams/quickstart.html #goal-of-this-quickstart and went through the steps mentioned and r

Re: Lost ISR when upgrading kafka from 0.10.0.1 to any newer version like 0.10.1.0 or 0.10.2.0

2017-03-15 Thread Ismael Juma
Looking at the output you pasted, broker `0` was the one being upgraded? A few things to check: 1. Does broker `0` connect to the other brokers after the restart 2. Is broker `0` able to connect to zookeeper 3. Does everything look OK in the controller and state-change logs in the controller node

Re: Trying to use Kafka Stream

2017-03-15 Thread Eno Thereska
Hi Mina, It might be that you need to set this property on the Kafka broker config file (server.properties): advertised.listeners=PLAINTEXT://your.host.name:9092 The problem might be this: within docker you run Kafka and Kafka’s address is localhost:9092. Great. Then say you have another con

Re: Trying to use Kafka Stream

2017-03-15 Thread Michael Noll
Mina, in your original question you wrote: > However, I do not see the word count when I try to run below example. Looks like that it does not connect to Kafka. The WordCount demo example writes its output to Kafka only -- it *does not* write any results to the console/STDOUT. >From what I can

Re: Kafka Retention Policy to Indefinite

2017-03-15 Thread Joe San
> > I am saying that replication quotas will mitigate one of the potential > downsides of setting an infinite retention policy. I was just interested in all of the possible potential downsides! Could you please point me to a documentation that has more information on this? On Tue, Mar 14, 2017 a