Re: NPE on startup with a low-level API based application

2017-06-15 Thread Frank Lyaruu
Yes, compression was on (lz4), key and value sizes fluctuate, key sizes are small <10 bytes, value sizes fluctuate also, but nothing crazy, up to about 100kb. I did some stepping through the code and at some point I saw a branch that used a different path depending on protocol version (something w

Re: Single Key Aggregation

2017-06-15 Thread Sameer Kumar
Ok.. Let me try explain it again. So, Lets say my source processor has a different key, now the value that it contains lets say contains an identifier type: which basically denotes category and I am counting on different categories. A specific case would be I do a filter and outputs only a specifi

Re: Re-Balancing Kafka topics - Best practices

2017-06-15 Thread Abhimanyu Nagrath
Hi Tod, I am not able to access the link https://github.com/linkedin/kafka-toolskafka-assigner Regards, Abhimanyu On Fri, Jun 16, 2017 at 12:26 AM, karan alang wrote: > Thanks Todd.. for the detailed reply. > > regds, > Karan Alang > > On Tue, Jun 13, 2017 at 3:19 PM, Todd Palino wrote: > >

Re: Dropped messages in kstreams?

2017-06-15 Thread Caleb Welton
Short answer seems to be that my Kafka LogRetentionTime was such that the metrics I was writing were getting purged from kafka during the test. Dropped metrics. On Thu, Jun 15, 2017 at 1:32 PM, Caleb Welton wrote: > I have encapsulated the repro into a small self contained project: > https://git

Re: IllegalStateException when putting to state store in Transformer implementation

2017-06-15 Thread Guozhang Wang
Adrian, I looked though the 0.10.2.1 code but I cannot nail down to any obvious places where the processor context is set to null, which could trigger your exception. Also from your stack trace there is no direct clues available. Would you mind creating a JIRA and attach the link to your JSON and

Re: UNKNOWN_TOPIC_OR_PARTITION with SASL_PLAINTEXT ACL

2017-06-15 Thread Tarun Garg
Hi, This doesn't give any insite about the problem. please do two things first. 1. change the log level in config/log4j for authorization (i dont remeber exact property name but it is a buttom of the file) : this will give the broker log 2. change log level in config/tools-log4j file:

Re: NPE on startup with a low-level API based application

2017-06-15 Thread Apurva Mehta
Finally, was compression enabled when you hit this exception? If so, which compression algorithm was enabled? On Thu, Jun 15, 2017 at 5:04 PM, Apurva Mehta wrote: > Frank: it would be even better if you could share the key and value which > was causing this problem. Maybe share it on the JIRA: >

Re: NPE on startup with a low-level API based application

2017-06-15 Thread Apurva Mehta
Frank: it would be even better if you could share the key and value which was causing this problem. Maybe share it on the JIRA: https://issues.apache.org/jira/browse/KAFKA-5456 ? Thanks, Apurva On Thu, Jun 15, 2017 at 4:07 PM, Apurva Mehta wrote: > Hi Frank, > > What is is the value of `batch.s

Re: UNKNOWN_TOPIC_OR_PARTITION with SASL_PLAINTEXT ACL

2017-06-15 Thread Vahid S Hashemian
Hi Arunkumar, Could you please take a look at this article: https://developer.ibm.com/opentech/2017/05/31/kafka-acls-in-practice/ The error message you posted earlier suggests there is some missing ACL (as indicated in the article). Let me know if that doesn't resolve the issue. Thanks. --Vahid

Re: UNKNOWN_TOPIC_OR_PARTITION with SASL_PLAINTEXT ACL

2017-06-15 Thread Arunkumar
I also created ACL for both producer and consumer. Still no luck bin/kafka-acls --producer host:9097 --topic sample1 --add -allow-host hostname9097 --allow-principal User:arun --authorizer-properties zookeeper.connect=zookeeperhost:2182 bin/kafka-acls --consumer host:9097 --topic sample1 -

Re: UNKNOWN_TOPIC_OR_PARTITION with SASL_PLAINTEXT ACL

2017-06-15 Thread Arunkumar
Hi Vahid Thank you for quick response. I set the ACL for topic and also created jaas permission as per the document for both producer and consumer. I have set what I have posted below. Do I need to set ACL like we set for Topics -- bin/kafka-acls --topic * --add -allow-host host:9097 --allow-

Re: UNKNOWN_TOPIC_OR_PARTITION with SASL_PLAINTEXT ACL

2017-06-15 Thread Vahid S Hashemian
Hi Arunkumar, Have you given your Kafka consumer/producer necessary permissions to consume/produce messages? --Vahid From: Arunkumar To: Date: 06/15/2017 04:07 PM Subject:UNKNOWN_TOPIC_OR_PARTITION with SASL_PLAINTEXT ACL Hi I am setting up ACL with SALS_PLAINTEXT. My z

Re: Kafka Streams vs Spark Streaming : reduce by window

2017-06-15 Thread Matthias J. Sax
Hi Paolo, This SO question might help, too: https://stackoverflow.com/questions/38935904/how-to-send-final-kafka-streams-aggregation-result-of-a-time-windowed-ktable For Streams, the basic model is based on "change" and we report updates to the "current" result immediately reducing latency to a m

Re: NPE on startup with a low-level API based application

2017-06-15 Thread Apurva Mehta
Hi Frank, What is is the value of `batch.size` in your producer? What is the size of the key and value you are trying to write? Thanks, Apurva On Thu, Jun 15, 2017 at 2:28 AM, Frank Lyaruu wrote: > Hey people, I see an error I haven't seen before. It is on a lowlevel-API > based streams applic

UNKNOWN_TOPIC_OR_PARTITION with SASL_PLAINTEXT ACL

2017-06-15 Thread Arunkumar
Hi I am setting up ACL with SALS_PLAINTEXT. My zookeeper and broker starts without error. But when I try to start my consumer or if I send message through a producer it throws an exception (Both producer and consumer are kafka CLI) Stack trace for my consumer below. Any insight is highly apprec

Re: kafka-consumer-groups.sh vs ConsumerOffsetChecker

2017-06-15 Thread Vahid S Hashemian
Is it possible that the consumer group or the corresponding topic got deleted? If you don't have consumers running in the group but there are offsets associated with (valid) topics the group, group offsets should not get removed from ZK. --Vahid From: karan alang To: users@kafka.apac

Re: Kafka Connect fails to read offset

2017-06-15 Thread Randall Hauch
At any time, did your standalone worker config contain `internal.value.converter.schemas.enable=false`? On Thu, Jun 15, 2017 at 5:00 AM, Artem B. wrote: > Hi all, > > Each time I start my source connector it fails to read the offset stored in > a file with the following error: > > 21:05:01:519 |

Scenario with ZK ensemble partitioned from Kafka cluster + ISR broker going down

2017-06-15 Thread Kostas Christidis
Assume: 1. A Kafka cluster with 3 brokers: B0, B1, B2 2. A single topic with a single partition 3. default.replication.factor = 3 4. min.insync.replicas = 3 5. acks = all for the producers Let: 1. B0 be the controller of the cluster and the leader of the partition for simplicity 2. ISR = [B0 B2]

Re: kafka-consumer-groups.sh vs ConsumerOffsetChecker

2017-06-15 Thread karan alang
Hi Vahid, In ZK, i dont see any reference to myGroup (is this because the consumer is no longer running or is there some other reason ?) *ls /consumers[replay-log-producer]* Also, when i run the ConsumerOffsetChecker w/o the topic, i get error shown below -> *$KAFKA_HOME/bin/kafka-run-class.sh

Re: Dropped messages in kstreams?

2017-06-15 Thread Caleb Welton
I have encapsulated the repro into a small self contained project: https://github.com/cwelton/kstreams-repro Thanks, Caleb On Thu, Jun 15, 2017 at 11:30 AM, Caleb Welton wrote: > I do have a TimestampExtractor setup and for the 10 second windows that > are emitted all the values expected in

Re: kafka-consumer-groups.sh vs ConsumerOffsetChecker

2017-06-15 Thread Vahid S Hashemian
Hi Karan, The message "No topic available for consumer group provided" appears when there is no topic under the consumer group in ZooKeeper (under /consumers/myGroup/owners/). Can you check whether the topic 'newBroker1' exists under this ZK path? Also, do you still get the rows below if you run

Re: kafka-consumer-groups.sh vs ConsumerOffsetChecker

2017-06-15 Thread karan alang
Pls note -> Even when i run the command as shown below (not as new-consumer), i don't get the required result. *$KAFKA_HOME/bin/kafka-consumer-groups.sh --describe --zookeeper localhost:2181 --group myGroup* No topic available for consumer group provided GROUP, TOPIC, PARTITION, CURRENT OFFSET

Re: Re-Balancing Kafka topics - Best practices

2017-06-15 Thread karan alang
Thanks Todd.. for the detailed reply. regds, Karan Alang On Tue, Jun 13, 2017 at 3:19 PM, Todd Palino wrote: > A few things here… > > 1) auto.leader.rebalance.enable can have serious performance impacts on > larger clusters. It’s currently in need of some development work to enable > it to batc

kafka-consumer-groups.sh vs ConsumerOffsetChecker

2017-06-15 Thread karan alang
Hi All - I've Kafka 0.9 (going forward will be migrating to Kafka 0.10) & trying to use the ConsumerOffsetChecker & bin/kafka-consumer-groups.sh to check for offsets. I'm seeing different behavior. Here is what i did -> a) When i use ConsumerOffsetChecker $KAFKA_HOME/bin/kafka-run-class.sh kaf

Re: Dropped messages in kstreams?

2017-06-15 Thread Caleb Welton
I do have a TimestampExtractor setup and for the 10 second windows that are emitted all the values expected in those windows are present, e.g. each 10 second window gets 100 values aggregated into it. I have no metrics with null keys or values. I will try to get the entire reproduction case packa

Re: Dropped messages in kstreams?

2017-06-15 Thread Matthias J. Sax
Another thing to consider? Do you have records will null key or value? Those records would be dropped and not processes. -Matthias On 6/15/17 6:24 AM, Garrett Barton wrote: > Is your time usage correct? It sounds like you want event time not > load/process time which is default unless you have a

Re: Single Key Aggregation

2017-06-15 Thread Eno Thereska
I'm not sure if I fully understand this but let me check: - if you start 2 instances, one instance will process half of the partitions, the other instance will process the other half - for any given key, like key 100, it will only be processed on one of the instances, not both. Does this help?

Re: KStream and KTable different behaviour on filter() operation

2017-06-15 Thread Eno Thereska
Yeah the semantics are slightly different. For a KTable, a null value just means that the record is a tombstone, and will be anyways ignored by subsequent processing: http://docs.confluent.io/current/streams/javadocs/org/apache/kafka/streams/kstream/KTable.html#filter-org.apache.kafka.streams.kst

Re: Kafka Streams vs Spark Streaming : reduce by window

2017-06-15 Thread Paolo Patierno
Hi Eno, regarding closing window I think that it's up to the streaming application. I mean ... If I want something like I described, I know that a value outside my 5 seconds window will be taken into account for the next processing (in the next 5 seconds). I don't think I'm losing a record, I

Re: Kafka Streams vs Spark Streaming : reduce by window

2017-06-15 Thread Eno Thereska
Hi Paolo, Yeah, so if you want fewer records, you should actually "not" disable cache. If you disable cache you'll get all the records as you described. About closing windows: if you close a window and a late record arrives that should have been in that window, you basically lose the ability to

Re: NPE on startup with a low-level API based application

2017-06-15 Thread Frank Lyaruu
It seems to happen when using Streams 0.11.1 snapshot against a 0.10.2 (release) broker, the problem disappeared after I upgraded the broker. On Thu, Jun 15, 2017 at 11:28 AM, Frank Lyaruu wrote: > Hey people, I see an error I haven't seen before. It is on a lowlevel-API > based streams applica

Re: Kafka Streams vs Spark Streaming : reduce by window

2017-06-15 Thread Paolo Patierno
Hi Emo, thanks for the reply ! Regarding the cache I'm already using CACHE_MAX_BYTES_BUFFERING_CONFIG = 0 (so disabling cache). Regarding the interactive query API (I'll take a look) it means that it's up to the application doing something like we have oob with Spark. May I ask what do you m

Re: Kafka Streams vs Spark Streaming : reduce by window

2017-06-15 Thread Eno Thereska
Hi Paolo, That is indeed correct. We don’t believe in closing windows in Kafka Streams. You could reduce the number of downstream records by using record caches: http://docs.confluent.io/current/streams/developer-guide.html#record-caches-in-the-dsl

Re: Kafka Streams vs Spark Streaming : reduce by window

2017-06-15 Thread Tom Bentley
It sounds like you want a tumbling time window, rather than a sliding window https://kafka.apache.org/documentation/streams#streams_dsl_windowing On 15 June 2017 at 14:38, Paolo Patierno wrote: > Hi, > > > using the streams library I noticed a difference (or there is a lack of > knowledge on my

Kafka Streams vs Spark Streaming : reduce by window

2017-06-15 Thread Paolo Patierno
Hi, using the streams library I noticed a difference (or there is a lack of knowledge on my side)with Apache Spark. Imagine following scenario ... I have a source topic where numeric values come in and I want to check the maximum value in the latest 5 seconds but ... putting the max value in

Re: Dropped messages in kstreams?

2017-06-15 Thread Garrett Barton
Is your time usage correct? It sounds like you want event time not load/process time which is default unless you have a TimestampExtractor defined somewhere upstream? Otherwise I could see far fewer events coming out as streams is just aggregating whatever showed up in that 10 second window. On

KStream and KTable different behaviour on filter() operation

2017-06-15 Thread Paolo Patierno
Hi all, I was asking why the different behaviour of filter() operation on a KStream and KTable. On KStream, if the predicate is false, the message isn't passed to the next node (so for example if a sinknode, it doesn't arrive to the destination topic). On KTable, if the predicate is true, a m

Kafka Connect fails to read offset

2017-06-15 Thread Artem B.
Hi all, Each time I start my source connector it fails to read the offset stored in a file with the following error: 21:05:01:519 | ERROR | pool-1-thread-1 | o.a.k.c.s.OffsetStorageReaderImpl | CRITICAL: Failed to deserialize offset data when getting offsets for tas k with namespace zohocrm-sour

NPE on startup with a low-level API based application

2017-06-15 Thread Frank Lyaruu
Hey people, I see an error I haven't seen before. It is on a lowlevel-API based streams application. I've started it once, then it ran fine, then did a graceful shutdown and since then I always see this error on startup. I'm using yesterday's trunk. It seems that the MemoryRecordsBuilder overflow

RE: IllegalStateException when putting to state store in Transformer implementation

2017-06-15 Thread Adrian McCague
Hi Guozhang, thanks for your reply I can confirm that the init method is quite basic: public void init(ProcessorContext context) { context.schedule(TIMEOUT.getMillis()); this.context = context; this.store = (KeyValueStore)this.context.getStateStore(storeName); } Omitting try catch a