Re: Single Key Aggregation

2017-06-14 Thread Sameer Kumar
Also, I am writing a single key in the output all the time. I believe machine2 will have to write a key and since a state store is local it wouldn't know about the counter state on another machine. So, I guess this will happen. -Sameer. On Thu, Jun 15, 2017 at 11:11 AM, Sameer Kumar wrote: > Th

Re: Single Key Aggregation

2017-06-14 Thread Sameer Kumar
The input topic contains 60 partitions and data is distributed well across different partitions on different keys. While consumption, I am doing some filtering and writing only single key data. Output would be something of the form:- Machine 1 2017-06-13 16:49:10 INFO LICountClickImprMR2:116 - l

Producer/consumer JMX metrics

2017-06-14 Thread Tarun Garg
Hi, I stuck in a dump problem. it seems atleast like that. I have a collectd conf file ``` LoadPlugin java ObjectName "kafka.producer:type=producer-metrics,clientId=([-.\w]+)" InstancePrefix "all" InstancePrefix "kafka-producer-

Re: Dropped messages in kstreams?

2017-06-14 Thread Caleb Welton
Disabling the cache with: ``` streamsConfiguration.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 0) ``` Results in: - Emitting many more intermediate calculations. - Still losing data. In my test case it output 342476 intermediate calculations for 3414 distinct windows, 14400 distinct were

Re: kafka streams first time did not print output

2017-06-14 Thread Matthias J. Sax
Hi, in your first example with .print() "before" and "after" .to() I want to clarify, that the order you use to add operators does not really matter here. The DAG you build will branch out anyway: +--> print() | KTable -> changelog --+--> to()

Re: Dropped messages in kstreams?

2017-06-14 Thread Matthias J. Sax
This seems to be related to internal KTable caches. You can disable them by setting cache size to zero. http://docs.confluent.io/current/streams/developer-guide.html#memory-management -Matthias On 6/14/17 4:08 PM, Caleb Welton wrote: > Update, if I set `StreamsConfig.NUM_STREAM_THREADS_CONFIG=

Re: Dropped messages in kstreams?

2017-06-14 Thread Caleb Welton
Update, if I set `StreamsConfig.NUM_STREAM_THREADS_CONFIG=1` then the problem does not manifest, at `StreamsConfig.NUM_STREAM_THREADS_CONFIG=2` or higher the problem shows up. When the number of threads is 1 the speed of data through the first part of the topology (before the ktable) slows down co

Log Cleanup/Deletion isn't working correctly[KAFKA-1194]

2017-06-14 Thread Mohammed Manna
Hi, I have bothered you guys a lot so far regarding this and the bug ticket is still open in Kafka JIRA (1194). It seems that log.cleanup.policy=DELETE doesn't close the current file graceful and schedule a delete. My stacktrace for this error is given below: [2017-06-14 23:07:34,965] INFO Partit

Dropped messages in kstreams?

2017-06-14 Thread Caleb Welton
I have a topology of KStream -> KTable -> KStream ``` final KStreamBuilder builder = new KStreamBuilder(); final KStream metricStream = builder.stream(ingestTopic); final KTable, MyThing> myTable = metricStream .groupByKey(stringSerde, mySerde) .reduce(MyThing::merge, TimeWin

Re: Slow Consumer Group Startup

2017-06-14 Thread Bryan Baugher
It does seem like we are in a similar situation described in the KIP ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-134%3A+Delay+initial+consumer+group+rebalance). For some historical reason we had set our session.timeout.ms to a high value (5 minutes) which corresponds with the amount of

Re: IllegalStateException when putting to state store in Transformer implementation

2017-06-14 Thread Guozhang Wang
Hello Adrian, When you call "put()" on the windowed state store that does not specify a timestamp, then the `timestamp()` is retrieved to use as the default timestamp. -- public synchronized void put(final K key, final V value) { put(key, value, context.timestamp()); } -

IllegalStateException when putting to state store in Transformer implementation

2017-06-14 Thread Adrian McCague
Hi All We have a transformer implementation in our Kafka Streams application that raises this exception, sometimes, when starting. "java.lang.IllegalStateException: This should not happen as timestamp() should only be called while a record is processed" This happens when 'put' is called on a s

Re: Slow Consumer Group Startup

2017-06-14 Thread Bryan Baugher
While I do have some logs its not trivial to share since the logs are across 16 JVMs and a few different hosts. On Wed, Jun 14, 2017 at 10:34 AM Eno Thereska wrote: > The delay in that KIP is just 3 seconds, not minutes though, right? Would > you have any logs to share? > > Thanks > Eno > > On 1

Re: Question on Support provision

2017-06-14 Thread Eno Thereska
Hi Sofia, Thank you for your recent enquiry for Kafka support services. Confluent employs some of the world’s foremost Apache Kafka experts, and that expertise shows in the level of support we can provide. The subscription offers a scali

Re: Slow Consumer Group Startup

2017-06-14 Thread Eno Thereska
The delay in that KIP is just 3 seconds, not minutes though, right? Would you have any logs to share? Thanks Eno > On 14 Jun 2017, at 16:14, Bryan Baugher wrote: > > Our consumer group isn't doing anything stateful and we've seen this > behavior for existing groups as well. It seems like timing

Re: Slow Consumer Group Startup

2017-06-14 Thread Bryan Baugher
Our consumer group isn't doing anything stateful and we've seen this behavior for existing groups as well. It seems like timing could be an issue, thanks for the information. On Tue, Jun 13, 2017 at 7:39 PM James Cheng wrote: > Bryan, > > This sounds related to > https://cwiki.apache.org/conflue

Question on Support provision

2017-06-14 Thread Sofia Miari
Good Day Team - may I please ask if there are any companies in EMEA supporting Kafka tool? First Data would like to utilize the tool but we will need support so looking for someone in EMEA who can provide us with support (paid, subscription etc). Thank you in advance, Sofia Miari Manager, Strat

Fat partition - Kafka Spark streaming

2017-06-14 Thread GURU PRAVEEN
Hi, We have a Kafka spark streaming integrated app that listens to twitter and pushes the tweets to Kafka and which is later consumed by spark app. We are constantly seeing one of the Kafka partitions always having more data than the other partitions. Not able to zero in on the root cause. We

Re: Adding/Removing Brokers to Kafka, while data is flowing into Kafka topics

2017-06-14 Thread Mohammed Manna
I don't believe this is a safe approach to do rebalancing whilst producing into the topic. This creates mismatches in partition/replication process, unless this has been handled in the upcoming .11 release? Kindest Regards, On 14 June 2017 at 02:52, karan alang wrote: > Some more details about