kaka-streams 0.11.0.1 rocksdb bug?

2017-09-22 Thread Ara Ebrahimi
Hi, We just upgraded to kaka-streams 0.11.0.1 and noticed that in the cluster deployment reduce() never gets called. Funny thing is it does gets called in the unit tests. And no, it’s not a data issue. What I have noticed is that all rocked folders (//1_0// and so on) are empty. I do not see a

Re: session window bug not fixed in 0.10.2.1?

2017-05-08 Thread Ara Ebrahimi
xed in 2645. Could you help verify if that is the case? In which we can re-open https://issues.apache.org/jira/browse/KAFKA-4851 and investigate further. Guozhang On Tue, May 2, 2017 at 1:02 PM, Ara Ebrahimi < ara.ebrah...@argyledata.com<mailto:ara.ebrah...@argyledata.com>> wrot

Re: session window bug not fixed in 0.10.2.1?

2017-05-02 Thread Ara Ebrahimi
e to both trunk and > 0.10.2.1, I just checked. What error are you seeing, could you give us an > update? > > Thanks > Eno > > On Fri, Apr 28, 2017 at 7:10 PM, Ara Ebrahimi > wrote: > >> Hi, >> >> I upgraded to 0.10.2.1 yesterday, enabled caching fo

session window bug not fixed in 0.10.2.1?

2017-04-28 Thread Ara Ebrahimi
ould > build from source and not have to create the StateStoreSupplier. > > Thanks, > Damian > > On Mon, 27 Mar 2017 at 21:56 Ara Ebrahimi > wrote: > > Thanks for the response Mathias! > > The reason we want this exact task assignment to happen is that a critical >

Re: more uniform task assignment across kafka stream nodes

2017-03-28 Thread Ara Ebrahimi
afka.apache.org><mailto:users@kafka.apache.org> Reply-To: mailto:users@kafka.apache.org><mailto:users@kafka.apache.org>> Great! So overall, the issue is not related to task assignment. Also the description below, does not indicate that different task assignment would change anything.

Re: more uniform task assignment across kafka stream nodes

2017-03-27 Thread Ara Ebrahimi
. Also the description below, does not indicate that different task assignment would change anything. -Matthias On 3/27/17 3:08 PM, Ara Ebrahimi wrote: Let me clarify, cause I think we’re using different terminologies: - message key is phone number, reversed - all call records for a phone number

Re: more uniform task assignment across kafka stream nodes

2017-03-27 Thread Ara Ebrahimi
uot;100s of records" does not sound much to me. Streams can process multiple hundredths of thousandth records per thread. That is the reason, why I think that the fix Damian pointed out will most likely fix your problem. -Matthias On 3/27/17 1:56 PM, Ara Ebrahimi wrote: Thanks for the response M

Re: more uniform task assignment across kafka stream nodes

2017-03-27 Thread Ara Ebrahimi
erry-picked to the 0.10.2 branch, so you could > build from source and not have to create the StateStoreSupplier. > > Thanks, > Damian > > On Mon, 27 Mar 2017 at 21:56 Ara Ebrahimi > wrote: > > Thanks for the response Mathias! > > The reason we want this exact task ass

Re: more uniform task assignment across kafka stream nodes

2017-03-27 Thread Ara Ebrahimi
- it might be worth to improve Streams here. -Matthias On 3/27/17 12:57 PM, Ara Ebrahimi wrote: Hi, So, I simplified the topology by making sure we have only 1 source topic. Now I have 1 source topic, 8 partitions, 2 instances. And here’s how the topology looks like: instance 1: KafkaStream

Re: more uniform task assignment across kafka stream nodes

2017-03-27 Thread Ara Ebrahimi
. Otherwise, I cannot give further advice. -Matthias On 3/25/17 6:08 PM, Ara Ebrahimi wrote: Via: builder.stream("topic1"); builder.stream("topic2"); builder.stream("topic3”); These are different kinds of topics consuming different avro objects. Ara. On Mar 25, 2017

Re: more uniform task assignment across kafka stream nodes

2017-03-25 Thread Ara Ebrahimi
t;); Both and handled differently with regard to creating tasks (partition to task assignment also depends on you downstream code though). If this does not help, can you maybe share the structure of processing? To dig deeper, we would need to know the topology DAG. -Matthias On 3/25/17 5:56 PM, Ara Eb

Re: more uniform task assignment across kafka stream nodes

2017-03-25 Thread Ara Ebrahimi
the same number of tasks (and thus partitions) assigned. What is the overall assignment? Do you have StandyBy tasks configured? What version do you use? -Matthias On 3/24/17 8:09 PM, Ara Ebrahimi wrote: Hi, Is there a way to tell kafka streams to uniformly assign partitions across in

more uniform task assignment across kafka stream nodes

2017-03-24 Thread Ara Ebrahimi
Hi, Is there a way to tell kafka streams to uniformly assign partitions across instances? If I have n kafka streams instances running, I want each to handle EXACTLY 1/nth number of partitions. No dynamic task assignment logic. Just dumb 1/n assignment. Here’s our scenario. Lets say we have an

Re: kafka streams locking issue in 0.10.20.0

2017-02-25 Thread Ara Ebrahimi
y need to increase the > max.poll.interval.ms > > Thanks, > Damian > > On Thu, 23 Feb 2017 at 18:26 Ara Ebrahimi > wrote: > >> Hi, >> >> After upgrading to 0.10.20.0 I got this: >> >> Caused by: org.apache.kafka.clients.consumer.CommitFailedExc

kafka streams locking issue in 0.10.20.0

2017-02-23 Thread Ara Ebrahimi
Hi, After upgrading to 0.10.20.0 I got this: Caused by: org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was long

KTable TTL

2017-02-13 Thread Ara Ebrahimi
Hi, I have a ktable and I want to keep entries in it only for that past 24 hours. How can I do that? I understand rocksdb has support for ttl. Should I set that? How? Should I use kafka-streams window functionality? Would it remove data from old windows? I want to do this because I’m seeing a

Re: kafka-consumer-offset-checker complaining about NoNode for X in zk

2017-02-02 Thread Ara Ebrahimi
han zookeeper. >> You can use the --new-consumer option to check for kafka stored offsets. >> >> Best Jan >> >> >> On 01.02.2017 21:14, Ara Ebrahimi wrote: >>> Hi, >>> >>> For a subset of our topics we get this error: >>> &g

kafka-consumer-offset-checker complaining about NoNode for X in zk

2017-02-01 Thread Ara Ebrahimi
Hi, For a subset of our topics we get this error: $KAFKA_HOME/bin/kafka-consumer-offset-checker.sh --group argyle-streams --topic topic_name --zookeeper $ZOOKEEPERS [2017-02-01 12:08:56,115] WARN WARNING: ConsumerOffsetChecker is deprecated and will be dropped in releases following 0.9.0. Use C

kafka-streams ktable recovery after rebalance crash

2017-02-01 Thread Ara Ebrahimi
Hi, My kafka-streams application crashed due to a rebalance event (seems like I need to increase max.poll.interval.ms even more!) and then when I restarted the app I noticed existing rocksdb files were gone and while the rest of the pipeline was processing the part dealing with ktable was sitti

Re: kafka streams consumer partition assignment is uneven

2017-01-09 Thread Ara Ebrahimi
of your three Streams > nodes process "more" topics/partitions than the other two nodes. > > -Michael > > > > On Mon, Jan 9, 2017 at 6:52 PM, Ara Ebrahimi > wrote: > >> Hi, >> >> I have 3 kafka brokers, each with 4 disks. I have 12 partitions.

kafka streams consumer partition assignment is uneven

2017-01-09 Thread Ara Ebrahimi
Hi, I have 3 kafka brokers, each with 4 disks. I have 12 partitions. I have 3 kafka streams nodes. Each is configured to have 4 streaming threads. My topology is quite complex and I have 7 topics and lots of joins and states. What I have noticed is that each of the 3 kafka streams nodes gets co

kafka streams passes org.apache.kafka.streams.kstream.internals.Change to my app!!

2016-12-08 Thread Ara Ebrahimi
Hi, Once in a while and quite randomly this happens, but it does happen every few hundred thousand message: 2016-12-03 11:48:05 ERROR StreamThread:249 - stream-thread [StreamThread-4] Streams application error during processing: java.lang.ClassCastException: org.apache.kafka.streams.kstream.int

Re: Initializing StateStores takes *really* long for large datasets

2016-11-30 Thread Ara Ebrahimi
+1 on this. Ara. > On Nov 30, 2016, at 5:18 AM, Mathieu Fenniak > wrote: > > I'd like to quickly reinforce Frank's opinion regarding the rocksdb memory > usage. I was also surprised by the amount of non-JVM-heap memory being > used and had to tune the 100 MB default down considerably. It's al

Re: Hang while close() on KafkaStream

2016-11-17 Thread Ara Ebrahimi
This happens for me too. On 10.1.0. Seems like it just sits there waiting in the streamThread.join() call. Ara. > On Nov 17, 2016, at 6:02 PM, mordac2k wrote: > > Hello all, > > I have a Java application in which I use an instance of KafkaStreams that > is working splendidly, under normal circu

Re: kafka streams rocksdb tuning (or memory leak?)

2016-11-16 Thread Ara Ebrahimi
gt;http://packages.confluent.io/maven/ > > > >org.apache.kafka >kafka-streams >0.10.1.0-cp2 >org.apache.kafka >kafka-clients > 0.10.1.0-cp2 > > Thanks, > > Damian > > > On Wed, 16 Nov 2016 at 14:52 Ara Ebrahimi > wrote: &

kafka streams rocksdb tuning (or memory leak?)

2016-11-16 Thread Ara Ebrahimi
Hi, I have a few KTables in my application. Some of them have unlimited windows. If I leave the application to run for a few hours, I see the java process consume more and more memory, way above the -Xmx limit. I understand this is due to the rocksdb native lib used by kafka streams. What I don

Re: kafka streaming rocks db lock bug?

2016-10-24 Thread Ara Ebrahimi
o investigate this issue further with you. Guozhang Guozhang On Sun, Oct 23, 2016 at 1:45 PM, Ara Ebrahimi mailto:ara.ebrah...@argyledata.com>> wrote: And then this on a different node: 2016-10-23 13:43:57 INFO StreamThread:286 - stream-thread [StreamThread-3] Stream thread shutdown c

Re: kafka streaming rocks db lock bug?

2016-10-23 Thread Ara Ebrahimi
.(ProcessorStateManager.java:98) at org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:69) ... 13 more Ara. On Oct 23, 2016, at 1:24 PM, Ara Ebrahimi mailto:ara.ebrah...@argyledata.com>> wrote: Hi, This happens when I hammer our 5 kafka streaming nodes (each with 4 str

kafka streaming rocks db lock bug?

2016-10-23 Thread Ara Ebrahimi
Hi, This happens when I hammer our 5 kafka streaming nodes (each with 4 streaming threads) hard enough for an hour or so: 2016-10-23 13:04:17 ERROR StreamThread:324 - stream-thread [StreamThread-2] Failed to flush state for StreamTask 3_8: org.apache.kafka.streams.errors.ProcessorStateException

Re: micro-batching in kafka streams

2016-09-28 Thread Ara Ebrahimi
to write a blog post about step-by-step instructions to leverage this feature for use cases just like yours soon. Guozhang On Wed, Sep 28, 2016 at 2:19 PM, Ara Ebrahimi mailto:ara.ebrah...@argyledata.com>> wrote: I need this ReadOnlyKeyValueStore. In my use case, I do an aggregateByKey(

Re: micro-batching in kafka streams

2016-09-28 Thread Ara Ebrahimi
using this feature I'd like to learn more of your error scenario. Guozhang On Tue, Sep 27, 2016 at 9:41 AM, Ara Ebrahimi mailto:ara.ebrah...@argyledata.com>> wrote: One more thing: Guozhang pointed me towards this sample for micro-batching: https://github.com/

Re: micro-batching in kafka streams

2016-09-27 Thread Ara Ebrahimi
(or any other such KTable-producing operation) and such periodic triggers. Ara. On Sep 26, 2016, at 10:40 AM, Ara Ebrahimi mailto:ara.ebrah...@argyledata.com>> wrote: Hi, So, here’s the situation: - for classic batching of writes to external systems, right now I simply hack it. This sp

Re: micro-batching in kafka streams

2016-09-26 Thread Ara Ebrahimi
scenes should (at least in an ideal world) be of > no concern to the user. > > -Michael > > > > > > On Mon, Sep 5, 2016 at 9:10 PM, Ara Ebrahimi > wrote: > >> Hi, >> >> What’s the best way to do micro-batching in Kafka Streams? Any plans for

Re: micro-batching in kafka streams

2016-09-12 Thread Ara Ebrahimi
; mechanism. This > is not exactly the same as micro-batching, but also acts as reducing IO > costs as well as data traffic: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-63%3A+Unify+store+and+downstream+caching+in+streams > > > Let me know if these references are helpfu

Re: Performance issue with KafkaStreams

2016-09-10 Thread Ara Ebrahimi
Hi Eno, Could you elaborate more on tuning Kafka Streaming applications? What are the relationships between partitions and num.stream.threads num.consumer.fetchers and other such parameters? On a single node setup with x partitions, what’s the best way to make sure these partitions are consumed

Re: enhancing KStream DSL

2016-09-09 Thread Ara Ebrahimi
tCallCommType(). >> equalsIgnoreCase("VOICE") || callRecord.getCallCommType().e >> qualsIgnoreCase("DATA")) >>); >> >> This would give you: >> >>KStream voiceRecords = branches[0]; >>KStream dataRecords = branc

enhancing KStream DSL

2016-09-08 Thread Ara Ebrahimi
Let’s say I have this: KStream[] branches = allRecords .branch( (imsi, callRecord) -> "VOICE".equalsIgnoreCase(callRecord.getCallCommType()), (imsi, callRecord) -> "DATA".equalsIgnoreCase(callRecord.getCallCommType()), (imsi, callRecord) -> true ); KS

micro-batching in kafka streams

2016-09-05 Thread Ara Ebrahimi
Hi, What’s the best way to do micro-batching in Kafka Streams? Any plans for a built-in mechanism? Perhaps StateStore could act as the buffer? What exactly are ProcessorContext.schedule()/punctuate() for? They don’t seem to be used anywhere? http://hortonworks.com/blog/apache-storm-design-patt

kafka streams: join 2 Windowed tables

2016-09-01 Thread Ara Ebrahimi
Hi, Is joining 2 streams based on Windowed keys supposed to work? I have 2 KTables: - KTable, T> events: I process events and aggregate events that have a common criteria using aggregateByKey and UnlimitedWindows as window (for now) - KTable, S> hourlyStats: I calculate some stats using aggreg