Re: Kafka Streams punctuate with slow-changing KTables

2017-02-02 Thread Elliot Crosby-McCullough
Sorry I left out too much context there. The current plan is to take a raw stream of events as a source and split them into a stream of facts and a table (or tables) of dimensions. Because this is denormalising the data, we only want one copy of each dimension entry despite the original events re

Re: Kafka Streams punctuate with slow-changing KTables

2017-02-02 Thread Damian Guy
Hi Elliot, With GlobalKTables your processor wouldn't be able to write directly to the table - they are read-only from all threads except the thread that keeps them up-to-date. You could potentially write the dimension data to the topic that is the source of the GlobalKTable, but there is no guara

Re: Kafka Streams punctuate with slow-changing KTables

2017-02-02 Thread Elliot Crosby-McCullough
Right. Maybe it's best to use some kind of idempotent foreign key then, or at least a small in-thread cache. Thanks for the info. On 2 February 2017 at 09:46, Damian Guy wrote: > Hi Elliot, > > With GlobalKTables your processor wouldn't be able to write directly to the > table - they are read-

Kafka Streams delivery semantics and state store

2017-02-02 Thread Krzysztof Lesniewski, Nexiot AG
Hello, In multiple sources I read that Kafka Streams has at-least-once delivery semantics, meaning that in case of failure, a message can be processed more than once, but it will not be lost. It is achieved by committing offset only after the message processing is completely finished and all

Streams InvalidStateStoreException: Store ... is currently closed

2017-02-02 Thread Mathieu Fenniak
Hey all, When an instance of a streams Processor is closed, is it supposed to call close() on any state stores that it retrieved from the ProcessorContext in its own close()? I started following the pattern of having every Processor close every state store based upon this documentation's example

Re: Streams InvalidStateStoreException: Store ... is currently closed

2017-02-02 Thread Damian Guy
Hi Matthew, You shouldn't close the stores in your custom processors. They are closed automatically by the framework during rebalances and shutdown. There is a good chance that your closing of the stores is causing the issue. Of course if you see the exception again then please report back so we ca

Re: Streams InvalidStateStoreException: Store ... is currently closed

2017-02-02 Thread Mathieu Fenniak
Thanks for the quick response Damian. I'll update my processors and retest. 👍 On Thu, Feb 2, 2017 at 9:27 AM, Damian Guy wrote: > Hi Matthew, > You shouldn't close the stores in your custom processors. They are closed > automatically by the framework during rebalances and shutdown. > There is a

Does KafkaStreams be configured to use multiple broker clusters in the same topology

2017-02-02 Thread Roger Vandusen
We would like to source topics from one cluster and sink them to a different cluster from the same topology. If this is not currently supported then is there a KIP/JIRA to track work to support this in the future? 0.10.2.0? -Roger

Re: Does KafkaStreams be configured to use multiple broker clusters in the same topology

2017-02-02 Thread Damian Guy
Hi Roger, This is not currently supported and won't be available in 0.10.2.0. This has been discussed, but it doesn't look there is a JIRA for it yet. Thanks, Damian On Thu, 2 Feb 2017 at 16:51 Roger Vandusen wrote: > We would like to source topics from one cluster and sink them to a > differe

Re: Does KafkaStreams be configured to use multiple broker clusters in the same topology

2017-02-02 Thread Roger Vandusen
Thanks for the quick reply Damian. So the work-around would be to configure our source topology’s with a processor component that would use another app component as a stand-alone KafkaProducer, let’s say an injected spring bean, configured to the other (sink) cluster, and then publish sink topi

Re: Does KafkaStreams be configured to use multiple broker clusters in the same topology

2017-02-02 Thread Damian Guy
Hi, yes you could attach a custom processor that writes to another Kafka cluster. The problem is going to be guaranteeing at least once delivery without impacting throughput. To guarantee at least once you would need to do a blocking send on every call to process, i.e., producer.send(..).get(), thi

Re: Does KafkaStreams be configured to use multiple broker clusters in the same topology

2017-02-02 Thread Roger Vandusen
Very helpful advice, thanks again Damian. On 2/2/17, 10:35 AM, "Damian Guy" wrote: Hi, yes you could attach a custom processor that writes to another Kafka cluster. The problem is going to be guaranteeing at least once delivery without impacting throughput. To guarantee at least once

Re: Does KafkaStreams be configured to use multiple broker clusters in the same topology

2017-02-02 Thread Roger Vandusen
Damian, We could lessen the producer.send(..).get() impact on throughput by simply handing it off to another async worker component in our springboot app, any feedback on that? -Roger On 2/2/17, 10:35 AM, "Damian Guy" wrote: Hi, yes you could attach a custom processor that writes to ano

Re: Kafka Streams delivery semantics and state store

2017-02-02 Thread Matthias J. Sax
Hi, About message acks: writes will be acked, however async (not sync as you describe it). Only before an actual commit, KafkaProducer#flush is called and all not-yet received acks are collected (ie, blocking/sync) before the commit is done. About state guarantees: there are none -- state might b

Re: Does KafkaStreams be configured to use multiple broker clusters in the same topology

2017-02-02 Thread Damian Guy
Hi Roger, The problem is that you can't do it ansyc and still guarantee at-least-once delivery. For example: if your streams app looked something like this: builder.stream("input").mapValue(...).process(yourCustomerProcessSupplier); On the commit interval, kafka streams will commit the consumed

Re: Does KafkaStreams be configured to use multiple broker clusters in the same topology

2017-02-02 Thread Roger Vandusen
Ah, yes, I see your point and use case, thanks for the feedback. On 2/2/17, 11:02 AM, "Damian Guy" wrote: Hi Roger, The problem is that you can't do it ansyc and still guarantee at-least-once delivery. For example: if your streams app looked something like this: bui

Re: Does KafkaStreams be configured to use multiple broker clusters in the same topology

2017-02-02 Thread Jan Filipiak
Sometimes I wake up cause I dreamed that this had gone down: https://cwiki.apache.org/confluence/display/KAFKA/Hierarchical+Topics On 02.02.2017 19:07, Roger Vandusen wrote: Ah, yes, I see your point and use case, thanks for the feedback. On 2/2/17, 11:02 AM, "Damian Guy" wrote: Hi R

Re: kafka-consumer-offset-checker complaining about NoNode for X in zk

2017-02-02 Thread Jan Filipiak
Hi, if its a kafka stream app, its most likely going to store its offsets in kafka rather than zookeeper. You can use the --new-consumer option to check for kafka stored offsets. Best Jan On 01.02.2017 21:14, Ara Ebrahimi wrote: Hi, For a subset of our topics we get this error: $KAFKA_HO

Re: kafka-consumer-offset-checker complaining about NoNode for X in zk

2017-02-02 Thread Jan Filipiak
Hi, sorry and using the consumer group tool, instead of the offset checker On 02.02.2017 20:08, Jan Filipiak wrote: Hi, if its a kafka stream app, its most likely going to store its offsets in kafka rather than zookeeper. You can use the --new-consumer option to check for kafka stored offs

Re: Kafka Streams delivery semantics and state store

2017-02-02 Thread Krzysztof Lesniewski, Nexiot AG
Thank you Matthias for your answer. Of course, wherever it is possible I will use idempotent updates, but unfortunately it does not apply to all my cases. I though before about the alternative to idempotent updates you have proposed, but I think it carries a risk of breaking at-least-once de

Re: Kafka Streams delivery semantics and state store

2017-02-02 Thread Eno Thereska
Hi Krzysztof, There are several scenarios where you want a set of records to be sent atomically (to a statestore, downstream topics etc). In case of failure then, either all of them commit successfully, or none does. We are working to add exactly-once processing to Kafka Streams and I suspect y

Will KafkaStreams 0.10.2.0 work with 0.10.0.0 Kafka broker cluster

2017-02-02 Thread Roger Vandusen

Re: kafka-consumer-offset-checker complaining about NoNode for X in zk

2017-02-02 Thread Ara Ebrahimi
Thanks! Worked like a charm. For some partitions, which do have data, I see “unknown” reported as offset. Any idea what unknown means? Also what’s the new command for setting offsets? Specifically move it back to point 0, AND also to move it to the end. Ara. > On Feb 2, 2017, at 11:21 AM, Jan

Re: Will KafkaStreams 0.10.2.0 work with 0.10.0.0 Kafka broker cluster

2017-02-02 Thread Ismael Juma
KafkaStreams 0.10.2.0 will work with 0.10.1.0 brokers (but not 0.10.0.0 brokers) because it requires the Create Topics request, which was only added in 0.10.1.0. Consumers, producers and connect with version 0.10.2.0 will work fine with 0.10.0.0 brokers. On Thu, Feb 2, 2017 at 9:09 PM, Roger Vand

Re: Kafka Streams delivery semantics and state store

2017-02-02 Thread Matthias J. Sax
You assumptions is not completely correct. After a crash and State Store restore, the store will contain exactly the same data as written to the underlying changelog. Thus, if you update was buffered but never send, the store will not contain the update after restore and thus the record will not b

Surveying for Applications of Kafka

2017-02-02 Thread Collin Lee
Hi Everyone, My name is Collin Lee and I'm a PhD Student in Computer Science at Stanford University doing some research on notification systems. I'm studying how applications are structured around systems like Kafka and so I'd love you know how you all use Kafka. If you are up for it, I'd lov

KAFKA-3703: Graceful close for consumers and producer with acks=0

2017-02-02 Thread Pascu, Ciprian (Nokia - FI/Espoo)
Hi, Can anyone tell me in which release this fix will be present? https://github.com/apache/kafka/pull/1836 It is not present in the current release (0.10.1.1), which I don't quite understand, because it has been committed in November last year to the trunk. To which branch the 0.10.1.1 tag

Re: KAFKA-3703: Graceful close for consumers and producer with acks=0

2017-02-02 Thread Manikumar
It is fixed on trunk and will be part of upcoming 0.10.2.0 release. On Fri, Feb 3, 2017 at 10:58 AM, Pascu, Ciprian (Nokia - FI/Espoo) < ciprian.pa...@nokia.com> wrote: > Hi, > > Can anyone tell me in which release this fix will be present? > > > https://github.com/apache/kafka/pull/1836 > > > I