Re: KStreams State Store - state.dir does not have .checkpoint file

2022-06-01 Thread Steven Schlansker
> On Jun 1, 2022, at 5:27 AM, Neeraj Vaidya > wrote: > > Thanks John ! > It seems if I send a TERM signal to my KStreams application which is running > inside a Docker container, then it results in a Clean shutdown. > This also then creates a checkpoint file successfully. > So, I guess I need

ConsumerRecord / ProducerRecord common interface?

2021-10-22 Thread Steven Schlansker
Hello Kafka friends, We are writing test code that sometimes will talk to a MockProducer and sometimes to a normal KafkaProducer. Therefore the test harness will sometimes read the MockProducer-produced records (ProducerRecord) directly, and sometimes will read actual ConsumerRecords through a rea

Re: Consulting ReadOnlyKeyValueStore from Processor can lead to deadlock

2018-05-10 Thread Steven Schlansker
> On May 10, 2018, at 10:48 AM, Steven Schlansker > wrote: > > But it still remains -- when you go an read that ROKVS documentation, it sure > doesn't prepare you to this possibility! And, it's a little frustrating that > we have to have this 'caching'

Consulting ReadOnlyKeyValueStore from Processor can lead to deadlock

2018-05-10 Thread Steven Schlansker
Hello again fellow Kafkans, Yesterday we observed a production deadlock take down one of our instances. Upon digging, it's clear that our usage of Kafka is the proximate cause, but the danger of our approach is not clear at all just from the Javadocs. We have stream processors that read off an in

Re: [External] Topic/Partition Assignment in streams application cluster

2018-01-08 Thread Steven Schlansker
For what it's worth, we run 32 partitions per topic and have also observed imbalanced balancing, where a large number of A partitions are assigned to worker 1 and a large number of B partitions are assigned to worker 2, leading to imbalanced load. Nothing super bad for us yet but the effect is not

Re: Consumer service that supports retry with exponential backoff

2017-10-09 Thread Steven Schlansker
> On Oct 9, 2017, at 2:41 PM, John Walker wrote: > > I have a pair of services. One dispatches commands to the other for > processing. > > My consumer sometimes fails to execute commands as a result of transient > errors. To deal with this, commands are retried after an exponentially > increa

Re: Kafka Streams: Problems implementing rate limiting processing step

2017-06-19 Thread Steven Schlansker
> On Jun 19, 2017, at 2:02 PM, Andre Eriksson wrote: > > I then tried implementing my own scheduling that periodically sends/clears > out messages using the ProcessorContext provided to the aforementioned > transform step. However, it seems that when I call forward() from my > scheduler (i.e.

Re: Reliably implementing global KeyValueStore#get

2017-06-07 Thread Steven Schlansker
issue but writing a >> custom sink is not to hard. >> >> Best Jan >> >> >> On 07.06.2017 23:47, Steven Schlansker wrote: >>> I was actually considering writing my own KeyValueStore backed >>> by e.g. a Postgres or the like. >>> >>&g

Re: Reliably implementing global KeyValueStore#get

2017-06-07 Thread Steven Schlansker
sing connect to put data into a store that is more > reasonable for your kind of query requirements? > > Best Jan > > On 07.06.2017 00:29, Steven Schlansker wrote: >>> On Jun 6, 2017, at 2:52 PM, Damian Guy wrote: >>> >>> Steven, >>> >>

Re: Finding StreamsMetadata with value-dependent partitioning

2017-06-07 Thread Steven Schlansker
state it seems this map is small enough so > maybe not worth the repartitioning. > > > Guozhang > > > > > > > On Tue, Jun 6, 2017 at 8:36 AM, Michael Noll wrote: > >> Happy to hear you found a working solution, Steven! >> >> -Michael >>

Re: Reliably implementing global KeyValueStore#get

2017-06-06 Thread Steven Schlansker
ception only fires *during a migration* not *after a migration that may have invalidated your metadata lookup completes* > > HTH, > Damian > > On Tue, 6 Jun 2017 at 18:11 Steven Schlansker > wrote: > >> >>> On Jun 6, 2017, at 6:16 AM, Eno Thereska wrote: >>

Re: Reliably implementing global KeyValueStore#get

2017-06-06 Thread Steven Schlansker
sufficient, as querying different all workers at different times in the presence of migrating data can still in theory miss it given pessimal execution. I'm sure I've long wandered off into the hypothetical, but I dream of some day being cool like Jepsen :) > Eno > > >

Reliably implementing global KeyValueStore#get

2017-06-05 Thread Steven Schlansker
Hi everyone, me again :) I'm still trying to implement my "remoting" layer that allows my clients to see the partitioned Kafka Streams state regardless of which instance they hit. Roughly, my lookup is: Message get(Key key) { RemoteInstance instance = selectPartition(key); return instanc

Re: Finding StreamsMetadata with value-dependent partitioning

2017-06-02 Thread Steven Schlansker
; > Thus, your approach to get all metadata is the only way you can go. Thanks for confirming this. The code is a little ugly but I've done worse :) > > > Very interesting (and quite special) use case. :) > > > -Matthias > > On 6/2/17 2:32 PM, Steven Schlans

Re: Suppressing intermediate topics feeding (Global)KTable

2017-06-02 Thread Steven Schlansker
ot; the information -- and Kafka Streams applications > use topics to exchange data. Thus, we need a topic anyhow. > > Does this make sense? > > So your overall architecture seems to be sound to me. > > > -Matthias > > > On 6/2/17 2:37 PM, Steven Schlansker wro

Re: Suppressing intermediate topics feeding (Global)KTable

2017-06-02 Thread Steven Schlansker
.mapValues(v -> v == null ? null : v.getResolvedDestination().toString()) .to(Serdes.String(), Serdes.String(), DEST_INDEX); builder.globalTable(Serdes.String(), Serdes.String(), DEST_INDEX, DEST_INDEX); > > -Matthias > > > On 6/2/17 12:28 PM, Ste

Re: Finding StreamsMetadata with value-dependent partitioning

2017-06-02 Thread Steven Schlansker
efully that explains my situation a bit more? Thanks! > > -Matthias > > > > On 6/2/17 10:34 AM, Steven Schlansker wrote: >> I have a KTable and backing store whose partitioning is value dependent. >> I want certain groups of messages to be ordered and that group

Suppressing intermediate topics feeding (Global)KTable

2017-06-02 Thread Steven Schlansker
Hi everyone, another question for the list :) I'm creating a cluster of KTable (and GlobalKTable) based off the same input stream K,V. It has a number of secondary indices (think like a RDBMS) K1 -> K K2 -> K etc These are all based off of trivial mappings from my main stream that also feeds the

Finding StreamsMetadata with value-dependent partitioning

2017-06-02 Thread Steven Schlansker
I have a KTable and backing store whose partitioning is value dependent. I want certain groups of messages to be ordered and that grouping is determined by one field (D) of the (possibly large) value. When I lookup by only K, obviously you don't know the partition it should be on. So I will build

Re: [DISCUSS] KIP-156 Add option "dry run" to Streams application reset tool

2017-05-08 Thread Steven Schlansker
> On May 8, 2017, at 11:14 AM, BigData dev wrote: > > Hi All, > I want to start a discussion on this simple KIP for Kafka Streams reset > tool (kafka-streams-application-reset.sh). > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=69410150 I've not used this tool, but if I were

disable KTable caching?

2017-05-03 Thread Steven Schlansker
I'm designing a Streams application that provides an API that acts on messages. Messages have a sender. I have a KStream and a KTable The first time a message is sent, you need to ensure the sender exists beforehand. Roughly, void send(Message m) { if (senderTable.get(m.getSenderId())) {

Re: How to implement use case

2017-04-27 Thread Steven Schlansker
> On Apr 27, 2017, at 3:25 AM, Vladimir Lalovic wrote: > > Hi all, > > > > Our system is about ride reservations and acts as broker between customers > and drivers. > ... > Most of our rules are function of time and some reservation’s property > (e.g. check if there are any reservations whe

Re: [VOTE] 0.10.2.1 RC0

2017-04-11 Thread Steven Schlansker
> On Apr 7, 2017, at 5:12 PM, Gwen Shapira wrote: > > Hello Kafka users, developers and client-developers, > > This is the first candidate for the release of Apache Kafka 0.10.2.1. This > is a bug fix release and it includes fixes and improvements from 24 JIRAs > (including a few critical bugs)

Kafka Streams and reliable state stores

2017-03-23 Thread Steven Schlansker
Hello everyone, I am looking to enhance my Kafka Streams based application from one instance to many. Part of the difficulty is the it seems that all of the state providers are "instance local", either in memory or on local disk. This means to answer queries for non-local partitions you have to

Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-03-13 Thread Steven Schlansker
> On Mar 13, 2017, at 12:30 PM, Matthias J. Sax wrote: > > Jay, > > thanks for your feedback > >> What if instead we called it KStreamsBuilder? > > That's the current name and I personally think it's not the best one. > The main reason why I don't like KStreamsBuilder is, that we have the > c

Re: Chatty StreamThread commit messages

2017-03-01 Thread Steven Schlansker
r 2017 at 07:15, Guozhang Wang wrote: >> >>> Hey Steven, >>> >>> That is a good question, and I think your proposal makes sense. Could you >>> file a JIRA for this change to keep track of it? >>> >>> Guozhang >>> >>> On T

Re: Kafka Streams vs Spark Streaming

2017-02-28 Thread Steven Schlansker
y than a newly created tasks that needs to reply the changelog to > rebuild the state first. > > > > -Matthias > > On 2/28/17 8:17 AM, Steven Schlansker wrote: >> >>> On Feb 28, 2017, at 12:17 AM, Michael Noll wrote: >>> >>> Sachin, >>> >

Chatty StreamThread commit messages

2017-02-28 Thread Steven Schlansker
Hi everyone, running with Kafka Streams 0.10.2.0, I see this every commit interval: 2017-02-28T21:27:16.659Z INFO <> [StreamThread-1] o.a.k.s.p.internals.StreamThread - stream-thread [StreamThread-1] Committing task StreamTask 1_31 2017-02-28T21:27:16.659Z INFO <> [StreamThread-1] o.a.k.s.p.in

Re: Kafka Streams vs Spark Streaming

2017-02-28 Thread Steven Schlansker
> On Feb 28, 2017, at 12:17 AM, Michael Noll wrote: > > Sachin, > > disabling (change)logging for state stores disables the fault-tolerance of > the state store -- i.e. changes to the state store will not be backed up to > Kafka, regardless of whether the store uses a RocksDB store, an in-memor

Re: KIP-121 [VOTE]: Add KStream peek method

2017-02-15 Thread Steven Schlansker
Oops, sorry, a number of votes were sent only to -dev and not to -user and so I missed those in the email I just sent. The actual count is more like +8 > On Feb 15, 2017, at 12:24 PM, Steven Schlansker > wrote: > > From reading the bylaws it's not entirely clear who close

Re: KIP-121 [VOTE]: Add KStream peek method

2017-02-15 Thread Steven Schlansker
rtly :) > On Feb 14, 2017, at 3:36 PM, Zakee wrote: > > +1 > > -Zakee >> On Feb 14, 2017, at 1:56 PM, Jay Kreps wrote: >> >> +1 >> >> Nice improvement. >> >> -Jay >> >> On Tue, Feb 14, 2017 at 1:22 PM, Steven Schlansker &

Re: KIP-121 [VOTE]: Add KStream peek method

2017-02-14 Thread Steven Schlansker
ra wrote: >> +1 (binding) >> >> On Wed, Feb 8, 2017 at 4:45 PM, Steven Schlansker >> wrote: >>> Hi everyone, >>> >>> Thank you for constructive feedback on KIP-121, >>> KStream.peek(ForeachAction) ; >>> it seems like it is ti

Re: Partitioning behavior of Kafka Streams without explicit StreamPartitioner

2017-02-10 Thread Steven Schlansker
aborator to feed it data over Kafka (yet). I get that I can use the normal Producer with the partitioner below, but I consider the code a little ugly and probably could be improved. Is the lack of a Kafka Streams level produce intentional? Am I thinking about the problem wrong? > Mathieu > >

Re: Partitioning behavior of Kafka Streams without explicit StreamPartitioner

2017-02-10 Thread Steven Schlansker
fy a > custom StreamPartitioner that will be called by Streams (not the > producer) to compute the partition before serializing the data. For this > case, the partition (to write the data into) is given to the producer > directly and the producer does not call it's own partitioner. > &g

Partitioning behavior of Kafka Streams without explicit StreamPartitioner

2017-02-09 Thread Steven Schlansker
Hi, I discovered what I consider to be really confusing behavior -- wondering if this is by design or a bug. The Kafka Partitioner interface: public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster); has both "Object value" and "byte[] va

KIP-121 [VOTE]: Add KStream peek method

2017-02-08 Thread Steven Schlansker
Hi everyone, Thank you for constructive feedback on KIP-121, KStream.peek(ForeachAction) ; it seems like it is time to call a vote which I hope will pass easily :) https://cwiki.apache.org/confluence/display/KAFKA/KIP-121%3A+Add+KStream+peek+method I believe the PR attached is already in good sh

Re: KIP-121 [Discuss]: Add KStream peek method

2017-02-08 Thread Steven Schlansker
just be a >>> special impl of `peek()` then, like we did for `count` as for `aggregate`? >>> I.e. we can replace the `KeyValuePrinter` class with an internal ForEach >>> impl within `peek()`. >>> >>> >>> Guozhang >>> >>> &g

Re: KIP-121 [Discuss]: Add KStream peek method

2017-02-07 Thread Steven Schlansker
wrote: >>>>> Steven, >>>>> >>>>> Thanks for your KIP. I move this discussion to dev mailing list -- >> KIPs >>>>> need to be discussed there (and can be cc'ed to user list). >>>>> >>>>> Can you also

KIP-121 [Discuss]: Add KStream peek method

2017-02-06 Thread Steven Schlansker
Hello users@kafka, I would like to propose a small KIP on the Streams framework that simply adds a KStream#peek implementation. https://cwiki.apache.org/confluence/display/KAFKA/KIP-121%3A+Add+KStream+peek+method https://issues.apache.org/jira/browse/KAFKA-4720 https://github.com/apache/kafka/pul

Re: Anyone using log4j Appender for Kafka?

2015-02-22 Thread Steven Schlansker
015, at 11:26 PM, anthony musyoki wrote: > Theres also another one here. > > https://github.com/danielwegener/logback-kafka-appender. > > It has a fallback appender which might address the issue of Kafka being > un-available. > > > On Mon, Feb 23, 2015 at 9:45 AM,

Re: Anyone using log4j Appender for Kafka?

2015-02-22 Thread Steven Schlansker
Here’s my attempt at a Logback version, should be fairly easily ported: https://github.com/opentable/otj-logging/blob/master/kafka/src/main/java/com/opentable/logging/KafkaAppender.java On Feb 22, 2015, at 1:36 PM, Scott Chapman wrote: > I am just starting to use it and could use a little guidan

Re: No longer supporting Java 6, if? when?

2014-11-06 Thread Steven Schlansker
Java 6 has been End of Life since Feb 2013. Java 7 (and 8, but unfortunately that's too new still) has very compelling features which can make development a lot easier. The sooner more projects drop Java 6 the better, in my opinion :) On Nov 5, 2014, at 7:45 PM, Worthy LaFollette wrote: > Mostl

Re: kafka java api (written in 100% clojure)

2014-10-13 Thread Steven Schlansker
Couple of mostly-uninformed comments inline, On Oct 13, 2014, at 2:00 AM, Gerrit Jansen van Vuuren wrote: > Hi Daniel, > > At the moment redis is a spof in the architecture, but you can setup > replication and I'm seriously looking into using redis cluster to eliminate > this. > Some docs t

Re: [DISCUSS] Kafka Security Specific Features

2014-06-06 Thread Steven Schlansker
Hi, I’m glad there’s so much thought into getting security right! But as a user of Kafka who doesn’t need Enterprise Security sort of features, I would ask whether doing such a large project built into Kafka is the appropriate use of developer time at this point in its lifecycle. For example, eve

Re: Compression in Kafka: GZIP or Snappy

2014-05-16 Thread Steven Schlansker
On May 7, 2014, at 7:16 AM, Maung Than wrote: > Hi All, > > I have read this posting from linkedIn Team member; > http://geekmantra.wordpress.com/2013/03/28/compression-in-kafka-gzip-or-snappy/ > ; Thanks. > > I have few questions and thoughts: > > 4) Has any one else done Snappy Vs. GZ