For changes that may be backwards incompatible or change the APIs we usually do a short KIP first (e.g., I just did one yesterday: https://cwiki.apache.org/confluence/display/KAFKA/KIP-114%3A+KTable+materialization+and+improved+semantics <https://cwiki.apache.org/confluence/display/KAFKA/KIP-114:+KTable+materialization+and+improved+semantics>). It's not meant to be overly-burdensome, and it encourages the community to participate in the design. In this case I suspect the KIP can be very short, a paragraph or so.
Thanks Eno > On 16 Jan 2017, at 22:52, Nicolas Fouché <nfou...@onfocus.io> wrote: > > In the case of KAFKA-4468, it's more about state stores. But still, keys > would not be backward compatible. What is the "official" policy about this > kind of change ? > > 2017-01-16 23:47 GMT+01:00 Nicolas Fouché <nfou...@onfocus.io>: > >> Hi Eno, >> I thought it would be impossible to put this in Kafka because of backward >> incompatibility with the existing windowed keys, no ? >> In my case, I had to recreate a new output topic, reset the topology, and >> and reprocess all my data. >> >> 2017-01-16 23:05 GMT+01:00 Eno Thereska <eno.there...@gmail.com>: >> >>> Nicolas, >>> >>> I'm checking with Bill who originally was interested in KAFKA-4468. If he >>> isn't actively working on it, why don't you give it a go and create a pull >>> request (PR) for it? That way your contribution is properly acknowledged >>> etc. We can help you through with that. >>> >>> Thanks >>> Eno >>>> On 16 Jan 2017, at 18:46, Nicolas Fouché <nfou...@onfocus.io> wrote: >>>> >>>> My current implementation: >>>> https://gist.github.com/nfo/eaf350afb5667a3516593da4d48e757a . I just >>>> appended the window `end` at the end of the byte array. >>>> Comments and suggestions are welcome ! >>>> >>>> >>>> 2017-01-16 15:48 GMT+01:00 Nicolas Fouché <nfou...@onfocus.io>: >>>> >>>>> Hi Damian, >>>>> >>>>> I recall now that I copied the `WindowedSerde` class [1] from Confluent >>>>> examples by Confluent, which uses the internal `WindowedSerializer` >>> class. >>>>> Better write my own Serde them. You're right, I should not rely on >>>>> internal classes, especially for data written outside Kafka Streams >>>>> topologies. >>>>> >>>>> Thanks for the insights on KAFKA-4468. >>>>> >>>>> https://github.com/confluentinc/examples/blob/ >>>>> 89db45c6890cf757b8e18565bdf7bc23f119a2ff/kafka-streams/src/ >>>>> main/java/io/confluent/examples/streams/utils/WindowedSerde.java >>>>> >>>>> Nicolas. >>>>> >>>>> 2017-01-16 12:31 GMT+01:00 Damian Guy <damian....@gmail.com>: >>>>> >>>>>> Hi Nicolas, >>>>>> >>>>>> I guess you are using the Processor API for your topology? The >>>>>> WindowedSerializer is an internal class that is used as part of the >>> DSL. >>>>>> In >>>>>> the DSL a topic will be created for each window operation, so we don't >>>>>> need >>>>>> the end time as it can be calculated from the window size. >>>>>> However, there is an open jira for this: >>>>>> https://issues.apache.org/jira/browse/KAFKA-4468 >>>>>> >>>>>> Thanks, >>>>>> Damian >>>>>> >>>>>> On Mon, 16 Jan 2017 at 11:18 Nicolas Fouché <nfou...@onfocus.io> >>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> In the same topology, I generate aggregates with 1-day windows and >>>>>> 1-week >>>>>>> windows and write them in one single topic. On Mondays, these windows >>>>>> have >>>>>>> the same start time. The effect: these aggregates overrides each >>> other. >>>>>>> >>>>>>> That happens because WindowedSerializer [1] only serializes the >>> window >>>>>>> start time. I'm a bit surprised, a window has by definition a start >>> and >>>>>> an >>>>>>> end. I suppose one wanted save on key sizes ? And/or one would >>> consider >>>>>>> that topics should not contain aggregates with different >>> granularities ? >>>>>>> >>>>>>> I have two choices then, either create as many output topics as I >>> have >>>>>>> granularities, or create my own serializer which also includes the >>>>>> window >>>>>>> end time. What would the community recommend ? >>>>>>> >>>>>>> Getting back to the core problem: >>>>>>> I could understand that it's not "right" to store different >>>>>> granularities >>>>>>> in one topic, and I thought it would save resources (less topic to >>>>>> manage >>>>>>> by Kafka). But, I'm really not sure about this default serializer: it >>>>>> does >>>>>>> not serialize all instance variables of the `Window` class, and more >>>>>>> generally does comply to the definition of a window. >>>>>>> >>>>>>> [1] >>>>>>> >>>>>>> https://github.com/apache/kafka/blob/0.10.1/streams/src/main >>>>>> /java/org/apache/kafka/streams/kstream/internals/WindowedSer >>> ializer.java >>>>>>> >>>>>>> Thanks. >>>>>>> Nicolas >>>>>>> >>>>>> >>>>> >>>>> >>> >>> >>