Hi Eno, I thought it would be impossible to put this in Kafka because of backward incompatibility with the existing windowed keys, no ? In my case, I had to recreate a new output topic, reset the topology, and and reprocess all my data.
2017-01-16 23:05 GMT+01:00 Eno Thereska <eno.there...@gmail.com>: > Nicolas, > > I'm checking with Bill who originally was interested in KAFKA-4468. If he > isn't actively working on it, why don't you give it a go and create a pull > request (PR) for it? That way your contribution is properly acknowledged > etc. We can help you through with that. > > Thanks > Eno > > On 16 Jan 2017, at 18:46, Nicolas Fouché <nfou...@onfocus.io> wrote: > > > > My current implementation: > > https://gist.github.com/nfo/eaf350afb5667a3516593da4d48e757a . I just > > appended the window `end` at the end of the byte array. > > Comments and suggestions are welcome ! > > > > > > 2017-01-16 15:48 GMT+01:00 Nicolas Fouché <nfou...@onfocus.io>: > > > >> Hi Damian, > >> > >> I recall now that I copied the `WindowedSerde` class [1] from Confluent > >> examples by Confluent, which uses the internal `WindowedSerializer` > class. > >> Better write my own Serde them. You're right, I should not rely on > >> internal classes, especially for data written outside Kafka Streams > >> topologies. > >> > >> Thanks for the insights on KAFKA-4468. > >> > >> https://github.com/confluentinc/examples/blob/ > >> 89db45c6890cf757b8e18565bdf7bc23f119a2ff/kafka-streams/src/ > >> main/java/io/confluent/examples/streams/utils/WindowedSerde.java > >> > >> Nicolas. > >> > >> 2017-01-16 12:31 GMT+01:00 Damian Guy <damian....@gmail.com>: > >> > >>> Hi Nicolas, > >>> > >>> I guess you are using the Processor API for your topology? The > >>> WindowedSerializer is an internal class that is used as part of the > DSL. > >>> In > >>> the DSL a topic will be created for each window operation, so we don't > >>> need > >>> the end time as it can be calculated from the window size. > >>> However, there is an open jira for this: > >>> https://issues.apache.org/jira/browse/KAFKA-4468 > >>> > >>> Thanks, > >>> Damian > >>> > >>> On Mon, 16 Jan 2017 at 11:18 Nicolas Fouché <nfou...@onfocus.io> > wrote: > >>> > >>>> Hi, > >>>> > >>>> In the same topology, I generate aggregates with 1-day windows and > >>> 1-week > >>>> windows and write them in one single topic. On Mondays, these windows > >>> have > >>>> the same start time. The effect: these aggregates overrides each > other. > >>>> > >>>> That happens because WindowedSerializer [1] only serializes the window > >>>> start time. I'm a bit surprised, a window has by definition a start > and > >>> an > >>>> end. I suppose one wanted save on key sizes ? And/or one would > consider > >>>> that topics should not contain aggregates with different > granularities ? > >>>> > >>>> I have two choices then, either create as many output topics as I have > >>>> granularities, or create my own serializer which also includes the > >>> window > >>>> end time. What would the community recommend ? > >>>> > >>>> Getting back to the core problem: > >>>> I could understand that it's not "right" to store different > >>> granularities > >>>> in one topic, and I thought it would save resources (less topic to > >>> manage > >>>> by Kafka). But, I'm really not sure about this default serializer: it > >>> does > >>>> not serialize all instance variables of the `Window` class, and more > >>>> generally does comply to the definition of a window. > >>>> > >>>> [1] > >>>> > >>>> https://github.com/apache/kafka/blob/0.10.1/streams/src/main > >>> /java/org/apache/kafka/streams/kstream/internals/ > WindowedSerializer.java > >>>> > >>>> Thanks. > >>>> Nicolas > >>>> > >>> > >> > >> > >