For changes that may be backwards incompatible or change the APIs we usually do 
a short KIP first (e.g., I just did one yesterday: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-114%3A+KTable+materialization+and+improved+semantics
 
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-114:+KTable+materialization+and+improved+semantics>).
 It's not meant to be overly-burdensome, and it encourages the community to 
participate in the design. In this case I suspect the KIP can be very short, a 
paragraph or so.

Thanks
Eno

> On 16 Jan 2017, at 22:52, Nicolas Fouché <nfou...@onfocus.io> wrote:
> 
> In the case of KAFKA-4468, it's more about state stores. But still, keys
> would not be backward compatible. What is the "official" policy about this
> kind of change ?
> 
> 2017-01-16 23:47 GMT+01:00 Nicolas Fouché <nfou...@onfocus.io>:
> 
>> Hi Eno,
>> I thought it would be impossible to put this in Kafka because of backward
>> incompatibility with the existing windowed keys, no ?
>> In my case, I had to recreate a new output topic, reset the topology, and
>> and reprocess all my data.
>> 
>> 2017-01-16 23:05 GMT+01:00 Eno Thereska <eno.there...@gmail.com>:
>> 
>>> Nicolas,
>>> 
>>> I'm checking with Bill who originally was interested in KAFKA-4468. If he
>>> isn't actively working on it, why don't you give it a go and create a pull
>>> request (PR) for it? That way your contribution is properly acknowledged
>>> etc. We can help you through with that.
>>> 
>>> Thanks
>>> Eno
>>>> On 16 Jan 2017, at 18:46, Nicolas Fouché <nfou...@onfocus.io> wrote:
>>>> 
>>>> My current implementation:
>>>> https://gist.github.com/nfo/eaf350afb5667a3516593da4d48e757a . I just
>>>> appended the window `end` at the end of the byte array.
>>>> Comments and suggestions are welcome !
>>>> 
>>>> 
>>>> 2017-01-16 15:48 GMT+01:00 Nicolas Fouché <nfou...@onfocus.io>:
>>>> 
>>>>> Hi Damian,
>>>>> 
>>>>> I recall now that I copied the `WindowedSerde` class [1] from Confluent
>>>>> examples by Confluent, which uses the internal `WindowedSerializer`
>>> class.
>>>>> Better write my own Serde them. You're right, I should not rely on
>>>>> internal classes, especially for data written outside Kafka Streams
>>>>> topologies.
>>>>> 
>>>>> Thanks for the insights on KAFKA-4468.
>>>>> 
>>>>> https://github.com/confluentinc/examples/blob/
>>>>> 89db45c6890cf757b8e18565bdf7bc23f119a2ff/kafka-streams/src/
>>>>> main/java/io/confluent/examples/streams/utils/WindowedSerde.java
>>>>> 
>>>>> Nicolas.
>>>>> 
>>>>> 2017-01-16 12:31 GMT+01:00 Damian Guy <damian....@gmail.com>:
>>>>> 
>>>>>> Hi Nicolas,
>>>>>> 
>>>>>> I guess you are using the Processor API for your topology? The
>>>>>> WindowedSerializer is an internal class that is used as part of the
>>> DSL.
>>>>>> In
>>>>>> the DSL a topic will be created for each window operation, so we don't
>>>>>> need
>>>>>> the end time as it can be calculated from the window size.
>>>>>> However, there is an open jira for this:
>>>>>> https://issues.apache.org/jira/browse/KAFKA-4468
>>>>>> 
>>>>>> Thanks,
>>>>>> Damian
>>>>>> 
>>>>>> On Mon, 16 Jan 2017 at 11:18 Nicolas Fouché <nfou...@onfocus.io>
>>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> In the same topology, I generate aggregates with 1-day windows and
>>>>>> 1-week
>>>>>>> windows and write them in one single topic. On Mondays, these windows
>>>>>> have
>>>>>>> the same start time. The effect: these aggregates overrides each
>>> other.
>>>>>>> 
>>>>>>> That happens because WindowedSerializer [1] only serializes the
>>> window
>>>>>>> start time. I'm a bit surprised, a window has by definition a start
>>> and
>>>>>> an
>>>>>>> end. I suppose one wanted save on key sizes ? And/or one would
>>> consider
>>>>>>> that topics should not contain aggregates with different
>>> granularities ?
>>>>>>> 
>>>>>>> I have two choices then, either create as many output topics as I
>>> have
>>>>>>> granularities, or create my own serializer which also includes the
>>>>>> window
>>>>>>> end time. What would the community recommend ?
>>>>>>> 
>>>>>>> Getting back to the core problem:
>>>>>>> I could understand that it's not "right" to store different
>>>>>> granularities
>>>>>>> in one topic, and I thought it would save resources (less topic to
>>>>>> manage
>>>>>>> by Kafka). But, I'm really not sure about this default serializer: it
>>>>>> does
>>>>>>> not serialize all instance variables of the `Window` class, and more
>>>>>>> generally does comply to the definition of a window.
>>>>>>> 
>>>>>>> [1]
>>>>>>> 
>>>>>>> https://github.com/apache/kafka/blob/0.10.1/streams/src/main
>>>>>> /java/org/apache/kafka/streams/kstream/internals/WindowedSer
>>> ializer.java
>>>>>>> 
>>>>>>> Thanks.
>>>>>>> Nicolas
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
>> 

Reply via email to