Hi John,
On 1/8/19 12:45 PM, Peter Levart wrote:
I looked at your custom transfomer, and it looks almost correct to
me. The
only flaw seems to be that it only looks
for closed windows for the key currently being processed, which means
that
if you have key "A" buffered, but don't get another event for it for a
while after the window closes, you won't emit the final result. This
might
actually take longer than the window retention period, in which case,
the
data would be deleted without ever emitting the final result.
So in DSL case, the suppression works by flushing *all* of the "ripe"
windows in the whole buffer whenever a singe event comes in with
recent enough timestamp regardless of the key of that event?
Is the buffer shared among processing tasks or does each task maintain
its own private buffer that only contains its share of data pertaining
to assigned input partitions? In case the tasks are executed on
several processing JVM(s) the buffer can't really be shared, right? In
that case a single event can't flush all of the "ripe" windows, but
just those that are contained in the task's part of buffer...
Just a question about your comment above:
/"This might actually take longer than the window retention period, in
which case, the data would be deleted without ever emitting the final
result"/
Are you talking about the buffer log topic retention? Aren't log topics
configured to "compact" rather than "delete" messages? So the last
"version" of the buffer entry for a particular key should stay forever?
What are the keys in suppression buffer log topic? Are they a pair of
(timestamp, key) ? Probably not since in that case the compacted log
would grow indefinitely...
Another question:
What are the keys in WindowStore's log topic? If the input keys to the
processor that uses such WindowStore consist of a bounded set of values
(for example user ids), would compacted log of such WindowStore also be
bounded?
Regards, Peter