On 1/8/19 12:57 PM, Peter Levart wrote:
Hi John,
On 1/8/19 12:45 PM, Peter Levart wrote:
I looked at your custom transfomer, and it looks almost correct to
me. The
only flaw seems to be that it only looks
for closed windows for the key currently being processed, which
means that
if you have key "A" buffered, but don't get another event for it for a
while after the window closes, you won't emit the final result. This
might
actually take longer than the window retention period, in which
case, the
data would be deleted without ever emitting the final result.
So in DSL case, the suppression works by flushing *all* of the "ripe"
windows in the whole buffer whenever a singe event comes in with
recent enough timestamp regardless of the key of that event?
Is the buffer shared among processing tasks or does each task
maintain its own private buffer that only contains its share of data
pertaining to assigned input partitions? In case the tasks are
executed on several processing JVM(s) the buffer can't really be
shared, right? In that case a single event can't flush all of the
"ripe" windows, but just those that are contained in the task's part
of buffer...
Just a question about your comment above:
/"This might actually take longer than the window retention period, in
which case, the data would be deleted without ever emitting the final
result"/
Are you talking about the buffer log topic retention? Aren't log
topics configured to "compact" rather than "delete" messages? So the
last "version" of the buffer entry for a particular key should stay
forever? What are the keys in suppression buffer log topic? Are they a
pair of (timestamp, key) ? Probably not since in that case the
compacted log would grow indefinitely...
Another question:
What are the keys in WindowStore's log topic? If the input keys to the
processor that uses such WindowStore consist of a bounded set of
values (for example user ids), would compacted log of such WindowStore
also be bounded?
In case the key of WindowStore log topic is (timestamp, key) then would
explicitly deleting flushed entries from WindowStore (by putting null
value into the store) keep the compacted log bounded? In other words,
does WindowStore log topic support a special kind of "tombstone" message
that effectively removes the key from the compacted log?
In that case, my custom processor could keep entries in its WindowStore
for as log as needed, depending on the activity of a particular input key...
Regards, Peter