Is it possible to process the same stream in two different ways? I can’t find
anything in the documentation definitively stating this is possible, but nor do
I find anything stating it isn’t. My attempt had some unexpected results,
which I’ll explain below:
Essentially, I have a stream of data I’m pulling from Kafka. I want to build
aggregate metrics on this data set using both tumbling windows as well as
session windows. So, I do something like the following:
DataStream<MyRecordType> baseStream =
env.addSource(….); // pulling data from kafka
.map(…) // parse the raw input
.assignTimestampsAndWatermarks(…);
DataStream <Tuple..<…>> timeWindowedStream =
baseStream.keyBy(…)
.timeWindow(…) // tumbling
window
.apply(…); //
aggregation over tumbling window
DataStream <Tuple..<…>> sessionWindowedStream =
baseStream.keyBy(…)
.window(EventTimeSessionWindows.withGap(…)) // session window
.apply(…);
// aggregation over
session window
The issue is that when I view my job in the Flink dashboard, it indicates that
each type of windowing is only receiving half of the records. Is what I’m
trying simply unsupported or is there something I’m missing?
Thanks!
________________________________
The information contained in this communication is confidential and intended
only for the use of the recipient named above, and may be legally privileged
and exempt from disclosure under applicable law. If the reader of this message
is not the intended recipient, you are hereby notified that any dissemination,
distribution or copying of this communication is strictly prohibited. If you
have received this communication in error, please resend it to the sender and
delete the original message and copy of it from your computer system. Opinions,
conclusions and other information in this message that do not relate to our
official business should be understood as neither given nor endorsed by the
company.