Hey Hector,
thanks for your reply. Your assumption is entirely correct, I have a few
Million datasets on the topic already to test a streaming use case. I am
planning on testing it with a variety of settings, but the problems occur with
any cluster-configuration. For example Parallelism 1 with
Hi Theo
In your initial email, you mentioned that you have "a bit of Data on it"
when referring to your topic with ten partitions. Correct me if I'm wrong,
but that sounds like the data in your topic is bounded and trying to test a
streaming use-case. What kind of parallelism do you have configure
Hey,
so one more thing, the query looks like this:
SELECT window_start, window_end, a, b, c, count(*) as x FROM TABLE(TUMBLE(TABLE
data.v1, DESCRIPTOR(timeStampData), INTERVAL '1' HOUR)) GROUP BY window_start,
window_end, a, b, c
When the non-determinism occurs, the topic is not keyed at all.
Hey Yuxia,
thanks for your response. I figured too, that the events arrive in a (somewhat)
random order and thus cause non-determinism. I used a Watermark like
this:"timeStampData - INTERVAL '10' SECOND” . Increasing the Watermark Interval
does not solve the problem though, the results are stil
HI, Theo.
I'm wondering what the Event-Time-Windowed Query you are using looks like.
For example, how do you define the watermark?
Considering you read records from the 10 partitions, and it may well that the
records will arrive the window process operator out of order.
Is it possible that the re