It would be interesting to see a design for this. You'll need to partition
or it won't scale because SQL "OVER" clause is linear and sorted in this
case. Other than that, it should be a pretty straightforward implementation
using state + timers + @RequiresTimeSortedInput. Sorting in any other way
w
1. Yes, using state is a better fit than Beam windowing. You will want to
use the new feature of annotating the DoFn with @RequiresTimeSortedInput.
This will make it so you can be sure you are actually getting the
"previous" event. They can arrive in any order without this annotation. You
won't be
Hi Kenn,
Thanks for clarification.
1. Just to put an example in front - for every event that comes in I
need to find corresponding previous event of same user_id and pass
previous_event_timestamp in the current event payload down (and also
current event becomes previous event for future event
Just want to clarify that Beam's concept of windowing is really an
event-time based key, and they are all processed logically simultaneously.
SQL's concept of windowing function is to sort rows and process them
linearly. They are actually totally different. From your queries it seems
you are intere
Hi Alexey,
Thank You for reference to that discussion I do actually have pretty
similar thoughts on what Beam SQL needs.
Update from my side:
Actually did find a workaround for issue with windowing function on
stream. It basically boils down to using sliding window to collect and
aggregate
Hi Piotr,
Thanks for details! I cross-post this to dev@ as well since, I guess, people
there can provide more insights on this.
A while ago, I faced the similar issues trying to run Beam SQL against TPC-DS
benchmark.
We had a discussion around that [1], please, take a look since it can be
hel