why do you have two watermarks? once you apply the watermark to a column
(i.e., "time"), it can be used in all later operations as long as the
column is preserved. So the above code should be equivalent to
df.withWarmark("time","window
size").dropDulplicates("id").groupBy(window("time","window siz
1. Yes. All times in event time, not processing time. So you may get 10AM
event time data at 11AM processing time, but it will still be compared
again all data within 9-10AM event times.
2. Show us your code.
On Thu, Feb 27, 2020 at 2:30 AM lec ssmi wrote:
> Hi:
> I'm new to structured stre