Hi Suket Sorry, this was a typo in the pseudo-code I sent. Of course that what you suggested (using the same eventtime attribute for the watermark and the window) is what my code does in reality. Sorry, to confuse people.
On 5/14/19 4:14 PM, suket arora wrote: > Hi Joe, > As per the spark structured streaming documentation and I quote > |"withWatermark| must be called on the same column as the timestamp column > used in the aggregate. For example, |df.withWatermark("time", "1 > min").groupBy("time2").count()| is invalid in Append output mode, as > watermark is defined on a different column from the aggregation column." > > *And after referring the following code * > // Group the data by window and word and compute the count of each > group > > |val windowedCounts = words .withWatermark("timestamp", "10 minutes") > .groupBy( window($"timestamp", "10 minutes", "5 minutes"), $"word") .count()| > > > I would suggest you to try following code > > df = inputStream.withWatermark("eventtime", "20 > seconds").groupBy($"sharedId", window($"eventtime","20 seconds", "10 > seconds")) > > And If this doesn't work, you can try trigger on query. Can you maybe explain what you mean by "try trigger on query" - I don't understand that. -- CU, Joe --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org