Hi everyone,

Anyone knows how to use withWatermark  on Dataset?

I have tried the following but hit this exception:

dataset org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema
cannot be cast to "MyType"

The code looks like the following:

dataset
.withWatermark("timestamp", "5 seconds")
.groupBy("timestamp", "customer_id")
.agg(MyAggregator)
.writeStream....

Where dataset has MyType for each row.
Where MyType is:
case class MyTpe(customer_id: Long, timestamp: Timestamp, product_id: Long)

MyAggregator which takes MyType as the input type did some maths on the
product_id and outputs a set of product_ids.

Best Regards,

Jerry

Reply via email to