ameyc commented on issue #11366:
URL: https://github.com/apache/datafusion/issues/11366#issuecomment-2221829306
@mustafasrepo & @ozankabak thanks for the feedback. the target usecases we
were going for are flink style workloads, with data read from kafka that is
generally not be ordered and thus needs to be watermarked. we tried the vanilla
aggregates and ran into PipelineBreaking panics.
An example workload we're trying to compute is of the nature, lmk if this
can already be expressed with current operators as is --
```
let windowed_df = df
.clone()
.streaming_window(
vec![],
vec![
max(col("imu_measurement").field("gps").field("speed")),
min(col("imu_measurement").field("gps").field("altitude")),
count(col("imu_measurement")).alias("count"),
],
Duration::from_millis(5_000), // 5 second window
Some(Duration::from_millis(1_000)), // 1 second slide
)
.unwrap();
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]