Hello,

I am using Flink as the streaming execution engine for building a
low-latency alerting application. The use case also requires ad-hoc
querying on batch data, which I also plan to serve using Flink to avoid the
complexity of maintaining two separate engines.

My current understanding is that Order By operator in Blink planner (on
DataStream) requires time attribute as the primary sort column. This is
quite limiting for ad-hoc querying. It seems I can use the DataSet API to
obtain a globally sorted output on an arbitrary column but that will force
me to use the older Flink planner.

Specifically, I am looking for guidance from the community on the following
questions -

   1. Is it possible to obtain a globally sorted output on DataStreams on
   an arbitrary sort column?
   2. What are the tradeoffs in using DataSet vs DataStream in performance,
   long term support, etc?
   3. Is there any other way to address this issue?

Regards,
Satyam

Reply via email to