Hi Satyam, Are you using blink planner in streaming mode? AFAIK, blink planner in batch mode can sort on arbitrary columns.
Satyam Shekhar <satyamshek...@gmail.com> 于2020年5月30日周六 上午6:19写道: > Hello, > > I am using Flink as the streaming execution engine for building a > low-latency alerting application. The use case also requires ad-hoc > querying on batch data, which I also plan to serve using Flink to avoid the > complexity of maintaining two separate engines. > > My current understanding is that Order By operator in Blink planner (on > DataStream) requires time attribute as the primary sort column. This is > quite limiting for ad-hoc querying. It seems I can use the DataSet API to > obtain a globally sorted output on an arbitrary column but that will force > me to use the older Flink planner. > > Specifically, I am looking for guidance from the community on the > following questions - > > 1. Is it possible to obtain a globally sorted output on DataStreams on > an arbitrary sort column? > 2. What are the tradeoffs in using DataSet vs DataStream in > performance, long term support, etc? > 3. Is there any other way to address this issue? > > Regards, > Satyam > -- Best, Benchao Li