(Oops, I mistakenly sent my response only to Rong Rong. Sorry Rong...) Hi Morrisa,
This is due to a bug in the old flink planner. The `createTable(new PlannerQueryOperation(relational.rel))` method in flink-table-planner -> `TableEnvImpl` -> `sqlQuery` should be `createTable(new PlannerQueryOperation(relational.project()))` as the last projection (if exists) should be performed. This bug is already fixed in the blink planner (flink-table-planner-blink). You can use blink planner and runner instead of the legacy planner to solve this problem. Rong Rong <walter...@gmail.com> 于2019年7月13日周六 上午7:08写道: > Hi Morrisa, > > Can you share more information regarding what type of function > "formatDate" is and how did you configure the return type of that function? > For the question on the first query If the return type is String, then ASC > on a string value should be on alphabetical ordering. > > However on the third query, if the GROUP BY and ORDER BY are both > operating on the same input to your UDF it shouldn't be part of the output > columns. > This looks like a bug to me. > > -- > Rong > > On Thu, Jul 11, 2019 at 11:45 AM Morrisa Brenner < > morrisa.bren...@klaviyo.com> wrote: > >> Hi Flink folks, >> >> We have a custom date formatting function that we use to format the >> output of columns containing dates. Ideally what we want is to format the >> output in the select statement but be able to order by the underlying >> datetime (so that and output with formatted dates "February 2019" and >> "April 2019" is guaranteed to have the rows sorted in time order rather >> than alphabetical order). >> >> When I go to add the unformatted column to the order by, however, that >> gets appended as an extra column to the select statement during the query >> planning process within Calcite. (In the order by parsing, it's considering >> this a different column from the one in the select statement.) When the >> group by column is different in the same way but there's no order by >> column, the extra column isn't added. I've included a couple of simple >> examples below. >> >> Is this the intended behavior of the query planner? Does anyone know of a >> way around this without needing to change the formatting so that it makes >> the output dates correctly sortable? >> >> Thanks for your help! >> >> Morrisa >> >> >> >> Example query and output with order by using formatted date: >> >> SELECT >> >> formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH'), >> >> sum(`testTable`.`count`) >> >> FROM `testTable` >> >> GROUP BY formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH') >> >> ORDER BY formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH') ASC >> >> Month >> >> SUM VALUE >> >> April 2019 >> >> 1052 >> >> February 2019 >> >> 1 >> >> >> Example query and output without order by but group by using unformatted >> date: >> >> SELECT >> >> formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH'), >> >> sum(`testTable`.`count`) >> >> FROM `testTable` >> >> GROUP BY floor(`testTable`.`timestamp` TO MONTH) >> >> Month >> >> SUM VALUE >> >> February 2019 >> >> 1 >> >> April 2019 >> >> 1052 >> >> We would like to enforce the ordering, so although this output is what we >> want, I don't think we can use this solution. >> >> Example query and output with order by using unformatted date: >> >> SELECT >> >> formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH'), >> >> sum(`testTable`.`count`) >> >> FROM `testTable` >> >> GROUP BY floor(`testTable`.`timestamp` TO MONTH) >> >> ORDER BY floor(`testTable`.`timestamp` TO MONTH) ASC >> >> Month >> >> SUM VALUE >> >> February 2019 >> >> 1 >> >> 2/1/2019 12:00 AM >> >> April 2019 >> >> 1052 >> >> 4/1/2019 12:00 AM >> >> >> -- >> Morrisa Brenner >> Software Engineer >> 225 Franklin St, Boston, MA 02110 >> klaviyo.com <https://www.klaviyo.com> >> [image: Klaviyo Logo] >> >