(Oops, I mistakenly sent my response only to Rong Rong. Sorry Rong...)

Hi Morrisa,

This is due to a bug in the old flink planner. The `createTable(new
PlannerQueryOperation(relational.rel))` method in flink-table-planner ->
`TableEnvImpl` -> `sqlQuery` should be `createTable(new
PlannerQueryOperation(relational.project()))` as the last projection (if
exists) should be performed. This bug is already fixed in the blink planner
(flink-table-planner-blink). You can use blink planner and runner instead
of the legacy planner to solve this problem.

Rong Rong <walter...@gmail.com> 于2019年7月13日周六 上午7:08写道:

> Hi Morrisa,
>
> Can you share more information regarding what type of function
> "formatDate" is and how did you configure the return type of that function?
> For the question on the first query If the return type is String, then ASC
> on a string value should be on alphabetical ordering.
>
> However on the third query, if the GROUP BY and ORDER BY are both
> operating on the same input to your UDF it shouldn't be part of the output
> columns.
> This looks like a bug to me.
>
> --
> Rong
>
> On Thu, Jul 11, 2019 at 11:45 AM Morrisa Brenner <
> morrisa.bren...@klaviyo.com> wrote:
>
>> Hi Flink folks,
>>
>> We have a custom date formatting function that we use to format the
>> output of columns containing dates. Ideally what we want is to format the
>> output in the select statement but be able to order by the underlying
>> datetime (so that and output with formatted dates "February 2019" and
>> "April 2019" is guaranteed to have the rows sorted in time order rather
>> than alphabetical order).
>>
>> When I go to add the unformatted column to the order by, however, that
>> gets appended as an extra column to the select statement during the query
>> planning process within Calcite. (In the order by parsing, it's considering
>> this a different column from the one in the select statement.) When the
>> group by column is different in the same way but there's no order by
>> column, the extra column isn't added. I've included a couple of simple
>> examples below.
>>
>> Is this the intended behavior of the query planner? Does anyone know of a
>> way around this without needing to change the formatting so that it makes
>> the output dates correctly sortable?
>>
>> Thanks for your help!
>>
>> Morrisa
>>
>>
>>
>> Example query and output with order by using formatted date:
>>
>> SELECT
>>
>> formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH'),
>>
>> sum(`testTable`.`count`)
>>
>> FROM `testTable`
>>
>> GROUP BY formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH')
>>
>> ORDER BY formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH') ASC
>>
>> Month
>>
>> SUM VALUE
>>
>> April 2019
>>
>> 1052
>>
>> February 2019
>>
>> 1
>>
>>
>> Example query and output without order by but group by using unformatted
>> date:
>>
>> SELECT
>>
>> formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH'),
>>
>> sum(`testTable`.`count`)
>>
>> FROM `testTable`
>>
>> GROUP BY floor(`testTable`.`timestamp` TO MONTH)
>>
>> Month
>>
>> SUM VALUE
>>
>> February 2019
>>
>> 1
>>
>> April 2019
>>
>> 1052
>>
>> We would like to enforce the ordering, so although this output is what we
>> want, I don't think we can use this solution.
>>
>> Example query and output with order by using unformatted date:
>>
>> SELECT
>>
>> formatDate(floor(`testTable`.`timestamp` TO MONTH), 'MONTH'),
>>
>> sum(`testTable`.`count`)
>>
>> FROM `testTable`
>>
>> GROUP BY floor(`testTable`.`timestamp` TO MONTH)
>>
>> ORDER BY floor(`testTable`.`timestamp` TO MONTH) ASC
>>
>> Month
>>
>> SUM VALUE
>>
>> February 2019
>>
>> 1
>>
>> 2/1/2019 12:00 AM
>>
>> April 2019
>>
>> 1052
>>
>> 4/1/2019 12:00 AM
>>
>>
>> --
>> Morrisa Brenner
>> Software Engineer
>> 225 Franklin St, Boston, MA 02110
>> klaviyo.com <https://www.klaviyo.com>
>> [image: Klaviyo Logo]
>>
>

Reply via email to