[
https://issues.apache.org/jira/browse/SPARK-51428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marko Ilic updated SPARK-51428:
-------------------------------
Epic Link: SPARK-46830
> Implicit aliases of collated trees are assigned non-deterministically
> ---------------------------------------------------------------------
>
> Key: SPARK-51428
> URL: https://issues.apache.org/jira/browse/SPARK-51428
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.0.0
> Reporter: Vladimir Golubev
> Assignee: Vladimir Golubev
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Consider the following collated queries and their schemas:
> 1.
> ```
> SELECT 'a' COLLATE UTF8_LCASE < 'A'
> ->
> (collate(a, UTF8_LCASE) < 'A' collate UTF8_LCASE)
> ```
> 2.
> ```
> SELECT CONCAT_WS('a', col1, col1) FROM VALUES ('a' COLLATE UTF8_LCASE)
> ->
> concat_ws(a, col1, col1)
> ```
> The 1. case has an explicit alias where 'A' literal is marked as collated,
> which is correct. However, in the second case, 'a' literal is not marked as
> collated in the output implicit alias, despite the fact that it is indeed
> collated by `CollationTypeCoercion`. The 2. output schema has to be
> `concat_ws('a' collate UTF8_LCASE, col1, col1)|`.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]