[
https://issues.apache.org/jira/browse/CALCITE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ruben Quesada Lopez updated CALCITE-2959:
-----------------------------------------
Description:
Currently, the class {{RelFieldCollation}} is used to define _"the ordering of
one field of a RelNode whose output is to be sorted"_. This representation can
hold only "simple" fields. In case of struct fields, a projection needs to be
applied in order to reference the struct field as a simple one. For example,
given this table:
{code}
CREATE TYPE Address AS (
street VARCHAR(20) NOT NULL,
zipcode VARCHAR(20) NOT NULL,
city VARCHAR(20) NOT NULL);
CREATE TABLE Person (
id VARCHAR(20) NOT NULL,
name VARCHAR(20) NOT NULL,
address Address NOT NULL);
{code}
With a SQL query such as: "{{SELECT p.name, p.address.city FROM Person p ORDER
BY p.address.city}}" the pseudo-plan generated would look like:
{code}
Sort(1) // --> Collation: [1]
Project(0=$1, 1=$2.city)
Scan(table=Person)
{code}
However, what would happen if we had a specific Scan operator that would
guarantee us that the records would be scanned already ordered by address.city?
Something like:
{code}
EnhancedScan(table=Person, sort=$2.city) // --> Collation???
{code}
The collation of such an operator cannot be represented with the current
Calcite capabilities (RelFieldCollation), because it would not be a "simple"
field, but a struct field, i.e. we would need a new collation abstraction to
represent it, e.g. [2.city] or [2.2]
I would like to open the discussion to see if / how we could find a solution to
represent this case.
was:
Currently, the class {{RelFieldCollation}} is used to define _"the ordering of
one field of a RelNode whose output is to be sorted"_. This representation can
hold only "simple" fields. In case of struct fields, a projection needs to be
applied in order to reference the struct field as a simple one. For example,
given this table:
{code}
CREATE TYPE Address AS (
street VARCHAR(20) NOT NULL,
zipcode VARCHAR(20) NOT NULL,
city VARCHAR(20) NOT NULL);
CREATE TABLE Person (
id VARCHAR(20) NOT NULL,
name VARCHAR(20) NOT NULL,
address Address NOT NULL);
{code}
With a SQL query such as: "{{SELECT p.name, p.address.city FROM Person p ORDER
BY p.address.city}}" the pseudo-plan generated would look like:
{code}
Sort(1) // --> Collation: [1]
Project(0=$1, 1=$2.city)
Scan(table=Person)
{code}
However, what would happen if we had a specific Scan operator that would
guarantee us that the records would be scanned already ordered by address.city?
Something like:
{code}
EnhancedScan(table=Person, sort=$2.city) // --> Collation???
{code}
The collation of such an operator cannot be represented with the current
Calcite capabilities (RelFieldCollation), because it would not be a "simple"
field, but a struct field, i.e. we would need a new collation abstraction to
represent it, e.g. [2.city] or [2.2]
I would like to open the discuss to see if / how we could find a solution to
represent this case.
> Support collation on struct fields
> ----------------------------------
>
> Key: CALCITE-2959
> URL: https://issues.apache.org/jira/browse/CALCITE-2959
> Project: Calcite
> Issue Type: New Feature
> Reporter: Ruben Quesada Lopez
> Priority: Major
>
> Currently, the class {{RelFieldCollation}} is used to define _"the ordering
> of one field of a RelNode whose output is to be sorted"_. This representation
> can hold only "simple" fields. In case of struct fields, a projection needs
> to be applied in order to reference the struct field as a simple one. For
> example, given this table:
> {code}
> CREATE TYPE Address AS (
> street VARCHAR(20) NOT NULL,
> zipcode VARCHAR(20) NOT NULL,
> city VARCHAR(20) NOT NULL);
> CREATE TABLE Person (
> id VARCHAR(20) NOT NULL,
> name VARCHAR(20) NOT NULL,
> address Address NOT NULL);
> {code}
> With a SQL query such as: "{{SELECT p.name, p.address.city FROM Person p
> ORDER BY p.address.city}}" the pseudo-plan generated would look like:
> {code}
> Sort(1) // --> Collation: [1]
> Project(0=$1, 1=$2.city)
> Scan(table=Person)
> {code}
> However, what would happen if we had a specific Scan operator that would
> guarantee us that the records would be scanned already ordered by
> address.city? Something like:
> {code}
> EnhancedScan(table=Person, sort=$2.city) // --> Collation???
> {code}
> The collation of such an operator cannot be represented with the current
> Calcite capabilities (RelFieldCollation), because it would not be a "simple"
> field, but a struct field, i.e. we would need a new collation abstraction to
> represent it, e.g. [2.city] or [2.2]
> I would like to open the discussion to see if / how we could find a solution
> to represent this case.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)