ahirner opened a new issue, #10935:
URL: https://github.com/apache/datafusion/issues/10935
### Is your feature request related to a problem or challenge?
Currently where clauses in a sub query are not pushed below `Unnest`
expressions. In conjunction with some other limitations, it composing views
with `unnest()` not ideal.
In `0.39.0` cli:
```
> CREATE TABLE IF NOT EXISTS v AS VALUES(1,[1,2,3]),(2,[3,4,5]);
0 row(s) fetched.
Elapsed 0.008 seconds.
```
We get equal results with or w/o a subquery:
```
> select unnest(column2) from v where column1=2;
+-------------------+
| unnest(v.column2) |
+-------------------+
| 3 |
| 4 |
| 5 |
+-------------------+
3 row(s) fetched.
Elapsed 0.011 seconds.
```
```
> select "unnest(v.column2)" from (select unnest(column2), column1 from v)
where column1=2;
+-------------------+
| unnest(v.column2) |
+-------------------+
| 3 |
| 4 |
| 5 |
+-------------------+
3 row(s) fetched.
Elapsed 0.015 seconds.
```
`Filter` and `FilterExec` are above projections with `unnest`, instead of
below.
Ok:
```
> explain select unnest(column2) from v where column1=2;
+---------------+------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+------------------------------------------------------------------------+
| logical_plan | Unnest: lists[unnest(v.column2)] structs[]
|
| | Projection: v.column2 AS unnest(v.column2)
|
| | Filter: v.column1 = Int64(2)
|
| | TableScan: v projection=[column1, column2]
|
| physical_plan | UnnestExec
|
| | RepartitionExec: partitioning=RoundRobinBatch(8),
input_partitions=1 |
| | ProjectionExec: expr=[column2@1 as unnest(v.column2)]
|
| | CoalesceBatchesExec: target_batch_size=8192
|
| | FilterExec: column1@0 = 2
|
| | MemoryExec: partitions=1, partition_sizes=[1]
|
| |
|
+---------------+------------------------------------------------------------------------+
```
Problematic:
```
> explain select "unnest(v.column2)" from (select unnest(column2), column1
from v) where column1=2;
+---------------+---------------------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+---------------------------------------------------------------------------------------+
| logical_plan | Projection: unnest(v.column2)
|
| | Filter: v.column1 = Int64(2)
|
| | Unnest: lists[unnest(v.column2)] structs[]
|
| | Projection: v.column2 AS unnest(v.column2),
v.column1 |
| | TableScan: v projection=[column1, column2]
|
| physical_plan | ProjectionExec: expr=[unnest(v.column2)@0 as
unnest(v.column2)] |
| | CoalesceBatchesExec: target_batch_size=8192
|
| | FilterExec: column1@1 = 2
|
| | RepartitionExec: partitioning=RoundRobinBatch(8),
input_partitions=1 |
| | UnnestExec
|
| | ProjectionExec: expr=[column2@1 as
unnest(v.column2), column1@0 as column1] |
| | MemoryExec: partitions=1, partition_sizes=[1]
|
| |
|
+---------------+---------------------------------------------------------------------------------------+
```
### Describe the solution you'd like
`FilterExec` is pushed down to tables that support them.
### Describe alternatives you've considered
A workaround is trying to not use subqueries. However, then one cannot unest
structs with multiple lists, or unnest a column more than once. Those are also
unsupported.
```
> select unnest(unnest([[1],[1,2],[1,2,3]]));
type_coercion
caused by
This feature is not implemented: Unnest should be rewritten to
LogicalPlan::Unnest before type coercion
```
```
> select unnest(column2) as x, unnest(column2) as y from v;
Error during planning: Projections require unique expression names but the
expression "v.column2 AS unnest(v.column2)" at position 0 and "v.column2 AS
unnest(v.column2)" at position 1 have the same name. Consider aliasing ("AS")
one of them.
```
(note: a real case would be something like `select
unnest(named_struct)['foo'] as foo, unnest(named_struct)['bar'] as bar`).
### Additional context
Maybe related: #5364
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]