lostmygithubaccount opened a new issue, #10830:
URL: https://github.com/apache/datafusion/issues/10830
### Describe the bug
in running some benchmarking, was getting errors like:
```
Exception: This feature is not implemented: Physical plan does not support
logical expression InSubquery(InSubquery { expr: Column(Column { relation:
None, name: "ps_suppkey" }), subquery: <subquery>, negated: false })
```
tracking this down shows `x NOT IN y` works but `NOT (x IN y) doesn't`
### To Reproduce
probably a more succinct way, but:
```python
import datafusion
ctx = datafusion.SessionContext()
tables = {
"part": {
"p_partkey": [1, 2, 3, 4, 5],
"p_brand": ["Brand#45", "Brand#45", "Brand#46", "Brand#47",
"Brand#48"],
"p_type": ["MEDIUM POLISHED", "MEDIUM POLISHED", "MEDIUM POLISHED",
"MEDIUM POLISHED", "MEDIUM POLISHED"],
"p_size": [49, 14, 23, 45, 19],
},
"partsupp": {
"ps_partkey": [1, 2, 3, 4, 5],
"ps_suppkey": [1, 2, 3, 4, 5],
},
"supplier": {
"s_suppkey": [1, 2, 3, 4, 5],
"s_comment": ["Customer Complaints", "Customer Complaints",
"Customer Complaints", "Customer Complaints", "Customer Complaints"],
},
}
for table_name, data in tables.items():
ctx.from_pydict(data, name=table_name)
sqlA = """
select
p_brand,
p_type,
p_size,
count(distinct ps_suppkey) as supplier_cnt
from
partsupp,
part
where
p_partkey = ps_partkey
and p_brand <> 'Brand#45'
and p_type not like 'MEDIUM POLISHED%'
and p_size in (49, 14, 23, 45, 19, 3, 36, 9)
and ps_suppkey not in (
select
s_suppkey
from
supplier
where
s_comment like '%Customer%Complaints%'
)
group by
p_brand,
p_type,
p_size
order by
supplier_cnt desc,
p_brand,
p_type,
p_size;
"""
ctx.sql(sqlA).collect() # this is fine
sqlB = """
SELECT
p_brand,
p_type,
p_size,
COUNT(DISTINCT ps_suppkey) AS supplier_cnt
FROM partsupp, part
WHERE
p_partkey = ps_partkey
AND p_brand <> 'Brand#45'
AND NOT p_type LIKE 'MEDIUM POLISHED%'
AND p_size IN (49, 14, 23, 45, 19, 3, 36, 9)
AND NOT ps_suppkey IN (
SELECT
s_suppkey
FROM supplier
WHERE
s_comment LIKE '%Customer%Complaints%'
)
GROUP BY
p_brand,
p_type,
p_size
ORDER BY
supplier_cnt DESC NULLS LAST,
p_brand,
p_type,
p_size;
"""
ctx.sql(sqlB).collect() # this is not fine
# Exception: This feature is not implemented: Physical plan does not support
logical expression InSubquery(InSubquery { expr: Column(Column { relation:
Some(Bare { table: "partsupp" }), name: "ps_suppkey" }), subquery: <subquery>,
negated: false })
```
```
### Expected behavior
query works in both cases
### Additional context
SQLGlot makes the above transformation that should be valid, which is how I
hit this initially
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]