cisaacson opened a new issue, #13994:
URL: https://github.com/apache/datafusion/issues/13994
### Describe the bug
The `supports_filters_pushdown` fn is invoked more than once on the same
Custom Data Source for a query. Further, it is called with different `filters`
list, so it is not consistent. This occurs when the Custom Data Source has a
query where multiple `filters` could qualify for pushdown, but only one is
desired.
I simple query example is:
```sql
SELECT a.c1
FROM a
WHERE a.c2 > 5
AND a.c3 = 'test_string'
```
Of these 2 `filter`s the intent is to pick the better `filter` (and only
one), which in this case would be `a.c3 = 'test_string'` as it would use a
better index in the Custom Data Source.
Sometimes the `supports_filters_pushdown` is sending both `filters`, then it
is called again with just a single `filter`, and it can be the wrong choice as
the `supports_filters_pushdown` fn does not have both `filters` to choose from
the second time. I have seen it vary unpredictably as to when it sends both
`filter`s and when it sends just one (in other words the order is inconsistent).
Then the `scan` fn is always called only once (correct) but it can be called
with the wrong filter.
### To Reproduce
To fully reproduce this I would need to create an example Custom Data
Source. If it cannot be figured out otherwise I'll invest the time and do that.
But in my case it will be a fair amount of work.
### Expected behavior
The `supports_filters_pushdown` should be called exactly once (I believe
that is correct), and it should provide all of the available `filter`s for the
Custom Data Source table. Then `scan` should faithfully provide the `filter`
selected and that one only based on the result `vec` of the
`TableProviderFilterPushDown` values that were returned in
`supports_filters_pushdown`. Otherwise a Custom Data Source cannot control
pushdown `filter`s, a key benefit of DataFusion.
### Additional context
I do not see how my Custom Data Source code could cause this, since the
`supports_filters_pushdown` fn is invoked by DataFusion, not by user code.
I was able to trigger a `panic` on the second call of
`supports_filters_pushdown`, here is the stacktrace. Hopefully this can shed
light on this, to see if it is indeed expected behavior or a bug.
Here is a stack trace I generated from when the supports_filters_pushdown fn
was called a second time on the same data source. This query has 2 filters on
the same table, the first call sent 1 filter, the second sent both filters.
Other times I have seen these reversed, so I suspect a threading issue. Let me
know what you think and if I should file a bug with just the stacktrace.
This is v42, you just released v44, if you want I can try that version.
This definitely would appear to be a bug, you can see it is all DataFusion
code that is in the trace.
```
supports_filters_pushdown was called twice!
0: rust_begin_unwind
at
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/std/src/panicking.rs:647:5
1: core::panicking::panic_fmt
at
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/panicking.rs:72:14
2: <rt_query::fbs_datasource::CompanyDataSource as
datafusion_catalog::table::TableProvider>::supports_filters_pushdown
at
./target/debug/build/rt-query-098625d83c072beb/out/fbs_datasource_generated.rs:3438:13
3: <datafusion::datasource::default_table_source::DefaultTableSource as
datafusion_expr::table_source::TableSource>::supports_filters_pushdown
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/core/src/datasource/default_table_source.rs:70:9
4: <datafusion_optimizer::push_down_filter::PushDownFilter as
datafusion_optimizer::optimizer::OptimizerRule>::rewrite
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/optimizer/src/push_down_filter.rs:876:31
5: datafusion_optimizer::optimizer::optimize_plan_node
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/optimizer/src/optimizer.rs:338:16
6: <datafusion_optimizer::optimizer::Rewriter as
datafusion_common::tree_node::TreeNodeRewriter>::f_down
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/optimizer/src/optimizer.rs:316:13
7: datafusion_common::tree_node::TreeNode::rewrite
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/common/src/tree_node.rs:179:37
8:
datafusion_common::tree_node::TreeNode::rewrite::{{closure}}::{{closure}}
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/common/src/tree_node.rs:179:64
9: datafusion_expr::logical_plan::tree_node::rewrite_arc
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/expr/src/logical_plan/tree_node.rs:387:5
10: datafusion_expr::logical_plan::tree_node::<impl
datafusion_common::tree_node::TreeNode for
datafusion_expr::logical_plan::plan::LogicalPlan>::map_children
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/expr/src/logical_plan/tree_node.rs:83:19
11: datafusion_common::tree_node::TreeNode::rewrite::{{closure}}
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/common/src/tree_node.rs:28:37
12: datafusion_common::tree_node::Transformed<T>::transform_children
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/common/src/tree_node.rs:735:24
13: datafusion_common::tree_node::TreeNode::rewrite
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/common/src/tree_node.rs:179:9
14: datafusion_optimizer::optimizer::Optimizer::optimize
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/optimizer/src/optimizer.rs:388:42
15: datafusion::execution::session_state::SessionState::optimize
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/core/src/execution/session_state.rs:716:13
16:
datafusion::execution::session_state::SessionState::create_physical_plan::{{closure}}
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/core/src/execution/session_state.rs:734:28
17: datafusion::dataframe::DataFrame::create_physical_plan::{{closure}}
at
/Users/coryisaacson/git/rtip-datafusion/datafusion/core/src/dataframe/mod.rs:215:61
18: rt_query_common::query_control::QueryControl::create::{{closure}}
at ./rt-query-common/src/query_control.rs:159:55
19: <rt_ng_db_service::ng_db_service_impl::NgDbSqlServiceImpl as
arrow_flight::sql::server::FlightSqlService>::get_flight_info_statement::{{closure}}
at ./rt-ng-db-service/src/ng_db_service_impl.rs:357:61
20: <core::pin::Pin<P> as core::future::future::Future>::poll
at
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
21: arrow_flight::sql::server::<impl
arrow_flight::gen::flight_service_server::FlightService for
T>::get_flight_info::{{closure}}
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-flight-53.0.0/src/sql/server.rs:610:64
22: <core::pin::Pin<P> as core::future::future::Future>::poll
at
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
23: <<arrow_flight::gen::flight_service_server::FlightServiceServer<T> as
tower_service::Service<http::request::Request<B>>>::call::GetFlightInfoSvc<T>
as
tonic::server::service::UnaryService<arrow_flight::gen::FlightDescriptor>>::call::{{closure}}
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-flight-53.0.0/src/arrow.flight.protocol.rs:1242:88
24: <core::pin::Pin<P> as core::future::future::Future>::poll
at
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
25: tonic::server::grpc::Grpc<T>::unary::{{closure}}
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tonic-0.12.2/src/server/grpc.rs:251:14
26: <arrow_flight::gen::flight_service_server::FlightServiceServer<T> as
tower_service::Service<http::request::Request<B>>>::call::{{closure}}
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-flight-53.0.0/src/arrow.flight.protocol.rs:1264:59
27: <core::pin::Pin<P> as core::future::future::Future>::poll
at
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
28: <F as futures_core::future::TryFuture>::try_poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/future.rs:82:9
29: <futures_util::future::try_future::into_future::IntoFuture<Fut> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/try_future/into_future.rs:34:9
30: <futures_util::future::future::map::Map<Fut,F> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/future/map.rs:55:37
31: <futures_util::future::future::Map<Fut,F> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/lib.rs:91:13
32: <futures_util::future::try_future::MapOk<Fut,F> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/lib.rs:91:13
33: <tower::util::map_response::MapResponseFuture<F,N> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tower-0.4.13/src/macros.rs:38:17
34: <core::pin::Pin<P> as core::future::future::Future>::poll
at
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
35: <tower::util::oneshot::Oneshot<S,Req> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tower-0.4.13/src/util/oneshot.rs:97:38
36: <axum::routing::route::RouteFuture<E> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/axum-0.7.5/src/routing/route.rs:162:61
37: <tonic::service::router::RoutesFuture as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tonic-0.12.2/src/service/router.rs:163:22
38: <tonic::transport::service::grpc_timeout::ResponseFuture<F> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tonic-0.12.2/src/transport/service/grpc_timeout.rs:83:38
39: <tower::util::either::Either<A,B> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tower-0.4.13/src/util/either.rs:71:57
40: <tonic::transport::server::service::recover_error::ResponseFuture<F>
as core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tonic-0.12.2/src/transport/server/service/recover_error.rs:60:20
41: <tonic::transport::server::SvcFuture<F> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tonic-0.12.2/src/transport/server/mod.rs:942:50
42: <core::pin::Pin<P> as core::future::future::Future>::poll
at
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
43: <tower::util::oneshot::Oneshot<S,Req> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tower-0.4.13/src/util/oneshot.rs:97:38
44: <hyper_util::service::TowerToHyperServiceFuture<S,R> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.6/src/service.rs:60:9
45: hyper::proto::h2::server::H2Stream<F,B>::poll2
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-1.4.1/src/proto/h2/server.rs:447:37
46: <hyper::proto::h2::server::H2Stream<F,B> as
core::future::future::Future>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-1.4.1/src/proto/h2/server.rs:537:9
47: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/runtime/task/core.rs:328:17
48: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/loom/std/unsafe_cell.rs:16:9
49: tokio::runtime::task::core::Core<T,S>::poll
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/runtime/task/core.rs:317:13
50: tokio::runtime::task::harness::poll_future::{{closure}}
at
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/runtime/task/harness.rs:485:19
51: <core::panic::unwind_safe::AssertUnwindSafe<F> as
core::ops::function::FnOnce<()>>::call_once
at
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/panic/unwind_safe.rs:272:9
52: std::panicking::try::do_call
at
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/std/src/panicking.rs:554:40
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]