cisaacson opened a new issue, #13994:
URL: https://github.com/apache/datafusion/issues/13994

   ### Describe the bug
   
   The `supports_filters_pushdown` fn is invoked more than once on the same 
Custom Data Source for a query. Further, it is called with different `filters` 
list, so it is not consistent. This occurs when the Custom Data Source has a 
query where multiple `filters` could qualify for pushdown, but only one is 
desired.
   
   I simple query example is:
   
   ```sql
   SELECT a.c1
   FROM a
   WHERE a.c2 > 5
     AND a.c3 = 'test_string'
   ```
   
   Of these 2 `filter`s the intent is to pick the better `filter` (and only 
one), which in this case would be `a.c3 = 'test_string'` as it would use a 
better index in the Custom Data Source.
   
   Sometimes the `supports_filters_pushdown` is sending both `filters`, then it 
is called again with just a single `filter`, and it can be the wrong choice as 
the `supports_filters_pushdown` fn does not have both `filters` to choose from 
the second time. I have seen it vary unpredictably as to when it sends both 
`filter`s and when it sends just one (in other words the order is inconsistent).
   
   Then the `scan` fn is always called only once (correct) but it can be called 
with the wrong filter. 
   
   ### To Reproduce
   
   To fully reproduce this I would need to create an example Custom Data 
Source. If it cannot be figured out otherwise I'll invest the time and do that. 
But in my case it will be a fair amount of work.
   
   ### Expected behavior
   
   The `supports_filters_pushdown` should be called exactly once (I believe 
that is correct), and it should provide all of the available `filter`s for the 
Custom Data Source table. Then `scan` should faithfully provide the `filter` 
selected and that one only based on the result `vec` of the 
`TableProviderFilterPushDown` values that were returned in 
`supports_filters_pushdown`. Otherwise a Custom Data Source cannot control 
pushdown `filter`s, a key benefit of DataFusion.
   
   ### Additional context
   
   I do not see how my Custom Data Source code could cause this, since the 
`supports_filters_pushdown` fn is invoked by DataFusion, not by user code.
   
   I was able to trigger a `panic` on the second call of 
`supports_filters_pushdown`, here is the stacktrace. Hopefully this can shed 
light on this, to see if it is indeed expected behavior or a bug.
   
   Here is a stack trace I generated from when the supports_filters_pushdown fn 
was called a second time on the same data source. This query has 2 filters on 
the same table, the first call sent 1 filter, the second sent both filters. 
Other times I have seen these reversed, so I suspect a threading issue. Let me 
know what you think and if I should file a bug with just the stacktrace.
   
   This is v42, you just released v44, if you want I can try that version.
   
   This definitely would appear to be a bug, you can see it is all DataFusion 
code that is in the trace.
   
   ```
   supports_filters_pushdown was called twice!
      0: rust_begin_unwind
                at 
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/std/src/panicking.rs:647:5
      1: core::panicking::panic_fmt
                at 
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/panicking.rs:72:14
      2: <rt_query::fbs_datasource::CompanyDataSource as 
datafusion_catalog::table::TableProvider>::supports_filters_pushdown
                at 
./target/debug/build/rt-query-098625d83c072beb/out/fbs_datasource_generated.rs:3438:13
      3: <datafusion::datasource::default_table_source::DefaultTableSource as 
datafusion_expr::table_source::TableSource>::supports_filters_pushdown
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/core/src/datasource/default_table_source.rs:70:9
      4: <datafusion_optimizer::push_down_filter::PushDownFilter as 
datafusion_optimizer::optimizer::OptimizerRule>::rewrite
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/optimizer/src/push_down_filter.rs:876:31
      5: datafusion_optimizer::optimizer::optimize_plan_node
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/optimizer/src/optimizer.rs:338:16
      6: <datafusion_optimizer::optimizer::Rewriter as 
datafusion_common::tree_node::TreeNodeRewriter>::f_down
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/optimizer/src/optimizer.rs:316:13
      7: datafusion_common::tree_node::TreeNode::rewrite
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/common/src/tree_node.rs:179:37
      8: 
datafusion_common::tree_node::TreeNode::rewrite::{{closure}}::{{closure}}
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/common/src/tree_node.rs:179:64
      9: datafusion_expr::logical_plan::tree_node::rewrite_arc
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/expr/src/logical_plan/tree_node.rs:387:5
     10: datafusion_expr::logical_plan::tree_node::<impl 
datafusion_common::tree_node::TreeNode for 
datafusion_expr::logical_plan::plan::LogicalPlan>::map_children
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/expr/src/logical_plan/tree_node.rs:83:19
     11: datafusion_common::tree_node::TreeNode::rewrite::{{closure}}
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/common/src/tree_node.rs:28:37
     12: datafusion_common::tree_node::Transformed<T>::transform_children
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/common/src/tree_node.rs:735:24
     13: datafusion_common::tree_node::TreeNode::rewrite
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/common/src/tree_node.rs:179:9
     14: datafusion_optimizer::optimizer::Optimizer::optimize
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/optimizer/src/optimizer.rs:388:42
     15: datafusion::execution::session_state::SessionState::optimize
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/core/src/execution/session_state.rs:716:13
     16: 
datafusion::execution::session_state::SessionState::create_physical_plan::{{closure}}
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/core/src/execution/session_state.rs:734:28
     17: datafusion::dataframe::DataFrame::create_physical_plan::{{closure}}
                at 
/Users/coryisaacson/git/rtip-datafusion/datafusion/core/src/dataframe/mod.rs:215:61
     18: rt_query_common::query_control::QueryControl::create::{{closure}}
                at ./rt-query-common/src/query_control.rs:159:55
     19: <rt_ng_db_service::ng_db_service_impl::NgDbSqlServiceImpl as 
arrow_flight::sql::server::FlightSqlService>::get_flight_info_statement::{{closure}}
                at ./rt-ng-db-service/src/ng_db_service_impl.rs:357:61
     20: <core::pin::Pin<P> as core::future::future::Future>::poll
                at 
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
     21: arrow_flight::sql::server::<impl 
arrow_flight::gen::flight_service_server::FlightService for 
T>::get_flight_info::{{closure}}
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-flight-53.0.0/src/sql/server.rs:610:64
     22: <core::pin::Pin<P> as core::future::future::Future>::poll
                at 
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
     23: <<arrow_flight::gen::flight_service_server::FlightServiceServer<T> as 
tower_service::Service<http::request::Request<B>>>::call::GetFlightInfoSvc<T> 
as 
tonic::server::service::UnaryService<arrow_flight::gen::FlightDescriptor>>::call::{{closure}}
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-flight-53.0.0/src/arrow.flight.protocol.rs:1242:88
     24: <core::pin::Pin<P> as core::future::future::Future>::poll
                at 
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
     25: tonic::server::grpc::Grpc<T>::unary::{{closure}}
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tonic-0.12.2/src/server/grpc.rs:251:14
     26: <arrow_flight::gen::flight_service_server::FlightServiceServer<T> as 
tower_service::Service<http::request::Request<B>>>::call::{{closure}}
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-flight-53.0.0/src/arrow.flight.protocol.rs:1264:59
     27: <core::pin::Pin<P> as core::future::future::Future>::poll
                at 
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
     28: <F as futures_core::future::TryFuture>::try_poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/future.rs:82:9
     29: <futures_util::future::try_future::into_future::IntoFuture<Fut> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/try_future/into_future.rs:34:9
     30: <futures_util::future::future::map::Map<Fut,F> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/future/map.rs:55:37
     31: <futures_util::future::future::Map<Fut,F> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/lib.rs:91:13
     32: <futures_util::future::try_future::MapOk<Fut,F> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/lib.rs:91:13
     33: <tower::util::map_response::MapResponseFuture<F,N> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tower-0.4.13/src/macros.rs:38:17
     34: <core::pin::Pin<P> as core::future::future::Future>::poll
                at 
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
     35: <tower::util::oneshot::Oneshot<S,Req> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tower-0.4.13/src/util/oneshot.rs:97:38
     36: <axum::routing::route::RouteFuture<E> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/axum-0.7.5/src/routing/route.rs:162:61
     37: <tonic::service::router::RoutesFuture as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tonic-0.12.2/src/service/router.rs:163:22
     38: <tonic::transport::service::grpc_timeout::ResponseFuture<F> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tonic-0.12.2/src/transport/service/grpc_timeout.rs:83:38
     39: <tower::util::either::Either<A,B> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tower-0.4.13/src/util/either.rs:71:57
     40: <tonic::transport::server::service::recover_error::ResponseFuture<F> 
as core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tonic-0.12.2/src/transport/server/service/recover_error.rs:60:20
     41: <tonic::transport::server::SvcFuture<F> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tonic-0.12.2/src/transport/server/mod.rs:942:50
     42: <core::pin::Pin<P> as core::future::future::Future>::poll
                at 
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/future/future.rs:124:9
     43: <tower::util::oneshot::Oneshot<S,Req> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tower-0.4.13/src/util/oneshot.rs:97:38
     44: <hyper_util::service::TowerToHyperServiceFuture<S,R> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.6/src/service.rs:60:9
     45: hyper::proto::h2::server::H2Stream<F,B>::poll2
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-1.4.1/src/proto/h2/server.rs:447:37
     46: <hyper::proto::h2::server::H2Stream<F,B> as 
core::future::future::Future>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-1.4.1/src/proto/h2/server.rs:537:9
     47: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/runtime/task/core.rs:328:17
     48: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/loom/std/unsafe_cell.rs:16:9
     49: tokio::runtime::task::core::Core<T,S>::poll
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/runtime/task/core.rs:317:13
     50: tokio::runtime::task::harness::poll_future::{{closure}}
                at 
/Users/coryisaacson/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/runtime/task/harness.rs:485:19
     51: <core::panic::unwind_safe::AssertUnwindSafe<F> as 
core::ops::function::FnOnce<()>>::call_once
                at 
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/panic/unwind_safe.rs:272:9
     52: std::panicking::try::do_call
                at 
/rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/std/src/panicking.rs:554:40
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to