alamb commented on PR #14224: URL: https://github.com/apache/datafusion/pull/14224#issuecomment-2639896008
I found with this small change I could rewrite our optimizer passes fairly easily (it also seems to not change this particular PR). We can fix as a follow on PR ```diff diff --git a/datafusion/physical-plan/src/source.rs b/datafusion/physical-plan/src/source.rs index 0b7e6dd03..6f57870ce 100644 --- a/datafusion/physical-plan/src/source.rs +++ b/datafusion/physical-plan/src/source.rs @@ -176,8 +176,8 @@ impl DataSourceExec { /// Return the source object #[allow(unused)] - pub fn source(&self) -> Arc<dyn DataSource> { - Arc::clone(&self.source) + pub fn source(&self) -> &Arc<dyn DataSource> { + &self.source } ``` # Rationale We have a bunch of code like this in influxdb_iox for rewriting ParquetExec in various ways: https://github.com/influxdata/influxdb3_core/blob/a5f6076c966f4940a67998e0b85d12c3e8596715/iox_query/src/physical_optimizer/cached_parquet_data.rs#L52-L88 To make this easier I am trying to replicate the pattern of `plan.as_any().downcast_ref::<ParquetExec>` and I came up with this: (However, the rust lifetime rules prevented this from compiling without the change above ```rust /// A view of a [`DataSourceExec`] that scans parquet files pub struct ParquetExecWrapper<'a> { base_config: &'a FileScanConfig, parquet_source: &'a ParquetSource, } impl<'a> ParquetExecWrapper<'a> { /// Create a new wrapper from a [`ExecutionPlan`], returning `None` if the /// plan is not a [`DataSourceExec`] that scans Parquet files pub fn try_new(plan: &'a dyn ExecutionPlan) -> Option<Self> { let plan = plan.as_ref(); let data_source_exec = plan.as_any().downcast_ref::<DataSourceExec>()?; let base_config = data_source_exec .source() .as_any() .downcast_ref::<FileScanConfig>()?; let parquet_source = base_config .file_source() .as_any() .downcast_ref::<ParquetSource>()?; Some(Self { base_config, parquet_source, }) } /// Return the underlying [`FileScanConfig`] pub fn base_config(&self) -> &'a FileScanConfig { self.base_config } /// Return the underlying [`ParquetSource`] pub fn parquet_source(&self) -> &'a ParquetSource { self.parquet_source } } ``` Otherwise things are going well -- I am making good progress -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org