berkaysynnada commented on issue #15111: URL: https://github.com/apache/datafusion/issues/15111#issuecomment-2724593747
> [@berkaysynnada](https://github.com/berkaysynnada) is there any particular problem (like you are trying to implement some feature that you can not) you are trying to solve with this ticket? It doesn’t block anything, but the placement and dependencies seem strange, and this inconsistency extends to other custom source and sink implementations used by downstream users. For example, `DataSinkExec` currently resides in the `physical-plan `crate, while `DataSourceExec` is in the `datasource` crate. If someone (like us) wants to introduce two different Sink and Source definitions that depend on some common utilities (which is likely), they are forced to implement the Sink part in the datasource crate as well (conflicting with the DataSinkExec and DataSourceExec). If the Sink were implemented in physical-plan, it wouldn't have access to the datasource utilities. -------------------------------------------------------------------------------- @jayzhan211: > Given that the physical plan is the more high level crate to me. I think the existing dependency makes more sense. I am not proposing reverting the dependency (I should change the title), but we need a common crate: ``` datasource ->(depends) physical-plan-common physical-plan -> physical-plan-common physical-plan -> datasource ``` Do you think that there is a major blocker for this? > physical plan is the more high level crate to me I've also thought on this. Conceptually, it seems like you said, all datasources are physical-plan, but not inverse. So, having datasources depending physical-plan make more sense, but let's consider from this perspective: We use a tree representation having nodes accessing their children's, not parents (top-down approach) for ExecutionPlan's. So, if you imagine a typical physical-plan, at the end, it needs to keep its source exec, which is DataSourceExec. So, any physical-plan should see its data source to store it. However, datasources cannot access any physical-plan, as they do not need to access its parents. However, they just need to see the trait definition. So, that's why I propose to have a `-common` crate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org