berkaysynnada commented on issue #15111:
URL: https://github.com/apache/datafusion/issues/15111#issuecomment-2724593747

   > [@berkaysynnada](https://github.com/berkaysynnada) is there any particular 
problem (like you are trying to implement some feature that you can not) you 
are trying to solve with this ticket?
   
   It doesn’t block anything, but the placement and dependencies seem strange, 
and this inconsistency extends to other custom source and sink implementations 
used by downstream users.
   
   For example, `DataSinkExec` currently resides in the `physical-plan `crate, 
while `DataSourceExec` is in the `datasource` crate. If someone (like us) wants 
to introduce two different Sink and Source definitions that depend on some 
common utilities (which is likely), they are forced to implement the Sink part 
in the datasource crate as well (conflicting with the DataSinkExec and 
DataSourceExec). If the Sink were implemented in physical-plan, it wouldn't 
have access to the datasource utilities.
   
   
--------------------------------------------------------------------------------
   
   @jayzhan211:
   > Given that the physical plan is the more high level crate to me. I think 
the existing dependency makes more sense.
   
   I am not proposing reverting the dependency (I should change the title), but 
we need a common crate:
   
   ```
   datasource ->(depends) physical-plan-common  
   physical-plan -> physical-plan-common 
   physical-plan -> datasource
   ```
   Do you think that there is a major blocker for this?
   
   > physical plan is the more high level crate to me
   I've also thought on this. Conceptually, it seems like you said, all 
datasources are physical-plan, but not inverse. So, having datasources 
depending physical-plan make more sense, but let's consider from this 
perspective: We use a tree representation having nodes accessing their 
children's, not parents (top-down approach) for ExecutionPlan's. So, if you 
imagine a typical physical-plan, at the end, it needs to keep its source exec, 
which is DataSourceExec. So, any physical-plan should see its data source to 
store it. However, datasources cannot access any physical-plan, as they do not 
need to access its parents. However, they just need to see the trait 
definition. So, that's why I propose to have a  `-common` crate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to