vbarua commented on PR #13888:
URL: https://github.com/apache/datafusion/pull/13888#issuecomment-2560209539

   There are already some tests for TPCH functionality in 
https://github.com/apache/datafusion/blob/main/datafusion/substrait/tests/cases/consumer_integration.rs,
 but IMO those are weaker than what you're proposing because they only verify 
that we can read the TPCH plans that have been generated in 
https://github.com/substrait-io/consumer-testing/tree/7c1f5f1876f00c2685f722b592dbd00030662d5d/substrait_consumer/testdata/integration/tpch
   
   The full roundtrip you're proposing is more comprehensive and lets us find 
bugs in both consumer and producer. 
   
   > In most cases it looks like we need to support more of the LogicalPlan 
nodes when serializing to substrait.
   
   It looks like your approach has already found some missing features 🐛 🔨 
   
   > After this the recreated plan should be the same
   
   In many cases plans roundtrip the same, but not necessarily in every case. 
DataFusion might consume a plan and then emit a plan that is semantically 
equivalent but has different nodes. For example, all emit kind remaps might be 
converted into projections. It is a desirable property that plans roundtrip the 
same, and probably something we can work towards in most cases.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to