Re: Delta Lake support for DataFusion

2021-06-10 Thread Jorge Cardoso Leitão
Hi, I agree with all of you. ^_^ I created https://github.com/apache/arrow-datafusion/issues/533 to track this. I tried to encapsulate the three main use-cases for the SQL extension. Feel free to edit at will. Best, Jorge On Thu, Jun 10, 2021 at 8:37 AM QP Hou wrote: > Thanks Daniël for st

Re: Delta Lake support for DataFusion

2021-06-09 Thread QP Hou
Thanks Daniël for starting the discussion! Looks like we are on the same page to take this as an opportunity to make datafusion more extensible :) I think Neville and Daniël nailed the biggest missing piece at the moment: being able to extend SQL parser and planner with new syntaxes and map them

Re: Delta Lake support for DataFusion

2021-06-09 Thread Andrew Lamb
> And probably some more I don't think of currently. I think this is useful > work as it also would enable other "extensions" to work in a similar way I 100% agree On Wed, Jun 9, 2021 at 2:30 PM Daniël Heres wrote: > Thanks all for the valuable input! > > I agree following the plugin / model ma

Re: Delta Lake support for DataFusion

2021-06-09 Thread Daniël Heres
Thanks all for the valuable input! I agree following the plugin / model makes a lot of sense for now (either in arrow-datafusion repo or somewhere external, for example in delta-rs if we're OK it not being part of Apache right now). In order to support certain Delta Lake features including SQL sy

Re: Delta Lake support for DataFusion

2021-06-09 Thread Neville Dipale
The correct approach might be to improve DataFusion support in delta-rs. TableProvider is already implemented here: https://github.com/delta-io/delta-rs/blob/main/rust/src/delta_datafusion.rs I've pinged QP to ask for their advice. Neville On Wed, 9 Jun 2021 at 19:58, Andrew Lamb wrote: > I th

Re: Delta Lake support for DataFusion

2021-06-09 Thread Andrew Lamb
I think the idea of DataFusion + DeltaLake is quite compelling and likely useful. However, I think DataFusion is ideally an "embeddable query engine" rather than a database system in itself, so in that mental model Delta Lake integration belongs somewhere other than the core DataFusion crate. My

Re: Delta Lake support for DataFusion

2021-06-09 Thread Jorge Cardoso Leitão
Hi, Some questions that come to mind: 1. If we add vendor X to datafusion, will we be open to other vendor Y? How do we compare vendors? How do we draw the line of "not sufficiently relevant"? 2. How do we ensure that we do not distort the same level playing field that some people expect from Dat