Re: [DISCUSS] SQL syntax extensions

2020-08-25 Thread Russell Spitzer
I think the moment we start touching catalyst we should be using Scala. If in the future there is a stored procedure api in Spark we can always go back to Java. On Tue, Aug 25, 2020, 4:59 PM Anton Okolnychyi wrote: > One more point we should clarify before implementing: where will the SQL > exte

Re: [DISCUSS] SQL syntax extensions

2020-08-25 Thread Anton Okolnychyi
One more point we should clarify before implementing: where will the SQL extensions live? In case of Presto, the extensions will be exposed as proper stored procedures and can be part of the Presto repo. In case of Spark, we could either keep them in a new module in Iceberg or in a completely di

Re: [DISCUSS] SQL syntax extensions

2020-08-04 Thread Anton Okolnychyi
During the last sync we discussed a blocker for this work raised by Carl. It was unclear how role-based control will work in the proposed approach. Specifically, how to ensure that user `X` not only has access to a stored procedure but is also allowed to execute it on table `T` where table name

Re: [DISCUSS] SQL syntax extensions

2020-07-29 Thread Ryan Blue
That looks like a good plan to me. Initially using stored procedures and adding custom syntax where possible sounds like a good way to start. For Spark, I agree that we can start exploring a plugin that can extend Spark's syntax. Having that done will make development faster and make it easier to

Re: [DISCUSS] SQL syntax extensions

2020-07-27 Thread Anton Okolnychyi
Thanks everybody for taking a look at the doc. FYI, I’ve updated it. I would like to share some intermediate thoughts. 1. It seems beneficial to follow the stored procedures approach to call small actions like rollback or expire snapshots. Presto already allows connectors to define stored proce

[DISCUSS] SQL syntax extensions

2020-07-23 Thread Anton Okolnychyi
Hi devs, I want to start a discussion on whether we want to have some SQL extensions in Iceberg that should help data engineers to invoke Iceberg-specific functionality through SQL. I know companies have this internally but I would like to unify this starting from Spark 3 and share the same syn