Re: Regarding the support of pluggable procedures in Iceberg

2021-11-29 Thread Ryan Murray
Ive commented on the PR with a possible way forward. Its a bit weird because of the reliance on Spark imports, I would prefer to return names or Class objects rather than the actual `Procedure` from the iceberg Catalog. Let me know what you think! Best, Ryan On Mon, Nov 29, 2021 at 2:04 PM Ryan

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-29 Thread Ryan Blue
Yeah, that sounds about right. I'm not sure how we would want to do it exactly, but I think the catalog would be able to override the procedures it wants to. On Fri, Nov 26, 2021 at 9:44 AM Ryan Murray wrote: > Hey Ryan, > > Thanks for the suggestion. That makes a lot of sense. The immediate use

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-26 Thread Ryan Murray
Hey Ryan, Thanks for the suggestion. That makes a lot of sense. The immediate use case for Nessie is to supply a nessie-ified version of the expire snapshots action which considers all branches. tbh this is something that likely can be (partially) merged to iceberg once the branching/tagging work

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-19 Thread Ryan Blue
Thanks for the additional context. #1 makes more sense now. What are you thinking about exposing that isn't in the standard set of procedures? Are there procedures that you'd want for an Iceberg table when using Nessie as the catalog? Would it work to add a SupportsProcedures interface on the Icebe

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-19 Thread Ryan Murray
Hey Ryan, Thanks for the follow up. As I see it the use cases are as follows: 1) add more iceberg specific procedures to the existing catalog. This is when we are actually operating on iceberg tables (not "to expose custom stored procedures for non-Iceberg systems") 2) modify existing OSS iceberg

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-18 Thread Ryan Blue
I don’t see the code where the spark extensions can find other procedure catalogs w/o the user having to configure and reference another catalog. Yes, that’s right. If other systems want to add stored procedures, then they would need to add a catalog. Is there a strong use case around adding more

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-14 Thread Ajantha Bhat
Hi Ryan Blue and Ryan Murray, *Thanks for giving your inputs. But I think we still need to conclude on this.* @Ryan Blue: > You shouldn't need to extend Iceberg's SparkCatalog to plug in stored > procedures. The Iceberg Spark extensions should support stored procedures > exposed by any catalog p

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-12 Thread Ryan Murray
Thanks Ryan for the response. Maybe I am misunderstanding here, apologies for that. However, I don't see the code where the spark extensions can find other procedure catalogs w/o the user having to configure and reference another catalog. Thinking about it more I think the goal of this discussion

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-11 Thread Ryan Blue
I think there's a bit of a misunderstanding here. You shouldn't need to extend Iceberg's SparkCatalog to plug in stored procedures. The Iceberg Spark extensions should support stored procedures exposed by any catalog plugin that implements `ProcedureCatalog` across the Spark versions where Iceberg

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-11 Thread Ryan Murray
Hey Ryan, What is the timeline for ProcedureCatalog to be moved into Spark and will it be backported? I agree 100% that its the 'correct' way to go long term but currently Iceberg has a `static final Map`[1] of valid procedures and no way for users to customize that. I personally don't love a stat

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-10 Thread Ryan Blue
I think that probably the best way to handle this use case is to have people implement the Iceberg `ProcedureCatalog` API. That's what we want to get upstream into Spark and is a really reasonable (and small) addition to Spark. The problem with adding pluggable procedures to Iceberg is that it is

Regarding the support of pluggable procedures in Iceberg

2021-11-10 Thread Ajantha Bhat
Hi Community! If Iceberg provides a capability to plugin procedures, it will be really helpful for users to plugin their own spark actions to handle their business logic around Iceberg tables. So, can we have a mechanism that allows plugging additional implementations of *org.apache.spark.sql.conn