Ive commented on the PR with a possible way forward. Its a bit weird
because of the reliance on Spark imports, I would prefer to return names or
Class objects rather than the actual `Procedure` from the iceberg Catalog.
Let me know what you think!
Best,
Ryan
On Mon, Nov 29, 2021 at 2:04 PM Ryan
Yeah, that sounds about right. I'm not sure how we would want to do it
exactly, but I think the catalog would be able to override the procedures
it wants to.
On Fri, Nov 26, 2021 at 9:44 AM Ryan Murray wrote:
> Hey Ryan,
>
> Thanks for the suggestion. That makes a lot of sense. The immediate use
Hey Ryan,
Thanks for the suggestion. That makes a lot of sense. The immediate use
case for Nessie is to supply a nessie-ified version of the expire snapshots
action which considers all branches. tbh this is something that likely can
be (partially) merged to iceberg once the branching/tagging work
Thanks for the additional context. #1 makes more sense now. What are you
thinking about exposing that isn't in the standard set of procedures? Are
there procedures that you'd want for an Iceberg table when using Nessie as
the catalog? Would it work to add a SupportsProcedures interface on the
Icebe
Hey Ryan,
Thanks for the follow up. As I see it the use cases are as follows:
1) add more iceberg specific procedures to the existing catalog. This is
when we are actually operating on iceberg tables (not "to expose custom
stored procedures for non-Iceberg systems")
2) modify existing OSS iceberg
I don’t see the code where the spark extensions can find other procedure
catalogs w/o the user having to configure and reference another catalog.
Yes, that’s right. If other systems want to add stored procedures, then
they would need to add a catalog. Is there a strong use case around adding
more
Hi Ryan Blue and Ryan Murray,
*Thanks for giving your inputs. But I think we still need to conclude on
this.*
@Ryan Blue:
> You shouldn't need to extend Iceberg's SparkCatalog to plug in stored
> procedures. The Iceberg Spark extensions should support stored procedures
> exposed by any catalog p
Thanks Ryan for the response.
Maybe I am misunderstanding here, apologies for that. However, I don't see
the code where the spark extensions can find other procedure catalogs w/o
the user having to configure and reference another catalog.
Thinking about it more I think the goal of this discussion
I think there's a bit of a misunderstanding here. You shouldn't need to
extend Iceberg's SparkCatalog to plug in stored procedures. The Iceberg
Spark extensions should support stored procedures exposed by any catalog
plugin that implements `ProcedureCatalog` across the Spark versions where
Iceberg
Hey Ryan,
What is the timeline for ProcedureCatalog to be moved into Spark and will
it be backported? I agree 100% that its the 'correct' way to go long term
but currently Iceberg has a `static final Map`[1] of valid procedures and
no way for users to customize that. I personally don't love a stat
I think that probably the best way to handle this use case is to have
people implement the Iceberg `ProcedureCatalog` API. That's what we want to
get upstream into Spark and is a really reasonable (and small) addition to
Spark.
The problem with adding pluggable procedures to Iceberg is that it is
11 matches
Mail list logo