Thanks Yufei and everyone else that participated in the discussion. I'll open a vote shortly to get this moving.
On Tue, Aug 20, 2024 at 8:04 PM Yufei Gu <flyrain...@gmail.com> wrote: > I'm fine with the approach to present the endpoint ID as a plain string. > For any extension in the future, we can add a new field parallel with the > endpoint array. Thanks a lot for driving this, Edward. > > Yufei > > > > On Tue, Aug 20, 2024 at 8:05 AM Eduard Tudenhöfner < > etudenhoef...@apache.org> wrote: > >> @Karuppayya: I've answered your question in the doc. >> >> @Yufei: Part of the reason why the proposed format is so simple is >> because there were different opinions about what level of granularity we >> should have in the context of REST capabilities, which then led to this >> version. >> It seems people are generally onboard with the simple approach of using >> *"<httpVerb> >> <resourcePath>" * so maybe we should go with this, wdyt @Yufei? >> >> >> >> On Fri, Aug 16, 2024 at 6:50 PM Yufei Gu <flyrain...@gmail.com> wrote: >> >>> I’m OK with using a plain string for the endpoint ID, as described in >>> doc[1]. However, I’ve been thinking about how we can make this more >>> flexible, especially since we’ve had quite a few discussions about >>> granularity. >>> >>> For instance, if we expose a bit more of the API specification, we might >>> not need some of the feature flags. >>> >>> It might be worthwhile to consider adopting the OpenAPI path object >>> schema[2] if we decide to add more structure. Since Iceberg is already >>> using OpenAPI to define the REST specification and will continue to do so, >>> this approach could make future extensions easier. >>> >>> Looking forward to your thoughts on this! >>> >>> Ref: >>> >>> 1. API Endpoints >>> <https://learn.openapis.org/specification/paths.html#api-endpoints> >>> 2. Path Object <https://spec.openapis.org/oas/v3.1.0#paths-object> >>> >>> Yufei >>> >>> >>> On Fri, Aug 16, 2024 at 8:58 AM karuppayya <karuppayya1...@gmail.com> >>> wrote: >>> >>>> Can we consider adding `OPTIONS` verb to the resource paths, as part of >>>> the spec? >>>> That way the endpoint discovery endpoint could return only the list of >>>> supported endpoints, without the verbs. >>>> >>>> `OPTIONS` on the resource path can return the list of all supported >>>> verbs, and also other information regarding API usage >>>> >>>> - Karuppayya >>>> >>>> On Fri, Aug 16, 2024 at 7:34 AM Dmitri Bourlatchkov >>>> <dmitri.bourlatch...@dremio.com.invalid> wrote: >>>> >>>>> Sorry for a bit of back-tracking, but I'd like to clarify the meaning >>>>> of those endpoint strings. >>>>> >>>>> I initially assumed that the strings would need to be parsed into >>>>> components (verb / path) for use in runtime. My suggestion for using a >>>>> JSON >>>>> representation was meant to make the parsing more standard (i.e. avoid >>>>> custom format). >>>>> >>>>> After reading the impl. PR [1] I think parsing is actually _not_ >>>>> necessary. The whole endpoint string is basically another form of an >>>>> endpoint "ID". >>>>> >>>>> If we agree that the values in the new endpoints config list should be >>>>> treated as opaque IDs, I think it should be fine to use the simple string >>>>> representation. >>>>> >>>>> WDYT? >>>>> >>>>> Thanks, >>>>> Dmitri. >>>>> >>>>> [1] https://github.com/apache/iceberg/pull/10929 >>>>> >>>>> On Fri, Aug 16, 2024 at 4:43 AM Eduard Tudenhöfner < >>>>> etudenhoef...@apache.org> wrote: >>>>> >>>>>> If we really want to add more structure to the JSON representation, >>>>>> then I would probably prefer what Dmitri suggested in the doc as I think >>>>>> { "GET": {}, "POST": {} } looks a bit weird: >>>>>> >>>>>> "endpoints":[ >>>>>> {"verb": "GET","path": "/v1/{prefix}/namespaces/{namespace}"}, >>>>>> {"verb": "GET","path": "/v1/{prefix}/namespaces"}, >>>>>> {"verb": "POST","path": "/v1/{prefix}/namespaces"} >>>>>> ] >>>>>> >>>>>> What do people think of that? >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Aug 16, 2024 at 8:13 AM Walaa Eldin Moustafa < >>>>>> wa.moust...@gmail.com> wrote: >>>>>> >>>>>>> Thank you Eduard for sharing this version of the proposal. Looks >>>>>>> simple, functional, and extensible. >>>>>>> >>>>>>> On Thu, Aug 15, 2024 at 1:10 PM Ryan Blue >>>>>>> <b...@databricks.com.invalid> wrote: >>>>>>> >>>>>>>> I think I'm fine either way. I lean toward the simplicity of the >>>>>>>> strings in the proposal but would not complain if we went with Yufei's >>>>>>>> suggestion. >>>>>>>> >>>>>>>> On Thu, Aug 15, 2024 at 12:12 PM Yufei Gu <flyrain...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> The current proposal lists endpoints as plain strings, and I still >>>>>>>>> believe we could make things a bit smoother by adding some structure >>>>>>>>> to >>>>>>>>> them. Here's the example if the previous one throws you off. >>>>>>>>> >>>>>>>>> *Before:* >>>>>>>>> >>>>>>>>> "GET /v1/{prefix}/namespaces","POST /v1/{prefix}/namespaces","GET >>>>>>>>> /v1/{prefix}/namespaces/{namespace}","DELETE >>>>>>>>> /v1/{prefix}/namespaces/{namespace}" >>>>>>>>> >>>>>>>>> *After:* >>>>>>>>> >>>>>>>>> {"/v1/{prefix}/namespaces": { "GET": {}, "POST": {} } >>>>>>>>> },{"/v1/{prefix}/namespaces/{namespace}": { "GET": {}, "DELETE": {} } >>>>>>>>> } >>>>>>>>> >>>>>>>>> I think this approach would make the definitions more >>>>>>>>> machine-readable and easier to expand in the future. It's always good >>>>>>>>> to >>>>>>>>> plan for potential changes, and this structure should help us adapt >>>>>>>>> without >>>>>>>>> much hassle. >>>>>>>>> Plus, we could use a trimmed OpenAPI schema for this, so we >>>>>>>>> wouldn’t need to create a new schema from scratch. >>>>>>>>> >>>>>>>>> Yufei >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Aug 15, 2024 at 11:02 AM Jack Ye <yezhao...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> > But I propose to use a trimmed openAPI's format directly. >>>>>>>>>> Looking at the example, this feels quite complicated to me. >>>>>>>>>> >>>>>>>>>> > For example, it is easier if we want to include operationID >>>>>>>>>> I don't think we need to consider accommodating both, since >>>>>>>>>> operationId is an alternative to "<HTTP VERB> <resource path from >>>>>>>>>> REST >>>>>>>>>> spec>". >>>>>>>>>> >>>>>>>>>> > or adding feature flags, or adding parameters >>>>>>>>>> For more advanced feature flags, those could likely directly go >>>>>>>>>> to overrides just like today, but given we limited the scope to >>>>>>>>>> service >>>>>>>>>> endpoint discovery, I think that is another discussion later. >>>>>>>>>> >>>>>>>>>> So the current proposed solution from Eduard still feels better >>>>>>>>>> to me. And I think the argument for ambiguity is pretty strong so I >>>>>>>>>> am good >>>>>>>>>> with the proposed approach to use "<HTTP VERB> <resource path from >>>>>>>>>> REST >>>>>>>>>> spec>". >>>>>>>>>> >>>>>>>>>> -Jack >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Aug 15, 2024 at 9:46 AM Yufei Gu <flyrain...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> +1 for the proposal. In terms of the format, the current >>>>>>>>>>> solution is simple enough. But I propose to use a trimmed openAPI's >>>>>>>>>>> format >>>>>>>>>>> directly. It won't add much cost as we can just take the minimum >>>>>>>>>>> fields we >>>>>>>>>>> want. But it opens a window to extend it in the future. For >>>>>>>>>>> example, it is >>>>>>>>>>> easier if we want to include operationID, or adding feature flags, >>>>>>>>>>> or >>>>>>>>>>> adding parameters. Here is an example: >>>>>>>>>>> >>>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>> "resources": [ >>>>>>>>>>> >>>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>> "/v1/{prefix}/namespaces": >>>>>>>>>>> >>>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>> "GET": { >>>>>>>>>>> >>>>>>>>>>> "description": "List all namespaces" >>>>>>>>>>> >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>>> >>>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>> "POST": { >>>>>>>>>>> >>>>>>>>>>> "description": "Create a new namespace" >>>>>>>>>>> >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>>> >>>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>> "path2": {} >>>>>>>>>>> >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> ... >>>>>>>>>>> >>>>>>>>>>> ] >>>>>>>>>>> >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yufei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Aug 15, 2024 at 8:47 AM Russell Spitzer < >>>>>>>>>>> russell.spit...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> I'm on board for this proposal. I was in the off-mail chats and >>>>>>>>>>>> I think this is probably our simplest approach going forward. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Aug 15, 2024 at 10:39 AM Dmitri Bourlatchkov >>>>>>>>>>>> <dmitri.bourlatch...@dremio.com.invalid> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> OpenAPI tool will WARN a lot if Operation IDs overlap. >>>>>>>>>>>>> Generated code/html may also look odd in case of overlaps. >>>>>>>>>>>>> >>>>>>>>>>>>> All-in-all, I think the best practice is to define unique >>>>>>>>>>>>> Operation IDs up front. >>>>>>>>>>>>> >>>>>>>>>>>>> For Iceberg REST API, the yaml file is the API definition, so >>>>>>>>>>>>> it should not be a problem to ensure that Operation IDs are >>>>>>>>>>>>> unique, I guess. >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> Dmitri. >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Aug 15, 2024 at 11:32 AM Eduard Tudenhöfner < >>>>>>>>>>>>> etudenhoef...@apache.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hey Jack, >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks for the feedback. I replied in the doc but I can >>>>>>>>>>>>>> reiterate my answer here too: The *path* is unique and >>>>>>>>>>>>>> required so that feels more appropriate than requiring to have >>>>>>>>>>>>>> an optional >>>>>>>>>>>>>> *operationId* in the OpenAPI spec. >>>>>>>>>>>>>> Additionally, using the path is more straight-forward when we >>>>>>>>>>>>>> introduce v2 endpoints, while you would have to make sure that >>>>>>>>>>>>>> all >>>>>>>>>>>>>> *operationIds* are unique across endpoints (and I'm not sure >>>>>>>>>>>>>> if OpenAPI tools actually enforce uniqueness). >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Aug 15, 2024 at 5:20 PM Jack Ye <yezhao...@gmail.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Eduard, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In general I agree with this proposal, thanks for putting >>>>>>>>>>>>>>> this up! Just one question (which I also added in the design), >>>>>>>>>>>>>>> what are the >>>>>>>>>>>>>>> thoughts behind using "<HTTP VERB> <resource path from REST >>>>>>>>>>>>>>> spec>", vs >>>>>>>>>>>>>>> using the operationId defined in the OpenAPI? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The operationId approach definitely looks much cleaner to >>>>>>>>>>>>>>> me, but (1) in OpenAPI it is not a requirement to define it, >>>>>>>>>>>>>>> and (2) right >>>>>>>>>>>>>>> now there are some inconsistent operationIds, for example >>>>>>>>>>>>>>> UpdateTable is >>>>>>>>>>>>>>> the operationId, but CommitTable is used for all request and >>>>>>>>>>>>>>> response >>>>>>>>>>>>>>> models. But these are all pretty solvable issues because we can >>>>>>>>>>>>>>> enforce >>>>>>>>>>>>>>> operationId to be required in the Iceberg spec, and fix it to be >>>>>>>>>>>>>>> consistent, assuming nobody is taking a dependency on these >>>>>>>>>>>>>>> operationIds >>>>>>>>>>>>>>> right now. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Personally speaking, I am pretty neutral on this topic, but >>>>>>>>>>>>>>> curious what everyone thinks. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>> Jack Ye >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Aug 14, 2024 at 9:20 AM Eduard Tudenhöfner < >>>>>>>>>>>>>>> etudenhoef...@apache.org> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hey Dmitri, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> this proposal is the result of the community feedback from >>>>>>>>>>>>>>>> the Capabilities proposal. Ultimately the capabilities turned >>>>>>>>>>>>>>>> out to entail >>>>>>>>>>>>>>>> more complexity than necessary and so this proposal solves the >>>>>>>>>>>>>>>> core problem >>>>>>>>>>>>>>>> while keeping complexity and spec changes to an absolute >>>>>>>>>>>>>>>> minimum. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Eduard >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, Aug 14, 2024 at 5:15 PM Dmitri Bourlatchkov >>>>>>>>>>>>>>>> <dmitri.bourlatch...@dremio.com.invalid> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Eduard, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> How is this proposal related to the Server Capabilities >>>>>>>>>>>>>>>>> discussion? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Dmitri. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, Aug 14, 2024 at 5:14 AM Eduard Tudenhöfner < >>>>>>>>>>>>>>>>> etudenhoef...@apache.org> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hey everyone, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I'd like to propose a way for REST servers to communicate >>>>>>>>>>>>>>>>>> to clients what endpoints it supports via a new >>>>>>>>>>>>>>>>>> *endpoints* field in the *CatalogConfig* of the >>>>>>>>>>>>>>>>>> *v1/config* endpoint. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This enables clients to make better decisions and clearly >>>>>>>>>>>>>>>>>> signal that a particular endpoint isn’t supported. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I opened #10937 >>>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/issues/10937> to >>>>>>>>>>>>>>>>>> track the proposal in GH. Please find the proposal doc >>>>>>>>>>>>>>>>>> here >>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/1krcIaLfxtBFDABU5ssLmf64zyHgE8BRncpGPIMTWlxo/edit?usp=sharing> >>>>>>>>>>>>>>>>>> (estimated >>>>>>>>>>>>>>>>>> read time: 5 minutes). The proposal requires a Spec change, >>>>>>>>>>>>>>>>>> which can be >>>>>>>>>>>>>>>>>> seen in #10928 >>>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/pull/10928>. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Eduard >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Ryan Blue >>>>>>>> Databricks >>>>>>>> >>>>>>>