Thanks Yufei and everyone else that participated in the discussion. I'll
open a vote shortly to get this moving.

On Tue, Aug 20, 2024 at 8:04 PM Yufei Gu <flyrain...@gmail.com> wrote:

> I'm fine with the approach to present the endpoint ID as a plain string.
> For any extension in the future, we can add a new field parallel with the
> endpoint array. Thanks a lot for driving this, Edward.
>
> Yufei
>
>
>
> On Tue, Aug 20, 2024 at 8:05 AM Eduard Tudenhöfner <
> etudenhoef...@apache.org> wrote:
>
>> @Karuppayya: I've answered your question in the doc.
>>
>> @Yufei: Part of the reason why the proposed format is so simple is
>> because there were different opinions about what level of granularity we
>> should have in the context of REST capabilities, which then led to this
>> version.
>> It seems people are generally onboard with the simple approach of using 
>> *"<httpVerb>
>> <resourcePath>" * so maybe we should go with this, wdyt @Yufei?
>>
>>
>>
>> On Fri, Aug 16, 2024 at 6:50 PM Yufei Gu <flyrain...@gmail.com> wrote:
>>
>>> I’m OK with using a plain string for the endpoint ID, as described in
>>> doc[1]. However, I’ve been thinking about how we can make this more
>>> flexible, especially since we’ve had quite a few discussions about
>>> granularity.
>>>
>>> For instance, if we expose a bit more of the API specification, we might
>>> not need some of the feature flags.
>>>
>>> It might be worthwhile to consider adopting the OpenAPI path object
>>> schema[2] if we decide to add more structure. Since Iceberg is already
>>> using OpenAPI to define the REST specification and will continue to do so,
>>> this approach could make future extensions easier.
>>>
>>> Looking forward to your thoughts on this!
>>>
>>> Ref:
>>>
>>>    1. API Endpoints
>>>    <https://learn.openapis.org/specification/paths.html#api-endpoints>
>>>    2. Path Object <https://spec.openapis.org/oas/v3.1.0#paths-object>
>>>
>>> Yufei
>>>
>>>
>>> On Fri, Aug 16, 2024 at 8:58 AM karuppayya <karuppayya1...@gmail.com>
>>> wrote:
>>>
>>>> Can we consider adding `OPTIONS` verb to the resource paths, as part of
>>>> the spec?
>>>> That way the endpoint discovery endpoint could return only the list of
>>>> supported endpoints, without the verbs.
>>>>
>>>> `OPTIONS` on the resource path can return the list of all supported
>>>> verbs, and also other information regarding API usage
>>>>
>>>> - Karuppayya
>>>>
>>>> On Fri, Aug 16, 2024 at 7:34 AM Dmitri Bourlatchkov
>>>> <dmitri.bourlatch...@dremio.com.invalid> wrote:
>>>>
>>>>> Sorry for a bit of back-tracking, but I'd like to clarify the meaning
>>>>> of those endpoint strings.
>>>>>
>>>>> I initially assumed that the strings would need to be parsed into
>>>>> components (verb / path) for use in runtime. My suggestion for using a 
>>>>> JSON
>>>>> representation was meant to make the parsing more standard (i.e. avoid
>>>>> custom format).
>>>>>
>>>>> After reading the impl. PR [1] I think parsing is actually _not_
>>>>> necessary. The whole endpoint string is basically another form of an
>>>>> endpoint "ID".
>>>>>
>>>>> If we agree that the values in the new endpoints config list should be
>>>>> treated as opaque IDs, I think it should be fine to use the simple string
>>>>> representation.
>>>>>
>>>>> WDYT?
>>>>>
>>>>> Thanks,
>>>>> Dmitri.
>>>>>
>>>>> [1] https://github.com/apache/iceberg/pull/10929
>>>>>
>>>>> On Fri, Aug 16, 2024 at 4:43 AM Eduard Tudenhöfner <
>>>>> etudenhoef...@apache.org> wrote:
>>>>>
>>>>>> If we really want to add more structure to the JSON representation,
>>>>>> then I would probably prefer what Dmitri suggested in the doc as I think
>>>>>> { "GET": {}, "POST": {} } looks a bit weird:
>>>>>>
>>>>>> "endpoints":[
>>>>>>   {"verb": "GET","path": "/v1/{prefix}/namespaces/{namespace}"},
>>>>>>   {"verb": "GET","path": "/v1/{prefix}/namespaces"},
>>>>>>   {"verb": "POST","path": "/v1/{prefix}/namespaces"}
>>>>>> ]
>>>>>>
>>>>>> What do people think of that?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Aug 16, 2024 at 8:13 AM Walaa Eldin Moustafa <
>>>>>> wa.moust...@gmail.com> wrote:
>>>>>>
>>>>>>> Thank you Eduard for sharing this version of the proposal. Looks
>>>>>>> simple, functional, and extensible.
>>>>>>>
>>>>>>> On Thu, Aug 15, 2024 at 1:10 PM Ryan Blue
>>>>>>> <b...@databricks.com.invalid> wrote:
>>>>>>>
>>>>>>>> I think I'm fine either way. I lean toward the simplicity of the
>>>>>>>> strings in the proposal but would not complain if we went with Yufei's
>>>>>>>> suggestion.
>>>>>>>>
>>>>>>>> On Thu, Aug 15, 2024 at 12:12 PM Yufei Gu <flyrain...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> The current proposal lists endpoints as plain strings, and I still
>>>>>>>>> believe we could make things a bit smoother by adding some structure 
>>>>>>>>> to
>>>>>>>>> them. Here's the example if the previous one throws you off.
>>>>>>>>>
>>>>>>>>> *Before:*
>>>>>>>>>
>>>>>>>>> "GET /v1/{prefix}/namespaces","POST /v1/{prefix}/namespaces","GET 
>>>>>>>>> /v1/{prefix}/namespaces/{namespace}","DELETE 
>>>>>>>>> /v1/{prefix}/namespaces/{namespace}"
>>>>>>>>>
>>>>>>>>> *After:*
>>>>>>>>>
>>>>>>>>> {"/v1/{prefix}/namespaces": { "GET": {}, "POST": {} } 
>>>>>>>>> },{"/v1/{prefix}/namespaces/{namespace}": { "GET": {}, "DELETE": {} } 
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> I think this approach would make the definitions more
>>>>>>>>> machine-readable and easier to expand in the future. It's always good 
>>>>>>>>> to
>>>>>>>>> plan for potential changes, and this structure should help us adapt 
>>>>>>>>> without
>>>>>>>>> much hassle.
>>>>>>>>> Plus, we could use a trimmed OpenAPI schema for this, so we
>>>>>>>>> wouldn’t need to create a new schema from scratch.
>>>>>>>>>
>>>>>>>>> Yufei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Aug 15, 2024 at 11:02 AM Jack Ye <yezhao...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> > But I propose to use a trimmed openAPI's format directly.
>>>>>>>>>> Looking at the example, this feels quite complicated to me.
>>>>>>>>>>
>>>>>>>>>> > For example, it is easier if we want to include operationID
>>>>>>>>>> I don't think we need to consider accommodating both, since
>>>>>>>>>> operationId is an alternative to "<HTTP VERB> <resource path from 
>>>>>>>>>> REST
>>>>>>>>>> spec>".
>>>>>>>>>>
>>>>>>>>>> > or adding feature flags, or adding parameters
>>>>>>>>>> For more advanced feature flags, those could likely directly go
>>>>>>>>>> to overrides just like today, but given we limited the scope to 
>>>>>>>>>> service
>>>>>>>>>> endpoint discovery, I think that is another discussion later.
>>>>>>>>>>
>>>>>>>>>> So the current proposed solution from Eduard still feels better
>>>>>>>>>> to me. And I think the argument for ambiguity is pretty strong so I 
>>>>>>>>>> am good
>>>>>>>>>> with the proposed approach to use "<HTTP VERB> <resource path from 
>>>>>>>>>> REST
>>>>>>>>>> spec>".
>>>>>>>>>>
>>>>>>>>>> -Jack
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 15, 2024 at 9:46 AM Yufei Gu <flyrain...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> +1 for the proposal. In terms of the format, the current
>>>>>>>>>>> solution is simple enough. But I propose to use a trimmed openAPI's 
>>>>>>>>>>> format
>>>>>>>>>>> directly. It won't add much cost as we can just take the minimum 
>>>>>>>>>>> fields we
>>>>>>>>>>> want. But it opens a window to extend it in the future. For 
>>>>>>>>>>> example, it is
>>>>>>>>>>> easier if we want to include operationID, or adding feature flags, 
>>>>>>>>>>> or
>>>>>>>>>>> adding parameters. Here is an example:
>>>>>>>>>>>
>>>>>>>>>>> {
>>>>>>>>>>>
>>>>>>>>>>>   "resources": [
>>>>>>>>>>>
>>>>>>>>>>>     {
>>>>>>>>>>>
>>>>>>>>>>>       "/v1/{prefix}/namespaces":
>>>>>>>>>>>
>>>>>>>>>>>         {
>>>>>>>>>>>
>>>>>>>>>>>           "GET": {
>>>>>>>>>>>
>>>>>>>>>>>           "description": "List all namespaces"
>>>>>>>>>>>
>>>>>>>>>>>           }
>>>>>>>>>>>
>>>>>>>>>>>         },
>>>>>>>>>>>
>>>>>>>>>>>         {
>>>>>>>>>>>
>>>>>>>>>>>           "POST": {
>>>>>>>>>>>
>>>>>>>>>>>           "description": "Create a new namespace"
>>>>>>>>>>>
>>>>>>>>>>>           }
>>>>>>>>>>>
>>>>>>>>>>>         }
>>>>>>>>>>>
>>>>>>>>>>>     },
>>>>>>>>>>>
>>>>>>>>>>>     {
>>>>>>>>>>>
>>>>>>>>>>>       "path2": {}
>>>>>>>>>>>
>>>>>>>>>>>     }
>>>>>>>>>>>
>>>>>>>>>>>   ...
>>>>>>>>>>>
>>>>>>>>>>>   ]
>>>>>>>>>>>
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yufei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Aug 15, 2024 at 8:47 AM Russell Spitzer <
>>>>>>>>>>> russell.spit...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'm on board for this proposal. I was in the off-mail chats and
>>>>>>>>>>>> I think this is probably our simplest approach going forward.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Aug 15, 2024 at 10:39 AM Dmitri Bourlatchkov
>>>>>>>>>>>> <dmitri.bourlatch...@dremio.com.invalid> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> OpenAPI tool will WARN a lot if Operation IDs overlap.
>>>>>>>>>>>>> Generated code/html may also look odd in case of overlaps.
>>>>>>>>>>>>>
>>>>>>>>>>>>> All-in-all, I think the best practice is to define unique
>>>>>>>>>>>>> Operation IDs up front.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For Iceberg REST API, the yaml file is the API definition, so
>>>>>>>>>>>>> it should not be a problem to ensure that Operation IDs are 
>>>>>>>>>>>>> unique, I guess.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Dmitri.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Aug 15, 2024 at 11:32 AM Eduard Tudenhöfner <
>>>>>>>>>>>>> etudenhoef...@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hey Jack,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks for the feedback. I replied in the doc but I can
>>>>>>>>>>>>>> reiterate my answer here too: The *path* is unique and
>>>>>>>>>>>>>> required so that feels more appropriate than requiring to have 
>>>>>>>>>>>>>> an optional
>>>>>>>>>>>>>> *operationId* in the OpenAPI spec.
>>>>>>>>>>>>>> Additionally, using the path is more straight-forward when we
>>>>>>>>>>>>>> introduce v2 endpoints, while you would have to make sure that 
>>>>>>>>>>>>>> all
>>>>>>>>>>>>>> *operationIds* are unique across endpoints (and I'm not sure
>>>>>>>>>>>>>> if OpenAPI tools actually enforce uniqueness).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Aug 15, 2024 at 5:20 PM Jack Ye <yezhao...@gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Eduard,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In general I agree with this proposal, thanks for putting
>>>>>>>>>>>>>>> this up! Just one question (which I also added in the design), 
>>>>>>>>>>>>>>> what are the
>>>>>>>>>>>>>>> thoughts behind using "<HTTP VERB> <resource path from REST 
>>>>>>>>>>>>>>> spec>", vs
>>>>>>>>>>>>>>> using the operationId defined in the OpenAPI?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The operationId approach definitely looks much cleaner to
>>>>>>>>>>>>>>> me, but (1) in OpenAPI it is not a requirement to define it, 
>>>>>>>>>>>>>>> and (2) right
>>>>>>>>>>>>>>> now there are some inconsistent operationIds, for example 
>>>>>>>>>>>>>>> UpdateTable is
>>>>>>>>>>>>>>> the operationId, but CommitTable is used for all request and 
>>>>>>>>>>>>>>> response
>>>>>>>>>>>>>>> models. But these are all pretty solvable issues because we can 
>>>>>>>>>>>>>>> enforce
>>>>>>>>>>>>>>> operationId to be required in the Iceberg spec, and fix it to be
>>>>>>>>>>>>>>> consistent, assuming nobody is taking a dependency on these 
>>>>>>>>>>>>>>> operationIds
>>>>>>>>>>>>>>> right now.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Personally speaking, I am pretty neutral on this topic, but
>>>>>>>>>>>>>>> curious what everyone thinks.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Jack Ye
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Aug 14, 2024 at 9:20 AM Eduard Tudenhöfner <
>>>>>>>>>>>>>>> etudenhoef...@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hey Dmitri,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> this proposal is the result of the community feedback from
>>>>>>>>>>>>>>>> the Capabilities proposal. Ultimately the capabilities turned 
>>>>>>>>>>>>>>>> out to entail
>>>>>>>>>>>>>>>> more complexity than necessary and so this proposal solves the 
>>>>>>>>>>>>>>>> core problem
>>>>>>>>>>>>>>>> while keeping complexity and spec changes to an absolute 
>>>>>>>>>>>>>>>> minimum.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Eduard
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Aug 14, 2024 at 5:15 PM Dmitri Bourlatchkov
>>>>>>>>>>>>>>>> <dmitri.bourlatch...@dremio.com.invalid> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Eduard,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> How is this proposal related to the Server Capabilities
>>>>>>>>>>>>>>>>> discussion?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Dmitri.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Aug 14, 2024 at 5:14 AM Eduard Tudenhöfner <
>>>>>>>>>>>>>>>>> etudenhoef...@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hey everyone,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'd like to propose a way for REST servers to communicate
>>>>>>>>>>>>>>>>>> to clients what endpoints it supports via a new
>>>>>>>>>>>>>>>>>> *endpoints* field in the *CatalogConfig* of the
>>>>>>>>>>>>>>>>>> *v1/config* endpoint.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This enables clients to make better decisions and clearly
>>>>>>>>>>>>>>>>>> signal that a particular endpoint isn’t supported.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I opened #10937
>>>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/issues/10937> to
>>>>>>>>>>>>>>>>>> track the proposal in GH. Please find the proposal doc
>>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/1krcIaLfxtBFDABU5ssLmf64zyHgE8BRncpGPIMTWlxo/edit?usp=sharing>
>>>>>>>>>>>>>>>>>>  (estimated
>>>>>>>>>>>>>>>>>> read time: 5 minutes). The proposal requires a Spec change, 
>>>>>>>>>>>>>>>>>> which can be
>>>>>>>>>>>>>>>>>> seen in #10928
>>>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/pull/10928>.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Eduard
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Ryan Blue
>>>>>>>> Databricks
>>>>>>>>
>>>>>>>

Reply via email to