Re: remote signing, I agree that it does not look like a server capability that a client can / should discover. It is more like something that the server instructs / configures the client to do.
Cheers, Dmitri. On Tue, Jul 9, 2024 at 12:05 PM Jack Ye <yezhao...@gmail.com> wrote: > I was reconciling the discussion yesterday, one point that was interesting > to me was that we agreed the purpose of these capabilities is to "control > client-side fallback behavior", or at least the client should behave > differently based on these capabilities. However, this seems to be only > needed so far for views, or more specifically, for loadView API only > because it impacts the fallback behavior to resolve the identifier as a > table or not. > > For all the other capabilities listed, and even the other endpoints in > view, because a server can decide to implement it partially anyway and just > document the behavior, does it make a difference if I declare the > capability or not? The client will not stop the request, the server will > just error out if it is not supported. Maybe the error is not in the > expected code or message, but it is still an error. In that case, why do we > need all these other capabilities like tables, remote-signing, etc. in the > first place? > > Maybe it is too extreme of a thought, but could anyone help describe how > the other capabilities could be used beyond potentially returning an error > earlier? > > -Jack > > > > > On Tue, Jul 9, 2024 at 8:02 AM Dmitri Bourlatchkov > <dmitri.bourlatch...@dremio.com.invalid> wrote: > >> Hi Eduard, >> >> > I've also added the 501 error to the response of the respective >> endpoints but worth mentioning that *HEAD* / *GET *requests must not >> return a 501 >> <https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/501> (this >> implies that the server impl would e.g. return a *404* in such a case). >> >> My reading on the Mozilla page makes me think that it is phrased too >> narrowly. Reading RFC 2616 [1] I believe that it does not preclude >> responding with 501 to GET and HEAD requests. I think it means that GET and >> HEAD methods must be supported by "general purpose" servers. The Iceberg >> REST server is not a general purpose server for resources. So, I think it >> should be fine to respond with 501 to unimplemented endpoints. >> >> Cheers, >> Dmitri. >> >> [1] https://www.rfc-editor.org/rfc/rfc2616#section-5.1.1 >> >> On Tue, Jul 9, 2024 at 9:44 AM Eduard Tudenhöfner < >> etudenhoef...@apache.org> wrote: >> >>> Hey everyone, >>> >>> I watched the catalog sync recording today and updated the PR >>> <https://github.com/apache/iceberg/pull/9940> to remove fine-grained >>> capabilities like *register-table / table-metrics*. >>> >>> The current capabilities (with versioning information) in the PR are: >>> >>> - tables >>> - views >>> - remote-signing >>> - vended-credentials >>> - multi-table-commit >>> >>> For servers that only *partially* implement endpoints under a >>> capability the spec requires the server to throw a *501 Not Implemented*. >>> I've also added the 501 error to the response of the respective endpoints >>> but worth mentioning that *HEAD* / *GET *requests must not return a 501 >>> <https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/501> (this >>> implies that the server impl would e.g. return a *404* in such a case). >>> >>> >>> Regards >>> Eduard >>> >>> On Thu, Jul 4, 2024 at 3:59 PM Jean-Baptiste Onofré <j...@nanthrax.net> >>> wrote: >>> >>>> Hi Eduard, >>>> >>>> It makes sense to return 501 for servers which don't implement all >>>> endpoints. It means that the server will at least have to implement >>>> empty endpoints if needed (that makes sense to me). >>>> >>>> I think we should focus on only "identified capabilities". I think >>>> that I proposed before that the capabilities can be >>>> overridden/provided by server implementation. Else, I'm afraid we >>>> won't be flexible enough or always behind the implementation (if an >>>> implementation wants to add "my-foo-cap"). >>>> >>>> Regards >>>> JB >>>> >>>> On Thu, Jul 4, 2024 at 9:32 AM Eduard Tudenhöfner >>>> <etudenhoef...@apache.org> wrote: >>>> > >>>> > I have clarified the wording in #9940 around the requirement on >>>> having to implement all endpoints under a particular capability. >>>> > >>>> > For servers that only partially implement endpoints under a >>>> capability the spec requires the server to throw a 501 Not Implemented. >>>> This was suggested by Jack and it seems reasonable to do that. >>>> > >>>> > Regarding the inclusion of table-spec / view-spec as a capability: I >>>> think this might make sense for the next iteration of the REST spec but as >>>> I mentioned earlier I don't see any clear benefit for the current REST spec >>>> as the client wouldn't do anything with that information. >>>> > If there is a clear benefit of having this, then this can still be >>>> added later to the current REST spec but I believe we should rather have a >>>> few well-defined and actionable capabilities rather than too many. >>>> > >>>> > Eduard >>>> > >>>> > On Wed, Jul 3, 2024 at 5:44 AM Renjie Liu <liurenjie2...@gmail.com> >>>> wrote: >>>> >>> >>>> >>> Spec is an interesting topic we did not discuss. Robert, how do you >>>> envision this to be used? >>>> >>> In my mind, if a new table format v3 is launched, there are 2 >>>> approaches we can go with, taking CreateTable as an example: >>>> >>> (1) increment the related operation version, which means that POST >>>> /v2/{prefix}/namespaces/{ns}/tables will be created and allow creating >>>> tables in the v3 version. >>>> >>> (2) update the existing table metadata model to support both v2 and >>>> v3 fields, and the server enforces the payload differently based on the >>>> TableMetadata.format-version field. If the server does not support v3, it >>>> can return unsupported at that time. >>>> >>> Either way we go, the table-spec version does not need to be a >>>> capability. (1) seems to be cleaner, but has some overhead in provisioning >>>> a new endpoint compared to (2). >>>> >>> Do you see another way to do this leveraging the table-spec version? >>>> >> >>>> >> >>>> >> 2 is cleaner but maybe inconsistent with current behavior, since >>>> /v1/tables operation supports both v1 and v3. We should only go to 2 only >>>> when we have incompatible fields/break changes according to discussion. >>>> >> >>>> >> Generally I agree with adding table-spec into capabilities. For >>>> example, we can expose this to user in api so that user could choose a >>>> supported table format version without throwing exception. >>>> >> >>>> >> On Wed, Jul 3, 2024 at 12:18 AM Jack Ye <yezhao...@gmail.com> wrote: >>>> >>> >>>> >>> Spec is an interesting topic we did not discuss. Robert, how do you >>>> envision this to be used? >>>> >>> >>>> >>> In my mind, if a new table format v3 is launched, there are 2 >>>> approaches we can go with, taking CreateTable as an example: >>>> >>> >>>> >>> (1) increment the related operation version, which means that POST >>>> /v2/{prefix}/namespaces/{ns}/tables will be created and allow creating >>>> tables in the v3 version. >>>> >>> >>>> >>> (2) update the existing table metadata model to support both v2 and >>>> v3 fields, and the server enforces the payload differently based on the >>>> TableMetadata.format-version field. If the server does not support v3, it >>>> can return unsupported at that time. >>>> >>> >>>> >>> Either way we go, the table-spec version does not need to be a >>>> capability. (1) seems to be cleaner, but has some overhead in provisioning >>>> a new endpoint compared to (2). >>>> >>> >>>> >>> Do you see another way to do this leveraging the table-spec version? >>>> >>> >>>> >>> -Jack >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> On Tue, Jul 2, 2024 at 6:03 AM Eduard Tudenhöfner >>>> <eduard.tudenhoef...@databricks.com.invalid> wrote: >>>> >>>> >>>> >>>> >>>> >>>> I couldn't make it to the catalog sync meeting yesterday but I >>>> watched the recording today (thanks for providing that). >>>> >>>> >>>> >>>>> The missing piece is how (new, capabilities-aware) clients handle >>>> the case when a service does _not_ return the capabilities field (absent). >>>> My proposal would be that a client should in this case assume that all >>>> _currently_ existing capabilities are supported. >>>> >>>>> >>>> >>>>> - tables: [1] >>>> >>>>> - views: [1] >>>> >>>>> - remote-signing: [1] >>>> >>>>> - multi-table-commit: [1] >>>> >>>>> - register-table: [1] >>>> >>>>> - table-metrics: [1] >>>> >>>>> - table-spec: [1,2] >>>> >>>>> - view-spec: [1,2] >>>> >>>>> >>>> >>>>> >>>> >>>> The one thing I would like to add here is that the current PR uses >>>> the tables capability (as version 1) as the default when a server doesn't >>>> return capabilities but it might be also ok to include views (as version 1) >>>> because the current client impl has some code to deal with errors in case >>>> endpoints don't exist. >>>> >>>> >>>> >>>> Unless we agree that the currently existing functionality in the >>>> REST spec is the default behavior to be assumed for older server, I'm not >>>> sure about including remote-signing / multi-table-commit / register-table / >>>> table-metrics as it has been indicated in earlier comments on the PR/ML >>>> that not every REST server supports these. >>>> >>>> >>>> >>>> That being said, we should discuss whether we want the default >>>> behavior (when an older server doesn't send back capabilities) to be >>>> >>>> a) tables (version 1) only >>>> >>>> b) the currently existing functionality as defined in the REST >>>> spec (as version 1) >>>> >>>> >>>> >>>> >>>> >>>> On another note: Including table-spec / view-spec seems to be more >>>> informative in its nature as I don't think a client would act differently >>>> right now when seeing these. >>>> >>>> >>>> >>>> Thanks >>>> >>>> Eduard >>>> >>>> >>>> >>>