Hi,

I still have concerns regarding the missing table-spec/view-spec capabilities. Newer clients can send create/update requests with requirements/updates of newer Iceberg table/view/udf specs to a server that doesn't support those spec versions - the outcome is rather undefined. What should a server do? Ignore the unknown fields and requirement/update types and hence do what it's potentially _not_ supposed to do? Reply with a then ambiguous 501 (is it the endpoint that's not implemented or the request content not supported)? Similar, what if a server decides to not support for example table-spec v1 and just drop the manifest-file list in a table snapshot leading to data loss?

IMO capabilities must contain the table/view/... spec versions supported by the server.

There's also the concern about the behavior if the `capabilties` field is missing (see https://github.com/apache/iceberg/pull/9940/files#r1676113409, not sure why the comment thread's resolved). The PR currently says: "tables -> default capability in case the `capabilities` property doesn't exist or is empty in the response" - meaning: the server would _only_ support tables. This phrase in the spec proposal effectively removes the view functionality from all currently existing Iceberg REST implementations.


On 11.07.24 08:42, Eduard Tudenhöfner wrote:
Are there any other concerns with the proposal or should we start a VOTE thread?

Eduard

On Wed, Jul 10, 2024 at 5:20 PM Dmitri Bourlatchkov <dmitri.bourlatch...@dremio.com.invalid> wrote:

            Re: remote signing, I agree that it does not look like a
            server capability that a client can / should discover. It
            is more like something that the server instructs /
            configures the client to do.

        While a server can control this behavior and instruct the
        client to use remote signing, technically nothing is
        preventing a client from configuring
        s3.remote-signing-enabled=true. In such a case it seems more
        appropriate to indicate that this capability isn't supported
        rather than a generic 501, because not every server will
        support remote signing.


    Good point regarding clients taking initiative and using request
    singing without an explicit server-provided config. It moves the
    client operations into a mode where the server has more control
    (over having longer term client-side credentials), so it looks
    like a reasonable mode to support from the security perspective.

    Let's keep that capability flag.

    Cheers,
    Dmitri.

    On Wed, Jul 10, 2024 at 5:48 AM Eduard Tudenhöfner
    <etudenhoef...@apache.org> wrote:

        Hey everyone,

        I've added a few inline comments below.

            Re: remote signing, I agree that it does not look like a
            server capability that a client can / should discover. It
            is more like something that the server instructs /
            configures the client to do.

        While a server can control this behavior and instruct the
        client to use remote signing, technically nothing is
        preventing a client from configuring
        s3.remote-signing-enabled=true. In such a case it seems more
        appropriate to indicate that this capability isn't supported
        rather than a generic 501, because not every server will
        support remote signing.

        The *vended-credentials* capability on the other hand is more
        informative in its nature and a server indeed configures a
        client. I think that was also one of the reasons I removed
        this capability but added it later back due to a comment from
        Jack.

        I'm ok either way in terms of removing / keeping
        *vended-credentials* as a capability but given that we'd want
        to include *actionable* capabilities at this point, I'd just
        remove it (nothing is preventing us from adding it later if
        necessary).


            In that case, why do we need all these other capabilities
            like tables, remote-signing, etc. in the first place?


        Given that capabilities also carry versioning information,
        clients can make more informed decisions on which endpoints to
        call. One could argue that generally throwing a 501 on
        everything that isn't supported might be sufficient, but that
        doesn't necessarily help a client in knowing which versions of
        a capability are safe to call/use.

        Regarding the control of client-side fallback behavior:
        I think the default fallback behavior should be *tables* (with
        version 1) with a property in the REST catalog that allows
        configuring this to e.g.
        *rest-default-capabilities=tables,views,abc,xyz* (all of them
        defaulting to version 1).


        Eduard


        On Tue, Jul 9, 2024 at 7:00 PM Jack Ye <yezhao...@gmail.com>
        wrote:

            Yes I agree that sounds like a valid use case. So the
            criteria so far is that capabilities are used for:
            - controlling client-side fallback behavior
            - failing expensive operations early if we know it will
            eventually fail due to missing capability

            Do we agree if this is the criteria we should use? What
            about the other capabilities, namly tables,
            remote-signing, credential-vending?

            -Jack


            On Tue, Jul 9, 2024 at 9:27 AM Ryan Blue
            <b...@databricks.com.invalid> wrote:

                > does it make a difference if I declare the
                capability or not?

                I think that it does in other cases. Multi-table
                commits, for example, are a building block for
                multi-statement transactions. If a service doesn't
                support multi-table commits then we ideally want
                clients to know that ahead of time so that they don't
                run a big transaction and then fail because the commit
                is not supported.

                On Tue, Jul 9, 2024 at 9:12 AM Dmitri Bourlatchkov
                <dmitri.bourlatch...@dremio.com.invalid> wrote:

                    Re: remote signing, I agree that it does not look
                    like a server capability that a client can /
                    should discover. It is more like something that
                    the server instructs / configures the client to do.

                    Cheers,
                    Dmitri.

                    On Tue, Jul 9, 2024 at 12:05 PM Jack Ye
                    <yezhao...@gmail.com> wrote:

                        I was reconciling the discussion yesterday,
                        one point that was interesting to me was that
                        we agreed the purpose of these capabilities is
                        to "control client-side fallback behavior", or
                        at least the client should behave differently
                        based on these capabilities. However, this
                        seems to be only needed so far for views, or
                        more specifically, for loadView API only
                        because it impacts the fallback behavior to
                        resolve the identifier as a table or not.

                        For all the other capabilities listed, and
                        even the other endpoints in view, because a
                        server can decide to implement it partially
                        anyway and just document the behavior, does it
                        make a difference if I declare the capability
                        or not? The client will not stop the request,
                        the server will just error out if it is not
                        supported. Maybe the error is not in the
                        expected code or message, but it is still an
                        error. In that case, why do we need all these
                        other capabilities like tables,
                        remote-signing, etc. in the first place?

                        Maybe it is too extreme of a thought, but
                        could anyone help describe how the other
                        capabilities could be used beyond potentially
                        returning an error earlier?

                        -Jack




                        On Tue, Jul 9, 2024 at 8:02 AM Dmitri
                        Bourlatchkov
                        <dmitri.bourlatch...@dremio.com.invalid> wrote:

                            Hi Eduard,

                            > I've also added the 501 error to the
                            response of the respective endpoints but
                            worth mentioning that *HEAD* / *GET
                            *requests must not return a 501
                            
<https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/501> (this
                            implies that the server impl would e.g.
                            return a *404* in such a case).

                            My reading on the Mozilla page makes me
                            think that it is phrased too narrowly.
                            Reading RFC 2616 [1] I believe that it
                            does not preclude responding with 501 to
                            GET and HEAD requests. I think it means
                            that GET and HEAD methods must be
                            supported by "general purpose" servers.
                            The Iceberg REST server is not a general
                            purpose server for resources. So, I think
                            it should be fine to respond with 501 to
                            unimplemented endpoints.

                            Cheers,
                            Dmitri.

                            [1]
                            https://www.rfc-editor.org/rfc/rfc2616#section-5.1.1

                            On Tue, Jul 9, 2024 at 9:44 AM Eduard
                            Tudenhöfner <etudenhoef...@apache.org> wrote:

                                Hey everyone,

                                I watched the catalog sync recording
                                today and updated the PR
                                <https://github.com/apache/iceberg/pull/9940>
                                to remove fine-grained capabilities
                                like *register-table / table-metrics*.

                                The current capabilities (with
                                versioning information) in the PR are:

                                  * tables
                                  * views
                                  * remote-signing
                                  * vended-credentials
                                  * multi-table-commit

                                For servers that only
                                *partially* implement endpoints under
                                a capability the spec requires the
                                server to throw a *501 Not
                                Implemented*. I've also added the 501
                                error to the response of the
                                respective endpoints but worth
                                mentioning that *HEAD* / *GET
                                *requests must not return a 501
                                
<https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/501> (this
                                implies that the server impl would
                                e.g. return a *404* in such a case).


                                Regards
                                Eduard

                                On Thu, Jul 4, 2024 at 3:59 PM
                                Jean-Baptiste Onofré <j...@nanthrax.net>
                                wrote:

                                    Hi Eduard,

                                    It makes sense to return 501 for
                                    servers which don't implement all
                                    endpoints. It means that the
                                    server will at least have to implement
                                    empty endpoints if needed (that
                                    makes sense to me).

                                    I think we should focus on only
                                    "identified capabilities". I think
                                    that I proposed before that the
                                    capabilities can be
                                    overridden/provided by server
                                    implementation. Else, I'm afraid we
                                    won't be flexible enough or always
                                    behind the implementation (if an
                                    implementation wants to add
                                    "my-foo-cap").

                                    Regards
                                    JB

                                    On Thu, Jul 4, 2024 at 9:32 AM
                                    Eduard Tudenhöfner
                                    <etudenhoef...@apache.org> wrote:
                                    >
                                    > I have clarified the wording in
                                    #9940 around the requirement on
                                    having to implement all endpoints
                                    under a particular capability.
                                    >
                                    > For servers that only partially
                                    implement endpoints under a
                                    capability the spec requires the
                                    server to throw a 501 Not
                                    Implemented. This was suggested by
                                    Jack and it seems reasonable to do
                                    that.
                                    >
                                    > Regarding the inclusion of
                                    table-spec / view-spec as a
                                    capability: I think this might
                                    make sense for the next iteration
                                    of the REST spec but as I
                                    mentioned earlier I don't see any
                                    clear benefit for the current REST
                                    spec as the client wouldn't do
                                    anything with that information.
                                    > If there is a clear benefit of
                                    having this, then this can still
                                    be added later to the current REST
                                    spec but I believe we should
                                    rather have a few well-defined and
                                    actionable capabilities rather
                                    than too many.
                                    >
                                    > Eduard
                                    >
                                    > On Wed, Jul 3, 2024 at 5:44 AM
                                    Renjie Liu
                                    <liurenjie2...@gmail.com> wrote:
                                    >>>
                                    >>> Spec is an interesting topic
                                    we did not discuss. Robert, how do
                                    you envision this to be used?
                                    >>> In my mind, if a new table
                                    format v3 is launched, there are 2
                                    approaches we can go with, taking
                                    CreateTable as an example:
                                    >>> (1) increment the related
                                    operation version, which means
                                    that POST
                                    /v2/{prefix}/namespaces/{ns}/tables
                                    will be created and allow creating
                                    tables in the v3 version.
                                    >>> (2) update the existing table
                                    metadata model to support both v2
                                    and v3 fields, and the server
                                    enforces the payload differently
                                    based on the
                                    TableMetadata.format-version
                                    field. If the server does not
                                    support v3, it can return
                                    unsupported at that time.
                                    >>> Either way we go, the
                                    table-spec version does not need
                                    to be a capability. (1) seems to
                                    be cleaner, but has some overhead
                                    in provisioning a new endpoint
                                    compared to (2).
                                    >>> Do you see another way to do
                                    this leveraging the table-spec
                                    version?
                                    >>
                                    >>
                                    >> 2 is cleaner but maybe
                                    inconsistent with current
                                    behavior, since /v1/tables
                                    operation supports both v1 and v3.
                                    We should only go to 2 only when
                                    we have incompatible fields/break
                                    changes according to discussion.
                                    >>
                                    >> Generally I agree with adding
                                    table-spec into capabilities. For
                                    example, we can expose this to
                                    user in api so that user could
                                    choose a supported table format
                                    version without throwing exception.
                                    >>
                                    >> On Wed, Jul 3, 2024 at 12:18 AM
                                    Jack Ye <yezhao...@gmail.com> wrote:
                                    >>>
                                    >>> Spec is an interesting topic
                                    we did not discuss. Robert, how do
                                    you envision this to be used?
                                    >>>
                                    >>> In my mind, if a new table
                                    format v3 is launched, there are 2
                                    approaches we can go with, taking
                                    CreateTable as an example:
                                    >>>
                                    >>> (1) increment the related
                                    operation version, which means
                                    that POST
                                    /v2/{prefix}/namespaces/{ns}/tables
                                    will be created and allow creating
                                    tables in the v3 version.
                                    >>>
                                    >>> (2) update the existing table
                                    metadata model to support both v2
                                    and v3 fields, and the server
                                    enforces the payload differently
                                    based on the
                                    TableMetadata.format-version
                                    field. If the server does not
                                    support v3, it can return
                                    unsupported at that time.
                                    >>>
                                    >>> Either way we go, the
                                    table-spec version does not need
                                    to be a capability. (1) seems to
                                    be cleaner, but has some overhead
                                    in provisioning a new endpoint
                                    compared to (2).
                                    >>>
                                    >>> Do you see another way to do
                                    this leveraging the table-spec
                                    version?
                                    >>>
                                    >>> -Jack
                                    >>>
                                    >>>
                                    >>>
                                    >>>
                                    >>>
                                    >>> On Tue, Jul 2, 2024 at 6:03 AM
                                    Eduard Tudenhöfner
                                    <eduard.tudenhoef...@databricks.com.invalid>
                                    wrote:
                                    >>>>
                                    >>>>
                                    >>>> I couldn't make it to the
                                    catalog sync meeting yesterday but
                                    I watched the recording today
                                    (thanks for providing that).
                                    >>>>
                                    >>>>> The missing piece is how
                                    (new, capabilities-aware) clients
                                    handle the case when a service
                                    does _not_ return the capabilities
                                    field (absent). My proposal would
                                    be that a client should in this
                                    case assume that all _currently_
                                    existing capabilities are supported.
                                    >>>>>
                                    >>>>> - tables: [1]
                                    >>>>> - views: [1]
                                    >>>>> - remote-signing: [1]
                                    >>>>> - multi-table-commit: [1]
                                    >>>>> - register-table: [1]
                                    >>>>> - table-metrics: [1]
                                    >>>>> - table-spec: [1,2]
                                    >>>>> - view-spec: [1,2]
                                    >>>>>
                                    >>>>>
                                    >>>> The one thing I would like to
                                    add here is that the current PR
                                    uses the tables capability (as
                                    version 1) as the default when a
                                    server doesn't return capabilities
                                    but it might be also ok to include
                                    views (as version 1) because the
                                    current client impl has some code
                                    to deal with errors in case
                                    endpoints don't exist.
                                    >>>>
                                    >>>> Unless we agree that the
                                    currently existing functionality
                                    in the REST spec is the default
                                    behavior to be assumed for older
                                    server, I'm not sure about
                                    including remote-signing /
                                    multi-table-commit /
                                    register-table / table-metrics as
                                    it has been indicated in earlier
                                    comments on the PR/ML that not
                                    every REST server supports these.
                                    >>>>
                                    >>>> That being said, we should
                                    discuss whether we want the
                                    default behavior (when an older
                                    server doesn't send back
                                    capabilities) to be
                                    >>>> a) tables (version 1) only
                                    >>>> b) the currently existing
                                    functionality as defined in the
                                    REST spec (as version 1)
                                    >>>>
                                    >>>>
                                    >>>> On another note: Including
                                    table-spec / view-spec seems to be
                                    more informative in its nature as
                                    I don't think a client would act
                                    differently right now when seeing
                                    these.
                                    >>>>
                                    >>>> Thanks
                                    >>>> Eduard
                                    >>>>



-- Ryan Blue
                Databricks

--
Robert Stupp
@snazy

Reply via email to