Re: [DISCUSS] Describing REST Server capabilities

2024-06-27 Thread Péter Váry
Ignore my previous email - fat thumbed... Here is the full version: I think most of us agree that the server should announce its exact capabilities, so the clients don't need to guess. The debate is around how granular this definition should be. If we do it on service level, then the client need

Iceberg - PySpark overwrite with a condition

2024-06-27 Thread Ha Cao
Hello, I am experimenting with PySpark's DataFrameWriterV2 overwrite() to an Iceberg table with existing data in a target partition. My goal is that instead of overwriting the

Re: [DISCUSS] Describing REST Server capabilities

2024-06-27 Thread Péter Váry
I think most of us agree that the server should announce its exact capabilities, so the clients don't need to guess. The debate is around how granular this definition should be. If we do it on service level, then the client needs to examine each and every service it is using whether it has the spe

Re: [Discussion] Apache Iceberg Community Guideline - Initial Version

2024-06-27 Thread Jack Ye
To provide an update here, I have consolidated most of the comments in the initial version, with the following changes: (1) condensed the section of roles and responsibilities, with pointers to different pages in ASF and existing Iceberg project pages. (2) clarified voting details, regrading thin

Re: [DISCUSS] Describing REST Server capabilities

2024-06-27 Thread Robert Stupp
IMO that would be a list of "capability" to "set of versions" tuples. The reason to have a "set of (integer) version" is that you have to plan for the future, now. I also think we do need "logical" capabilities to express for example which table/view/etc specs a service supports and to express

Re: [DISCUSS] Describing REST Server capabilities

2024-06-27 Thread Robert Stupp
On 27.06.24 19:05, Micah Kornfield wrote: Maybe it pays to prototype the individual end point approach to demonstrate its relative complexity? The math is pretty simple: you need to duplicate all endpoints, all request/response schema types, all tests, duplicate and/or adopt client code. No

Re: [DISCUSS] Describing REST Server capabilities

2024-06-27 Thread Micah Kornfield
> > If it means returning a response with 20-30ish hard-coded entries, and the > client is configured based on that, that seems totally reasonable to me. It reads to me that a lot of the debate is around the complexity of one approach for the other. Maybe it pays to prototype the individual end

Iceberg-arrow vectorized read bug

2024-06-27 Thread Lessard, Steve
I have found unexpected behavior in iceberg-arrow’s vectorized read support. After quite a bit of digging and collaboration with Eduard Tudenhoefner we have determined that there is a bug in iceberg-arrow, but we have not been able to determine exactly what the bug is. Can you please help identi

Re: [Discussion] Apache Iceberg Guidelines for Committership and PMC Membership

2024-06-27 Thread Daniel Weeks
Rich, I largely agree with what you're saying about not making arbitrary rules/guidelines that exclude, but I don't think this document should impose any form of restrictions. I believe this should not be framed as "rules" or even "guidelines" as it should be informative about the considerations a

Re: [DISCUSS] Describing REST Server capabilities

2024-06-27 Thread Jack Ye
I feel Alex is already tapping into the more complex territory I do not want to go into, because as he says, a "capability" is logical, and it can be a set of overlapping endpoints, small features in some endpoints, etc. We already saw that in the original PR we tried to say "pagination" is a capab

Re: [DISCUSS] Describing REST Server capabilities

2024-06-27 Thread Alex Dutra
Hi all, So far we've been thinking of capabilities as equivalent to a set of endpoints. That's a rather technical definition. It also brings one important limitation: one endpoint can only be "governed" by one capability. Granted, most capabilities do require implementing specific endpoints. But

Re: [Discussion] Apache Iceberg Guidelines for Committership and PMC Membership

2024-06-27 Thread Rich Bowen
Once again, please understand that I'm an outsider, and have no vote here, but have a few years of experience with Apache communities, and so have a lot of opinions. Forgive me if I wax philosophical. First of all, framing this document as guidelines, rather than rules, is the right approach. I

RE: Re: Feedback Collection: Bylaws in Iceberg

2024-06-27 Thread Shane Curcuru
Some additional perspectives from someone not on Iceberg, but who's looked into a lot of ASF project communities. On 2024/06/25 17:54:48 Tyler Akidau wrote: ...snip... 1. I like the idea of guidelines on committership and PMC membership, but worry about overspecification limiting who might be c

Re: [DISCUSS] Describing REST Server capabilities

2024-06-27 Thread Jean-Baptiste Onofré
Hi Jack I like Robert's proposal. Back to the topics, I think grouping with tags is more "flexible" (it was what we included in the REST spec proposal as well). Regards JB On Wed, Jun 26, 2024 at 6:26 PM Jack Ye wrote: > > It seems like there are 2 sub-topics here: > 1. should we group operatio

Re: [DISCUSS] Describing REST Server capabilities

2024-06-27 Thread Eduard Tudenhöfner
IMO a capability is a coarse-grained way of describing that a catalog supports X, Y, Z. In order to support X it needs to implement the particular endpoints under X, otherwise it doesn't fully support X. >From a user's perspective this makes it easy to understand whether a catalog fully supports e