Re: [DISCUSS] Describing REST Server capabilities

2024-06-26 Thread Péter Váry
I don't have a very strong opinion in the groups vs. single services debate, but I lean towards grouping, as that makes the result of the service human readable too. I expect that small/incomplete services will be the exception in the long run, and they can highlight the implemented services in the

Re: Iceberg-arrow vectorized read bug

2024-06-26 Thread Amogh Jahagirdar
`Hey Steve, Thanks for the clear reproduction test case, I think that's very helpful. I did some debugging locally, and my suspicion is that it's incorrect/unexpected that NullVectorReader being used for reading the new optional column. I could be wrong but it seems like we should be allocating a

Iceberg-arrow vectorized read bug

2024-06-26 Thread Lessard, Steve
I have found unexpected behavior in iceberg-arrow’s vectorized read support. After quite a bit of digging and collaboration with Eduard Tudenhoefner we have determined that there is a bug in iceberg-arrow, but we have not been able to determine exactly what the bug is. Can you please help identi

Re: [DISCUSS] Describing REST Server capabilities

2024-06-26 Thread Jack Ye
> evaluate a REST service and if it's good for their use case Feels like you are talking from the perspective of people choosing a vendor product. I believe most vendors will offer near-full capability. But I am coming from an angle of small organizations that are building REST servers just for op

Re: [DISCUSS] Describing REST Server capabilities

2024-06-26 Thread Amogh Jahagirdar
I'm in favor of grouping by tags. The way I look at this, there are 2 primary considerations: 1.) The client/server protocol complexity tradeoffs. On the first consideration, unless I'm missing something the client side becomes significantly more complex; if this has been sketched out earlier i

Re: [Discussion] Apache Iceberg Community Guideline - Initial Version

2024-06-26 Thread Jack Ye
+1 for adding to the site. I am putting it as a doc for now since Google doc is easier to comment (I think?). My plan is to: (1) publish it as a PR after a vote has passed. We can do one more sanity check in the PR, but the information will be exactly as it is presented in the Google doc, maybe a

Re: [Discuss] Geospatial Support

2024-06-26 Thread Szehon Ho
Hi It was great to meet in person with Snowflake engineers and we had a good discussion on the paths forward. Meeting notes for Snowflake- Iceberg sync. - Iceberg proposed Geometry type defaults to (edges=planar , crs=CRS84). - Snowflake has two types Geography (spherical) and Geometry (pl

Iceberg-arrow vectorized read bug

2024-06-26 Thread Lessard, Steve
I have found unexpected behavior in iceberg-arrow’s vectorized read support. After quite a bit of digging and collaboration with Eduard Tudenhoefner we have determined that there is a bug in iceberg-arrow, but we have not been able to determine exactly what the bug is. Can you please help identi

Re: [DISCUSS] Describing REST Server capabilities

2024-06-26 Thread Jack Ye
It seems like there are 2 sub-topics here: 1. should we group operations with tags, or should we do this per-operation/endpoint? 2. how should we do the capability/versioning for each unit (either per tag or per operation) Shall we first conclude on 1? For 1, my take is that we will need to do it

Re: [Discussion] Apache Iceberg Community Guideline - Initial Version

2024-06-26 Thread Ryan Blue
+1 for adding this to the site once we agree on the changes. One thing that has been raised several times but hasn't yet been addressed is how we want to tackle this. Many of us have asked to review the additional bylaws individually and discuss the purpose and merits of each one. It's great to ha

Re: [DISCUSS] Describing REST Server capabilities

2024-06-26 Thread Daniel Weeks
I think Robert's approach is a reasonable compromise here. If we wanted a "per operation/endpoint" versioning, I think I'd prefer Micah's OpenAPI spec based approach because it's more standardized, but I feel adds a lot of client complexity. -Dan On Wed, Jun 26, 2024 at 6:59 AM Robert Stupp w

Re: Iceberg Catalog Syncs Invite

2024-06-26 Thread Jack Ye
Oh I thought I have done that, probably missed, let me check each one and add missing GitHub proposals. And yes I will do the recording and also publish updates to devlist. Any major design decisions will be voted on devlist, the discussion meetings will not be used to pass any decisions. -Jack

Re: [Discussion] Apache Iceberg Community Guideline - Initial Version

2024-06-26 Thread Micah Kornfield
Hi Jack, I think it would make sense to convert this to a PR, so it can be version tracked in the future (and that way it avoids another review if the intent is to transitition github)? Thanks, Micah On Tue, Jun 25, 2024 at 9:07 AM Jack Ye wrote: > Hi everyone, > > Thanks for the feedback in th

Re: Iceberg Catalog Syncs Invite

2024-06-26 Thread Daniel Weeks
Thanks Jack, Are we set up to have a recording for this? I think it's important that people who cannot attend due to timezone or conflicts can still keep up to speed with the discussion. I don't think a summary is sufficient for full context, but would be great to include with a linked recording

Re: [DISCUSS] Describing REST Server capabilities

2024-06-26 Thread Robert Stupp
(I think, compatibility deserves a separate thread - it's a "huge" topic) Based on experience, we decided on the following with Nessie: * Unknown fields/attributes in a structure _DO_ cause (de)serialization failures. * "Stable API versions" - endpoint additions and/or added query parame

Re: [Discussion] Apache Iceberg Guidelines for Committership and PMC Membership

2024-06-26 Thread Robert Stupp
Thanks for drafting the doc! Having guidelines is a good thing, however I feel that it's too "coding oriented". Not every "good contributor" writes code, but organizes events, talks about Iceberg, etc. And vice versa not every good coder likes to speak publicly. This applies to both committers

Re: Iceberg Catalog Syncs Invite

2024-06-26 Thread Jean-Baptiste Onofré
Hi Jack Thanks ! Just a reminder: we have to provide regular updates on the dev mailing list (as we say at Apache: if it doesn't happen on the mailing list, it never happened). Regards JB On Wed, Jun 26, 2024 at 7:11 AM Jack Ye wrote: > > Hi everyone, > > Sorry for the wait, here is the invite

Re: [Discussion] Apache Iceberg Guidelines for Committership and PMC Membership

2024-06-26 Thread Jean-Baptiste Onofré
Hi, Thanks for proposing this. I left some comments. Regards JB On Tue, Jun 25, 2024 at 8:10 PM Jack Ye wrote: > > Hi everyone, > > Here is a draft proposal for the guidelines for committership and PMC > membership: > > https://docs.google.com/document/d/1ka0F9Cn0QeL3IJbds3aGyz3XLnzlS5khoY5B8y

Re: [Discussion] Apache Iceberg Guidelines for Committership and PMC Membership

2024-06-26 Thread Ajantha Bhat
Thanks for working on it. Added my comments in the doc. I think we should also mention how each metric is computed (I have added the github filters). Once this is finalized, maybe after every release (quarter) it is good to go through existing contributors and check whether anyone matches the crit

Re: [Discussion] Apache Iceberg Guidelines for Committership and PMC Membership

2024-06-26 Thread Eduard Tudenhöfner
Thanks Jack for adding this doc. I just had one minor comment about the duration of "sustained" contributions but overall the doc LGTM Eduard On Tue, Jun 25, 2024 at 8:11 PM Jack Ye wrote: > Hi everyone, > > Here is a draft proposal for the guidelines for committership and PMC > membership: >