Re: [Discussion] Versioned SQL UDFs (Catalog routines) in Iceberg

2024-07-15 Thread Ajantha Bhat
Hi, just another reminder since we didn't get any review on the proposal. Initially proposed on June 4. - Ajantha On Mon, Jun 24, 2024 at 4:21 PM Ajantha Bhat wrote: > Hi everyone, > > We've only received one review so far (from Benny). > > We would appreciate more eyes on this. > > - Ajantha >

Re: [VOTE] spec: remove the JSON spec for content file and file scan task sections

2024-07-15 Thread Steven Wu
+1 from me too (binding). I am closing the votes and thanks everyone for the votes. There are 18 YES votes, out of which 6 were binding. And there are no disapproving votes. Hence the proposal is approved and I will merge the PR. On Thu, Jul 11, 2024 at 7:09 PM Honah J. wrote: > +1 (non-bindi

Re: [DISCUSS] Merging specification clarifications

2024-07-15 Thread Micah Kornfield
OK, I started a vote thread for the PRs. Thanks, Micah On Mon, Jul 15, 2024 at 12:44 AM Fokko Driesprong wrote: > Hey Micah, > > Thanks for raising this. I was going over all the open PRs on the table > spec, and I think it would be great to get these in since they provide some > valuable clari

[VOTE] Merge table spec clarifications on time travel and equality deletes

2024-07-15 Thread Micah Kornfield
I'd like to raise on modifying the table specification with clarifications on time travel and equality deletes [1][2]. The PRs have links to prior mailing list discussions where there was apparent consensus that these were the expectations for functionality. Possible votes: [ ] +1 Merge the PRs [

Re: [Early Feedback] Variant and Subcolumnarization Support

2024-07-15 Thread Aihua Xu
Thanks for the discussion. I will move forward to work on spec PR. Regarding the implementation, we will have module for Variant support in Iceberg so we will not have to bring in Spark libraries. I'm reposting the meeting invite in case it's not clear in my original email since I included i

Re: [DISCUSS] Describing REST Server capabilities

2024-07-15 Thread Dmitri Bourlatchkov
So I would argue to define the current set of APIs and specs as the default > if the `capabilities` field is missing. There have been two sides to this in prior discussions. Having *tables* as the default vs having what's *currently in the spec* as the default. The argument for having *tables* as

Re: [DISCUSS] Describing REST Server capabilities

2024-07-15 Thread Robert Stupp
On 15.07.24 16:10, Eduard Tudenhöfner wrote: Current servers do not send a `capabilities` field at all. You're suggesting to use a new `rest-default-capabilities` property to let newer clients assume `1`.  Once the table/view/etc-spec capabilities are needed, those newer clients

Re: [DISCUSS] Describing REST Server capabilities

2024-07-15 Thread Eduard Tudenhöfner
> > Current servers do not send a `capabilities` field at all. You're > suggesting to use a new `rest-default-capabilities` property to let newer > clients assume `1`. Once the table/view/etc-spec capabilities are needed, > those newer clients would assume table-spec v1. That's wrong IMO. That s

Re: [DISCUSS] Enable the discussion tab for iceberg github repos

2024-07-15 Thread Renjie Liu
Hi: > But one minor concern, since this has to be enabled at the repo level(go, > rust, core), wondering if this can lead to limited participation as against > when discussed in a mailing list. This is a good point, but as we discussed, github discussions are usually used for user related quest

Re: [DISCUSS] Unencoded Variable Length Column Size Statistics

2024-07-15 Thread Ajantha Bhat
Hi Samrose, Thanks for the proposal. +1 from my side as Iceberg should definitely leverage all info provided by Parquet. This can help in query planning (specially as the Join and exchange happens with raw data). I have also tagged Micah on the proposal as he worked on the same at Parquet side.

Re: [DISCUSS] Describing REST Server capabilities

2024-07-15 Thread Robert Stupp
Sorry, I don't understand the two suggestions, especially when used in combination. Current servers do not send a `capabilities` field at all. You're suggesting to use a new `rest-default-capabilities` property to let newer clients assume `1`.  Once the table/view/etc-spec capabilities are need

Re: [VOTE] Release Apache Iceberg 1.6.0 RC0

2024-07-15 Thread Fokko Driesprong
Thanks JB for running the release! +1 (binding) - Checked signatures and checksums - Ran license check - Ran. tests - Verified against example notebooks - Ran some tests regarding the split of the uri/oauth2-server-uri Kind regards, F

Re: [DISCUSS] Describing REST Server capabilities

2024-07-15 Thread Eduard Tudenhöfner
I would suggest adding *table-spec / view-spec / udf-spec *capabilities later when new requirements/updates get added. The current implementation wouldn't make any use of these capabilities, so I don't see a good enough reason to add them at this point. The PR currently says: "tables -> default ca

Re: [VOTE] Fix property names in REST spec for statistics / partition statistics

2024-07-15 Thread Eduard Tudenhöfner
The vote passed with *5 binding +1* votes and *10 non-binding +1* votes. Thanks everyone, I'll get the PR merged. On Fri, Jul 12, 2024 at 4:10 AM Honah J. wrote: > +1 (non-binding) > > Best regards, > Honah > > On Wed, Jul 10, 2024 at 1:19 PM Péter Váry > wrote: > >> +1 (non-binding - at least

Re: [DISCUSS] Describing REST Server capabilities

2024-07-15 Thread Robert Stupp
Hi, I still have concerns regarding the missing table-spec/view-spec capabilities. Newer clients can send create/update requests with requirements/updates of newer Iceberg table/view/udf specs to a server that doesn't support those spec versions - the outcome is rather undefined. What should

[DISCUSS] Unencoded Variable Length Column Size Statistics

2024-07-15 Thread Samrose Ahmed
Hello, I have added a proposal to be able to optionally track uncompressed unencoded column size statistics for variable length columns. Currently, it isn't possible to estimate memory size of variable length columns as `columnSizes` only contains compressed sizes. I've created an issue (https://

Re: [VOTE] Release Apache Iceberg 1.6.0 RC0

2024-07-15 Thread Ajantha Bhat
+1 (non-binding) * validated checksum and signature * checked license docs & ran RAT checks * ran build and tests with JDK11 * verified CI against Trino by bumping Iceberg version ( https://github.com/trinodb/trino/pull/22667) * verified Nessie REST catalog with Iceberg 1.6.0 - Ajantha On Sat, J

Re: [DISCUSS] Merging specification clarifications

2024-07-15 Thread Fokko Driesprong
Hey Micah, Thanks for raising this. I was going over all the open PRs on the table spec, and I think it would be great to get these in since they provide some valuable clarification. I think a VOTE is the most straightforward way to get it in, you can find an example here