Re: [DISCUSS] Iceberg Summit proposal

2024-02-21 Thread Manu Zhang
I think the event time is odd for people in Asia to attend. > We can set up dedicated slack channels to collect questions beforehand and continue discussion afterwards. On Thu, Feb 22, 2024 at 6:43 AM Ryan Blue wrote: > Thanks, Jacob! I updated the doc to fix this. I think we are aiming for > Tu

Re: Materialized view integration with REST spec

2024-02-21 Thread Walaa Eldin Moustafa
Thanks Jack! I feel Question 0 is very broad, essentially capturing the whole design. Can we start by discussing more granular questions? On Wed, Feb 21, 2024 at 8:53 PM Jack Ye wrote: > Thanks everyone for the help in organizing the thoughts! > > I have moved the summary of everyone's comments

Re: Materialized view integration with REST spec

2024-02-21 Thread Jack Ye
Thanks everyone for the help in organizing the thoughts! I have moved the summary of everyone's comments here also to the doc that Jan linked under question 0. We can continue to have more discussions there and cast votes! Best, Jack Ye On Wed, Feb 21, 2024 at 12:14 PM Jan Kaul wrote: > Thanks

Re: [VOTE] Release Apache Iceberg 1.5.0 RC3

2024-02-21 Thread Renjie Liu
+1 (non-binding) - verified signature and checksum - verified RAT license check - verified build/tests passing with JDK11 On Thu, Feb 22, 2024 at 7:39 AM Drew wrote: > +1 (non-binding) > > - verified signature and checksum > - verified RAT license check > - verified build/tests passing with JDK

Re: Proposal for RESTful Data Operations

2024-02-21 Thread Ryan Blue
Okay, so it sounds like the motivation is to improve the story around CDC. That’s a good area to work on, but I don’t see how extending the REST protocol like this would make an impact on that problem. In addition, I’m not following your rationale for a few things, so we should probably take a look

Re: [DISCUSS] spec: remove the file scan task JSON serialization section from table spec

2024-02-21 Thread Jack Ye
I see. I was asking for the devlist discussion history, because this is related to our proposal discussion. I think we should establish some rules like "no change should be added to any spec without devlist discussions", and then we can use this rule to justify the removal of this spec change that

Re: Proposal for RESTful Data Operations

2024-02-21 Thread Jack Ye
Thanks for the response Ryan! > The solution to the problem above is to add more to the API — maybe have a single endpoint that can delete and append files in a single commit. But then pushing this to the server requires that we also support validations to ensure the swap is valid when there are r

Re: [DISCUSS] spec: remove the file scan task JSON serialization section from table spec

2024-02-21 Thread Steven Wu
here is the PR for spec update: https://github.com/apache/iceberg/pull/9771 > Was there any prior discussions on devlist for adding it to the spec? Jack, there is no separate discussion on adding it to the spec. It was a mistake on my part. it was added in the PR from 8 months ago as linked. [2]

Re: [VOTE] Release Apache Iceberg 1.5.0 RC3

2024-02-21 Thread Drew
+1 (non-binding) - verified signature and checksum - verified RAT license check - verified build/tests passing with JDK17 - ran manual tests with GlueCatalog on Spark 3.5 Drew On Wed, Feb 21, 2024 at 9:33 AM Ajantha Bhat wrote: > Hi Everyone, > > I propose that we release the following RC

Re: [DISCUSS] Iceberg Summit proposal

2024-02-21 Thread Ryan Blue
Thanks, Jacob! I updated the doc to fix this. I think we are aiming for Tuesday and Wednesday. On Wed, Feb 21, 2024 at 11:59 AM Jacob Marble wrote: > The document contains conflicting dates. May 13th & 14th, or May 14th & > 15th? > > On Tue, Feb 20, 2024 at 10:23 PM Ajantha Bhat > wrote: > >> T

Re: Table Schema History Pruning

2024-02-21 Thread Ben Dilday (BLOOMBERG/ 120 PARK)
Hey Everyone, We put in a feature request / proposal on this topic a few days ago, with the idea of storing the schemas in files that are external to metadata.json https://github.com/apache/iceberg/issues/9734 - would be really interested in getting some feedback on it and seeing if folks think

Re: [DISCUSS] spec: remove the file scan task JSON serialization section from table spec

2024-02-21 Thread Ryan Blue
I think I would probably remove it from the spec with a note and a pointer to the class that implements it. Right now we don't have anyone that I'm aware of relying on this serialization format across engines so it isn't a format-level contract. Though we should note that Flink relies on the forma

Re: [DISCUSS] spec: remove the file scan task JSON serialization section from table spec

2024-02-21 Thread Jack Ye
Was there any prior discussions on devlist for adding it to the spec? Could you help link those conversations? Thanks, Jack Ye On Wed, Feb 21, 2024 at 1:05 PM Steven Wu wrote: > > In the recent PR review [1], Ryan and emkornfield has raised a question > why file scan task JSON serialization was

[DISCUSS] spec: remove the file scan task JSON serialization section from table spec

2024-02-21 Thread Steven Wu
In the recent PR review [1], Ryan and emkornfield has raised a question why file scan task JSON serialization was added to the table spec [2]. We seems to have a consensus that it *shouldn't* have been added to the table spec. Now the question is what's the process of removing an invalid section f

Re: Materialized view integration with REST spec

2024-02-21 Thread Jan Kaul
Thanks Micah, I think the voting chips are great. @Szehon, actually what I had in mind was not to have one thread per question but rather have smaller threads that can be resolved more easily. I have the fear that one thread for the current question would lead to a very long and unmanageable d

Re: [DISCUSS] Iceberg Summit proposal

2024-02-21 Thread Jacob Marble
The document contains conflicting dates. May 13th & 14th, or May 14th & 15th? On Tue, Feb 20, 2024 at 10:23 PM Ajantha Bhat wrote: > Thanks for the proposal. > Looking forward to the first official Iceberg summit. > > I think the event time is odd for people in Asia to attend. > Suggestions are

Re: Proposal for RESTful Data Operations

2024-02-21 Thread Ryan Blue
Thanks for pushing this forward, Drew and Jack! Jack just asked “how would such endpoints work with multi-table transactions?” — that demonstrates a big concern that I have about adding remove or delete file append endpoints. I don’t think that those endpoints can or should be used for transaction

Re: Materialized view integration with REST spec

2024-02-21 Thread Micah Kornfield
> > Of course we also need threads that express our preferences (voting). I > would suggest to keep these separate from discussions about single points > so that they can be persisted in the document. Not sure if it helpful, but I added voting chips Question 0, as maybe an easier way to keep trac

Re: Materialized view integration with REST spec

2024-02-21 Thread Szehon Ho
Thanks Jan. +1 on having just one thread per question for vote/preference. Where do you suggest we have it, on the discussion question itself? It would be to keep the existing threads and move it there. Also, I think it makes sense with making a slack channel (for quick question, reply) , and a

[VOTE] Release Apache Iceberg 1.5.0 RC3

2024-02-21 Thread Ajantha Bhat
Hi Everyone, I propose that we release the following RC as the official Apache Iceberg 1.5.0 release. The commit ID is 0c8703078443a3c73a5aa5a6bd1cf904e0b5ce09 * This corresponds to the tag: apache-iceberg-1.5.0-rc3 * https://github.com/apache/iceberg/commits/apache-iceberg-1.5.0-rc3 * https://gi

Re: Materialized view integration with REST spec

2024-02-21 Thread Jan Kaul
Thank you Jack for driving the consensus for the MV spec and thank you all for the discussion. I really like the idea about incremental consensus because we often loose sight in detailed discussions. As Jack mentioned, the highest priority question currently is: *Should the Iceberg MV be reali