Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-28 Thread Jan Kaul
d in queries (again early vs late). For example, views and tables in queries can withstand default catalog renames, but tables cannot when they are used inside views -- it even applies to views inside views, which makes this very ha

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-25 Thread Jan Kaul
example, if we want to validate that the tables referenced in the view exist, how can we do that when default-catalog isn't defined, since the view hasn't been created or loaded yet? Thanks, Walaa. On Thu, Apr 24, 2025 at 7:02 AM Jan Kaul wrote: Yes

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-24 Thread Jan Kaul
ate binding behavior (similar to the proposal), as opposed to using some catalog that "stores" the view definition. Thanks, Walaa On Tue, Apr 22, 2025 at 11:01 AM Jan Kaul wrote: Hi Walaa, Thanks for clarifying the aspects of non-determinism. Let me try to address y

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-22 Thread Jan Kaul
ion's default catalog (i.e., without specifying a catalog) in the current Iceberg spec? These questions are important because if we can’t unambiguously recover the "view catalog" from metadata, then defaulting to it is problematic. And if views can't be created in the default

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-22 Thread Jan Kaul
Hi Walaa, thank you for your proposal. If I understood correctly, you proposal is composed of three parts: - session default catalog as fallback for "default-catalog" - session default namespace as fallback for "default-namepace" - Late binding + UUID validation I have some comments regardi

Re: [DISCUSS] Proposal to buffer manifest files before updating manifest-list

2024-11-22 Thread Jan Kaul
to think a bit more about it but above are my concerns so far. Kind regards, Fokko Op vr 22 nov 2024 om 15:26 schreef Jan Kaul : Hi all, I'd like to propose an optimization for how we track manifest files in Iceberg tables, specifically focusing on r

Re: [DISCUSS] Deprecate embedded manifests

2024-11-22 Thread Jan Kaul
Hi all, I've been thinking about how we could make Iceberg tables more performant for streaming inserts. And I thought about using the manifests field as a buffer for manifest files before they are written to the manifest-list. This reduces the write amplification and simplifies the conflict

[DISCUSS] Proposal to buffer manifest files before updating manifest-list

2024-11-22 Thread Jan Kaul
Hi all, I'd like to propose an optimization for how we track manifest files in Iceberg tables, specifically focusing on reducing write amplification and simplifying conflict resolution during fast-append operations. Background: Replace vs. Change-Based Updates To frame this proposal,

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-20 Thread Jan Kaul
;PartialIdentifier" without the catalog name sounds good to me.  The storage table and MV have to be in the same catalog.  That would be a good fifth requirement to add to the list. Thanks Benny On Thu, Sep 19, 2024 at 1:27 AM Jan Kaul wrote:

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-19 Thread Jan Kaul
ll the key requirements above is much more important. Thanks Benny On Sat, Sep 14, 2024 at 2:01 AM Jan Kaul wrote: How about we make the /catalog_name field/ of the identifier optional? If the field is missing, it references a table/view in the same catalog. If it is present it h

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-14 Thread Jan Kaul
iers without the catalog name.  The MV and storage table would have to be in the same catalog. Thanks Benny On Fri, Sep 13, 2024 at 2:08 AM Jan Kaul wrote: Hi, regarding our recent discussion on table identifiers with respect to different catalog_names with different query eng

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-13 Thread Jan Kaul
support catalog name, it should be at the representation level, but catalog name does not really depend on the “dialect” but rather on the “engine”; hence the discussion becomes a little more involved. Thanks, Walaa. On Wed, Sep 11, 2024 at 1:11 PM Jan Kaul wrote: Hi Benny, I think

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-11 Thread Jan Kaul
er job at handling CRUD by UUID instead of engine specific identifiers. Another scenario we need to think through is a view that joins tables from two different catalogs.  How would we represent the lineage for that in an engine agnostic way? Thanks Benny On Tue, Sep 10, 2024 at 7:21 AM

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-10 Thread Jan Kaul
identifiers of multiple representations in Approach 1. The takeaway is that SQL identifiers are highly coupled with engines, just like views. It makes sense to track both together for consistency. Thanks, Walaa. On Sat, Sep 7, 2024 at 8:15 AM Jan K

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-03 Thread Jan Kaul
on the UUID vs table identifier discussion. I will do that by next week. Thanks, Walaa. On Thu, Aug 29, 2024 at 5:31 AM Jan Kaul wrote: Hi all,

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-02 Thread Jan Kaul
at by next week. Thanks, Walaa. On Thu, Aug 29, 2024 at 5:31 AM Jan Kaul wrote: Hi all, to move the Iceberg Materialzied View Proposal forward, I created a PR (https://github.com/apache/iceberg/pull/11041) that adds a section on Materialized Views to the View Spec. I hope we

[DISCUSS] Iceberg Materialzied Views

2024-08-29 Thread Jan Kaul
Hi all, to move the Iceberg Materialzied View Proposal forward, I created a PR (https://github.com/apache/iceberg/pull/11041) that adds a section on Materialized Views to the View Spec. I hope we can resolve any remaining questions there, before we can start the voting process for the Proposal

Re: [DISCUSS] Materialized Views: Lineage and State information

2024-08-16 Thread Jan Kaul
creation time by the respective engine, the refresh operation does not need to parse the SQL, correct?Thanks,Walaa.On Fri, Aug 16, 2024 at 12:24 AM Jan Kaul wrote: As the table I created is not properly shown in the mailing list I'll reformat the summary of

Re: [DISCUSS] Materialized Views: Lineage and State information

2024-08-16 Thread Jan Kaul
view version required if child view is updated (#5) On 16.08.24 09:17, Jan Kaul wrote: Hi, Thanks Micah for clearly stating the requirements. I think this gives better clarity for the discussion. It seems like we don't have a solution that satisfies all requirements at once. So we wou

Re: [DISCUSS] Materialized Views: Lineage and State information

2024-08-16 Thread Jan Kaul
ement for MVs.  The view lineage makes sharing of views between engines without common SQL dialects possible. Benny On Thu, Aug 15, 2024 at 12:22 AM Jan Kaul wrote:

Re: [DISCUSS] Materialized Views: Lineage and State information

2024-08-15 Thread Jan Kaul
Hi all, I would like to reemphasize the purpose of the refresh-state for materialized views. The purpose is to determine if the precomputed data is fresh, stale or invalid. For that the current snapshot-id of every table in the query tree has to be fetched from the catalog by using its full i

Re: Iceberg MV Refresh

2024-06-20 Thread Jan Kaul
Thanks Benny for bringing these issues up. I would agree with both of your propositions. Regarding the naming of the fields, we can go with the naming that you suggested. I just wanted to wait if some more people chime in with their opinions. Jan On 20.06.24 23:16, Benny Chow wrote: > So ba

Re: Summary of Iceberg Materialized View Meeting

2024-06-20 Thread Jan Kaul
n the doc for a week, then meet again if they are still open? Thanks, Walaa. On Fri, Jun 7, 2024 at 12:27 PM Jan Kaul wrote: No that's great, thank you. I'm thankful for the input. Jan Am 07.06.2024 17:53 schrieb Benny Chow : Looks good Jan.  I'm a bit

Agenda Community Sync 19th June

2024-06-18 Thread Jan Kaul
Hi all, I was wondering whether there was an agenda for the community sync tomorrow. There currently is no entry in the google doc. Best wishes, Jan

Re: Summary of Iceberg Materialized View Meeting

2024-06-07 Thread Jan Kaul
No that's great, thank you. I'm thankful for the input.JanAm 07.06.2024 17:53 schrieb Benny Chow :Looks good Jan.  I'm a bit nit pick on picking good names so I left some comments around that to see what others think.ThanksOn Fri, Jun 7, 2024 at 2:26 AM

Re: Summary of Iceberg Materialized View Meeting

2024-06-07 Thread Jan Kaul
erg is an open project and we realize not everyone can attend virtual meetings and want you to know you are welcome. On Jun 6, 2024, at 7:11 AM, Jan Kaul wrote:  Hi all, thanks to all of you who attended the

Summary of Iceberg Materialized View Meeting

2024-06-06 Thread Jan Kaul
Hi all, thanks to all of you who attended the meeting yesterday! It was great to talk to you and I think we made great progress. For those of you who weren't able to attend the meeting, I summarized the main points below: * Question 1*: Should we store the "storage table pointer" as a view pr

Re: Iceberg Materialized View Meeting

2024-06-04 Thread Jan Kaul
ng Jan.   I’ll be there!BennyOn Jun 3, 2024, at 11:15 PM, Jan Kaul wrote: Hi all, we will have a video call to get together and discuss Iceberg Materialized Views. The call is on Wednesday, 5 June 2024, 16:00:00 UTC (9:00 PDT) and you can jo

Iceberg Materialized View Meeting

2024-06-03 Thread Jan Kaul
Hi all, we will have a video call to get together and discuss Iceberg Materialized Views. The call is on *Wednesday, 5 June 2024, 16:00:00 UTC (9:00 PDT)* and you can join the meeting with the following link: https://meet.google.com/ttr-xwnk-wiz On the agenda are: * Store "Storage table po

Re: Materialized Views: Next Steps

2024-05-08 Thread Jan Kaul
ts.apache.org/thread/rotmqzmwk5jrcsyxhzjhrvcjs5v3yjcc Thanks, Walaa. On Wed, May 8, 2024 at 2:31 AM Jan Kaul wrote: The original google doc <https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?usp=sharing> discussed multiple aspects of the Materialized View spec. One was

Re: Materialized Views: Next Steps

2024-05-08 Thread Jan Kaul
nal doc and issue as references in the PR description. Please let me know if this works. Happy to hear others' thoughts on the best way to move forward. Thanks, Walaa. On Wed, May 8, 2024 at 12:56 AM Jan Kaul wrote: Thanks Walaa for trying to move things along. However I don't

Re: Materialized Views: Next Steps

2024-05-08 Thread Jan Kaul
Thanks Walaa for trying to move things along. However I don't think it's a good idea to start a separate discussion about the metadata for materialized views because we already had this discussion and reached consensus in this google doc: https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwX

Re: [Proposal] Add support for Materialized Views in Iceberg

2024-04-26 Thread Jan Kaul
0 On 25.04.24 21:54, Alagappan Maruthappan wrote: +1 for separate table and view objects. On Thu, Apr 25, 2024 at 12:33 PM Russell Spitzer wrote: +1 to separate. > On Apr 25, 2024, at 2:08 PM, Jean-Baptiste Onofré wrote: > > +1 to separate, it makes sense to me. >

Re: Materialized view integration with REST spec

2024-03-26 Thread Jan Kaul
the entire MV metadata object in process and returns the view metadata part. When the “loadTable” method of the TableCatalog is then called to obtain the storage table, it returns the table part of the cached MV metadata object. Best wishes, Jan On 3/26/24 9:08 AM, Jan Kaul wrote: I th

Re: Materialized view integration with REST spec

2024-03-26 Thread Jan Kaul
and their pros and cons we can move forward. How does that sound? Thanks, Walaa. On Mon, Mar 25, 2024 at 7:45 AM Jan Kaul wrote: I have the feeling that the current pros and cons from the summary target a version of the MV spec that wasn't really part of the discussi

Re: Materialized view integration with REST spec

2024-03-25 Thread Jan Kaul
I have the feeling that the current pros and cons from the summary target a version of the MV spec that wasn't really part of the discussion. The current arguments target a completely new specification for materialized views which we agreed on, is out of scope. Instead of a completely new speci

Re: New committer: Renjie Liu

2024-03-10 Thread Jan Kaul
Congrats!Am 09.03.2024 22:38 schrieb Micah Kornfield :CongratsOn Saturday, March 9, 2024, Hussein Awala wrote:Congrats Renjie!On Sat, Mar 9, 2024 at 8:55 PM Yufei Gu wrote:Congratulations and thanks for the great work in rust iceberg, Renjie!YufeiOn Sat, Ma

Re: Materialized view integration with REST spec

2024-02-29 Thread Jan Kaul
ning Catalog API's with the metadata. Thanks Szehon On Thu, Feb 29, 2024 at 5:45 AM Jan Kaul wrote: Hi all, I would like to provide my perspective on the question of what a materialized view is and elaborate on Jack's recent proposal

Re: Materialized view integration with REST spec

2024-02-29 Thread Jan Kaul
Hi all, I would like to provide my perspective on the question of what a materialized view is and elaborate on Jack's recent proposal to view a materialized view as a catalog concept. Firstly, let's look at the role of the catalog. Every entity in the catalog has a *unique identifier*, and t

Re: Materialized view integration with REST spec

2024-02-21 Thread Jan Kaul
a separate meeting. On Wed, Feb 21, 2024 at 12:40 AM Jan Kaul wrote: Thank you Jack for driving the consensus for the MV spec and thank you all for the discussion. I really like the idea about incremental consensus because we often loose sight in detailed d

Re: Materialized view integration with REST spec

2024-02-21 Thread Jan Kaul
Thank you Jack for driving the consensus for the MV spec and thank you all for the discussion. I really like the idea about incremental consensus because we often loose sight in detailed discussions. As Jack mentioned, the highest priority question currently is: *Should the Iceberg MV be reali

Re: [VOTE] Release Apache Iceberg Rust 0.2.0 RC1

2024-02-16 Thread Jan Kaul
+1 [x] Download links are valid. [x] Checksums and signatures. [x] LICENSE/NOTICE files exist [x] No unexpected binary files [x] All source files have ASF headers [x] Can compile from source On 15.02.24 17:00, Xuanwo wrote: +1 non-binding So happy to get a v0.2.0 release! [x] Download links a

Re: Process for creating new Proposals

2024-02-07 Thread Jan Kaul
egards JB On Mon, Jan 15, 2024 at 9:14 AM Jan Kaul wrote: > > Hey all, > > I was wondering if the community decided on a standard way to create new

Process for creating new Proposals

2024-01-15 Thread Jan Kaul
Hey all, I was wondering if the community decided on a standard way to create new proposals. In the community meeting it sounds like there is a consensus on using Github issues with a special "proposal" label. I think it would also be great to decide on how the proposal process should look lik

Re: Branching and Tagging for Iceberg Views

2023-11-14 Thread Jan Kaul
Thank you for your comments. I should have provided a user story to make the use case more clear. While the WAP pattern is probably the most common usage for the branching feature of iceberg tables, it could also be used in different ways. The following is a user story showcasing the branching

Branching and Tagging for Iceberg Views

2023-11-13 Thread Jan Kaul
Hi all, I was wondering what you think about a Branching and Tagging feature for Iceberg Views similar to the one for Iceberg Tables. Just that instead of having references to table snapshots you would have references to view versions. This could be accomplished similar to Iceberg Tables by in

Re: Feedback on Iceberg Materialized View Spec

2023-11-10 Thread Jan Kaul
Up until 2 weeks ago the discussion took place in the Github issue but since then most people joined the discussion in the google doc. Since the google doc seems to have more visibility I would propose to continue the discussion there. I hope that's fine. Best wishes, Jan On 08.11.23 07:09,

Re: Feedback on Iceberg Materialized View Spec

2023-11-07 Thread Jan Kaul
another look, I would really appreciate your input. Best wiches, Jan On 27.10.23 11:48, Jan Kaul wrote: Thank you Dan and the others for your helpful comments. I've added some sections to address the points that you mentioned. I'm not really sure what you mean by fail after grace peri

Re: Feedback on Iceberg Materialized View Spec

2023-10-27 Thread Jan Kaul
gt; >> > I like the idea of GitHub. Why not enabling (in .asf.yml) GitHub >>> >> >> > discussions ? A GitHub Discussion could be a good place to share the >>> >> >> > doc and exchange both in the doc and in the discussion

Feedback on Iceberg Materialized View Spec

2023-10-24 Thread Jan Kaul
Hi all, I've created an issue to propose a design for a Materialized View Spec a while ago. After further discussion we reached a first draft for the spec. It would be great if you could have another look at the design and share your feedback.

Re: Discussion about the location of language clients

2023-08-10 Thread Jan Kaul
Hi all, first off, thanks Brian for starting the conversation and thanks Renjie for the write up. I'm also in the camp multi-repo because of the already mentioned benefits. One point I would like to add is that the potential drawback of having less visibility with multi-repos can be mitigate

Location of rust repo

2023-07-19 Thread Jan Kaul
Hey all, we just had our first sync for the rust iceberg developers and it was great to talk to everyone. The most important point that came up was the location where the rust development should take place. The two options are either to have a separate "iceberg-rust" repository or to create

Re: Rust support support

2023-07-10 Thread Jan Kaul
Hi Brian, thanks for starting the discussion around a native library for iceberg. I'm Jan the owner of the iceberg-rust repository. Recently, another project started the implementation of a iceberg rust library (https://github.com/icelake-io/icelake). I think this is a great timing to start a

Re: C++/Rust SDK sync

2023-04-12 Thread Jan Kaul
gt;> Hi Jan, >>>>> >>>>> Thanks for raising this, and I'd love to join the sync. I did quite a bit of work on the Python implementation, and I'm happy to help with the Rust/C++ SDK as well. I'm neither a R

C++/Rust SDK sync

2023-04-07 Thread Jan Kaul
those who want to join the meeting, it would be great if you could answer this email with the dates that you are available. I will then create an online meeting for the date where most people can join. I'm looking forward to talking to you. Best wishes, Jan Kaul

Re: Iceberg Materialized View Spec

2022-12-22 Thread Jan Kaul
Hi Walaa, as you pointed out, design 1 in the github issue with a common view and a linked storage table seems to be the most promising going forward. I therefore put together an initial proposal for a specification. I realize that my proposal