Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread Péter Váry
+1 On Wed, Jan 22, 2025, 06:06 huaxin gao wrote: > +1 (non-binding) > > On Tue, Jan 21, 2025 at 6:04 PM Manu Zhang > wrote: > >> +1 (non-binding) >> >> Thanks & Regards >> >> On Wed, Jan 22, 2025 at 8:06 AM Daniel Weeks wrote: >> >>> +1 (binding) >>> >>> On Tue, Jan 21, 2025 at 1:05 PM Szehon

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread huaxin gao
+1 (non-binding) On Tue, Jan 21, 2025 at 6:04 PM Manu Zhang wrote: > +1 (non-binding) > > Thanks & Regards > > On Wed, Jan 22, 2025 at 8:06 AM Daniel Weeks wrote: > >> +1 (binding) >> >> On Tue, Jan 21, 2025 at 1:05 PM Szehon Ho >> wrote: >> >>> +1 (binding) >>> >>> Thanks >>> Szehon >>> >>> O

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread huaxin gao
+1 (non-binding) On Tue, Jan 21, 2025 at 4:20 PM Amogh Jahagirdar <2am...@gmail.com> wrote: > +1 Thank you Christian! > > On Tue, Jan 21, 2025 at 12:35 PM Sreeram Garlapati < > gsreeramku...@gmail.com> wrote: > >> +1 >> >> Thanks for cleaning this up. >> >> Best, >> Sreeram >> >> On Mon, Jan 20,

Re: Proposal: Parquet footer size in Iceberg metadata

2025-01-21 Thread Sreeram Garlapati
Thanks for the nice idea/suggestion, Dan. Yes, we have been employing a similar technique that you noted below and kinda arrived at the conclusion that there is no deterministic way to achieve that most optimal situation, ie., single i/o call to S3 to read the parquet footer. Best, Sreeram On Tue

Re: [DISCUSS] Use pr title + pr description as default git commit title + message in iceberg-rust

2025-01-21 Thread Renjie Liu
Thanks everyone for joining the discussion, I'll submit a jira ticket to enable it for iceberg-rust, iceberg-cpp, iceberg-go, and pyiceberg. On Mon, Jan 20, 2025 at 9:44 AM Junwang Zhao wrote: > hi Renjie, > > It would be great if iceberg-cpp adopt the same methodology. > > Regards > Junwang Zha

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread Manu Zhang
+1 (non-binding) Thanks & Regards On Wed, Jan 22, 2025 at 8:06 AM Daniel Weeks wrote: > +1 (binding) > > On Tue, Jan 21, 2025 at 1:05 PM Szehon Ho wrote: > >> +1 (binding) >> >> Thanks >> Szehon >> >> On Tue, Jan 21, 2025 at 12:55 PM Yufei Gu wrote: >> >>> +1 Thanks Honah! >>> >>> Yufei >>> >

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-21 Thread Manu Zhang
Hi Ryan, I think you could achieve what you're looking for by setting the age to 1 > ms and the minimum number of snapshots to keep I'm not sure how I can set the minimum number of snapshots to keep for tables with different update frequencies. For a daily updated table, I might set it to 5. How

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Amogh Jahagirdar
+1 Thank you Christian! On Tue, Jan 21, 2025 at 12:35 PM Sreeram Garlapati wrote: > +1 > > Thanks for cleaning this up. > > Best, > Sreeram > > On Mon, Jan 20, 2025 at 11:25 PM Christian Thiel < > christian.t.b...@gmail.com> wrote: > >> Hi everyone, >> >> based on good feedback on the [DISCUSS]

Re: Proposal: Parquet footer size in Iceberg metadata

2025-01-21 Thread Daniel Weeks
Hey Sreeram, I think it's worthwhile to consider what value would be added by tracking the footer size in metadata, but there are other options to address these optimization use cases. For example, if you take a look at the RangeReadable

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread Daniel Weeks
+1 (binding) On Tue, Jan 21, 2025 at 1:05 PM Szehon Ho wrote: > +1 (binding) > > Thanks > Szehon > > On Tue, Jan 21, 2025 at 12:55 PM Yufei Gu wrote: > >> +1 Thanks Honah! >> >> Yufei >> >> >> On Tue, Jan 21, 2025 at 12:45 PM Russell Spitzer < >> russell.spit...@gmail.com> wrote: >> >>> +1 >>>

Re: Reminder: Iceberg Catalog Community Sync

2025-01-21 Thread Honah J.
No problem! Here is the google meeting link: https://meet.google.com/xbs-yqfb-joa Best regards, Honah On Tue, Jan 21, 2025 at 2:34 PM Manish Malhotra < manish.malhotra.w...@gmail.com> wrote: > Thanks , > > Can you please share the meeting link as well? > > Regards, > Manish > > On Tue, Jan 21, 2

Re: Reminder: Iceberg Catalog Community Sync

2025-01-21 Thread Manish Malhotra
Thanks , Can you please share the meeting link as well? Regards, Manish On Tue, Jan 21, 2025 at 2:23 PM Honah J. wrote: > Hi everyone, > > FYI, the first catalog community sync in 2025 will be on tomorrow, > Wednesday 01/22 at 9AM (US/Pacific). Here is the meeting note/recordings: > > https://

Reminder: Iceberg Catalog Community Sync

2025-01-21 Thread Honah J.
Hi everyone, FYI, the first catalog community sync in 2025 will be on tomorrow, Wednesday 01/22 at 9AM (US/Pacific). Here is the meeting note/recordings: https://docs.google.com/document/d/1iPGVCIcr-M0XtAiudOguWAvmqIdVgpYN5vz5ohO8PKw/edit?tab=t.0#heading=h.cr6o1g2rn5hc Please feel free to add topi

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread Szehon Ho
+1 (binding) Thanks Szehon On Tue, Jan 21, 2025 at 12:55 PM Yufei Gu wrote: > +1 Thanks Honah! > > Yufei > > > On Tue, Jan 21, 2025 at 12:45 PM Russell Spitzer < > russell.spit...@gmail.com> wrote: > >> +1 >> >> On Tue, Jan 21, 2025 at 2:36 PM rdb...@gmail.com >> wrote: >> >>> +1 >>> >>> On Tu

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread Yufei Gu
+1 Thanks Honah! Yufei On Tue, Jan 21, 2025 at 12:45 PM Russell Spitzer wrote: > +1 > > On Tue, Jan 21, 2025 at 2:36 PM rdb...@gmail.com wrote: > >> +1 >> >> On Tue, Jan 21, 2025 at 12:20 PM Honah J. wrote: >> >>> Hi everyone, >>> >>> In the last VOTE >>>

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread Russell Spitzer
+1 On Tue, Jan 21, 2025 at 2:36 PM rdb...@gmail.com wrote: > +1 > > On Tue, Jan 21, 2025 at 12:20 PM Honah J. wrote: > >> Hi everyone, >> >> In the last VOTE >> thread >> on documenting snapshot summary optional fields, we decid

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-21 Thread Russell Spitzer
I do think this comes up a lot and is one of the more confusing things about the snapshot expiration. Definitely one of my most answered questions is: "When I set min-snapshots to 1, why do I not get only 1 snapshot." I agree adding another behavior may be even more confusing but I wouldn't be oppo

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread rdb...@gmail.com
+1 On Tue, Jan 21, 2025 at 12:20 PM Honah J. wrote: > Hi everyone, > > In the last VOTE > thread > on documenting snapshot summary optional fields, we decided to move the > documentation to a subsection of Appendix F – Implementa

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-21 Thread rdb...@gmail.com
I think you could achieve what you're looking for by setting the age to 1 ms and the minimum number of snapshots to keep. Then snapshot expiration would always expire all snapshots other than the min number, getting you what you want. It probably wouldn't make sense to set a maximum as well. Right

[VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread Honah J.
Hi everyone, In the last VOTE thread on documenting snapshot summary optional fields, we decided to move the documentation to a subsection of Appendix F – Implementation Notes. Since this is a significant change, I canceled the pre

Proposal: Parquet footer size in Iceberg metadata

2025-01-21 Thread Sreeram Garlapati
Hello Team! This is a small improvement proposal to store the *parquet footer size* as part of the *data_file* metadata in the iceberg manifest . *manifest_entry > (2) data_file > (146 Optional) footer_size_in_bytes* *Motivation*: - We have se

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Sreeram Garlapati
+1 Thanks for cleaning this up. Best, Sreeram On Mon, Jan 20, 2025 at 11:25 PM Christian Thiel wrote: > Hi everyone, > > based on good feedback on the [DISCUSS] thread [1] I would like to raise > a vote to deprecate the `snapshot-id` field of the `SetStatisticsUpdate` > in the IRC. It is redun

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Honah J.
+1 Thanks Christian! Best regards, Honah On Tue, Jan 21, 2025 at 10:54 AM Yufei Gu wrote: > +1 Thanks for removing the redundant! > Yufei > > > On Tue, Jan 21, 2025 at 9:28 AM Jean-Baptiste Onofré > wrote: > >> +1 (non binding) >> >> Thanks Christian ! >> >> Regards >> JB >> >> On Tue, Jan 21,

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Yufei Gu
+1 Thanks for removing the redundant! Yufei On Tue, Jan 21, 2025 at 9:28 AM Jean-Baptiste Onofré wrote: > +1 (non binding) > > Thanks Christian ! > > Regards > JB > > On Tue, Jan 21, 2025 at 8:25 AM Christian Thiel > wrote: > > > > Hi everyone, > > > > based on good feedback on the [DISCUSS] t

Re: [DISCUSS] Add metadata stats/metrics management on the REST Spec

2025-01-21 Thread Jean-Baptiste Onofré
Hi Dan, The target is about exposing stats & metrics from the metadata (relaying partition stats, etc), and give the option for a REST Catalog implementation to extend with additional metrics/stats. The purpose of the REST Catalog interface is to expose that from the query planner, be able to use

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-21 Thread Daniel Weeks
Hey Manu, I think I understand what you're trying to achieve here and I feel like the most important part is to have an updated version of the retention procedure to clearly state how this interacts with the other settings as part of the

Re: [DISCUSS] Add metadata stats/metrics management on the REST Spec

2025-01-21 Thread Daniel Weeks
Hey JB, I'm not sure I fully understand what the proposal is, but I also realise it's probably not completely fleshed out yet. When you say "manage metadata", the first concern that I have is whether you mean to just query/get the info or to also modify it. Table metadata is immutable and requir

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Jean-Baptiste Onofré
+1 (non binding) Thanks Christian ! Regards JB On Tue, Jan 21, 2025 at 8:25 AM Christian Thiel wrote: > > Hi everyone, > > based on good feedback on the [DISCUSS] thread [1] I would like to raise > a vote to deprecate the `snapshot-id` field of the `SetStatisticsUpdate` > in the IRC. It is redu

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Daniel Weeks
+1 On Tue, Jan 21, 2025 at 8:38 AM Eduard Tudenhöfner wrote: > +1 > > On Tue, Jan 21, 2025 at 5:34 PM Marc Cenac > wrote: > >> +1 non-binding >> >> On Tue, Jan 21, 2025 at 8:19 AM Sung Yun wrote: >> >>> +1 non-binding >>> >>> Thanks for driving this Christian! >>> >>> On 2025/01/21 12:39:26 Ru

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Kevin Liu
+1 (non-binding) Thanks Christian! I also created an issue to track the deprecation for the pyiceberg side https://github.com/apache/iceberg-python/issues/1556 Best, Kevin Liu On Tue, Jan 21, 2025 at 8:39 AM Eduard Tudenhöfner wrote: > +1 > > On Tue, Jan 21, 2025 at 5:34 PM Marc Cenac > wrot

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Eduard Tudenhöfner
+1 On Tue, Jan 21, 2025 at 5:34 PM Marc Cenac wrote: > +1 non-binding > > On Tue, Jan 21, 2025 at 8:19 AM Sung Yun wrote: > >> +1 non-binding >> >> Thanks for driving this Christian! >> >> On 2025/01/21 12:39:26 Russell Spitzer wrote: >> > +1 >> > >> > On Tue, Jan 21, 2025 at 4:34 AM Alex Dutra

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Marc Cenac
+1 non-binding On Tue, Jan 21, 2025 at 8:19 AM Sung Yun wrote: > +1 non-binding > > Thanks for driving this Christian! > > On 2025/01/21 12:39:26 Russell Spitzer wrote: > > +1 > > > > On Tue, Jan 21, 2025 at 4:34 AM Alex Dutra > > > wrote: > > > > > +1 (nb) > > > > > > On Tue, Jan 21, 2025 at 1

Re: FileRewrite API refactor

2025-01-21 Thread Russell Spitzer
To bump this back up, I think this is a pretty important change to the core library so it's necessary that we get more folks involved in this discussion. I I agree that the Rewrite Data Files needs to be broken up and realigned if we want to be able to reuuse the code in flink. I think I prefer t

[DISCUSS] Add metadata stats/metrics management on the REST Spec

2025-01-21 Thread Jean-Baptiste Onofré
Hi folks, I know we don't want to "expose" the whole metadata tables in the REST api, but I would like to discuss adding metadata stats and metrics management. We are discussing this as part of the Apache Polaris TMS proposal. The purpose is: 1. To add interfaces to manage metadata stats and metr

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Sung Yun
+1 non-binding Thanks for driving this Christian! On 2025/01/21 12:39:26 Russell Spitzer wrote: > +1 > > On Tue, Jan 21, 2025 at 4:34 AM Alex Dutra > wrote: > > > +1 (nb) > > > > On Tue, Jan 21, 2025 at 11:30 AM Piotr Findeisen < > > piotr.findei...@gmail.com> wrote: > > > >> +1 non-binding >

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Russell Spitzer
+1 On Tue, Jan 21, 2025 at 4:34 AM Alex Dutra wrote: > +1 (nb) > > On Tue, Jan 21, 2025 at 11:30 AM Piotr Findeisen < > piotr.findei...@gmail.com> wrote: > >> +1 non-binding >> >> On Tue, 21 Jan 2025 at 10:25, Fokko Driesprong wrote: >> >>> +1 >>> >>> Thanks for cleaning this up Christian! >>>

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Alex Dutra
+1 (nb) On Tue, Jan 21, 2025 at 11:30 AM Piotr Findeisen wrote: > +1 non-binding > > On Tue, 21 Jan 2025 at 10:25, Fokko Driesprong wrote: > >> +1 >> >> Thanks for cleaning this up Christian! >> >> Kind regards, >> Fokko >> >> Op di 21 jan 2025 om 08:25 schreef Christian Thiel < >> christian.t.

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Piotr Findeisen
+1 non-binding On Tue, 21 Jan 2025 at 10:25, Fokko Driesprong wrote: > +1 > > Thanks for cleaning this up Christian! > > Kind regards, > Fokko > > Op di 21 jan 2025 om 08:25 schreef Christian Thiel < > christian.t.b...@gmail.com>: > >> Hi everyone, >> >> based on good feedback on the [DISCUSS] t

Re: [VOTE] Deprecate IRC snapshot-id Field of SetStatisticsUpdate

2025-01-21 Thread Fokko Driesprong
+1 Thanks for cleaning this up Christian! Kind regards, Fokko Op di 21 jan 2025 om 08:25 schreef Christian Thiel < christian.t.b...@gmail.com>: > Hi everyone, > > based on good feedback on the [DISCUSS] thread [1] I would like to raise > a vote to deprecate the `snapshot-id` field of the `SetSt