from:"rdb...@gmail.com"

Re: [VOTE] Merge guidelines for committing PRs

2024-08-29 Thread rdb...@gmail.com

-0

While I appreciate the motivation, I think that this is going to lead to
more problems, not fewer.

On Wed, Aug 28, 2024 at 10:54 PM Renjie Liu  wrote:

> +1 (binding)
>
> On Thu, Aug 29, 2024 at 8:59 AM Amogh Jahagirdar <2am...@gmail.com> wrote:
>
>> +1 (binding)
>>
>> On Wed, Aug 28, 2024 at 6:45 PM Yufei Gu  wrote:
>>
>>> +1 (binding)
>>> Yufei
>>>
>>>
>>> On Wed, Aug 28, 2024 at 4:56 PM Anton Okolnychyi 
>>> wrote:
>>>
 +1 (binding)

 Thanks, Micah!

 ср, 28 серп. 2024 р. о 16:36 Dmitri Bourlatchkov
  пише:

> +1 (nb)
>
> Cheers,
> Dmitri.
>
> On Wed, Aug 28, 2024 at 12:29 PM Micah Kornfield <
> emkornfi...@gmail.com> wrote:
>
>> I propose to merge https://github.com/apache/iceberg/pull/10780 as a
>> starting place for describing community norms around merging/discussing 
>> PRs.
>>
>> We've discussed this [1] and gone through a bunch of revisions on the
>> PR to what is a minimal starting point for describing the merge process.
>>
>> The vote will remain open for at least 72 Hours
>>
>> Vote:
>> [] +1
>> [] +0
>> [] -1, do not merge because ...
>>
>>
>> My vote is +1 (non-binding)
>>
>> Thanks,
>> Micah
>>
>>
>> [1] https://lists.apache.org/thread/l5d1vrnt5n3xmkqnx1yk7mhyfn2z0cx9
>>
>

Re: [DISCUSS] Row Lineage Proposal

2024-08-29 Thread rdb...@gmail.com

+1 for making row lineage and equality deletes mutually exclusive.

The idea behind equality deletes is to avoid needing to read existing data
in order to delete records. That doesn't fit with row lineage because the
purpose of lineage is to be able to identify when a row changes by
maintaining an identifier that would have to be read.

On Wed, Aug 28, 2024 at 4:16 PM Anton Okolnychyi 
wrote:

> I went through the proposal and left comments as well. Thanks for working
> on it, Russell!
>
> I don't see a good solution to how row lineage can work with equality
> deletes. If so, I would be in favor of not allowing equality deletes at all
> if row lineage is enabled as opposed to treating all added data records as
> new. I will spend more time thinking if we can make it work.
>
> - Anton
>
> ср, 28 серп. 2024 р. о 12:41 Ryan Blue  пише:
>
>> Sounds good to me. Thanks for pushing this forward, Russell!
>>
>> On Tue, Aug 27, 2024 at 7:17 PM Russell Spitzer <
>> russell.spit...@gmail.com> wrote:
>>
>>> I think folks have had a lot of good comments and since there haven't
>>> been a lot of strong opinions I'm going to try to take what I think are the
>>> least interesting options and move them into the "discarded section".
>>> Please continue to comment and let's please make sure any things that folk
>>> think are blockers for a Spec PR are eliminated. If we have general
>>> consensus at a high level I think we can move to discussing the actual spec
>>> changes on a spec change PR.
>>>
>>> I'm going to be keeping the proposals for :
>>>
>>> Global Identifier as the Identifier
>>> and
>>> Last Updated Sequence number as the Version
>>>
>>>
>>>
>>> On Tue, Aug 20, 2024 at 3:21 AM Ryan Blue 
>>> wrote:
>>>
 The situation in which you would use equality deletes is when you do
 not want to read the existing table data. That seems at odds with a feature
 like row-level tracking where you want to keep track. To me, it would be a
 reasonable solution to just say that equality deletes can't be used in
 tables where row-level tracking is enabled.

 On Mon, Aug 19, 2024 at 11:34 AM Russell Spitzer <
 russell.spit...@gmail.com> wrote:

> As far as I know Flink is actually the only engine we have at the
> moment that can produce Equally deletes and only Equality deletes have 
> this
> specific problem. Since an equality delete can be written without actually
> knowing whether rows are being updated or not, it is always ambiguous as 
> to
> whether a new row is an updated row, a newly added row, or a row which was
> deleted but then a newly added row was also appended.
>
> I think in this case we need to ignore row_versioning and just give
> every new row a brand new identifier. For a reader this means all updates
> look like a "delete" and "add" and no "updates". For other processes (COW
> and Position Deletes) we only mark records as being deleted or updated
> after finding them first, this makes it easy to take the lineage 
> identifier
> from the source record and change it. For Spark, we just kept working on
> engine improvements (like SPJ, Dynamic partition pushdown) to try to make
> that scan and join faster but we probably still require a bit slower
> latency.
>
> I think we could theoretically resolve equality deletes into updates
> at compaction time again but only if the user first defines accurate "row
> identity" columns because otherwise we have no way of determining whether
> rows were updated or not. This is basically the issue we have now in the
> CDC procedures. Ideally, I think we need to find a way to have flink 
> locate
> updated rows at runtime using some better indexing structure or something
> like that as you suggested.
>
> On Sat, Aug 17, 2024 at 1:07 AM Péter Váry <
> peter.vary.apa...@gmail.com> wrote:
>
>> Hi Russell,
>>
>> As discussed offline, this would be very hard to implement with the
>> current Flink CDC write strategies. I think this is true for every
>> streaming writers.
>>
>> For tracking the previous version of the row, the streaming writer
>> would need to scan the table. It needs to be done for every record to 
>> find
>> the previous version. This could be possible if the data would be stored 
>> in
>> a way which supports fast queries on the primary key, like LSM Tree (see:
>> Paimon [1]), otherwise it would be prohibitively costly, and unfeasible 
>> for
>> higher loads. So adding a new storage strategy could be one solution.
>>
>> Alternatively we might find a way for the compaction to update the
>> lineage fields. We could provide a way to link the equality deletes to 
>> the
>> new rows which updated them during write, then on compaction we could
>> update the lineage fields based on this info.
>>
>> Is there any better ideas w

Re: [VOTE] Merge REST Spec Change To Add New Scan Planning APIs

2024-09-03 Thread rdb...@gmail.com

+1

I think it would be good to give an overview of the current proposal since
it has evolved quite a bit from the original like Jack said.

On Tue, Sep 3, 2024 at 9:09 AM Jack Ye  wrote:

> Thanks for keeping pushing for this Rahil. Personally I am +1 (binding)
> for this, with just some minor comments in the latest PR.
>
> But I think the initial DISCUSS thread [1] was quite a while ago and a lot
> has changed after a lot of comments and reviews. Should we restart another
> DISCUSS thread before voting, to make sure people are aware of the latest
> design and address any additional comments?
>
> Best,
> Jack Ye
>
> [1] https://lists.apache.org/thread/qq13468x6gk0vxnsckzc5xd02tjlvpkm
>
>
> On Mon, Sep 2, 2024 at 9:22 PM Chertara, Rahil 
> wrote:
>
>> Hi all,
>>
>>
>>
>> I've opened a PR [1] to add REST spec changes for a new protocol around
>> table scan planning. For context around the design discussions, see the
>> original google doc proposal [2], the dev list discussion thread [3], and
>> finally the discussion that has happened on the spec change PR.
>>
>>
>>
>> Please vote on merging this change. The vote will remain open for at
>> least 72 hours.
>>
>>
>>
>> [] +1
>>
>> [] +0
>>
>> [] -1, do not merge because ...
>>
>>
>>
>> [1] https://github.com/apache/iceberg/pull/9695
>>
>> [2]
>> https://docs.google.com/document/d/1FdjCnFZM1fNtgyb9-v9fU4FwOX4An-pqEwSaJe8RgUg/edit#heading=h.cftjlkb2wh4h
>>
>> [3] https://lists.apache.org/thread/qq13468x6gk0vxnsckzc5xd02tjlvpkm
>>
>>
>>
>> Thanks,
>>
>> Rahil Chertara
>>
>

Re: Time-based partitioning on long column type

2024-09-10 Thread rdb...@gmail.com

Maybe we could update the time-based partition functions to be applied to a
long column directly. It would treat that column like a timestamp in
milliseconds. Would that work? I need to think more about the implications
of doing that, but I don't think that we currently have an issue with
extending the supported source column types like that.

On Mon, Sep 9, 2024 at 9:05 PM Manu Zhang  wrote:

> Hi all,
>
> I'd like to bump this thread since we don't want to allow long to
> timestamp promotion in V3
> .
> What other options do we have?
>
> Thanks,
> Manu
>
> On Fri, Apr 5, 2024 at 12:09 AM Jean-Baptiste Onofré 
> wrote:
>
>> Ah yes, milestone is fine. Thanks !
>>
>> All good.
>>
>> Regards
>> JB
>>
>> On Thu, Apr 4, 2024 at 5:12 PM Eduard Tudenhoefner 
>> wrote:
>> >
>> > There is the V3 Spec milestone where it's tracked (amongst other
>> things).
>> >
>> > On Thu, Apr 4, 2024 at 9:44 AM Jean-Baptiste Onofré 
>> wrote:
>> >>
>> >> Hi Eduard,
>> >>
>> >> Thanks for the update ! It makes sense to me.
>> >>
>> >> Maybe a GH label with spec or v3_spec would help to see what is
>> planned for v3 ?
>> >>
>> >> Regards
>> >> JB
>> >>
>> >> On Thu, Apr 4, 2024 at 9:36 AM Eduard Tudenhoefner 
>> wrote:
>> >> >
>> >> > Type promotion from Long to Timestamp is on the roadmap for the V3
>> Spec, so that would be the preferred way.
>> >> >
>> >> > On Wed, Apr 3, 2024 at 10:38 AM Jean-Baptiste Onofré <
>> j...@nanthrax.net> wrote:
>> >> >>
>> >> >> Hi Manu
>> >> >>
>> >> >> TIMESTAMP_LONG type promotion could be the easiest way, it would
>> work
>> >> >> with the existing transform.
>> >> >>
>> >> >> Would it work for you?
>> >> >>
>> >> >> Regards
>> >> >> JB
>> >> >>
>> >> >> On Wed, Apr 3, 2024 at 5:56 AM Manu Zhang 
>> wrote:
>> >> >> >
>> >> >> > Hi all,
>> >> >> >
>> >> >> > We have source data with a timestamp field in LONG type to land
>> in an Iceberg table. We want to partition the table with the timestamp
>> field such that we can query recent data more efficiently. However, LONG is
>> not supported as the source type of time-based transform (hour, day, etc)
>> >> >> >
>> >> >> > I find the previous discussion
>> https://github.com/apache/iceberg/issues/417 and Ryan suggested two
>> solutions
>> >> >> >
>> >> >> > 1. type promotion from LONG to TIMESTAMP
>> >> >> > 2. custom transform
>> >> >> >
>> >> >> > As I understand it, neither solution has already been implemented
>> yet. Is there any progress in either direction? Which solution does the
>> community prefer? Any other suggestions are also appreciated.
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Manu
>>
>

[DISCUSS] September board report

2024-09-10 Thread rdb...@gmail.com

Hi everyone,

It’s time for another ASF board report! Here’s my current draft. Please
reply if you think there is something that I should add or change. Thanks!

Ryan
Description:

Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.
Project Status:

Current project status: Ongoing
Issues for the board: None
Membership Data:

Apache Iceberg was founded 2020-05-19 (4 years ago)
There are currently 31 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is roughly 4:3.

Community changes, past quarter:

   - Amogh Jahagirdar was added to the PMC on 2024-08-12
   - Eduard Tudenhoefner was added to the PMC on 2024-08-12
   - Honah J. was added to the PMC on 2024-07-22
   - Renjie Liu was added to the PMC on 2024-07-22
   - Peter Vary was added to the PMC on 2024-08-12
   - Piotr Findeisen was added as committer on 2024-07-24
   - Kevin Liu was added as committer on 2024-07-24
   - Sung Yun was added as committer on 2024-07-24
   - Hao Ding was added as committer on 2024-07-23

Project Activity:

Releases:

   - Java 1.6.1 was released on 2024-08-28
   - Rust 0.3.0 was released on 2024-08-20
   - PyIceberg 0.7.1 was released on 2024-08-18
   - PyIceberg 0.7.0 was released on 2024-07-30
   - Java 1.6.0 was released on 2024-07-23

Table format:

   - Work for v3 is picking up
   - Committed timestamp_ns implementation
   - Ongoing discussion/proposal for improvements to row-level deletes
   - Ongoing discussion/proposal for row-level metadata for change tracking
   - Discussion for adding variant type and where to maintain the spec
   (Parquet)
   - Making progress on geometry types
   - Clarified transform requirements to add transforms as needed (to
   support geo)
   - Discovered issues affecting new type promotion cases, reduced scope

REST protocol specification:

   - Added server-side scan planning
   - Support for removing partition specs
   - Support for endpoint discovery for future additions
   - Clarified failure requirements for unknown actions or validations

Java:

   - Added classes for v3 table writes
   - Fixed rewrites in tables with 1000+ columns
   - Added Kafka Connect runtime bundle
   - Support for Flink 1.20
   - Added range distribution support in Flink
   - Dropped support for Java 8

PyIceberg:

   - Discussed adding a dependency on iceberg-rust for native extensions
   - Write support for time and identity transforms
   - Parallelized large writes
   - Support for deletes using filter predicates
   - Staged table creation for atomic CTAS
   - Support manifest merging on write
   - Better integration with PyArrow to produce lazy readers from scans
   - New API to add existing Parquet files
   - Support custom catalogs

Rust:

   - Established subproject pyiceberg_core to support PyIceberg
   - Implemented OAuth for catalog REST client
   - Added Parquet writer and reader capabilities with support for data
   projection.
   - Introduced memory catalog and memory file IO support
   - Initialized SQL Catalog
   - Added support for GCS storage and AWS session tokens
   - Implemented concurrent table scans and data file fetching
   - Enhanced predicate builders and expression evaluators
   - Added support for timestamp columns in row filters

Go:

   - Implemented expressions and expression visitors

Community Health:

Several new committers and PMC members were added this quarter, which is a
good
indicator for community health. There was also a significant number of
threads
on the mailing list about setting expectations for contributors and clearly
document how the community operates. New guidelines for merging PRs have
been
added to the website and the community is also discussing guidelines for how
contributors can become committers. This builds on work from last quarter
that
clarified the process for design discussions.

Many of the topics under discussion were raised because of the acquisition
that
was noted in the last board report. The community has been working to
address
the concerns raised, which are primarily in 3 areas:

   - How decisions are made about designs and commits (now clarified)
   - How contributors become committers and PMC members (under discussion)
   - How the community operates when people cannot reach consensus

The last concern has historically not been a problem; people have so far
chosen to “disagree and commit” when a large majority in the community has
a different opinion. However, the first instance of this was encountered
near
the end of the quarter. The community and PMC need to discuss how to make
progress on the issue.

Re: [DISCUSS] September board report

2024-09-11 Thread rdb...@gmail.com

Thanks for the updates! I'll add those.

On Wed, Sep 11, 2024 at 8:02 AM Jean-Baptiste Onofré 
wrote:

> Hi Ryan,
>
> It looks good to me. Thanks !
>
> Regards
> JB
>
> On Tue, Sep 10, 2024 at 11:43 PM rdb...@gmail.com 
> wrote:
> >
> > Hi everyone,
> >
> > It’s time for another ASF board report! Here’s my current draft. Please
> reply if you think there is something that I should add or change. Thanks!
> >
> > Ryan
> >
> > Description:
> >
> > Apache Iceberg is a table format for huge analytic datasets that is
> designed
> > for high performance and ease of use.
> >
> > Project Status:
> >
> > Current project status: Ongoing
> > Issues for the board: None
> >
> > Membership Data:
> >
> > Apache Iceberg was founded 2020-05-19 (4 years ago)
> > There are currently 31 committers and 21 PMC members in this project.
> > The Committer-to-PMC ratio is roughly 4:3.
> >
> > Community changes, past quarter:
> >
> > Amogh Jahagirdar was added to the PMC on 2024-08-12
> > Eduard Tudenhoefner was added to the PMC on 2024-08-12
> > Honah J. was added to the PMC on 2024-07-22
> > Renjie Liu was added to the PMC on 2024-07-22
> > Peter Vary was added to the PMC on 2024-08-12
> > Piotr Findeisen was added as committer on 2024-07-24
> > Kevin Liu was added as committer on 2024-07-24
> > Sung Yun was added as committer on 2024-07-24
> > Hao Ding was added as committer on 2024-07-23
> >
> > Project Activity:
> >
> > Releases:
> >
> > Java 1.6.1 was released on 2024-08-28
> > Rust 0.3.0 was released on 2024-08-20
> > PyIceberg 0.7.1 was released on 2024-08-18
> > PyIceberg 0.7.0 was released on 2024-07-30
> > Java 1.6.0 was released on 2024-07-23
> >
> > Table format:
> >
> > Work for v3 is picking up
> > Committed timestamp_ns implementation
> > Ongoing discussion/proposal for improvements to row-level deletes
> > Ongoing discussion/proposal for row-level metadata for change tracking
> > Discussion for adding variant type and where to maintain the spec
> (Parquet)
> > Making progress on geometry types
> > Clarified transform requirements to add transforms as needed (to support
> geo)
> > Discovered issues affecting new type promotion cases, reduced scope
> >
> > REST protocol specification:
> >
> > Added server-side scan planning
> > Support for removing partition specs
> > Support for endpoint discovery for future additions
> > Clarified failure requirements for unknown actions or validations
> >
> > Java:
> >
> > Added classes for v3 table writes
> > Fixed rewrites in tables with 1000+ columns
> > Added Kafka Connect runtime bundle
> > Support for Flink 1.20
> > Added range distribution support in Flink
> > Dropped support for Java 8
> >
> > PyIceberg:
> >
> > Discussed adding a dependency on iceberg-rust for native extensions
> > Write support for time and identity transforms
> > Parallelized large writes
> > Support for deletes using filter predicates
> > Staged table creation for atomic CTAS
> > Support manifest merging on write
> > Better integration with PyArrow to produce lazy readers from scans
> > New API to add existing Parquet files
> > Support custom catalogs
> >
> > Rust:
> >
> > Established subproject pyiceberg_core to support PyIceberg
> > Implemented OAuth for catalog REST client
> > Added Parquet writer and reader capabilities with support for data
> projection.
> > Introduced memory catalog and memory file IO support
> > Initialized SQL Catalog
> > Added support for GCS storage and AWS session tokens
> > Implemented concurrent table scans and data file fetching
> > Enhanced predicate builders and expression evaluators
> > Added support for timestamp columns in row filters
> >
> > Go:
> >
> > Implemented expressions and expression visitors
> >
> > Community Health:
> >
> > Several new committers and PMC members were added this quarter, which is
> a good
> > indicator for community health. There was also a significant number of
> threads
> > on the mailing list about setting expectations for contributors and
> clearly
> > document how the community operates. New guidelines for merging PRs have
> been
> > added to the website and the community is also discussing guidelines for
> how
> > contributors can become committers. This builds on work from last
> quarter that
> > clarified the process for desig

Re: [DISCUSS] Define calendar used in specification?

2024-09-12 Thread rdb...@gmail.com

The spec purposely avoids timestamp conversion. Iceberg returns values as
they are passed from the engine and it is the engine's responsibility to do
any date/time conversion. I don't think that we should change this and take
responsibility in Iceberg.

On Thu, Sep 12, 2024 at 12:32 AM Bart Samwel 
wrote:

> I have some historical context that may or may not be relevant. I still
> remember how we did the transition for Spark. This was ca. 2019, and there
> were still many people mixing Spark 2.x and 3.0. Also, many other systems
> were still using Java 7 which only supported Julian. As a result, Spark
> 3.0+ can even still write with the Julian calendar, at least if using the
> Spark-native parquet read and write path.
>
> 1) The parquet files written by Spark 3.0+ have metadata keys that contain
> a Spark version ("org.apache.spark.version") and whether the timestamps are
> in Julian a.k.a. Java 7 ("org.apache.spark.legacyDateTime"). There's also
> "org.apache.spark.legacyINT96" which is about whether INT96 timestamps have
> been written with Julian calendar in the date part.
>
> 2) Files that don't have a Spark version are interpreted as Julian or
> proleptic Gregorian depending on a config
> "spark.sql.parquet.datetimeRebaseModeInRead" /
> "spark.sql.parquet.int96RebaseModeInRead". (There are similar configs for
> ORC and avro.) This defaults to EXCEPTION, which means "if a date is
> different in the two calendars, fail the write and force the users to
> choose". If it's set to LEGACY, then Spark will actually "rebase" the dates
> at read time because Spark 3.0+ uses the Java 8 proleptic gregorian
> calendar internally.
>
> 3) Writing mode is controlled by configs
> "spark.sql.parquet.datetimeRebaseModeInWrite" and
> "spark.sql.parquet.int96RebaseModeInWrite". These were also until recently
> set to EXCEPTION (i.e., force the user to choose when a value is
> encountered where it matters). See
> https://issues.apache.org/jira/browse/SPARK-46440.
>
> I'm not sure if any of this matters for Iceberg though. It may matter if
> any Iceberg implementation writes using the Spark native parquet/orc/avro
> write path AND the user has configured it to use LEGACY dates. Or are there
> paths where Iceberg can convert from Parquet files? Then you might
> encounter these metadata flags. I'm not sure if it's worth complicating the
> spec by supporting this. :)
>
> On Thu, Sep 12, 2024 at 8:03 AM Micah Kornfield 
> wrote:
>
>> At the moment, the specification is ambiguous on which calendar is used
>> for temporal conversion/writing [1]. Reading the java code it appears it is
>> using Java's OffsetDateTime which conforms to ISO8601 [2].  ISO8601 appears
>> to explicitly disallow the Julian calendar (but only says proleptic
>> gregorian can be used by mutual consent [3]).
>>
>> Therefore I'd propose:
>> 1. We make the  ISO8601 + proleptic Gregorian + Gregorian calendars
>> explicit in the specification.
>> 2. Mention in an implementation note, that data migrated from other
>> systems or data written by older systems might follow the Julian calendar
>> (e.g. it looks like Spark only transitioned in 3.0 [4]).
>>   *  Does anybody know of metadata available for systems to make this
>> determination?
>>   *  Or a recommendation on how to handle these?
>>
>> Thoughts?
>>
>> Thanks,
>> Micah
>>
>> [1] This is esoteric but a few systems use 0001-01-01 as a sentinel value
>> for null so does have some wider applicability
>> [2]
>> https://docs.oracle.com/javase/8/docs/api/java/time/OffsetDateTime.html
>> [3] https://en.wikipedia.org/wiki/ISO_8601#Dates
>> [4] https://issues.apache.org/jira/browse/SPARK-26651
>>
>>

Re: Time-based partitioning on long column type

2024-09-12 Thread rdb...@gmail.com

I was thinking about adding to the specification how the current date/time
transforms work with source columns that are longs. I don't think it helps
much to introduce a new transform, although that would make the
interpretation of the source data more obvious to users applying the
transform.

On Wed, Sep 11, 2024 at 10:26 PM Micah Kornfield 
wrote:

> Maybe we could update the time-based partition functions to be applied to
>> a long column directly. It would treat that column like a timestamp in
>> milliseconds. Would that work? I need to think more about the implications
>> of doing that, but I don't think that we currently have an issue with
>> extending the supported source column types like that.
>
>
> By update do you mean adding new functions like
> HOUR_FROM_UNIX_EPOCH_MILLIS or overloading the definition to accept Long
> values and treat them as milliseconds. The former seems more reasonable to
> me. The latter I think has many of the same draw-backs raised on the other
> thread.  IMO, both aren't super pleasing from a long term maintainability
> perspective.
>
> Cheers,
> Micah
>
> On Tue, Sep 10, 2024 at 8:50 AM rdb...@gmail.com  wrote:
>
>> Maybe we could update the time-based partition functions to be applied to
>> a long column directly. It would treat that column like a timestamp in
>> milliseconds. Would that work? I need to think more about the implications
>> of doing that, but I don't think that we currently have an issue with
>> extending the supported source column types like that.
>>
>> On Mon, Sep 9, 2024 at 9:05 PM Manu Zhang 
>> wrote:
>>
>>> Hi all,
>>>
>>> I'd like to bump this thread since we don't want to allow long to
>>> timestamp promotion in V3
>>> <https://lists.apache.org/thread/79y866zdhs2fmyv0nsfq3xvdsmqh7h8c>.
>>> What other options do we have?
>>>
>>> Thanks,
>>> Manu
>>>
>>> On Fri, Apr 5, 2024 at 12:09 AM Jean-Baptiste Onofré 
>>> wrote:
>>>
>>>> Ah yes, milestone is fine. Thanks !
>>>>
>>>> All good.
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> On Thu, Apr 4, 2024 at 5:12 PM Eduard Tudenhoefner 
>>>> wrote:
>>>> >
>>>> > There is the V3 Spec milestone where it's tracked (amongst other
>>>> things).
>>>> >
>>>> > On Thu, Apr 4, 2024 at 9:44 AM Jean-Baptiste Onofré 
>>>> wrote:
>>>> >>
>>>> >> Hi Eduard,
>>>> >>
>>>> >> Thanks for the update ! It makes sense to me.
>>>> >>
>>>> >> Maybe a GH label with spec or v3_spec would help to see what is
>>>> planned for v3 ?
>>>> >>
>>>> >> Regards
>>>> >> JB
>>>> >>
>>>> >> On Thu, Apr 4, 2024 at 9:36 AM Eduard Tudenhoefner <
>>>> edu...@tabular.io> wrote:
>>>> >> >
>>>> >> > Type promotion from Long to Timestamp is on the roadmap for the V3
>>>> Spec, so that would be the preferred way.
>>>> >> >
>>>> >> > On Wed, Apr 3, 2024 at 10:38 AM Jean-Baptiste Onofré <
>>>> j...@nanthrax.net> wrote:
>>>> >> >>
>>>> >> >> Hi Manu
>>>> >> >>
>>>> >> >> TIMESTAMP_LONG type promotion could be the easiest way, it would
>>>> work
>>>> >> >> with the existing transform.
>>>> >> >>
>>>> >> >> Would it work for you?
>>>> >> >>
>>>> >> >> Regards
>>>> >> >> JB
>>>> >> >>
>>>> >> >> On Wed, Apr 3, 2024 at 5:56 AM Manu Zhang <
>>>> owenzhang1...@gmail.com> wrote:
>>>> >> >> >
>>>> >> >> > Hi all,
>>>> >> >> >
>>>> >> >> > We have source data with a timestamp field in LONG type to land
>>>> in an Iceberg table. We want to partition the table with the timestamp
>>>> field such that we can query recent data more efficiently. However, LONG is
>>>> not supported as the source type of time-based transform (hour, day, etc)
>>>> >> >> >
>>>> >> >> > I find the previous discussion
>>>> https://github.com/apache/iceberg/issues/417 and Ryan suggested two
>>>> solutions
>>>> >> >> >
>>>> >> >> > 1. type promotion from LONG to TIMESTAMP
>>>> >> >> > 2. custom transform
>>>> >> >> >
>>>> >> >> > As I understand it, neither solution has already been
>>>> implemented yet. Is there any progress in either direction? Which solution
>>>> does the community prefer? Any other suggestions are also appreciated.
>>>> >> >> >
>>>> >> >> > Thanks,
>>>> >> >> > Manu
>>>>
>>>

Re: [Discuss] test logging is broken and Avro 1.12.0 upgraded slf4j-api dep to 2.x

2024-09-16 Thread rdb...@gmail.com

If I understand the SLF4J announcement correctly, it sounds like the best
option is to rely on binary compatibility between the 1.x and 2.x clients.

As long as we don't use the newer API, then the compiled code can use
either a 1.7.x or 2.0.x API Jar. The API Jar needs to match the provider
version, so it should be supplied by downstream code instead of Iceberg.
Luckily, it doesn't look like we ship the SLF4J API in our runtime
binaries, so we already have a situation where downstream projects can
choose the SLF4J version of both the API and provider Jars.

In that case, I think the best path forward is to upgrade to 2.x but not
use the new API features that will cause problems if downstream libraries
are not already on 2.x.

Does that sound reasonable?

On Wed, Sep 11, 2024 at 11:17 AM Steven Wu  wrote:

> Following up on the discussion from the community sync meeting. Right now,
> Iceberg test code is in the 3rd row in the table pasted below. With the
> recent Avro 1.12 upgrade (and slf4j 2.x), the main code is also affected.
> That means downstream applications (Spark, Trino, Flink, ...) may run into
> the same situation when upgrading to the next Iceberg 1.7 release.
>
> [image: image.png]
>
> From the slf4j doc, it seems that *slf4j 2.x API is backward compatible
> as long as provider/binding is updated to 2.x too*.
> https://www.slf4j.org/faq.html#changesInVersion200
>
> We have two options
> 1. Exclude the slf4j 2.x transitive dependency from Avro 1.12. but we
> would need to be comprehensive on forcing the slf4j dependency version
> resolution to 1.x everywhere until Iceberg is ready to move forward to
> slf4j 2.x.
> 2. Move forward with slf4j 2.x now with a clear callout in the 1.7 release
> notes.
>
> First option is the more conservative approach and defer the slf4j 2.x
> upgrade decision to the downstream applications. But it will add a little
> burden to make sure future dependency upgrades don't bring in slf4j 2.x
> transitively. Maybe we can leverage the force resolution from Gradle?
>
> configurations.all {
>   resolutionStrategy {
> force 'org.slf4j:slf4j-api:1.7.35'
>   }
> }
>
>
> Thanks,
> Steven
>
> SLF4J 2.0 stable release was announced *2 years ago*:
> https://mailman.qos.ch/pipermail/announce/2022/000176.html. It explained
> binary compatibility.
>
> "Mixing different versions of slf4j-api.jar and SLF4J provider can cause
> problems. For example, if you are usingslf4j-api-2.0.0.jar, then you should
> also use slf4j-simple-2.0.0.jar, using slf4j-simple-1.5.5.jar will not
> work."
>
> More notes from slf4j FAQ page.
>
> "SLF4J 2.0.0 incorporates an optional fluent api
> . Otherwise, there are no
> client facing API changes in 2.0.x. For most users, upgrading to version
> 2.0..x should be a drop-in replacement, as long as the logging provider is
> updated as well.
>
> From the clients perspective, the SLF4J API, more specifically the
> org.slf4j package, is backward compatible for all versions. This means
> that you can upgrade from SLF4J version 1.0 to any later version without
> problems. Code compiled with *slf4j-api-versionN.jar* will work with
> *slf4j-api-versionM.jar* for any versionN and any versionM. *To date,
> binary compatibility in slf4j-api has never been broken."*
>
>
> On Mon, Sep 9, 2024 at 9:22 AM Steven Wu  wrote:
>
>> Bump the thread to bring the awareness of the issue and implication of
>> slf4j 2.x upgrade.
>>
>> On Mon, Aug 26, 2024 at 12:24 PM Steve Zhang
>>  wrote:
>>
>>> I believe dependabot tried to upgrade self4j to 2.x in [1] but JB
>>> mentioned there's -1 on this upgrade, maybe he has more context.
>>>
>>> [1]https://github.com/apache/iceberg/pull/9688
>>>
>>> Thanks,
>>> Steve Zhang
>>>
>>>
>>>
>>> On Aug 24, 2024, at 7:37 PM, Steven Wu  wrote:
>>>
>>> Hi,
>>>
>>> It seems that test logging is broken in many modules (like core, flink)
>>> because slf4j-api was upgraded to 2.x while slf4j-simple provider is still
>>> on 1.7. I created a PR that upgraded slf4j-simple testImplementation to 2.x
>>> for all subprojects.
>>>
>>> https://github.com/apache/iceberg/pull/11001
>>>
>>> That fixed the test logging problem (e.g. TestInMemoryCatalog). You can
>>> find more details in the PR description. Test logging seems to have been
>>> broken for a while (from 1.4). But those dep problems have been for *test
>>> runtime only*.
>>>
>>> Recent change [1] on Avro 1.12.0 introduced slf4j-api 2.x change for
>>> runtime, as verified by the cmd below on the *main branch*.
>>>
>>> ./gradlew -q :iceberg-core:dependencyInsight --dependency slf4j-api
>>> --configuration runtimeClasspath
>>>
>>> This thread is to raise awareness on the slf4j-api dep change to 2.x as
>>> downstream projects/applications can be affected. Looking forward to
>>> feedback on the path forward.
>>>
>>>1. continue the current upgrade path and document the slf4j-api 2.x
>>>change in the next 1.7 release. But we need to be cautious of not porting
>>

Re: Spec changes for deletion vectors

2024-10-14 Thread rdb...@gmail.com

> I think it might be worth mentioning the current proposal makes some,
mostly minor, design choices to try to be compatible with Delta Lake
deletion vectors.

Yes it does, and thanks for pointing this out, Micah. I think it's
important to consider whether compatibility is important to this community.
I just replied to Piotr on the PR, but I'll adapt some of that response
here to reach the broader community.

I think there is value in supporting compatibility with older Delta
readers, but I acknowledge that this may be my perspective because my
employer has a lot of Delta customers that we are going to support now and
in the future.

The main use case for maintaining compatibility with the Delta format is
that it's hard to move old jobs to new code in a migration. We see a
similar issue in Hive to Iceberg migrations, where unknown older readers
prevent migration entirely because they are hard to track down and often
read files directly from the backing object store. I'd like to avoid the
same problem here, where all readers need to be identified and migrated at
the same time. Compatibility with the format those readers expect makes it
possible to maintain Delta metadata for them temporarily. That increases
confidence that things won't randomly break and makes it easier to get
people to move forward.

The second reason for maintaining compatibility is that we want for the
formats to become more similar. My hope is that we can work across both
communities and come up with a common metadata format in a future version
-- which explains my interest in smooth migrations. Maintaining
compatibility in cases like this builds trust and keeps our options open:
if we have compatible data layers, then it's easier to build a compatible
metadata layer. I'm hoping that if we make the blob format compatible, we
can get the Delta community to start using Puffin for better
self-describing delete files.

Other people may not share those goals, so I think it helps to consider
what is being compromised for this compatibility. I don't think it is too
much. There are 2 additional fields:
* A 4-byte length field (that Iceberg doesn't need)
* A 4-byte CRC to validate the contents of the bitmap

There are also changes to how these would have been added if the Iceberg
community were building this independently.
* Our initial version didn't include a CRC at all, but now that we think
it's useful compatibility means using a CRC-32 checksum rather than a newer
one
* The Delta format uses big endian for its fields (or mixed endian if you
consider RoaringBitmap is LE)
* The magic bytes (added to avoid reading the Puffin footer) would have
been different

Overall, I don't think that those changes to what we would have done are
unreasonable. It's only 8 extra bytes and half of them are for a checksum
that is a good idea.

I'm looking forward to what the rest of the community thinks about this.
Thanks for reviewing the PR!

Ryan

On Sun, Oct 13, 2024 at 10:45 PM Jean-Baptiste Onofré 
wrote:

> Hi
>
> Thanks for the PRs ! I reviewed Anton's document, I will do a pass on the
> PRs.
>
> Imho, it's important to get feedback from query engines, as, if delete
> vectors is not a problem per se (it's what we are using as internal
> representation), the use of Puffin files to store it is "impactful"
> for the query engines (probably some query engines might need to
> implement Puffin spec (read/write) using other language than Java, for
> instance Apache Impala).
>
> I like the proposal, I just hope we won't "surprise" some query
> engines with extra work :)
>
> Regards
> JB
>
> On Thu, Oct 10, 2024 at 11:41 PM rdb...@gmail.com 
> wrote:
> >
> > Hi everyone,
> >
> > There seems to be broad agreement around Anton's proposal to use
> deletion vectors in Iceberg v3, so I've opened two PRs that update the spec
> with the proposed changes. The first, PR #11238, adds a new Puffin blob
> type, delete-vector-v1, that stores a delete vector. The second, PR #11240,
> updates the Iceberg table spec.
> >
> > Please take a look and comment!
> >
> > Ryan
>

Re: Spec changes for deletion vectors

2024-10-15 Thread rdb...@gmail.com

Thanks, Szehon.

To clarify on compatibility, using the same format for the blobs makes it
so that existing Delta readers can read and use the DVs written by Iceberg.
I'd love for Delta to adopt Puffin, but if we adopt the extra fields they
would not need to change how readers work. That's why I think there is a
benefit to using the same format. We avoid fragmentation and make sure data
and delete files are compatible. No unnecessary fragmentation.

Ryan

On Tue, Oct 15, 2024 at 10:57 AM Szehon Ho  wrote:

> This is awesome work by Anton and Ryan, it looks like a ton of effort has
> gone into the V3 position vector proposal to make it clean and efficient, a
> long time coming and Im truly excited to see the great improvement in
> storage/perf.
>
> wrt to these fields, I think most of the concerns are already mentioned by
> the other community members in the prs
> https://github.com/apache/iceberg/pull/11238 and
> https://github.com/apache/iceberg/pull/11238, so not much to add.  The DV
> itself is RoaringBitmap 64-bit format so that's great, the argument for CRC
> seems reasonable, and I dont have enough data to be opinionated towards
> endian/magic byte.
>
> But I do lean towards the many PR comments that the extra length field is
> unnecessary, and would just add confusion.  It seemed to me that the
> Iceberg community has made so much effort to trim to spec to the bare
> minimum for cleanliness and efficiency, so I do feel the field is not in
> the normal direction of the project.  Also Im not clear on the plan for old
> Delta readers, they cant read Puffin anyway, if Delta adopts Puffin, then
> new readers could adopt?  Anyway great work again, thanks for raising the
> issue on devlist!
>
> Thanks,
> Szehon
>
> On Mon, Oct 14, 2024 at 5:14 PM rdb...@gmail.com  wrote:
>
>> > I think it might be worth mentioning the current proposal makes some,
>> mostly minor, design choices to try to be compatible with Delta Lake
>> deletion vectors.
>>
>> Yes it does, and thanks for pointing this out, Micah. I think it's
>> important to consider whether compatibility is important to this community.
>> I just replied to Piotr on the PR, but I'll adapt some of that response
>> here to reach the broader community.
>>
>> I think there is value in supporting compatibility with older Delta
>> readers, but I acknowledge that this may be my perspective because my
>> employer has a lot of Delta customers that we are going to support now and
>> in the future.
>>
>> The main use case for maintaining compatibility with the Delta format is
>> that it's hard to move old jobs to new code in a migration. We see a
>> similar issue in Hive to Iceberg migrations, where unknown older readers
>> prevent migration entirely because they are hard to track down and often
>> read files directly from the backing object store. I'd like to avoid the
>> same problem here, where all readers need to be identified and migrated at
>> the same time. Compatibility with the format those readers expect makes it
>> possible to maintain Delta metadata for them temporarily. That increases
>> confidence that things won't randomly break and makes it easier to get
>> people to move forward.
>>
>> The second reason for maintaining compatibility is that we want for the
>> formats to become more similar. My hope is that we can work across both
>> communities and come up with a common metadata format in a future version
>> -- which explains my interest in smooth migrations. Maintaining
>> compatibility in cases like this builds trust and keeps our options open:
>> if we have compatible data layers, then it's easier to build a compatible
>> metadata layer. I'm hoping that if we make the blob format compatible, we
>> can get the Delta community to start using Puffin for better
>> self-describing delete files.
>>
>> Other people may not share those goals, so I think it helps to consider
>> what is being compromised for this compatibility. I don't think it is too
>> much. There are 2 additional fields:
>> * A 4-byte length field (that Iceberg doesn't need)
>> * A 4-byte CRC to validate the contents of the bitmap
>>
>> There are also changes to how these would have been added if the Iceberg
>> community were building this independently.
>> * Our initial version didn't include a CRC at all, but now that we think
>> it's useful compatibility means using a CRC-32 checksum rather than a newer
>> one
>> * The Delta format uses big endian for its fields (or mixed endian if you
>> consider RoaringBitmap is LE)
>> * The magic byte

Re: [VOTE] Table V3 Spec: Row Lineage

2024-10-09 Thread rdb...@gmail.com

+1

Thanks for shepherding this, Russell!

On Tue, Oct 8, 2024 at 7:07 PM Russell Spitzer 
wrote:

> Hi Y'all!
>
> I think we are more or less in agreement on adding Row Lineage to the spec
> apart from a few details which may change a bit during implementation.
> Because of this, I'd like to call for an overall vote on whether or not
> Row-Lineage as described in  PR 11130
>  can be added to the spec.
>
> I'll note this is basically giving a thumbs up for reviewers and
> implementers to go ahead with the pull-request and acknowledging that you
> support the direction this proposal is going. I do think we'll probably dig
> a few things out when we write the reference implementation, but I think in
> general we have defined the required behaviors we want to see.
>
> Please vote in the next 72 hours
>
> [ ] +1, commit the proposed spec changes
> [ ] -0
> [ ] -1, do not make these changes because . . .
>
>
> Thanks everyone,
>
> Russ
>

Re: [Discuss] Iceberg community maintaining the docker images

2024-10-09 Thread rdb...@gmail.com

I think it's important for a project to remain focused on its core purpose,
and I've always advocated for Iceberg to remain a library that is easy to
plug into other projects. I think that should be the guide here as well.
Aren't projects like Spark and Trino responsible for producing easy to use
Docker images of those environments? Why would the Iceberg project build
and maintain them?

I would prefer not to be distracted by these things, unless we need them
for cases like supporting testing and validation of things that are part of
the core purpose of the project.

On Tue, Oct 8, 2024 at 6:08 AM Ajantha Bhat  wrote:

> Hello everyone,
>
> Now that the test fixtures are in [1],we can create a runtime JAR for the
> REST catalog adapter [2] from the TCK.
> Following that, we can build and maintain the Docker image based on it [3].
>
> I also envision the Iceberg community maintaining some quick-start Docker
> images, such as spark-iceberg-rest, Trino-iceberg-rest, among others.
>
> I've looked into other Apache projects, and it seems that Apache Infra can
> assist us with this process.
> As we have the option to publish Iceberg docker images under the Apache
> Docker Hub account.
>
> [image: image.png]
>
> I am more than willing to maintain this code, please find the PRs related
> to the same [2] & [3].
>
> Any suggestions on the same? contributions are welcome if we agree to
> maintain it.
>
> [1] https://github.com/apache/iceberg/pull/10908
> [2] https://github.com/apache/iceberg/pull/11279
> [3] https://github.com/apache/iceberg/pull/11283
>
> - Ajantha
>

Re: Clarification on DayTransform Result Type

2024-10-07 Thread rdb...@gmail.com

Yes. When we return the Spark type, it shows up as date and Spark correctly
displays the value.

On Mon, Sep 30, 2024 at 9:56 AM Kevin Liu  wrote:

> Thank you both for the insights and context.
>
> As Russell pointed out, the "day partition transform" result is true of
> int type. The Types.DateType
> <https://github.com/apache/iceberg/blob/dddb5f423b353d961b8a08eb2cb4371d453c2959/api/src/main/java/org/apache/iceberg/transforms/Days.java#L47>
>  corresponds
> to TypeID.DATE
> <https://github.com/apache/iceberg/blob/09370ddbc39fc3920fb8cbd3dff11b377dd37e40/api/src/main/java/org/apache/iceberg/types/Types.java#L181>,
> which is also an Integer type
> <https://github.com/apache/iceberg/blob/113c6e7d62e53d3e3cb15b1712f3a1db473ca940/api/src/main/java/org/apache/iceberg/types/Type.java#L37>.
> So, this behavior conforms to the spec.
>
> The issue with DayTransform in PyIceberg (#1208
> <https://github.com/apache/iceberg-python/pull/1208>) is due to the
> changes in the PR. The problem arises from how the partition value is
> displayed in the partition metadata table. As Ryan mentioned, Spark
> displays the partition value as `date`. However, the PR removes
> `DateType` as the `result_type`, which causes PyIceberg to display the
> partition value as `int` since the epoch.
>
> > if we just change the type to `date`, engines could correctly display
> the value
>
> I found a related discussion in apache/iceberg/#279
> <https://github.com/apache/iceberg/issues/279#issuecomment-521322801>,
> specifically: "That will cause the partition tuple's field type to be a
> date, which should also cause the metadata table to display formatted dates
> instead of the day ordinal in Spark." I want to confirm my understanding:
> is this behavior due to the Iceberg-to-Spark DateType conversion in `
> <https://github.com/apache/iceberg/blob/main/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/TypeToSparkType.java#L103-L104>
> TypeToSparkType`
> <https://github.com/apache/iceberg/blob/09370ddbc39fc3920fb8cbd3dff11b377dd37e40/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/TypeToSparkType.java#L103-L104>
> ?
>
> Best,
> Kevin
>
>
>
> On Fri, Sep 27, 2024 at 1:52 PM rdb...@gmail.com  wrote:
>
>> The background is that the result of the day function and dates are
>> basically the same: the number of days from the Unix epoch. When we started
>> using metadata tables, we realized that a lot of people use the day
>> function but then get a weird ordinal value out, but if we just change the
>> type to `date`, engines could correctly display the value. This isn't
>> required by the spec, it's just a convenience.
>>
>> On Fri, Sep 27, 2024 at 8:30 AM Russell Spitzer <
>> russell.spit...@gmail.com> wrote:
>>
>>> Good thing DateType is an Integer :)
>>> https://github.com/apache/iceberg/blob/113c6e7d62e53d3e3cb15b1712f3a1db473ca940/api/src/main/java/org/apache/iceberg/types/Type.java#L37
>>>
>>> On Thu, Sep 26, 2024 at 8:38 PM Kevin Liu 
>>> wrote:
>>>
>>>> Hey folks,
>>>>
>>>> While reviewing a PR to fix DayTransform in PyIceberg (#1208
>>>> <https://github.com/apache/iceberg-python/pull/1208>), we found an
>>>> inconsistency between the spec and the Java Iceberg library.
>>>>
>>>> According to the spec
>>>> <https://iceberg.apache.org/spec/#partition-transforms>, the result
>>>> type for the "day partition transform" should be `int`, similar to other
>>>> time-based partition transforms (year/month/hour). However, in the Java
>>>> Iceberg library, the result type for day partition transform is `DateType` 
>>>> (
>>>> source
>>>> <https://github.com/apache/iceberg/blob/dddb5f423b353d961b8a08eb2cb4371d453c2959/api/src/main/java/org/apache/iceberg/transforms/Days.java#L47>).
>>>> This seems to be a discrepancy from the spec, as the day partition
>>>> transform is the only time-based transform with a non-int result
>>>> type—whereas the others use IntegerType (source
>>>> <https://grep.app/search?q=getResultType&filter[repo][0]=apache/iceberg&filter[path][0]=api/src/main/java/org/apache/iceberg/>
>>>> ).
>>>>
>>>> Could someone confirm if my understanding is correct? If so, is there
>>>> any historical context for this difference? Lastly, how should we approach
>>>> resolving this moving forward?
>>>>
>>>> Best,
>>>> Kevin
>>>>
>>>>

Re: [DISCUSS] Remove iceberg-pig module ?

2024-10-19 Thread rdb...@gmail.com

+1

On Thu, Oct 17, 2024 at 11:56 PM Steve Zhang
 wrote:

> +1
>
> Thanks,
> Steve Zhang
>
>
>
> On Oct 17, 2024, at 11:16 PM, roryqi  wrote:
>
> +1.
>
> Péter Váry  于2024年10月18日周五 13:44写道：
>
>> +1
>>
>> On Fri, Oct 18, 2024, 04:50 Manu Zhang  wrote:
>>
>>> +1
>>>
>>> On Fri, Oct 18, 2024 at 8:50 AM Rodrigo Meneses 
>>> wrote:
>>>
 +1

 On Thu, Oct 17, 2024 at 4:38 PM Bryan Keller  wrote:

> +1
>
>
> On Oct 17, 2024, at 1:51 PM, Anton Okolnychyi 
> wrote:
>
> +1
>
> чт, 17 жовт. 2024 р. о 13:42 Steven Wu  пише:
>
>> +1
>>
>> On Thu, Oct 17, 2024 at 10:44 AM John Zhuge 
>> wrote:
>>
>>> +1 (non-binding)
>>>
>>> On Thu, Oct 17, 2024 at 10:21 AM Yufei Gu 
>>> wrote:
>>>
 +1 for deprecating it in 1.7
 Yufei


 On Thu, Oct 17, 2024 at 9:51 AM Ajantha Bhat 
 wrote:

> +1 for dropping it.
>
> On Thu, Oct 17, 2024 at 8:55 PM Daniel Weeks 
> wrote:
>
>> +1 for deprecating and dropping
>>
>> On Thu, Oct 17, 2024 at 7:46 AM Eduard Tudenhöfner <
>> etudenhoef...@apache.org> wrote:
>>
>>> +1 for marking the project deprecated (in 1.7.0) and dropping it
>>> in the next release (1.8.0)
>>>
>>> On Thu, Oct 17, 2024 at 4:36 PM Russell Spitzer <
>>> russell.spit...@gmail.com> wrote:
>>>
 +1 (oink)

 If anyone really cares please chime in but seriously we should
 drop it

 On Thu, Oct 17, 2024 at 8:07 AM Jean-Baptiste Onofré <
 j...@nanthrax.net> wrote:

> Hi folks,
>
> Even if it seems the project is pretty close to 0.18 release,
> Apache
> Pig is a "dormant" project.
>
> I would like to discuss here if it would not make sense to
> remove the
> iceberg-pig module.
>
> Thoughts ?
>
> Regards
> JB
>

>>>
>>> --
>>> John Zhuge
>>>
>>
>
>

Re: Spec changes for deletion vectors

2024-10-19 Thread rdb...@gmail.com

em X in system Y. At the same time, I 
>>>>> have
>>>>> never heard a user complaining about having inconsistent endianness in
>>>>> their protocols.
>>>>>
>>>>> On Thu, Oct 17, 2024 at 11:02 AM Jean-Baptiste Onofré 
>>>>> wrote:
>>>>>
>>>>>> Hi folks,
>>>>>>
>>>>>> As Daniel said, I think we have actually two proposals in one:
>>>>>> 1. The first proposal is "improvement of positional delete files",
>>>>>> using delete vectors stored in Puffin files. I like this proposal, it
>>>>>> makes a lot of sense. I think with a kind of consensus here (we
>>>>>> discussed about how to parse Puffin files, etc, good discussion).
>>>>>> 2. Then, based on (1), is support vector format "compatible" with
>>>>>> Delta. This is also interesting. However, do we really need this in
>>>>>> Spec V3 ? Why not focus on the original proposal (improvement of
>>>>>> positional delete) with a simple approach, and evaluate Delta
>>>>>> compatibility later ? If the compatibility is "easy", I'm not against
>>>>>> to include in V3, but users might be disappointed if bringing this
>>>>>> means a tradeoff.
>>>>>>
>>>>>> Imho, I will focus on 1 because it would be a great feature for the
>>>>>> Iceberg community.
>>>>>>
>>>>>> Regards
>>>>>> JB
>>>>>>
>>>>>> On Wed, Oct 16, 2024 at 9:16 PM Daniel Weeks 
>>>>>> wrote:
>>>>>> >
>>>>>> > Hey Everyone,
>>>>>> >
>>>>>> > I feel like at this point we've articulated all of the various
>>>>>> options and paths forward, but this really just comes down to a matter of
>>>>>> whether we want to make a concession here for the purpose of 
>>>>>> compatibility.
>>>>>> >
>>>>>> > If we were building this with no prior art, I would expect to omit
>>>>>> the length and align the endianness, but given there's an opportunity to
>>>>>> close the gap with minor inefficiency, it merits real consideration.
>>>>>> >
>>>>>> > This proposal takes into consideration bi-directional compatibility
>>>>>> while maintaining backward compatibility.  Do we feel this is beneficial 
>>>>>> to
>>>>>> the larger community or should we discard efforts for compatibility?
>>>>>> >
>>>>>> > -Dan
>>>>>> >
>>>>>> > On Wed, Oct 16, 2024 at 11:01 AM rdb...@gmail.com 
>>>>>> wrote:
>>>>>> >>
>>>>>> >> Thanks, Russell for the clear summary of the pros and cons! I
>>>>>> agree there's some risk to Iceberg implementations, but I think that is
>>>>>> mitigated somewhat by code reuse. For example, an engine like Trino could
>>>>>> simply reuse code for reading Delta bitmaps, so we would get some
>>>>>> validation and support more easily.
>>>>>> >>
>>>>>> >> Micah, I agree with the requirements that you listed, but I would
>>>>>> say #2 is not yet a "requirement" for the design. It's a consideration 
>>>>>> that
>>>>>> I think has real value, but it's up to the community whether we want to 
>>>>>> add
>>>>>> some cost to #1 to make #2 happen. I definitely think that #3 is a
>>>>>> requirement so that we can convert Delta to Iceberg metadata (as in the
>>>>>> iceberg-delta-lake module).
>>>>>> >>
>>>>>> >> For the set of options, I would collapse a few of those options
>>>>>> because I think that we would use the same bitmap representation, the
>>>>>> portable 64-bit roaring bitmap.
>>>>>> >>
>>>>>> >> If that's the case (and probably even if we had some other
>>>>>> representation), then Delta can always add support for reading Iceberg
>>>>>> delete vectors. That means we either go with the current proposal (a) 
>>>>>> that
>>>>>> preserves the ability for existing Delta clie

Re: [DISCUSS] Discrepancy Between Iceberg Spec and Java Implementation for Snapshot summary's 'operation' key

2024-10-19 Thread rdb...@gmail.com

I can provide some historical context here about how the table spec evolved
and how the REST spec works with respect to table versions.

We initially did not have the snapshot summary or operation. When I added
the summary, the operation was intended to be required in cases where the
summary is present. It should always be there if the summary is and the
summary should always be there unless you wrote the metadata.json file way
back in 2017 or 2018. It looks like the spec could be more clear that the
operation is required when summary is present. Anyone want to open a PR?

Anton, I don't think there is a top-level operation field. The Java
Snapshot class tracks the operation as top-level, but it is always stored
in the summary. I think this is consistent with the spec.

For the REST spec, I think that it should be strictly optional to support
v1 tables written with no summary, but it should always be present. I'd
probably leave it required since it already is and is catching a valuable
error case here.

When building the REST spec, I took the same approach as the Java parser
(which is also to parse table metadata coming from REST servers). That is,
it is compatible with v1 metadata; fields that were not required in v1 are
optional. This fits with the principle of "be liberal in what you accept
and strict in what you produce". The REST spec allows passing metadata for
any table version so that the specs are not tightly coupled. The table
version is passed when loading and clients should reject table versions
that are newer than what they can support. The REST protocol just needs to
facilitate passing the table metadata.

Most v2 structures, like the `schemas` list, are introduced as optional in
v1 and made required in v2. That way, it's possible to add metadata to
existing format versions and make the upgrade path easier. Maintaining the
newer structures even though there are different writer versions deployed
is one of the reasons why the REST spec changes to a change-based model.
New metadata only needs to be supported by the service maintaining the
metadata.json files and any writers that want to update it.

I see some points about being able to remove old table versions. I don't
think that the REST protocol itself is the place to do this. The protocol
is format-agnostic. Implementations are free to reject requests to create
tables with older versions, or to update the table to a new version.

Ryan

On Fri, Oct 18, 2024 at 6:42 AM Sung Yun  wrote:

> Folks, thank you for your responses! These area really helpful insights.
>
> > I agree that the REST spec should aim to support v1, v2, and potentially
> the upcoming v3. In practice, it seems like the choice of table spec might
> ultimately be dictated by the REST catalog implementation.
>
> > A best practice would be for the server to strive to support all Iceberg
> versions, but the REST spec itself should remain flexible enough to
> accommodate less strict table specs.
>
> Yufei, yes that makes sense, and I agree that the server should strive to
> support all format versions, because otherwise the an older client
> application, may just not be compatible with a REST Catalog running on a
> higher version of table spec.  I think we have two choices here in ensuring
> that the REST Catalog server is able to support multiple versions of the
> Table Spec:
>
> 1. We could create single components that are common denominators of all
> existing table specs to accommodate the less table specs. The REST Catalog
> Spec currently falls short in this approach, and I've put up this PR to
> show what this change would look like just for the Snapshot component:
> https://github.com/apache/iceberg/pull/11353 - My take on this is that,
> this approach will make the REST catalog spec more confusing and difficult
> to manage as we add more Table Spec versions moving forward. The discussion
> on this mail list thread is I think a great demonstration of that confusion
> :)
> 2. We could instead create separate Table Spec version specific components
> on the REST Catalog Open API Spec. For example, a Snapshot component could
> be anyOf SnapshotV1 and SnapshotV2, which match the Table Spec V1 and V2
> definitions. I think creating explicit components that match the spec
> definitions will work in our favor when we continue to introduce more Spec
> changes and manage their lifecycles. And perhaps, maybe we could also
> indicate what format-versions the REST Catalog Server supports through an
> endpoint, and communicate it to a client application.
>
> I'd love to hear the community's opinion on suggestion (2)! I'm very
> curious to hear if we've considered it before.
>
> Sung
>
> On 2024/10/18 05:13:15 Péter Váry wrote:
> > Hi Team,
> > Apart from fixing this current issue by relaxing the current spec
> > constraints, to support both v1 and v2 specifications, we should think
> > about how to handle table spec evolution for the long term.
> >
> > What are the base factors we can s

Re: Overwrite old properties on table replace with REST catalog

2024-10-20 Thread rdb...@gmail.com

Hi Vladimir,

This isn't a bug. The behavior of CREATE OR REPLACE is to replace the data
of a table, but to maintain things like other refs, snapshot history,
permissions (if supported by the catalog), and table properties. Table
properties are replaced if they are set in the operation like `b` in your
example. This is not the same as a drop and create, which may be what you
want instead.

The reason for this behavior is that the CREATE OR REPLACE operation is
used to replace a table's data without needing to handle schema changes
between versions. For example, producing a daily report table that replaces
the previous day. However, the table still exists and it is valuable to be
able to time travel to older versions or to be able to use branches and
tags. Clearly, that means that table history and refs stick around so the
table is not completely new every time it is replaced.

Adding on to that, properties control things like ref and snapshot
retention, file format, compression, and other settings. These aren't
settings that need to be carried through in every replace operation. And it
would make no sense if you set the snapshot retention because older
snapshots are retained, only to have it discarded the next time you replace
the table data. A good way to think about this is that table properties are
set infrequently, while table data changes regularly. And the person
changing the data may not be the person tuning the table settings.

Hopefully that helps,

Ryan

On Sun, Oct 20, 2024 at 9:45 AM Vladimir Ozerov 
wrote:

> Hi,
>
> Consider a REST catalog and a user calls "CREATE OR REPLACE "
> command. When processing the command, engines will usually initiate a
> "createOrReplace" transaction and add metadata, such as the properties of a
> new table.
>
> Users expect a table to be replaced with a new one if it exists,
> including properties. However, I observe the following:
>
>1. RESTSessionCatalog loads previous table metadata, adds new
>properties (MetadataUpdate.SetProperties), and invokes the backend
>2. The backend (e.g., Polaris) will typically invoke
>"CatalogHandler.updateTable." There, the previous table state, including
>its properties, is loaded
>3. Finally, metadata updates are applied, and old table properties are
>merged with new ones. That is, if the old table has properties [a=1, b=2],
>and the new table has properties [b=3, c=4], then the final properties
>would be [a=1, b=3, c=4], while the user expects [b=3, c=4].
>
> It looks like a bug because the user expects complete property replacement
> instead of a merge. Shall we explicitly clear all previous properties
> in RESTSessionCatalog.Builder.replaceTransaction?
>
> Regards,
> Vladimir.
>
>
>
> --
> *Vladimir Ozerov*
> Founder
> querifylabs.com
>

Re: [DISCUSS] Discrepancy Between Iceberg Spec and Java Implementation for Snapshot summary's 'operation' key

2024-10-20 Thread rdb...@gmail.com

Was it ever valid to have a summary field without the operation key?

No. They were introduced at the same time.

Would it be helpful to create alternative versions of the REST spec
specifically for referencing V1 and V2 tables?

I am strongly against this. The REST spec should be independent of the
table versions. Any table format version can be passed and the table format
should be the canonical reference for what is allowed. We want to avoid
cases where there are discrepancies. The table spec is canonical for table
metadata, and the REST spec allows passing it.

On Sun, Oct 20, 2024 at 11:18 AM Kevin Liu  wrote:

> Hey folks,
>
> Thanks, everyone for the discussion, and thanks Ryan for providing the
> historical context.
> Enforce the `operation` key in Snapshot’s `summary` field
>
> When serializing the `Snapshot` object from JSON, the Java implementation
> does not enforce that the `summary` field must contain an `operation` key.
> In the V1 spec, the `summary` field is optional, while in the V2 spec, it
> is required. However, in both versions, if a `summary` field is present, it
> must include an `operation` key. Any `summary` field lacking an `operation`
> key should be considered invalid.
>
> I’ve addressed this issue in PR 11354 [1] by adding this constraint when
> parsing JSON.
>
> > We initially did not have the snapshot summary or operation. When I
> added the summary, the operation was intended to be required in cases where
> the summary is present. It should always be there if the summary is and the
> summary should always be there unless you wrote the metadata.json file
> way back in 2017 or 2018.
>
> @Ryan, does this constraint also apply to `metadata.json` files from
> 2017/2018? Was it ever valid to have a `summary` field without the
> `operation` key?
>
> > Well, the spec says nothing about a top-level `operation` field in JSON
> [1]. Yet the Java implementation produces it [2] and removes the operation
> from the summary map. This seems inconsistent?
>
> @Anton, the Java `Snapshot` object includes both the `summary` and
> `operation` fields. When serializing to JSON, the `operation` field is
> included in the `summary` map [2], rather than as a top-level field. During
> deserialization from JSON, the `operation` field is extracted from the
> `summary` map [3].
>
> I believe this is consistent with the table spec, which defines the JSON
> output, not how the `Snapshot` object is implemented in Java.
> On REST spec and Table spec
>
> Thanks, Yufei, for highlighting the difference between the REST spec and
> the table spec. I mistakenly used the REST spec
> (`rest-catalog-open-api.yaml` [4]) as the source of truth for V2 tables.
>
> Looking at the REST spec file, it can be challenging to determine how a
> REST server should handle V1 versus V2 tables. Even for V2 tables, the
> current version of the file combines features from V2, along with
> additional changes made in preparation for the upcoming V3 spec.
>
> Would it be helpful to create alternative versions of the REST spec
> specifically for referencing V1 and V2 tables? The goal would be to have a
> "frozen" version of the REST spec dedicated to V1 tables and another for V2
> tables while allowing the current REST spec file to evolve as needed.
>
> Taking a step back, I think we need more documentation on the REST spec,
> including support for different table versions and guidance on upgrading
> from one version to another. I’d love to hear everyone’s thoughts on this.
>
>
> Best,
>
> Kevin Liu
>
>
> [1] https://github.com/apache/iceberg/pull/11354
>
> [2]
> https://github.com/apache/iceberg/blob/17f1c4d2205b59c2bd877d4d31bbbef9e90979c5/core/src/main/java/org/apache/iceberg/SnapshotParser.java#L63-L66
>
> [3]
> https://github.com/apache/iceberg/blob/17f1c4d2205b59c2bd877d4d31bbbef9e90979c5/core/src/main/java/org/apache/iceberg/SnapshotParser.java#L124-L137
>
> [4]
> https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml
>
>
> On Sat, Oct 19, 2024 at 7:48 PM Sung Yun  wrote:
>
>> Hi Ryan, thank you for your response!
>>
>> That detailed context is very helpful in allowing me to understanding why
>> the REST catalog spec has evolved the way it has, and how the Table Spec
>> and the REST Catalog Spec should each be referenced in the sub-communities
>> (like in PyIceberg). I'll keep those motivations in mind as we discuss
>> those Specs in the future.
>>
>> Also, here's a small PR to specify more explicitly that the operation
>> field should be a required field in the summary field:
>> https://github.com/apache/iceberg/pull/11355
>>
>> Sung
>>
>> On 2024/10/19 22

Re: Spec changes for deletion vectors

2024-10-16 Thread rdb...@gmail.com

dless of what
>> we pick since on  Iceberg side we really just need the bitmap and what
>> offset it is located at within a file, everything else could be in the
>> Iceberg metadata. We don’t have any disagreement on this aspect I think.
>>
>> The question is whether we would write additional Delta specific metadata
>> into the vector itself that an Iceberg implementation would not use so that
>> current Delta readers could read Iceberg delete vectors without a code
>> change or rewriting the vectors. The underlying representation would still
>> be the same between the two formats.
>>
>> The pros to doing this are that a reverse translation of iceberg to delta
>> would be much simpler.  Any implementers who already have delta vector read
>> code can probably mostly reuse it although our metadata would let them skip
>> to just reading the bitmap.
>>
>> The cons are that the metadata being written isn’t used by Iceberg so any
>> real tests would require using a delta reader, anything else would just be
>> synthetic. Theoretically we could end up with iceberg implementers who have
>> bugs in this part of the code and we wouldn’t even know it was an issue
>> till someone converted the table to delta. Other iceberg readers would just
>> be ignoring these bytes, so we essentially are adding a requirement and
>> complexity (although not that much) to Iceberg writers for the benefit of
>> current Delta readers. Delta would probably also have to add a new fields
>> to their metadata representations to capture the vector metadata to handle
>> our bitmaps.
>>
>> On Tue, Oct 15, 2024 at 5:56 PM Scott Cowell
>>  wrote:
>>
>>> From an engine perspective I think compatibility between Delta and
>>> Iceberg on DVs is a great thing to have.  The additions for cross-compat
>>> seem a minor thing to me that is vastly outweighed by a future where Delta
>>> tables with DVs were supported in Delta Uniform and could be read by any
>>> Iceberg V3 compliant engine.
>>>
>>> -Scott
>>>
>>> On Tue, Oct 15, 2024 at 2:06 PM Anton Okolnychyi 
>>> wrote:
>>>
>>>> Are there engines/vendors/companies in the community that support both
>>>> Iceberg and Delta and would benefit from having one blob layout for DVs?
>>>>
>>>> - Anton
>>>>
>>>> вт, 15 жовт. 2024 р. о 11:10 rdb...@gmail.com  пише:
>>>>
>>>>> Thanks, Szehon.
>>>>>
>>>>> To clarify on compatibility, using the same format for the blobs makes
>>>>> it so that existing Delta readers can read and use the DVs written by
>>>>> Iceberg. I'd love for Delta to adopt Puffin, but if we adopt the extra
>>>>> fields they would not need to change how readers work. That's why I think
>>>>> there is a benefit to using the same format. We avoid fragmentation and
>>>>> make sure data and delete files are compatible. No unnecessary
>>>>> fragmentation.
>>>>>
>>>>> Ryan
>>>>>
>>>>> On Tue, Oct 15, 2024 at 10:57 AM Szehon Ho 
>>>>> wrote:
>>>>>
>>>>>> This is awesome work by Anton and Ryan, it looks like a ton of effort
>>>>>> has gone into the V3 position vector proposal to make it clean and
>>>>>> efficient, a long time coming and Im truly excited to see the great
>>>>>> improvement in storage/perf.
>>>>>>
>>>>>> wrt to these fields, I think most of the concerns are already
>>>>>> mentioned by the other community members in the prs
>>>>>> https://github.com/apache/iceberg/pull/11238 and
>>>>>> https://github.com/apache/iceberg/pull/11238, so not much to add.
>>>>>> The DV itself is RoaringBitmap 64-bit format so that's great, the 
>>>>>> argument
>>>>>> for CRC seems reasonable, and I dont have enough data to be opinionated
>>>>>> towards endian/magic byte.
>>>>>>
>>>>>> But I do lean towards the many PR comments that the extra length
>>>>>> field is unnecessary, and would just add confusion.  It seemed to me that
>>>>>> the Iceberg community has made so much effort to trim to spec to the bare
>>>>>> minimum for cleanliness and efficiency, so I do feel the field is not in
>>>>>> the normal direction of the project.  Also Im not clear on the plan for 
>>>>>> ol

Re: Iceberg View Spec Improvements

2024-10-09 Thread rdb...@gmail.com

+1 for Steven's comment. There is already an implicit assumption that the
catalog names are consistent across engines. The best practice is to not
reference identifiers across catalogs, but there isn't much we can do about
the assumption here without rewriting SQL to fully qualify identifiers.

On Tue, Oct 8, 2024 at 11:16 PM Walaa Eldin Moustafa 
wrote:

> Hi Steven,
>
> Assumption 1 in "Portable SQL table identifiers" states:
>
> *All engines resolve a fully specified SQL identifier x.y.z to the same
> storage table identifier b.c in the same catalog a.*
>
> I think this assumption encodes the 4th assumption you shared. Assuming
> "x.y.z" resolves to "b.c" in storage catalog "a" across all engines is
> true, the following is also true:
>
> 1- When resolving to the same storage table identifier, the same catalog
> name must be used -- This is encoded by the fact that SQL catalog part is
> common as "x". (this addresses the first part of the 4th assumption you
> shared).
>
> 2- The mapping from "x.y.z" to "b.c" in storage catalog "a" can be done by
> federation, but that is an implementation detail. As long as we maintain
> the constraint that "x.y.x" is resolved consistently across engines to the
> same storage table, implementing this by federation or something else is
> irrelevant. (this addresses the second part of the 4th assumption you
> shared).
>
> Further, the Assumption 1 above is more comprehensive since it prescribes
> how to go about all of the SQL catalog part, namespace and table name, not
> only the catalog part.
>
> Thanks,
> Walaa.
>
> On Tue, Oct 8, 2024 at 9:41 PM Steven Wu  wrote:
>
>> Walaa, it doesn't seem to me that the doc captured Russel's idea. there
>> could be a new assumption
>> 4. If the catalog name is part of the table identifier, it should be
>> consistent across engines. catalog federation can achieve the
>> normalization/standardization of the catalog names
>>
>>
>>
>> On Tue, Oct 8, 2024 at 6:17 PM Walaa Eldin Moustafa <
>> wa.moust...@gmail.com> wrote:
>>
>>> Just opened Comment access to the doc. Link here again for convenience
>>> [1].
>>>
>>> [1]
>>> https://docs.google.com/document/d/1e5orD_sBv0VlNNLZRgUtalVUllGuztnAGTtqo8J0UG8/edit
>>>
>>> Thanks,
>>> Walaa.
>>>
>>>
>>> On Tue, Oct 8, 2024 at 10:42 AM Walaa Eldin Moustafa <
>>> wa.moust...@gmail.com> wrote:
>>>
 Thanks Steven! I think this fits in the framework of "portable table
 identifiers" in the doc. I have stated the assumptions that should be added
 to the Iceberg spec in that case (in the doc they are more abstract/generic
 than the version you shared). Would be great to provide your feedback on
 the assumptions in the doc.

 Thanks,
 Walaa.


 On Tue, Oct 8, 2024 at 9:40 AM Steven Wu  wrote:

> I like to follow up on Russel's suggestion of using a federated
> catalog for resolving the catalog name/alias problem. I think Russel's 
> idea
> is that the federated catalog standardizes the catalog names (for
> referencing). That could solve the problem.
>
> There are two cases/
> (1) single catalog: there is no need to include catalog name in the
> table identifier.
> (2) multiple catalogs (backends): the view and storage table should be
> defined in a federated catalog. the references to source tables should
> include the source catalog names, which are standardized by the federated
> catalog.
>
> Thanks,
> Steven
>
>
> On Mon, Oct 7, 2024 at 11:16 PM Walaa Eldin Moustafa <
> wa.moust...@gmail.com> wrote:
>
>> Hi Everyone,
>>
>> As part of our discussions on the Materialized View (MV) spec, the
>> topic of "SQL table identifiers" has been a constant source of 
>> complexity.
>> After several iterations, the community has agreed not to use SQL table
>> identifiers in the table-side representation of MVs. However, that still
>> does not preclude referencing SQL table identifiers in views since they 
>> are
>> integral to view definitions. Therefore, it’s crucial to properly design
>> this aspect of the spec in order to improve the view spec as well as
>> unblock the progress on the MV spec.
>>
>> I’ve outlined the current gaps in the view spec along with some
>> proposed ways to address them in this document [1]. It would be great to
>> get your feedback so we can simplify future discussions around views and
>> materialized views.
>>
>> Looking forward to hearing your thoughts.
>>
>> [1]
>> https://docs.google.com/document/d/1e5orD_sBv0VlNNLZRgUtalVUllGuztnAGTtqo8J0UG8/edit
>>
>> Thanks,
>> Walaa
>>
>>

Re: [Discuss] Iceberg community maintaining the docker images

2024-10-10 Thread rdb...@gmail.com

I was specifically replying to this suggestion to add docker images for
Trino and Spark:

> I also envision the Iceberg community maintaining some quick-start Docker
images, such as spark-iceberg-rest, Trino-iceberg-rest, among others.

It sounds like we're mostly agreed that the Iceberg project itself isn't a
good place to do that. As for an image that is for catalog implementations
to test against, I think that's a good idea (supporting testing and
validation).

On Thu, Oct 10, 2024 at 10:56 AM Jean-Baptiste Onofré 
wrote:

> It's actually what I meant by REST Catalog docker image for test.
>
> Personally, I would not include any docker images in the Iceberg project
> (but more in the "iceberg" ecosystem, which is different from the project
> :)).
>
> However, if the community has a different view on that, no problem.
>
> Regards
> JB
>
> On Thu, Oct 10, 2024 at 9:50 AM Daniel Weeks  wrote:
>
>> I think we should focus on the docker image for the test REST Catalog
>> implementation.  This is somewhat different from the TCK since it's used by
>> the python/rust/go projects for testing the client side of the REST
>> specification.
>>
>> As for the quickstart/example type images, I'm open to discussing what
>> makes sense here, but we should decouple that and other docker images from
>> getting a test REST catalog image out.  (Seems like there's general
>> consensus around that).
>>
>> -Dan
>>
>> On Thu, Oct 10, 2024 at 4:29 AM Ajantha Bhat 
>> wrote:
>>
>>> Yes, the PRs I mentioned are about running TCK as a docker container and
>>> keeping/maintaining that docker file in the Iceberg repo.
>>>
>>> I envisioned maintaining other docker images also because I am not sure
>>> about the roadmap of the ones in our quickstart
>>> <https://iceberg.apache.org/spark-quickstart/> (example:
>>> tabulario/spark-iceberg).
>>>
>>> Thanks,
>>> Ajantha
>>>
>>> On Thu, Oct 10, 2024 at 3:50 PM Jean-Baptiste Onofré 
>>> wrote:
>>>
>>>> Hi
>>>>
>>>> I think there's context missing here.
>>>>
>>>> I agree with Ryan that Iceberg should not provide any docker image or
>>>> runtime things (we had the same discussion about REST server).
>>>>
>>>> However, my understanding is that this discussion is also related to
>>>> the REST TCK. The TCK validation run needs a runtime, and I remember a
>>>> discussion we had with Daniel (running TCK as a docker container).
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> On Wed, Oct 9, 2024 at 2:20 PM rdb...@gmail.com 
>>>> wrote:
>>>>
>>>>> I think it's important for a project to remain focused on its core
>>>>> purpose, and I've always advocated for Iceberg to remain a library that is
>>>>> easy to plug into other projects. I think that should be the guide here as
>>>>> well. Aren't projects like Spark and Trino responsible for producing easy
>>>>> to use Docker images of those environments? Why would the Iceberg project
>>>>> build and maintain them?
>>>>>
>>>>> I would prefer not to be distracted by these things, unless we need
>>>>> them for cases like supporting testing and validation of things that are
>>>>> part of the core purpose of the project.
>>>>>
>>>>> On Tue, Oct 8, 2024 at 6:08 AM Ajantha Bhat 
>>>>> wrote:
>>>>>
>>>>>> Hello everyone,
>>>>>>
>>>>>> Now that the test fixtures are in [1],we can create a runtime JAR for
>>>>>> the REST catalog adapter [2] from the TCK.
>>>>>> Following that, we can build and maintain the Docker image based on
>>>>>> it [3].
>>>>>>
>>>>>> I also envision the Iceberg community maintaining some quick-start
>>>>>> Docker images, such as spark-iceberg-rest, Trino-iceberg-rest, among 
>>>>>> others.
>>>>>>
>>>>>> I've looked into other Apache projects, and it seems that Apache
>>>>>> Infra can assist us with this process.
>>>>>> As we have the option to publish Iceberg docker images under the
>>>>>> Apache Docker Hub account.
>>>>>>
>>>>>> [image: image.png]
>>>>>>
>>>>>> I am more than willing to maintain this code, please find the PRs
>>>>>> related to the same [2] & [3].
>>>>>>
>>>>>> Any suggestions on the same? contributions are welcome if we agree to
>>>>>> maintain it.
>>>>>>
>>>>>> [1] https://github.com/apache/iceberg/pull/10908
>>>>>> [2] https://github.com/apache/iceberg/pull/11279
>>>>>> [3] https://github.com/apache/iceberg/pull/11283
>>>>>>
>>>>>> - Ajantha
>>>>>>
>>>>>

Spec changes for deletion vectors

2024-10-10 Thread rdb...@gmail.com

Hi everyone,

There seems to be broad agreement around Anton's proposal to use deletion
vectors in Iceberg v3, so I've opened two PRs that update the spec with the
proposed changes. The first, PR #11238
, adds a new Puffin
blob type, delete-vector-v1, that stores a delete vector. The second, PR
#11240 , updates the
Iceberg table spec.

Please take a look and comment!

Ryan

Re: [EXTERNAL] Re: [DISCUSS] Column to Column filtering

2024-10-04 Thread rdb...@gmail.com

I don't think that the doc shows cases where this filtering would eliminate
data files. The example in the doc is airline flights, where a flight has
both a scheduled departure time and an actual departure time. The query
looks for cases where the actual departure is later than the scheduled
departure. (And similar for arrival times.)

In order to evaluate whether a column-to-column filter would be helpful, I
think we need a way to think about how the data would be laid out in files.
You could have files that are written as the complete records are created
-- in that case the records would be basically clustered by actual arrival
time. Or you might recluster the data to organize it by flight number or by
some departure time.

Assuming the first case, the arrival times of a data file would be
clustered into a small period of time, but the flights could be of varying
duration. So the scheduled departure times could be all over the place,
from 8 hours ago to 1 hour ago. Similarly, the actual departure times would
have a wide range. As a result, it's likely that those ranges would overlap
and a column-to-column filter would not be able to select files. For the
arrival time query, there's a similar problem. Assuming most flights are on
time, the arrival time and scheduled arrival time (upper bound) would be
similar for at least one flight, so the ranges would likely overlap and the
expression wouldn't be helpful again.

If I think about clustering the data by departure time, the same logic
applies and I don't think there would be much value to the expression. If
you were to cluster by other columns, then you'd get the same conclusion.
Clustering by flight number, for example, would more randomly distribute
flights and result in wider overall ranges that still overlap.

The only case in which I think this might work is if the data were
clustered by a derived delay value -- but wouldn't it be easier to simply
calculate that value as a column and filter by it directly?

I definitely see the value of column-to-column filters for rows, but in a
table format it doesn't seem to me like it would result in much filtering.
Maybe there's a different use case?

On Thu, Oct 3, 2024 at 11:41 PM Benny Chow  wrote:

> Assuming the table contained smaller and better correlated files, I think
> a workaround where you materialized the timestamp difference between two
> columns could be effective for data file pruning.  So if a particular
> planned departure date was associated with a lot of delays and the table
> was partitioned by destination_cd and sorted by planned departure date,
> materializing the diff between planned departure date and actual departure
> date will result in a single field with min/max bounds that could be
> filtered on.  You could then get data file pruning for a filter like late
> departure but not more than an hour late.
>
> On Mon, Sep 30, 2024 at 12:56 PM Baldwin, Jennifer
>  wrote:
>
>> It has come to my attention that there was no attachment.  I have created
>> google doc instead.  Thanks.
>>
>>
>>
>>
>> https://docs.google.com/document/d/1HZa3AyPPfgz9VOVA9rPhJJ8f3F-3tEel_53nIlvYlo0/edit?usp=sharing
>>
>>
>>
>> *From: *Baldwin, Jennifer 
>> *Date: *Friday, September 27, 2024 at 12:54 PM
>> *To: *dev@iceberg.apache.org 
>> *Cc: *jennifer.bald...@teradata.com.invalid
>> 
>> *Subject: *Re: [EXTERNAL] Re: [DISCUSS] Column to Column filtering
>>
>> You don't often get email from jennifer.bald...@teradata.com.invalid. Learn
>> why this is important <https://aka.ms/LearnAboutSenderIdentification>
>>
>> Please see attached, I hope this provides you with more clarity on the
>> use case we hope to support.  Let me know if you have any further questions
>>
>>
>>
>> *From: *Russell Spitzer 
>> *Date: *Wednesday, September 18, 2024 at 6:15 PM
>> *To: *dev@iceberg.apache.org 
>> *Cc: *jennifer.bald...@teradata.com.invalid
>> 
>> *Subject: *[EXTERNAL] Re: [DISCUSS] Column to Column filtering
>>
>> [CAUTION: External Email]
>>
>>
>>
>> I have similar concerns to Ryan although I could see that if we were
>> writing smaller and better correlated files that this could be a big help.
>> Specifically with variant use cases this may be very useful. I would love
>> to hear more about the use cases and rationale for adding this. Do you have
>> any specific examples you can go into detail on?
>>
>>
>>
>> On Wed, Sep 18, 2024 at 4:48 PM rdb...@gmail.com 
>> wrote:
>>
>> I'm curious to learn more about this feature. Is there a driving use case
>> that you're implementing it for? Are there common situations in which these
>> fi

Re: [DISCUSS] Discrepancy Between Iceberg Spec and Java Implementation for Snapshot summary's 'operation' key

2024-10-22 Thread rdb...@gmail.com

> For example, the `Snapshot` `summary` field is optional in V1 but
required in V2. Therefore, the REST spec definition should mark the
`summary` field as optional to support both versions.

Yeah, this is technically true. But as I said in my first email, unless you
have tables that are 5 years old, it's unlikely that this is going to be a
problem. A failure here is more likely with newer implementations that have
a bug. So I'd argue there's value in leaving it as required.

On Mon, Oct 21, 2024 at 9:41 AM Kevin Liu  wrote:

> > No. They were introduced at the same time.
> Great! Since the `summary` field and the `operation` key were introduced
> together, we should enforce the rule that the `summary` field must always
> have an accompanying `operation` key. This has been addressed in PR 11354
> [1].
>
> > I am strongly against this. The REST spec should be independent of the
> table versions.
> That makes sense. For the REST spec to support both V1 and V2 tables, it
> should "accept" the least common denominator between the two versions. For
> example, the `Snapshot` `summary` field is optional in V1 but required in
> V2. Therefore, the REST spec definition should mark the `summary` field as
> optional to support both versions. However, the current REST spec leans
> towards the V2 table spec; fields that are optional in V1 and required in
> V2 are marked as required in the spec, such as `TableMetadata.table-uuid`
> [2][3] and `Snapshot.summary` [4][5].
>
> Would love to get other people's thoughts on this.
>
> Best,
> Kevin Liu
>
> [1] https://github.com/apache/iceberg/pull/11354
> [2]
> https://github.com/apache/iceberg/blob/8e743a5b5209569f84b6bace36e1106c67e1eab3/open-api/rest-catalog-open-api.yaml#L2414
> [3] https://iceberg.apache.org/spec/#table-metadata-fields
> [4]
> https://github.com/apache/iceberg/blob/8e743a5b5209569f84b6bace36e1106c67e1eab3/open-api/rest-catalog-open-api.yaml#L2325
> [5] https://iceberg.apache.org/spec/#snapshots
>
> On Sun, Oct 20, 2024 at 11:24 AM rdb...@gmail.com 
> wrote:
>
>> Was it ever valid to have a summary field without the operation key?
>>
>> No. They were introduced at the same time.
>>
>> Would it be helpful to create alternative versions of the REST spec
>> specifically for referencing V1 and V2 tables?
>>
>> I am strongly against this. The REST spec should be independent of the
>> table versions. Any table format version can be passed and the table format
>> should be the canonical reference for what is allowed. We want to avoid
>> cases where there are discrepancies. The table spec is canonical for table
>> metadata, and the REST spec allows passing it.
>>
>> On Sun, Oct 20, 2024 at 11:18 AM Kevin Liu 
>> wrote:
>>
>>> Hey folks,
>>>
>>> Thanks, everyone for the discussion, and thanks Ryan for providing the
>>> historical context.
>>> Enforce the `operation` key in Snapshot’s `summary` field
>>>
>>> When serializing the `Snapshot` object from JSON, the Java
>>> implementation does not enforce that the `summary` field must contain an
>>> `operation` key. In the V1 spec, the `summary` field is optional, while in
>>> the V2 spec, it is required. However, in both versions, if a `summary`
>>> field is present, it must include an `operation` key. Any `summary` field
>>> lacking an `operation` key should be considered invalid.
>>>
>>> I’ve addressed this issue in PR 11354 [1] by adding this constraint when
>>> parsing JSON.
>>>
>>> > We initially did not have the snapshot summary or operation. When I
>>> added the summary, the operation was intended to be required in cases where
>>> the summary is present. It should always be there if the summary is and the
>>> summary should always be there unless you wrote the metadata.json file
>>> way back in 2017 or 2018.
>>>
>>> @Ryan, does this constraint also apply to `metadata.json` files from
>>> 2017/2018? Was it ever valid to have a `summary` field without the
>>> `operation` key?
>>>
>>> > Well, the spec says nothing about a top-level `operation` field in
>>> JSON [1]. Yet the Java implementation produces it [2] and removes the
>>> operation from the summary map. This seems inconsistent?
>>>
>>> @Anton, the Java `Snapshot` object includes both the `summary` and
>>> `operation` fields. When serializing to JSON, the `operation` field is
>>> included in the `summary` map [2], rather than as a top-level field. During
>>> deserialization from JSON, the `operation` field is extracted from

Re: Overwrite old properties on table replace with REST catalog

2024-10-22 Thread rdb...@gmail.com

Thanks Vladimir! Would you like to open a PR to make that change? It sounds
like another good item to put into the "Implementation notes" section.

On Sun, Oct 20, 2024 at 11:41 PM Vladimir Ozerov 
wrote:

> Hi Jean-Baptiste,
>
> Agreed. REST spec looks good. I am talking about the general spec, where
> it might be useful to add a hint to engine developers, that CREATE OR
> REPLACE semantics in Iceberg is expected to follow slightly different
> semantics. This is already broken in Trino: depending on catalog type users
> may get either classical "DROP + CREATE" (for non-REST catalogs), or
> "CREATE AND UPDATE" for REST catalog. For Flink, their official docs say
> that CREATE OR REPLACE == DROP + CREATE, while for Iceberg tables this
> should not be the case. These are definitively things that should be fixed
> at engine levels. But at the same time it highlights that engine developers
> are having hard time defining proper semantics for CREATE OR REPLACE in the
> Iceberg integrations, so a paragraph or so in the main Iceberg spec may
> help us align expectations.
>
> Regards,
> Vladimir.
>
> On Mon, Oct 21, 2024 at 8:28 AM Jean-Baptiste Onofré 
> wrote:
>
>> Hi Vladimir,
>>
>> As Ryan said, it's not a bug: CREATE OR REPLACE can be seen as "CREATE
>> AND UPDATE" from table format perspective. Specifically for the
>> properties, it makes sense to not delete the current properties as it
>> can be used in several use cases (security, tables grouping, ...).
>> I'm not sure a REST Spec update is required, probably more on the
>> engine side. In the REST Spec, you can create a table
>> (
>> https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L553
>> )
>> and update a table
>> (
>> https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L975
>> ),
>> and it's up to the query engine to implement the "CREATE OR REPLACE"
>> with the correct semantic.
>>
>> Regards
>> JB
>>
>> On Sun, Oct 20, 2024 at 9:26 PM Vladimir Ozerov 
>> wrote:
>> >
>> > Hi Ryan,
>> >
>> > Thanks for the clarification. Yes, I think my confusion was caused by
>> the fact that many engines treat CREATE OR REPLACE as a semantic equivalent
>> of DROP + CREATE, which is performed atomically (e.g., Flink [1]). Table
>> formats add history on top of that, which is expected to be retained, no
>> questions here. Permission propagation also make sense. For properties
>> things become a bit blurry, because on the one hand there are Iceberg
>> specific properties, which may affect table maintenance, and on the other
>> hand there are user-defined properties in the same bag. The question
>> appeared in the first place because I observed a discrepancy in Trino: all
>> catalogs except for REST completely overrides table properties on REPLACE,
>> and REST catalog merges them, which might be confusing to end users.
>> Perhaps some clarification at the spec level might be useful, because
>> without agreement between engines the could be some subtle bugs in
>> multi-engine environments, such as sudden data format changes between
>> replaces, etc.
>> >
>> > [1]
>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#create-or-replace-table
>> >
>> > Regards,
>> > Vladimir.
>> >
>> > On Sun, Oct 20, 2024 at 9:20 PM rdb...@gmail.com 
>> wrote:
>> >>
>> >> Hi Vladimir,
>> >>
>> >> This isn't a bug. The behavior of CREATE OR REPLACE is to replace the
>> data of a table, but to maintain things like other refs, snapshot history,
>> permissions (if supported by the catalog), and table properties. Table
>> properties are replaced if they are set in the operation like `b` in your
>> example. This is not the same as a drop and create, which may be what you
>> want instead.
>> >>
>> >> The reason for this behavior is that the CREATE OR REPLACE operation
>> is used to replace a table's data without needing to handle schema changes
>> between versions. For example, producing a daily report table that replaces
>> the previous day. However, the table still exists and it is valuable to be
>> able to time travel to older versions or to be able to use branches and
>> tags. Clearly, that means that table history and refs stick around so the
>> table is not completely new every time it is replaced.
>> >>
>> >> Adding on to that, properties control things like ref and snapshot

Re: Spec changes for deletion vectors

2024-10-22 Thread rdb...@gmail.com

Thanks for the careful consideration here, everyone.

I completely agree that we do not want this to be confused as setting a
precedent about delegating design decisions. I don't think the Delta
community would do that either! We have to make decisions that are right
for Iceberg.

It sounds like we're mostly in agreement on both of Russell's concerns. The
value of compatibility in this case likely overrides the pain and annoyance.

I'll update the spec PRs so that we can move forward.

Ryan

On Mon, Oct 21, 2024 at 3:05 PM Szehon Ho 
wrote:

> Im +1 for adding DV (Goal 1) and also +1 for the ability for Iceberg
> readers to read Delta Lake DV’s, as the magic byte, CRC make sense
> design-wise (Goal 2).
>
> Its nice that there's cross-community collaboration probably in other
> areas Im not looking at, but I'm -0.5 on writing an otherwise unjustifiable
> (albeit small) format change to be compatible by Delta Lake readers (Goal
> 3), to register that this is irregular direction for the project, for
> Iceberg to choose a spec just so it can be read by another project.  For
> example, it can be debated, but discussions in the past about adding export
> functionality to Hive, were not supported to be added to Iceberg project.
> So it won’t block if community supports it, but want to raise awareness.
>
> Thanks,
> Szehon
>
>
>
> On Oct 21, 2024, at 2:36 PM, Micah Kornfield 
> wrote:
>
> I agree with everything Russell said.  I think we should move forward with
> the current format of DVs to favor compatibility.  I'll add that I think
> the collaboration aspect likely applies to other aspects as well outside of
> Deletion Vectors (e.g. the work that is happening on Variant type).
>
> Thanks,
> Micah
>
> On Mon, Oct 21, 2024 at 1:45 PM Russell Spitzer 
> wrote:
>
>> I've thought about this a lot and talked it over with a lot of folks. As
>> I've noted before my main concerns are
>>
>> A. Setting a precedent that we are delegating design decisions to another
>> project
>> B. Setting unnecessary requirements that can only really be checked by
>> integration tests with another system
>>
>> I think the value of compatibility can override the pain/annoyance of B.
>>
>> For A, I just want to make clear I will not go along with any sort of
>> concession in design in the future. I think it's ok that we do it this
>> time, but in the future if the Delta Delete Vector format changes I would
>> really hope that the community around Delta would make those decisions in
>> collaboration with the Apache Iceberg community. This is probably to be a
>> red line for me in the future and I won't be able to go ahead in the future
>> with any changes that aren't necessary for the Iceberg project regardless
>> of their adoption in other formats.
>>
>> So with that said, I'm in support of any of the above solutions but I
>> think just going with full compatibility with Delta (down to storage format
>> details) is the right choice to try to get the two communities working
>> together in the future.
>>
>> On Sat, Oct 19, 2024 at 4:38 PM rdb...@gmail.com 
>> wrote:
>>
>>> Thanks for the summary, Szehon!
>>>
>>> I would add one thing to the "minimum" for each option. Because we want
>>> to be able to seek directly to the DV for a particular data file, I think
>>> it's important to start the blob with magic bytes. That way the reader can
>>> validate that the offset was correct and that the contents of the blob are
>>> in the expected format. So I'd add magic bytes to options (1) and (2). In
>>> (2) we would want the magic bytes to match the ones from Delta to be able
>>> to read DV files written by Delta.
>>>
>>> Ryan
>>>
>>> On Thu, Oct 17, 2024 at 8:55 PM Szehon Ho 
>>> wrote:
>>>
>>>> So based on Micah's original goals, switch 2 and 3:
>>>>
>>>> 1. The best possible implementation of DVs (limited redundancy, no
>>>> extraneous fields, CPU efficiency, minimal space, etc).
>>>> 2.  The ability for Iceberg readers to read Delta Lake DVs
>>>> 3.  The ability for Delta Lake readers to read Iceberg DVs
>>>>
>>>> The minimum for each option are:
>>>> (1) = DV
>>>> (2) = DV (little-endian) + CRC  (big-endian)
>>>> (3) = Len (big-endian) + Magic + DV (litte-endian) + CRC (big-endian)
>>>>
>>>> Design wise,  CRC can be useful, Magic may be useful/OK, but Len is
>>>> controversial as its a different len

Re: [VOTE] Endpoint for refreshing vended credentials

2024-10-22 Thread rdb...@gmail.com

+1 (binding)

Thanks for your work on this!

On Tue, Oct 22, 2024 at 2:47 PM Prashant Singh 
wrote:

> +1 (non-binding)
>
> Regards,
> Prashant
>
> On Tue, Oct 22, 2024 at 10:50 AM John Zhuge  wrote:
>
>> +1 (non-binding)
>>
>> John Zhuge
>>
>>
>> On Tue, Oct 22, 2024 at 9:45 AM Jack Ye  wrote:
>>
>>> +1 (binding)
>>>
>>> Best,
>>> Jack Ye
>>>
>>> On Tue, Oct 22, 2024 at 9:32 AM Dmitri Bourlatchkov
>>>  wrote:
>>>
 Thanks for the reply Eduard!

 I think it is fine to defer fine-tuning credential refreshes to a later
 PR.

 I'm upgrading my vote to +1 (non-binding).

 Cheers,
 Dmitri.

 On Tue, Oct 22, 2024 at 11:11 AM Eduard Tudenhöfner <
 etudenhoef...@apache.org> wrote:

> Hey Dmitri,
>
> the idea behind the endpoint itself is really just to provide *valid*
> credentials for a given table when a client asks for them.
> If the server returned you two S3 credentials, the client will use the
> one with the longest prefix and if that credential expires, it will ask 
> the
> server again for *valid* credentials.
> That means the server can again return you two S3 credentials, even if
> that second unused credential from the previous endpoint call didn't 
> expire
> yet.
> I don't think we'd want to complicate the endpoint *at this point* to
> have a differentiation between what specific credentials a client wants to
> receive from the server.
>
> Thanks,
> Eduard
>
> On Mon, Oct 21, 2024 at 6:36 PM Dmitri Bourlatchkov
>  wrote:
>
>> -0 (non-binding)
>>
>> If multiple credentials are vended for a table (which is allowed) the
>> current API requires all credentials to be refreshed, when any of the
>> previous credentials expires. I think this is suboptimal (but can 
>> probably
>> be made to work in most practical cases).
>>
>> Cheers,
>> Dmitri.
>>
>> On Mon, Oct 21, 2024 at 6:07 AM Eduard Tudenhöfner <
>> etudenhoef...@apache.org> wrote:
>>
>>> Hey everyone,
>>>
>>> I'd like to vote on #11281
>>> , which introduces a
>>> new endpoint and allows retrieving/refreshing vended credentials for a
>>> given table.
>>>
>>> Please vote +1 if you generally agree with the path forward.
>>>
>>> Please vote in the next 72 hours
>>>
>>> [ ] +1, commit the proposed spec changes
>>> [ ] -0
>>> [ ] -1, do not make these changes because . . .
>>>
>>>
>>> Thanks everyone,
>>>
>>> Eduard
>>>
>>

Re: [DISCUSS] Discrepancy Between Iceberg Spec and Java Implementation for Snapshot summary's 'operation' key

2024-10-28 Thread rdb...@gmail.com

> I'd rather we just enforce the spec as written and provide
separate functionality for cleaning/repairing out of spec metadata.

I definitely prefer to keep metadata clean, but is it really worth causing
failures? When there's a problem with the metadata for a table, it's
unlikely that a user is going to be able to go and fix it themselves. So if
there's a bug in a library that produces bad metadata (for example, a case
statement that returns null instead of throwing an exception to generate
the operation field) then the table would be wedged. We can hopefully fix a
lot of this in catalog service implementations (reject bad commits) but I'm
worried about the idea of just not being able to read or fix tables.

On Mon, Oct 28, 2024 at 9:35 AM Russell Spitzer 
wrote:

> This is one of the reasons I'm opposed to metadata we don't use/need. We
> end up forking the spec and then we have some odd behaviors, a metadata
> which is illegal in one implementation (PyIceberg) will be legal in another
> (Iceberg Java) (and with the suggestion above be silently modified to bring
> it to spec). I'd rather we just enforce the spec as written and provide
> separate functionality for cleaning/repairing out of spec metadata. If we
> do want to auto-correct metadata.json we need to do so at read time,
> otherwise a user of the Iceberg library can read an "invalid"
> metadata.json, try to use the operation key and fail in an unexpected way,
> only to modify the table in an unrelated DML and have it start working again
>
> That said I don't feel strongly enough to not revert if that's the
> consensus here. At minimum we should forbid the Iceberg java implementation
> from being able to write a Snapshot without an Operation.
>
> On Mon, Oct 28, 2024 at 6:31 AM Fokko Driesprong  wrote:
>
>> Hey everyone,
>>
>> Thanks Kevin for spotting and raising this.
>>
>> I agree with Ryan that we still should be able to read the table. We had
>> similar issues in the past (looking at you -1 for current-snapshot-id)
>> which we can fix. Can I suggest when we encounter a missing operation, and
>> we write the table again, that set the operation to overwrite? This seems
>> to be the safest option. This way the Java library doesn't produce any
>> metadata that conflicts with the spec. I checked quickly, and it looks like
>> it allows writing a summary without an operation:
>>
>> @Test
>> public void testJsonConversionSummaryWithoutOperationFails() {
>>   String json =
>>   String.format(
>>   "{\n"
>>   + "  \"snapshot-id\" : 2,\n"
>>   + "  \"parent-snapshot-id\" : 1,\n"
>>   + "  \"timestamp-ms\" : %s,\n"
>>   + "  \"summary\" : {\n"
>>   + "\"files-added\" : \"4\",\n"
>>   + "\"files-deleted\" : \"100\"\n"
>>   + "  },\n"
>>   + "  \"manifests\" : [ \"/tmp/manifest1.avro\",
>> \"/tmp/manifest2.avro\" ],\n"
>>   + "  \"schema-id\" : 3\n"
>>   + "}",
>>   System.currentTimeMillis());
>>
>>   Snapshot snap = SnapshotParser.fromJson(json);
>>   assertThat(snap.operation()).isNull();
>>   assertThat(SnapshotParser.toJson(snap)).contains("operation");  // This
>> fails
>> }
>>
>> Kind regards,
>> Fokko
>>
>>
>> Op ma 28 okt 2024 om 00:16 schreef Kevin Liu :
>>
>>> Hi Ryan,
>>>
>>> I've created a revert PR [1]. I agree that we should take a more
>>> permissive approach when reading a table, allowing for reading
>>> non-compliant table metadata, especially for an opportunity to "fix" the
>>> metadata. However, I think we still need a way to enforce the table
>>> specification to ensure that other operations interact with a compliant
>>> table.
>>>
>>> Perhaps we could permit `SnapshotParser` to read the metadata but
>>> enforce the `operation` field at a different location to guarantee the
>>> table’s compliance with the specification.
>>>
>>> Best,
>>> Kevin Liu
>>>
>>> [1] https://github.com/apache/iceberg/pull/11409
>>>
>>> On Sat, Oct 26, 2024 at 11:05 AM rdb...@gmail.com 
>&g

Re: [DISCUSS] Apache Iceberg 1.7.0 Release Cutoff

2024-10-24 Thread rdb...@gmail.com

Do I recall our agreement that any V3 spec changes that will be released
bear no compatibility guarantees until we close V3 and vote on it as a
whole?

Yes. We can make changes to v3 until the community votes to adopt and close
the version to any new forward-breaking changes. That vote isn’t tied to a
specific Java release, but in my opinion, we do want to have released the
features in at least one version before adopting v3.

There is no need to block 1.7 for these changes (except any potential
reverts for sync maintenance), but we may want to do 1.8 some time in
November.

I agree. As we’ve always done in the past, I think we want to prefer
getting releases out and not holding them for new features. More frequent
releases are great so I’d support releasing a 1.8.0 version to get these
items released. I think there’s a good chance that we will also have
working Variant code (restricted to v3 tables for preview) that we will
want to start testing as well.

I think we can still wait a few days before the freeze, but not weeks.

Agreed, and I trust Russell as release manager to make the right call on
when we should wait.

Ryan

On Wed, Oct 23, 2024 at 11:22 PM Jean-Baptiste Onofré 
wrote:

> Hi Anton,
>
> AFAIR, when we discussed the 1.7.0 release, we agreed about the "best
> effort" regarding V3. Regarding the major PRs you are mentioning, we
> can always include in 1.7.1 and following.
> I think we can still wait a few days before the freeze, but not weeks.
> I think we should move forward on 1.7.0 as proposed by Russ and be
> ready to submit new 1.7.x releases to vote soon.
>
> Thoughts ?
>
> Regards
> JB
>
> On Thu, Oct 24, 2024 at 8:16 AM Anton Okolnychyi 
> wrote:
> >
> > Do I recall our agreement that any V3 spec changes that will be released
> bear no compatibility guarantees until we close V3 and vote on it as a
> whole?
> >
> > There are a few major PRs unrelated to V3 that are close but unlikely to
> be merged by the end of this week:
> > - Amogh and I were working on sync maintenance for V2 position deletes
> in core and Spark. The final PRs are being reviewed but we are
> re-evaluating if the broadcast approach is the way to go. We may need to
> revert some previously merged parts in 1.7 to avoid breaking the API in 1.8.
> > - Ajantha wrote a utility for computing partition stats, which is
> already in main. There is an open PR for writers but I think we may need to
> switch to "internal" writers, which will need a bit of time to iterate.
> > - Wing Yew Poon has a PR to support V2 tables in ChangelogScan but we
> need to optimize the computation and make it incremental.
> >
> > There is no need to block 1.7 for these changes (except any potential
> reverts for sync maintenance), but we may want to do 1.8 some time in
> November.
> >
> > - Anton
> >
> > ср, 23 жовт. 2024 р. о 17:28 Manu Zhang  пише:
> >>
> >> Can we also include https://github.com/apache/iceberg/pull/11157? Much
> appreciated if I can get more eyes on it.
> >>
> >> Thanks,
> >> Manu
> >>
> >> On Wed, Oct 23, 2024 at 11:03 PM Russell Spitzer <
> russell.spit...@gmail.com> wrote:
> >>>
> >>> Keep up coming :) I did a pass on Prashant's as well
> >>>
> >>> On Wed, Oct 23, 2024 at 12:47 AM Jean-Baptiste Onofré 
> wrote:
> 
>  Hi Prashant
> 
>  Thanks for the heads up. We still have time to review and include in
>  the release.
> 
>  I will do a pass on your PR.
> 
>  Thanks !
>  Regards
>  JB
> 
>  On Tue, Oct 22, 2024 at 9:29 PM Prashant Singh <
> prashant010...@gmail.com> wrote:
>  >
>  > Hi Russell,
>  >
>  > I recently stumbled across an issue in Iceberg Kafka-Connect which
> shows up when we are using UUID / FixedType, which leads to failure when we
> are actually writing the files, so makes it kinda unusable when we are
> using these data-types.
>  > I have a fix out for it already :
> https://github.com/apache/iceberg/pull/11346/files . Would request you
> include this as well into 1.7.0.
>  >
>  > Regards,
>  > Prashant Singh
>  >
>  > On Mon, Oct 21, 2024 at 9:13 AM Wing Yew Poon
>  wrote:
>  >>
>  >> Hi Russell,
>  >> There is a data correctness issue (
> https://github.com/apache/iceberg/issues/11221) that I have a fix for (
> https://github.com/apache/iceberg/pull/11247). This is a serious issue,
> and I'd like to see the fix go into 1.7.0.
>  >> Eduard has already approved the PR, but he asked if you or Amogh
> would take a look as well.
>  >> Thanks,
>  >> Wing Yew
>  >>
>  >>
>  >> On Mon, Oct 21, 2024 at 8:56 AM Russell Spitzer <
> russell.spit...@gmail.com> wrote:
>  >>>
>  >>> That's still my current plan
>  >>>
>  >>> On Mon, Oct 21, 2024 at 10:52 AM Rodrigo Meneses <
> rmene...@gmail.com> wrote:
>  
>   Hi, team. Are we still targeting to cut off on October 25th and
> release by Oct 31the, for the 1.7.0 release?
>   Thanks
>   -Rodrigo
> >>>

Re: [PROPOSAL] Refactore use of Guava Lists.*

2024-10-25 Thread rdb...@gmail.com

It’s correct that these methods aren’t strictly needed. We could translate
every case into a slightly different form:

Lists.newArrayList() -> new ArrayList<>()
Lists.newArrayList(iter) -> new ArrayList(); Iterators.addAll(list, iter)
Lists.newArrayList(iterable) -> new ArrayList<>();
Iterators.addAll(list, iterable.iterator()
Lists.newArrayList(collection) -> new ArrayList<>(collection)
Lists.newArrayList(1, 2, 3) -> new ArrayList<>();
Collections.addAll(list, 1, 2, 3)

You could make that argument for all syntactic sugar. These constructors
are concise, clear, and avoid a bit of code that’s specific to the thing
you’re converting to a list. That’s why this is part of the project’s style.

I think there’s value here, but even if this were an arbitrary choice I
don’t think there’s value in refactoring to avoid it. If and when these
convenience methods are deprecated, we can always replace them with our own
utility in iceberg-common and move on without a ton of changes.

Ryan

On Thu, Oct 24, 2024 at 9:58 AM Jean-Baptiste Onofré 
wrote:

> Hi Eduard
>
> Yeah, I mean checkstyle (not spotless).
>
> AFAIR, I saw a couple of locations without the diamond syntax. Let me
> find it out. Maybe we can start with fixing there.
>
> Thanks !
> Regards
> JB
>
> On Thu, Oct 24, 2024 at 5:07 PM Eduard Tudenhöfner
>  wrote:
> >
> > Hey JB,
> >
> > I don't think we're ever using e.g. Lists.newArrayList() without the
> diamond syntax in the codebase, so it's typically always List list
> = Lists.newArrayList().
> > So I wonder how much of an issue that actually is? Do you have examples
> in the codebase that don't use the diamond syntax and is it worth updating
> the entire codebase?
> >
> >> to loosen the Lists/Maps/... enforcement in spotless)
> >
> >
> > FYI this is being enforced by checkstyle and was first introduced by
> https://github.com/apache/iceberg/pull/3689 in order to have a consistent
> style around collection instantiation.
> >
> > Thanks
> > Eduard
> >
> > On Thu, Oct 24, 2024 at 8:20 AM Jean-Baptiste Onofré 
> wrote:
> >>
> >> Hi folks,
> >>
> >> We are using Guava for different "utils" methods. Especially, we are
> >> using Guava to create lists and maps. For instance, we do (and we
> >> force the use of):
> >>
> >> List myList = Lists.newArrayList();
> >>
> >> or
> >>
> >> Map myMap = Maps.newHashMap();
> >>
> >> If it was a good idea up to JDK7, these methods are now unnecessary,
> >> and the "regular" approach should be preferred in order to cleanly use
> >> the diamond syntax
> >> (
> https://docs.oracle.com/javase/tutorial/java/generics/genTypeInference.html#type-inference-instantiation
> ).
> >>
> >> The Guava Javadoc actually clearly states this:
> >>
> https://github.com/google/guava/blob/v33.3.1/guava/src/com/google/common/collect/Lists.java#L79
> >>
> >> So, I propose:
> >> - to loosen the Lists/Maps/... enforcement in spotless)
> >> - to refactore Iceberg to use JDK list and map constructors instead of
> >> Guava Lists/Maps/...
> >>
> >> I started the change, but as it's a pretty large one, I would like
> >> your feedback before completing the PR.
> >>
> >> Regards
> >> JB
>

Re: [DISCUSS] Discrepancy Between Iceberg Spec and Java Implementation for Snapshot summary's 'operation' key

2024-10-26 Thread rdb...@gmail.com

I see it's been merged, but I don't think it is a good idea to enforce
this. The spec can and should require the `operation` but we want to be
careful about creating situations where bad metadata can needlessly break a
table. I would be much more permissive here, which is why this probably
wasn't enforced in the first place.

On Fri, Oct 25, 2024 at 2:36 PM Kevin Liu  wrote:

> Thanks, everyone! The PR[1] has been merged
>
> Best,
> Kevin Liu
>
> [1] https://github.com/apache/iceberg/pull/11354
>
>
> On Fri, Oct 25, 2024 at 1:02 PM Kevin Liu  wrote:
>
>> Thanks, Ryan! That makes sense.
>>
>> I want to follow up on the original issue. I've made a PR [1] to enforce
>> that the Snapshot `summary` map must have an `operation` key. Please take a
>> look. Thank you @nastra for the comments and reviews.
>>
>> Best,
>> Kevin Liu
>>
>> [1] https://github.com/apache/iceberg/pull/11354
>>
>>
>>
>> On Tue, Oct 22, 2024 at 4:06 PM rdb...@gmail.com 
>> wrote:
>>
>>> > For example, the `Snapshot` `summary` field is optional in V1 but
>>> required in V2. Therefore, the REST spec definition should mark the
>>> `summary` field as optional to support both versions.
>>>
>>> Yeah, this is technically true. But as I said in my first email, unless
>>> you have tables that are 5 years old, it's unlikely that this is going to
>>> be a problem. A failure here is more likely with newer implementations that
>>> have a bug. So I'd argue there's value in leaving it as required.
>>>
>>> On Mon, Oct 21, 2024 at 9:41 AM Kevin Liu 
>>> wrote:
>>>
>>>> > No. They were introduced at the same time.
>>>> Great! Since the `summary` field and the `operation` key were
>>>> introduced together, we should enforce the rule that the `summary`
>>>> field must always have an accompanying `operation` key. This has been
>>>> addressed in PR 11354 [1].
>>>>
>>>> > I am strongly against this. The REST spec should be independent of
>>>> the table versions.
>>>> That makes sense. For the REST spec to support both V1 and V2 tables,
>>>> it should "accept" the least common denominator between the two versions.
>>>> For example, the `Snapshot` `summary` field is optional in V1 but required
>>>> in V2. Therefore, the REST spec definition should mark the `summary` field
>>>> as optional to support both versions. However, the current REST spec leans
>>>> towards the V2 table spec; fields that are optional in V1 and required in
>>>> V2 are marked as required in the spec, such as `TableMetadata.table-uuid`
>>>> [2][3] and `Snapshot.summary` [4][5].
>>>>
>>>> Would love to get other people's thoughts on this.
>>>>
>>>> Best,
>>>> Kevin Liu
>>>>
>>>> [1] https://github.com/apache/iceberg/pull/11354
>>>> [2]
>>>> https://github.com/apache/iceberg/blob/8e743a5b5209569f84b6bace36e1106c67e1eab3/open-api/rest-catalog-open-api.yaml#L2414
>>>> [3] https://iceberg.apache.org/spec/#table-metadata-fields
>>>> [4]
>>>> https://github.com/apache/iceberg/blob/8e743a5b5209569f84b6bace36e1106c67e1eab3/open-api/rest-catalog-open-api.yaml#L2325
>>>> [5] https://iceberg.apache.org/spec/#snapshots
>>>>
>>>> On Sun, Oct 20, 2024 at 11:24 AM rdb...@gmail.com 
>>>> wrote:
>>>>
>>>>> Was it ever valid to have a summary field without the operation key?
>>>>>
>>>>> No. They were introduced at the same time.
>>>>>
>>>>> Would it be helpful to create alternative versions of the REST spec
>>>>> specifically for referencing V1 and V2 tables?
>>>>>
>>>>> I am strongly against this. The REST spec should be independent of the
>>>>> table versions. Any table format version can be passed and the table 
>>>>> format
>>>>> should be the canonical reference for what is allowed. We want to avoid
>>>>> cases where there are discrepancies. The table spec is canonical for table
>>>>> metadata, and the REST spec allows passing it.
>>>>>
>>>>> On Sun, Oct 20, 2024 at 11:18 AM Kevin Liu 
>>>>> wrote:
>>>>>
>>>>>> Hey folks,
>>>>>>
>>>>>> Thanks, everyone for the discussion, and thanks Ryan for providing
>>>>>> t

Re: REST catalog removes void transform

2024-10-31 Thread rdb...@gmail.com

Vladimir, what is the context in which you want to maintain a partition
spec with only void transforms? Is this in a v2 table? In a v2 table, the
catalog should be free to remove void transforms. They are required for v1.

On Wed, Oct 30, 2024 at 5:00 AM Vladimir Ozerov 
wrote:

> Hi,
>
> When a user creates a table with void() transform on a single column, REST
> catalogs appears to  ignore this, and ends up with a table with no
> partitioning information. The relevant code part is in
> RESTSessionCatalog.createChanges:
>
> PartitionSpec spec = meta.spec();
> if (spec != null && spec.isPartitioned()) {
> changes.add(new MetadataUpdate.AddPartitionSpec(spec));
> } else {
> changes.add(new
> MetadataUpdate.AddPartitionSpec(PartitionSpec.unpartitioned()));
> }
>
> My question is whether this is by design or not? From the user
> perspective, this appears to be ok, because the table is not partitioned
> anyway. However, some engines, such as Trino, currently retain void()
> partitioning info for non-REST catalogs. What would be the proper
> expectation from the Iceberg user in this case - should it observe void()
> in table schema or not?
>
> Regards,
> --
> *Vladimir Ozerov*
> Founder
> querifylabs.com
>

Re: [VOTE] Deletion Vectors in V3

2024-10-31 Thread rdb...@gmail.com

+1

Thanks, Anton!

On Wed, Oct 30, 2024 at 11:58 PM Fokko Driesprong  wrote:

> +1
>
> I had to read up a bit, thanks for driving this Anton.
>
> Kind regards,
> Fokko
>
> Op do 31 okt 2024 om 07:53 schreef Piotr Findeisen <
> piotr.findei...@gmail.com>:
>
>> Thank you Anton,
>>
>> +1 (non-binding)
>>
>>
>>
>> On Thu, 31 Oct 2024 at 05:07, John Zhuge  wrote:
>>
>>> +1 (non-binding)
>>>
>>> On Wed, Oct 30, 2024 at 1:56 PM Anton Okolnychyi 
>>> wrote:
>>>
 +1 (binding)

 - Anton

 ср, 30 жовт. 2024 р. о 21:32 Amogh Jahagirdar <2am...@gmail.com> пише:

> +1 (binding)
>
> Thanks,
> Amogh Jahagirdar
>
> On Wed, Oct 30, 2024 at 1:28 PM Bryan Keller 
> wrote:
>
>> +1!
>>
>> On Oct 30, 2024, at 11:03 AM, Daniel Weeks  wrote:
>>
>> +1 (binding)
>>
>> -Dan
>>
>> On Wed, Oct 30, 2024 at 10:51 AM Prashant Singh <
>> prashant010...@gmail.com> wrote:
>>
>>> +1 (non-binding)
>>>
>>> Thanks,
>>> Prashant
>>>
>>> On Wed, Oct 30, 2024 at 10:16 AM Russell Spitzer <
>>> russell.spit...@gmail.com> wrote:
>>>
 +1  - My comments are clear on the other thread for future proposals

 On Wed, Oct 30, 2024 at 11:18 AM Szehon Ho 
 wrote:

> -0
>
> Great work and exciting functionality, but transferring concerns
> from the other thread about the decision.
>
> Thanks
> Szehon
>
> On Wed, Oct 30, 2024 at 9:12 AM Steven Wu 
> wrote:
>
>> +1
>>
>> On Wed, Oct 30, 2024 at 1:07 AM xianjin 
>> wrote:
>>
>>> +1 (non binding)
>>>
>>> On Wed, Oct 30, 2024 at 2:28 PM Jean-Baptiste Onofré <
>>> j...@nanthrax.net> wrote:
>>>
 +1 (non binding)

 Regards
 JB

 On Tue, Oct 29, 2024 at 10:45 PM Anton Okolnychyi <
 aokolnyc...@gmail.com> wrote:
 >
 > Hi folks,
 >
 > We have been discussing the new layout for position deletes
 in V3 for a while now. It seems the community reached consensus. 
 I'd like
 to start a vote on adding deletion vectors to the V3 spec as 
 described in
 PRs 11238 and 11240. You can see a prototype of the proposed spec 
 changes
 in PR 11302.
 >
 > Design doc discussion thread:
 https://lists.apache.org/thread/8ypt0jowf3zosw6f6bxt25jxc3vdgps9
 > Spec changes discussion thread:
 https://lists.apache.org/thread/wyon0kvroxsmkxh153444xzscwbb68o1
 >
 > Please vote in the next 72 hours
 >
 > [ ] +1, commit the proposed spec changes
 > [ ] -0
 > [ ] -1, do not make these changes because...
 >
 > - Anton

>>>
>>
>>>
>>> --
>>> John Zhuge
>>>
>>

Re: [VOTE] Drop Python3.8 Support in PyIceberg 0.8.0

2024-09-23 Thread rdb...@gmail.com

+1

On Mon, Sep 23, 2024 at 10:31 AM Steven Wu  wrote:

> +1 (binding). makes sense.
>
> On Mon, Sep 23, 2024 at 9:38 AM Yufei Gu  wrote:
>
>> +1 Thanks for bringing this up.
>>
>> Yufei
>>
>>
>> On Mon, Sep 23, 2024 at 9:27 AM Kevin Liu  wrote:
>>
>>> +1 non-binding. Thanks for starting this conversation!
>>>
>>>
>>> On Fri, Sep 20, 2024 at 2:02 PM Sung Yun  wrote:
>>>
 Hi folks,

 I'd like to start this thread to vote on dropping the support for
 Python3.8 in the upcoming 0.8.0 PyIceberg release.

 Python3.8 will be End-Of-Life in October 2024, and some of our
 dependencies have already dropped support for Python3.8 prebuilt
 wheels which makes our dependency management more complicated than
 needed.

 https://devguide.python.org/versions/

 Sung

>>>

[VOTE] Table v3 spec: Add unknown and new type promotion

2024-09-27 Thread rdb...@gmail.com

Hi everyone,

I'd like to vote on PR #10955
 that
has been open for a while with the changes to add new type promotion cases.
After discussion, the PR has been scoped down to keep complexity low. It
now adds:

* An `unknown` type for cases when only `null` values have been observed
* Type promotion from `unknown` to any other type
* Type promotion from `date` to `timestamp` or `timestamp_ns`
* Clarification that promotion is not allowed if it breaks transform results

The set of changes is quite a bit smaller than originally proposed because
of the issue already discussed about lower and upper bounds values, and it
no longer includes variant. I think that we can add more type promotion
cases after we improve bounds metadata. This adds what we can now to keep
v3 moving forward.

Please vote in the next 72 hours:

[ ] +1, commit the proposed spec changes
[ ] -0
[ ] -1, do not make these changes because . . .

Thanks,

Ryan

Re: [DISCUSS] Modify ThreadPools.newWorkerPool to avoid unnecessary Shutdown Hook registration

2024-09-27 Thread rdb...@gmail.com

I'm okay with adding newFixedThreadPool as Steven suggests, but I don't
think that solves the problem that these are used more widely than intended
and without people knowing the behavior. Even though "non-exiting" is
awkward, it is maybe a good option to call out behavior. +1 for Javadoc,
and +1 for doing something here since there are improper uses throughout
Iceberg. Thanks for raising this, Peter!

On Thu, Sep 26, 2024 at 1:52 AM Jean-Baptiste Onofré 
wrote:

> Hi Steven,
>
> I agree with you here. I think we can use semantics similar to
> ThreadPoolExecutor/ScheduledThreadPoolExecutor (like
> newFixedThreadPool, newWorkStealingPool, ...).
>
> Regards
> JB
>
> On Thu, Sep 26, 2024 at 2:05 AM Steven Wu  wrote:
> >
> >
> > First, we should definitely add Javadoc to `ThreadPools.newWorkerPool`
> on its behavior with a shutdown hook. It is not obvious from the method
> name. I would actually go further to deprecate `newWorkerPool` with
> `newExitingWorkerPool`. `newWorkerPool` method name is easy to cause the
> misuage, as the intention is not obvious from the name.
> >
> > `newNonExitingWorkerPool` is a little awkward to me. `NonExiting` should
> be the default behavior. Maybe we can call this new method as
> `newFixedThreadPool(int poolSize, String prefix)`. Alternatively, we can
> just make `ThreadPools.newDaemonThreadFactory` public as the proposed
> `newNonExitingWorkerPool` really just saved one line on the thread factory
> construction.
> >
> >
> > On Wed, Sep 18, 2024 at 10:25 PM Péter Váry 
> wrote:
> >>
> >> Here are the cases where we call the `newWorkerPool` in our code:
> >>
> >> Correctly:
> >>
> >> S3FileIO.executorService
> >> HadoopFileIO.executorService
> >>
> >> Incorrectly:
> >>
> >> CountersBenchmark.defaultCounterMultipleThreads (core module)
> >> BaseDistributedDataScan.newMonitorPool (core module)
> >> FlinkInputFormat.createInputSplits (flink module)
> >> IcebergInputFormat.getSplits (flink module)
> >>
> >> Incorrectly, but typically called only once in the JVM lifecycle
> >>
> >> TableMigrationUtil.migrationService - the pool management is abandoned,
> and nothing prevents multiple pool creations (data module)
> >> IcebergCommitter (flink module)
> >> IcebergFilesCommitter.open (flink module)
> >> IcebergSource.planSplitsForBatch (flink module)
> >> StreamingMonitorFunction.open (flink module)
> >> ContinuousSplitPlannerImpl (flink module)
> >> Coordinator - Kafka coordinator - I'm not sure that this belongs
> to here (kafka-connect)
> >>
> >> The code we need to duplicate in core/data/flink/kafka module is:
> >>
> >>   public static ExecutorService newNonExitingWorkerPool(String
> namePrefix, int poolSize) {
> >> return Executors.newFixedThreadPool(
> >> poolSize,
> >> new
> ThreadFactoryBuilder().setDaemon(true).setNameFormat(namePrefix +
> "-%d").build());
> >>   }
> >>
> >>
> >> Maybe adding another utility method to the `ThreadPools` would help
> future contributors to think twice about the need for using the `Exiting`
> solution, so I would prefer to add this method to the core `ThreadPools`
> with enough javadoc to highlight the intended usage.
> >>
> >> Thanks,
> >> Peter
> >>
> >> rdb...@gmail.com  ezt írta (időpont: 2024. szept.
> 18., Sze, 23:26):
> >>>
> >>> I think this is the intended behavior. The code calls
> `MoreExecutors.getExitingExecutorService` internally to ensure the pool
> exits. I think the right fix is for callers to create their own
> `ExecutorService` rather than using `newWorkerPool`. That allows for
> customization without making Iceberg more complicated. `ThreadPools` isn't
> doing anything special here. It's just a convenience method to create an
> exiting, fixed-size thread pool that runs daemon threads. If that's not
> what you're looking for then isn't it reasonable to make your own
> convenience method?
> >>>
> >>> On Wed, Sep 18, 2024 at 1:22 PM Péter Váry <
> peter.vary.apa...@gmail.com> wrote:
> >>>>
> >>>> This is not just a Flink issue, tha calls are spread out in multiple
> packages. We checked the code, and in many of the current use-cases in the
> Iceberg repo the pool is not used in a static environment, and closed
> manually. In this cases we should switch to a thread pool without a
> shutdown hook. So I think minimally we need to create a utili

Re: Clarification on DayTransform Result Type

2024-09-27 Thread rdb...@gmail.com

The background is that the result of the day function and dates are
basically the same: the number of days from the Unix epoch. When we started
using metadata tables, we realized that a lot of people use the day
function but then get a weird ordinal value out, but if we just change the
type to `date`, engines could correctly display the value. This isn't
required by the spec, it's just a convenience.

On Fri, Sep 27, 2024 at 8:30 AM Russell Spitzer 
wrote:

> Good thing DateType is an Integer :)
> https://github.com/apache/iceberg/blob/113c6e7d62e53d3e3cb15b1712f3a1db473ca940/api/src/main/java/org/apache/iceberg/types/Type.java#L37
>
> On Thu, Sep 26, 2024 at 8:38 PM Kevin Liu  wrote:
>
>> Hey folks,
>>
>> While reviewing a PR to fix DayTransform in PyIceberg (#1208
>> ), we found an
>> inconsistency between the spec and the Java Iceberg library.
>>
>> According to the spec
>> , the result type
>> for the "day partition transform" should be `int`, similar to other
>> time-based partition transforms (year/month/hour). However, in the Java
>> Iceberg library, the result type for day partition transform is `DateType` (
>> source
>> ).
>> This seems to be a discrepancy from the spec, as the day partition
>> transform is the only time-based transform with a non-int result
>> type—whereas the others use IntegerType (source
>> 
>> ).
>>
>> Could someone confirm if my understanding is correct? If so, is there any
>> historical context for this difference? Lastly, how should we approach
>> resolving this moving forward?
>>
>> Best,
>> Kevin
>>
>>

Re: [DISCUSS] Modify ThreadPools.newWorkerPool to avoid unnecessary Shutdown Hook registration

2024-09-30 Thread rdb...@gmail.com

+1 for `newExitingWorkerPool`.

On Fri, Sep 27, 2024 at 4:23 PM Steven Wu  wrote:

> >  I don't think that solves the problem that these are used more widely
> than intended and without people knowing the behavior.
>
> Ryan, to solve this problem, I suggest we deprecate the current
> `newWorkerPool` with `newExitingWorkerPool`. This way, when people calls
> `newExitingWorkerPool`, the intended behavior is clear from the method name.
>
> On Fri, Sep 27, 2024 at 1:58 PM rdb...@gmail.com  wrote:
>
>> I'm okay with adding newFixedThreadPool as Steven suggests, but I don't
>> think that solves the problem that these are used more widely than intended
>> and without people knowing the behavior. Even though "non-exiting" is
>> awkward, it is maybe a good option to call out behavior. +1 for Javadoc,
>> and +1 for doing something here since there are improper uses throughout
>> Iceberg. Thanks for raising this, Peter!
>>
>> On Thu, Sep 26, 2024 at 1:52 AM Jean-Baptiste Onofré 
>> wrote:
>>
>>> Hi Steven,
>>>
>>> I agree with you here. I think we can use semantics similar to
>>> ThreadPoolExecutor/ScheduledThreadPoolExecutor (like
>>> newFixedThreadPool, newWorkStealingPool, ...).
>>>
>>> Regards
>>> JB
>>>
>>> On Thu, Sep 26, 2024 at 2:05 AM Steven Wu  wrote:
>>> >
>>> >
>>> > First, we should definitely add Javadoc to `ThreadPools.newWorkerPool`
>>> on its behavior with a shutdown hook. It is not obvious from the method
>>> name. I would actually go further to deprecate `newWorkerPool` with
>>> `newExitingWorkerPool`. `newWorkerPool` method name is easy to cause the
>>> misuage, as the intention is not obvious from the name.
>>> >
>>> > `newNonExitingWorkerPool` is a little awkward to me. `NonExiting`
>>> should be the default behavior. Maybe we can call this new method as
>>> `newFixedThreadPool(int poolSize, String prefix)`. Alternatively, we can
>>> just make `ThreadPools.newDaemonThreadFactory` public as the proposed
>>> `newNonExitingWorkerPool` really just saved one line on the thread factory
>>> construction.
>>> >
>>> >
>>> > On Wed, Sep 18, 2024 at 10:25 PM Péter Váry <
>>> peter.vary.apa...@gmail.com> wrote:
>>> >>
>>> >> Here are the cases where we call the `newWorkerPool` in our code:
>>> >>
>>> >> Correctly:
>>> >>
>>> >> S3FileIO.executorService
>>> >> HadoopFileIO.executorService
>>> >>
>>> >> Incorrectly:
>>> >>
>>> >> CountersBenchmark.defaultCounterMultipleThreads (core module)
>>> >> BaseDistributedDataScan.newMonitorPool (core module)
>>> >> FlinkInputFormat.createInputSplits (flink module)
>>> >> IcebergInputFormat.getSplits (flink module)
>>> >>
>>> >> Incorrectly, but typically called only once in the JVM lifecycle
>>> >>
>>> >> TableMigrationUtil.migrationService - the pool management is
>>> abandoned, and nothing prevents multiple pool creations (data module)
>>> >> IcebergCommitter (flink module)
>>> >> IcebergFilesCommitter.open (flink module)
>>> >> IcebergSource.planSplitsForBatch (flink module)
>>> >> StreamingMonitorFunction.open (flink module)
>>> >> ContinuousSplitPlannerImpl (flink module)
>>> >> Coordinator - Kafka coordinator - I'm not sure that this
>>> belongs to here (kafka-connect)
>>> >>
>>> >> The code we need to duplicate in core/data/flink/kafka module is:
>>> >>
>>> >>   public static ExecutorService newNonExitingWorkerPool(String
>>> namePrefix, int poolSize) {
>>> >> return Executors.newFixedThreadPool(
>>> >> poolSize,
>>> >> new
>>> ThreadFactoryBuilder().setDaemon(true).setNameFormat(namePrefix +
>>> "-%d").build());
>>> >>   }
>>> >>
>>> >>
>>> >> Maybe adding another utility method to the `ThreadPools` would help
>>> future contributors to think twice about the need for using the `Exiting`
>>> solution, so I would prefer to add this method to the core `ThreadPools`
>>> with enough javadoc to highlight the intended usage.
>>> >>
>>> >> Thanks,
>>> >> Peter
>>> >>
>>> >> rdb

Re: [Discuss] Geospatial Support

2024-09-30 Thread rdb...@gmail.com

I have a couple of comments that I'd like to see addressed.

First, I think that the definition of the bounding box needs to be more
clear: the bounding box must include all points that lie on an object's
edges or within an object. If that isn't required then we can't use the
bounding box for filtering because there may be points outside the box that
are part of the object.

Second, the encoding for lower and upper bound points needs to be a little
more specific about the binary and how to handle the optional values.

On Mon, Sep 30, 2024 at 1:30 PM Yufei Gu  wrote:

> Thanks Szehon! My comments were addressed. I'm ready to vote.
>
> Yufei
>
>
> On Mon, Sep 30, 2024 at 11:47 AM Russell Spitzer <
> russell.spit...@gmail.com> wrote:
>
>> All my concerns are addressed, I'm ready to vote.
>>
>> On Mon, Sep 30, 2024 at 1:21 PM Szehon Ho 
>> wrote:
>>
>>> Hi all,
>>>
>>> There have been several rounds of discussion on the PR:
>>> https://github.com/apache/iceberg/pull/10981 and I think most of the
>>> main points have been addressed.
>>>
>>> If anyone is interested, please take a look.  If there are no other
>>> major points, we plan to start a VOTE thread soon.
>>>
>>> I know Jia and team are also volunteering to work on the prototype
>>> immediately afterwards.
>>>
>>> Thank you,
>>> Szehon
>>>
>>> On Tue, Aug 20, 2024 at 1:57 PM Szehon Ho 
>>> wrote:
>>>
 Hi all

 Please take a look at the proposed spec change to support Geo type for
 V3 in : https://github.com/apache/iceberg/pull/10981, and comment or
 otherwise let me know your thoughts.

 Just as an FYI it incorporated the feedback from our last meeting (with
 Snowflake and Wherobots engineers).

 Thanks,
 Szehon

 On Wed, Jun 26, 2024 at 7:29 PM Szehon Ho 
 wrote:

> Hi
>
> It was great to meet in person with Snowflake engineers and we had a
> good discussion on the paths forward.
>
> Meeting notes for Snowflake- Iceberg sync.
>
>- Iceberg proposed Geometry type defaults to (edges=planar ,
>crs=CRS84).
>- Snowflake has two types Geography (spherical) and Geometry
>(planar, with customizable CRS).  The data layout/encoding is the same 
> for
>both types.  Let's see how we can support each in Iceberg type, 
> especially
>wrt Iceberg partition/file pruning
>- Geography type support
>- Main concern is the need for a suitable partition transform for
>   partition-level filter, the candidate is Micahel Entin's
>   proposal
>   
> 
>   .
>   - Secondary concern is file and RG-level filtering.  Gang's Parquet
>   proposal
>    allow
>   storage of S2 / H3 ID's in Parquet stats, and so we can also 
> leverage that
>   in Iceberg pruning code (Google and Uber libraries are compatible)
>- Geometry type support
>   -  Main concern is partition transform needs to understand CRS,
>   but this can be solved by having XZ2 transform created with 
> customizable
>   min/max lat/long range (its all it needs)
>- Should (CRS, edges) be stored properties on Geography type in
>Phase 1?
>   - Should be fine to store, with only allowing defaults in Phase
>   1.
>   - Concern 1: If edges is stored, there will be ask to store
>   other properties like (orientation, epoch).  Solution is to punt 
> these
>   follow-on properties for later.
>   - Concern 2: if crs is stored, what format?  PROJJSON vs SRID.
>   Solution is to leave it as a string
>   - Concern 3: if crs is stored as a string, Iceberg cannot read
>   it.  This should be ok, as we only need this for XZ2 transform, 
> where the
>   user already passes in the info from CRS (up to user to make sure 
> these
>   align).
>
> Thanks
> Szehon
>
> On Tue, Jun 18, 2024 at 12:23 PM Szehon Ho 
> wrote:
>
>> Jia and I will sync with the Snowflake folks to see if we can have a
>> solution, or roadmap to solution, in the proposal.
>>
>> Thanks JB for the interest!  By the way, I want to schedule a meeting
>> to go over the proposal, it seems there's good feedback from folks from 
>> geo
>> side (and even Parquet community), but not too many eyes/feedback from
>> other folks/PMC on Iceberg community.  This might be due to lack of
>> familiarity/ time to read through it all.  In fact, a lot of the advanced
>> discussions like this one are for Phase 2 items, and Phase 1 items are
>> relatively straightforward, so wanted to explain that.  As I know its
>> summer vacation for some folks, we can do this in a week or early July,
>> hope t

Re: [VOTE] Table v3 spec: Add unknown and new type promotion

2024-09-30 Thread rdb...@gmail.com

+1 (binding)

On Mon, Sep 30, 2024 at 12:32 PM Daniel Weeks  wrote:

> +1 (binding)
>
> On Fri, Sep 27, 2024 at 2:41 PM Russell Spitzer 
> wrote:
>
>> +1 (binding)
>>
>> On Fri, Sep 27, 2024 at 4:37 PM rdb...@gmail.com 
>> wrote:
>>
>>> Hi everyone,
>>>
>>> I'd like to vote on PR #10955
>>> <https://github.com/apache/iceberg/pull/10955> that has been open for a
>>> while with the changes to add new type promotion cases. After discussion,
>>> the PR has been scoped down to keep complexity low. It now adds:
>>>
>>> * An `unknown` type for cases when only `null` values have been observed
>>> * Type promotion from `unknown` to any other type
>>> * Type promotion from `date` to `timestamp` or `timestamp_ns`
>>> * Clarification that promotion is not allowed if it breaks transform
>>> results
>>>
>>> The set of changes is quite a bit smaller than originally proposed
>>> because of the issue already discussed about lower and upper bounds values,
>>> and it no longer includes variant. I think that we can add more type
>>> promotion cases after we improve bounds metadata. This adds what we can now
>>> to keep v3 moving forward.
>>>
>>> Please vote in the next 72 hours:
>>>
>>> [ ] +1, commit the proposed spec changes
>>> [ ] -0
>>> [ ] -1, do not make these changes because . . .
>>>
>>> Thanks,
>>>
>>> Ryan
>>>
>>

Re: [DISCUSS] Modify ThreadPools.newWorkerPool to avoid unnecessary Shutdown Hook registration

2024-09-18 Thread rdb...@gmail.com

Since we're using standard interfaces, maybe we should just document this
behavior and you can control it by creating your own worker pool instead?

On Tue, Sep 17, 2024 at 2:20 AM Péter Váry 
wrote:

> Bumping this thread a bit.
>
> Cleaning up the pool in non-static cases should be a responsibility of the
> user. If they want a pool which is closed by a hook when the JVM exists
> they should explicitly "say" so, for example calling "newExitingWorkerPool".
>
> This is a behaviour change in the API, so I think we need feedback from
> the community before proceeding with it.
> What are your thoughts?
>
> Thanks,
> Peter
>
> 冯佳捷  ezt írta (időpont: 2024. szept. 13., P,
> 17:16):
>
>> Hi all,
>>
>> During the investigation of a metaspace memory leak issue in Flink
>> IcebergSource ( https://github.com/apache/iceberg/pull/11073 ), a
>> discussion with @pvary revealed that *ThreadPools.newWorkerPool*
>> currently registers a Shutdown Hook via ExitingExecutorService for all
>> created thread pools. While this ensures graceful shutdown of the pools
>> when the JVM exits, it might lead to unnecessary Shutdown Hook
>> accumulation, especially when the pool is explicitly closed within the
>> application's lifecycle.
>>
>> I propose to *modify ThreadPools.newWorkerPool to not register a
>> Shutdown Hook by default*. This would prevent potential issues where
>> developers might unintentionally register numerous Shutdown Hooks when
>> using ThreadPools.newWorkerPool for short-lived thread pools.
>> To retain the existing functionality for long-lived thread pools that
>> require a Shutdown Hook, I suggest introducing a new, more descriptive
>> function, such as *newExitingWorkerPool*. This function would explicitly
>> create thread pools that are registered with a Shutdown Hook.
>>
>> *This change might potentially impact users who rely on the implicit
>> Shutdown Hook registration provided by the current
>> ThreadPools.newWorkerPool implementation.*
>> I would like to gather feedback from the community regarding this
>> proposed change, especially regarding potential compatibility concerns.
>>
>> Best regards,
>> Feng Jiajie
>>
>>

Re: [DISCUSS] Modify ThreadPools.newWorkerPool to avoid unnecessary Shutdown Hook registration

2024-09-18 Thread rdb...@gmail.com

I think this is the intended behavior. The code calls
`MoreExecutors.getExitingExecutorService` internally to ensure the pool
exits. I think the right fix is for callers to create their own
`ExecutorService` rather than using `newWorkerPool`. That allows for
customization without making Iceberg more complicated. `ThreadPools` isn't
doing anything special here. It's just a convenience method to create an
exiting, fixed-size thread pool that runs daemon threads. If that's not
what you're looking for then isn't it reasonable to make your own
convenience method?

On Wed, Sep 18, 2024 at 1:22 PM Péter Váry 
wrote:

> This is not just a Flink issue, tha calls are spread out in multiple
> packages. We checked the code, and in many of the current use-cases in the
> Iceberg repo the pool is not used in a static environment, and closed
> manually. In this cases we should switch to a thread pool without a
> shutdown hook. So I think minimally we need to create a utility method to
> create such a pool.
>
> The main question is:
> - Is it a bug, or a feature, that we always provide a pool with a hook?
>
> If this is a bug, then we create a "newExitingWorkerPool", and change the
> callers to use the correct one.
> If this is a feature, then we create a "newNotExitingWorkerPool" (which is
> gross IMHO, but we should consider API compatibility), and change the
> callers to use the correct one.
>
> Thanks,
> Peter
>
> On Wed, Sep 18, 2024, 21:53 rdb...@gmail.com  wrote:
>
>> Since we're using standard interfaces, maybe we should just document this
>> behavior and you can control it by creating your own worker pool instead?
>>
>> On Tue, Sep 17, 2024 at 2:20 AM Péter Váry 
>> wrote:
>>
>>> Bumping this thread a bit.
>>>
>>> Cleaning up the pool in non-static cases should be a responsibility of
>>> the user. If they want a pool which is closed by a hook when the JVM exists
>>> they should explicitly "say" so, for example calling "newExitingWorkerPool".
>>>
>>> This is a behaviour change in the API, so I think we need feedback from
>>> the community before proceeding with it.
>>> What are your thoughts?
>>>
>>> Thanks,
>>> Peter
>>>
>>> 冯佳捷  ezt írta (időpont: 2024. szept. 13., P,
>>> 17:16):
>>>
>>>> Hi all,
>>>>
>>>> During the investigation of a metaspace memory leak issue in Flink
>>>> IcebergSource ( https://github.com/apache/iceberg/pull/11073 ), a
>>>> discussion with @pvary revealed that *ThreadPools.newWorkerPool*
>>>> currently registers a Shutdown Hook via ExitingExecutorService for all
>>>> created thread pools. While this ensures graceful shutdown of the pools
>>>> when the JVM exits, it might lead to unnecessary Shutdown Hook
>>>> accumulation, especially when the pool is explicitly closed within the
>>>> application's lifecycle.
>>>>
>>>> I propose to *modify ThreadPools.newWorkerPool to not register a
>>>> Shutdown Hook by default*. This would prevent potential issues where
>>>> developers might unintentionally register numerous Shutdown Hooks when
>>>> using ThreadPools.newWorkerPool for short-lived thread pools.
>>>> To retain the existing functionality for long-lived thread pools that
>>>> require a Shutdown Hook, I suggest introducing a new, more descriptive
>>>> function, such as *newExitingWorkerPool*. This function would
>>>> explicitly create thread pools that are registered with a Shutdown Hook.
>>>>
>>>> *This change might potentially impact users who rely on the implicit
>>>> Shutdown Hook registration provided by the current
>>>> ThreadPools.newWorkerPool implementation.*
>>>> I would like to gather feedback from the community regarding this
>>>> proposed change, especially regarding potential compatibility concerns.
>>>>
>>>> Best regards,
>>>> Feng Jiajie
>>>>
>>>>

Re: [DISCUSS] Column to Column filtering

2024-09-18 Thread rdb...@gmail.com

I'm curious to learn more about this feature. Is there a driving use case
that you're implementing it for? Are there common situations in which these
filters are helpful and selective?

My initial impression is that this kind of expression would have limited
utility at the table format level. Iceberg tracks column ranges for data
files and the primary use case for filtering is to skip data files at the
scan planning phase. For a column-to-column comparison, you would only be
able to eliminate data files that have non-overlapping ranges. That is, if
you're looking for rows where x < y, you can only eliminate a file when
max(x) < min(y). To me, it seems unlikely that this would be generic enough
to be worth it, but if there are use cases where this can happen and speed
up queries I think it may make sense.

Ryan

On Tue, Sep 17, 2024 at 6:21 AM Baldwin, Jennifer
 wrote:

> I’m starting a thread to discuss a feature for comparisons using column
> references on the left and right side of an expression wherever iceberg
> supports column reference to literal value(s) comparisons.  The use case we
> want to support is filtering of date columns from a single table.  For
> instance:
>
>
>
> select * from travel_table
>
> where expected_date > travel_date;
>
>
>
> select * from travel_table
>
> where payment_date <>  due_date;
>
>
>
>
>
> The changes will impact row and scan file filtering.  Impacted jars are
> iceberg-api, iceberg-core, iceberg-orc and iceberg-parquet.
>
>
>
> Is this a feature the Iceberg community would be willing to accept?
>
>
>
> Here is a link to a Draft PR with current changes, Thanks.
>
> https://github.com/apache/iceberg/pull/11152
>
>
>

Re: [VOTE] Table v3 spec: Add unknown and new type promotion

2024-10-03 Thread rdb...@gmail.com

With 7 +1 votes and 1 -0, this passes. I'll work with Micah to clarify some
of the sections that he pointed out, but I don't think that we need to
block moving forward on the overall set of things that are proposed.
Thanks, everyone!

On Tue, Oct 1, 2024 at 1:49 AM Jean-Baptiste Onofré  wrote:

> +1 (non binding)
>
> Regards
> JB
>
> On Fri, Sep 27, 2024 at 11:36 PM rdb...@gmail.com 
> wrote:
> >
> > Hi everyone,
> >
> > I'd like to vote on PR #10955 that has been open for a while with the
> changes to add new type promotion cases. After discussion, the PR has been
> scoped down to keep complexity low. It now adds:
> >
> > * An `unknown` type for cases when only `null` values have been observed
> > * Type promotion from `unknown` to any other type
> > * Type promotion from `date` to `timestamp` or `timestamp_ns`
> > * Clarification that promotion is not allowed if it breaks transform
> results
> >
> > The set of changes is quite a bit smaller than originally proposed
> because of the issue already discussed about lower and upper bounds values,
> and it no longer includes variant. I think that we can add more type
> promotion cases after we improve bounds metadata. This adds what we can now
> to keep v3 moving forward.
> >
> > Please vote in the next 72 hours:
> >
> > [ ] +1, commit the proposed spec changes
> > [ ] -0
> > [ ] -1, do not make these changes because . . .
> >
> > Thanks,
> >
> > Ryan
>

Re: [Discuss] Iceberg View Interoperability

2024-10-25 Thread rdb...@gmail.com

Substrait is one of the reasons why we designed views with the ability to
have different representations. I think that SQL translation is not a great
solution. I'd like to see more focus on a portable intermediate
representation like Substrait. That would solve a lot of the limitations
with the SQL representation (like not wanting to rewrite after resolution).

On Fri, Oct 25, 2024 at 10:21 AM Szehon Ho  wrote:

> Im dont have hands on experience on Substrait, but wondering, is substrait
> representation possible today with existing Iceberg view spec?  Ie, engines
> can store today the text serialized substrait representation with sql
> dialect 'substrait'?   Or is it an abuse of spec and we should make a
> proper field for IR representations?
>
> I imagine if its possible, interested engines can start working on
> translating to and from substrait, and hopefully it will catch on, but as
> of now Im not sure that Iceberg even provides a standard option?
>
> Thoughts?
> Szehon
>
> On Tue, Oct 22, 2024 at 4:05 AM Will Raschkowski
>  wrote:
>
>> It’s all nascent, but we have a relatively solid experience going from
>> Spark to Substrait [1], Substrait to Calcite [2], and Calcite to one of its
>> supported SQL dialects [3]. All these translations are Java-based.
>>
>>
>> [1]: https://github.com/substrait-io/substrait-java/pull/271
>>
>> [2]:
>> https://github.com/substrait-io/substrait-java/blob/main/isthmus/README.md
>>
>> [3]:
>> https://svn.apache.org/repos/asf/calcite/site/apidocs/org/apache/calcite/sql/dialect/package-summary.html
>>
>>
>>
>> *From: *Ajantha Bhat 
>> *Date: *Tuesday, 22 October 2024 at 08:22
>> *To: *dev@iceberg.apache.org 
>> *Subject: *Re: [Discuss] Iceberg View Interoperability
>>
>> *CAUTION:* This email originates from an external party (outside of
>> Palantir). If you believe this message is suspicious in nature, please use
>> the "Report Message" button built into Outlook.
>>
>>
>>
>> Thanks Dan for the reply.
>>
>> but I'm not convinced using a procedure is a good idea or really moves
>> things forward in that direction.
>> I just feel like the procedure approach has a number of drawbacks
>> including, relying on the user to do the translation, being dependent on
>> Spark for defining the other representations (the API would typically be
>> available to you if you're using spark), there are a number of
>> quoting/escaping issues that make it awkward to define, and it introduces a
>> second way to define a view.
>>
>>
>> I didn't consider this as a problem as the procedure is not exposing any
>> new functionalities, users can still achieve this using java API and end up
>> with the same situations mentioned above. Also, not just Spark, each engine
>> can have this procedure.
>> But I see what you mean, having a centralized solution independent of the
>> engine can be better here.
>>
>> I feel like PyIceberg is the right path forward here.
>>
>>
>>
>> If we are using the python based solutions like SQLGlot, it makes sense
>> to use the PyIceberg to expose a view API to translate and add additional
>> dialects from it. But SQLGlot also doesn't support every engine out there
>> (Like Flink, Dremio, Impala etc). So, we may still need an API to manually
>> add the dialect for unsupported engines from python.
>>
>> I am sure we have people here who work on Coral, SQLGlot, Substrait and
>> similar technologies in this mailing list. I will wait for a week to see if
>> there are any opinions on the same. If not, we can start the development
>> with SQLGlot.
>>
>> - Ajantha
>>
>>
>>
>> On Thu, Oct 17, 2024 at 9:32 PM Daniel Weeks  wrote:
>>
>> Hey Ajantha,
>>
>>
>>
>> I think it's good to figure out a path forward for extending view
>> support, but I'm not convinced using a procedure is a good idea or really
>> moves things forward in that direction.
>>
>>
>>
>> As you already indicated, there are a number of different libraries to
>> translate views, but of the various options, It feels like there are two
>> approaches that resonate with me are either letting the engine perform the
>> translation (like Trino has done with Coral) or supporting views in
>> python [github.com]
>> 
>> so that you can use a combination of libraries like SQLGlot.  Python is the
>> most universally available and is already used in a lot of tooling for
>> running/managing DDL statements.
>>
>>
>>
>> I just feel like the procedure approach has a number of drawbacks
>> including, relying on the user to do the translation, being dependent on
>> Spark for defining the other representations (the API would typically be
>> available to you if you're using spark), there are a number of
>> quoting/escaping issues that make it awkward to define, and it introduces a
>> second way to define a view.
>>
>>
>>
>> Long-term IR (e.g. substrate) is

Re: [Discuss] Different file formats for ingestion and compaction

2024-10-25 Thread rdb...@gmail.com

Gabor,

The reason why the write format is a "default" is that I intended for it to
be something that engines could override. For cases where it doesn't make
sense to use the default because of memory pressure (as you might see in
ingestion processes) you could choose to override and use a format that
fits better with the use case. Then data services could go and compact into
a better long-term format.

We could also do what you're saying and introduce a property so that
streaming or ingest writers are specifically allowed to write with a
row-oriented format (Avro). I'm not sure how much value there is here,
though. It seems that most processing engines are more able now to
co-locate records and the number of open columnar files is no longer a
pressing concern.

Ryan

On Fri, Oct 25, 2024 at 8:26 AM Gabor Kaszab  wrote:

> Hey Iceberg Community,
>
> I read this article
> 
> the other day and there is this part that caught my attention (amongst
> others):
> "For high-throughput streaming ingestion, ...  durably store recently
> ingested tuples in a row-oriented format and periodically convert them to
> Parquet."
>
> So this made me wonder if it makes sense to give some support from the
> Iceberg lib for the writers to write different file formats when ingesting
> and different when they are compacting. Currently, we have
> "write.format.default" to tell the writers what format to use when writing
> to the table.
> I played with the idea, similarly to the quote above, to choose a format
> that is faster to write for streaming ingests and then periodically compact
> them into another format that is faster to read. Let's say ingest using
> AVRO and compact into Parquet.
>
> Do you think it would make sense to introduce another table property to
> split the file format between those use cases? E.g.:
> 1) Introduce "write.compact.format.default" to tell the writers what
> format to use for compactions and use existing "write.format.default" for
> everything else.
> Or
> 2) Introduce "write.stream-ingest.format.default" to tell the engines what
> format to use for streaming ingest and use the existing
> "write.format.default" for everything else?
>
> What do you think?
> Gabor
>
>

Re: [Discuss] Simplify tableExists API in HiveCatalog

2024-11-27 Thread rdb...@gmail.com

I'd support changing the behavior if we still have a way to match the
intent, which is to return true if the table exists in Hive and is an
Iceberg table.

On Wed, Nov 27, 2024 at 11:26 AM Szehon Ho  wrote:

> Hm I think the thread got a bit sidetracked by the other question.
>
> The initial proposal by Steve is a performance improvement for
> HiveCatalog's tableExists().  Currently it loads both Hive and Iceberg
> table metadata, and if successful returns true.  The proposal is to load
> from Hive only, and return true if Hive metadata identifies that an Iceberg
> table exists with this name.
>
> Checking corruption of Iceberg's table metadata.json is a side-effect of
> the current behavior, but would not anymore with the proposed change.
> That's the question of the original thread, and so far there's agreement
> that it is not necessarily part of this scope of HiveCatalog's
> tableExists().
>
> At least this is my understanding.
> Thanks,
> Szehon
>
> On Wed, Nov 27, 2024 at 10:56 AM rdb...@gmail.com 
> wrote:
>
>> What kind of corruption are you referring to? I would expect corruption
>> to result in an exception when loading the table, but that the table should
>> still exist. The problem is likely that we determine if a table exists by
>> attempting to load it. We could fix that by not attempting to load the
>> table. I think that's a reasonable solution.
>>
>> On Wed, Nov 27, 2024 at 12:45 AM Manu Zhang 
>> wrote:
>>
>>> The current behavior's intent is not to check whether the metadata is
>>>> valid, it is to detect whether the table is an Iceberg table.
>>>
>>>
>>> Is there a way to detect this from HiveCatalog without loading the
>>> table?
>>>
>>>
>>> On Wed, Nov 27, 2024 at 2:01 PM Péter Váry 
>>> wrote:
>>>
>>>> I think we have an agreement, not to change the behavior wrt existing
>>>> non-Iceberg tables, and throw an exception.
>>>>
>>>> Are we also in agreement with the original proposal to return true when
>>>> the table exists but the metadata is somehow corrupted? Note: this is the
>>>> proposed change of behavior why the thread was originally started.
>>>>
>>>> On Tue, Nov 26, 2024, 21:30 rdb...@gmail.com  wrote:
>>>>
>>>>> I'd argue against changing this. The current behavior's intent is not
>>>>> to check whether the metadata is valid, it is to detect whether the table
>>>>> is an Iceberg table. It ignores non-Iceberg tables. Changing that behavior
>>>>> would be surprising, especially if we started throwing exceptions.
>>>>>
>>>>> On Fri, Nov 22, 2024 at 2:01 PM Kevin Liu 
>>>>> wrote:
>>>>>
>>>>>> > Should add, my personal preference is probably not to change the
>>>>>> existing behavior for this part
>>>>>>
>>>>>> +1. I realized that this is not a new behavior. The `loadTable`
>>>>>> implementation has this problem too.
>>>>>> It would be good to have a test case specifically for this edge case
>>>>>> and maybe call this out in the documentation.
>>>>>>
>>>>>> Thanks,
>>>>>> Kevin Liu
>>>>>>
>>>>>> On Fri, Nov 22, 2024 at 11:57 AM Szehon Ho 
>>>>>> wrote:
>>>>>>
>>>>>>> Should add, my personal preference is probably not to change the
>>>>>>> existing behavior for this part (false, if exists a Hive table with same
>>>>>>> name) at the moment, just adding another possibility for consideration.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Szehon
>>>>>>>
>>>>>>> On Fri, Nov 22, 2024 at 2:00 AM Szehon Ho 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks Kevin and Gabor, this is an interesting discussion.  I guess
>>>>>>>> a third option instead of returning true/false in this case, is to 
>>>>>>>> change
>>>>>>>> it to throw an NoSuchIcebergTableException if its a non-Iceberg table,
>>>>>>>> which I think is actually what this pr does?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Szehon
>>>>>>>>
>>>>>>>> On Fri, Nov 22, 2024 at 1:08 AM Gabor

Re: [Discuss] Simplify tableExists API in HiveCatalog

2024-11-27 Thread rdb...@gmail.com

What kind of corruption are you referring to? I would expect corruption to
result in an exception when loading the table, but that the table should
still exist. The problem is likely that we determine if a table exists by
attempting to load it. We could fix that by not attempting to load the
table. I think that's a reasonable solution.

On Wed, Nov 27, 2024 at 12:45 AM Manu Zhang  wrote:

> The current behavior's intent is not to check whether the metadata is
>> valid, it is to detect whether the table is an Iceberg table.
>
>
> Is there a way to detect this from HiveCatalog without loading the table?
>
>
> On Wed, Nov 27, 2024 at 2:01 PM Péter Váry 
> wrote:
>
>> I think we have an agreement, not to change the behavior wrt existing
>> non-Iceberg tables, and throw an exception.
>>
>> Are we also in agreement with the original proposal to return true when
>> the table exists but the metadata is somehow corrupted? Note: this is the
>> proposed change of behavior why the thread was originally started.
>>
>> On Tue, Nov 26, 2024, 21:30 rdb...@gmail.com  wrote:
>>
>>> I'd argue against changing this. The current behavior's intent is not to
>>> check whether the metadata is valid, it is to detect whether the table is
>>> an Iceberg table. It ignores non-Iceberg tables. Changing that behavior
>>> would be surprising, especially if we started throwing exceptions.
>>>
>>> On Fri, Nov 22, 2024 at 2:01 PM Kevin Liu  wrote:
>>>
>>>> > Should add, my personal preference is probably not to change the
>>>> existing behavior for this part
>>>>
>>>> +1. I realized that this is not a new behavior. The `loadTable`
>>>> implementation has this problem too.
>>>> It would be good to have a test case specifically for this edge case
>>>> and maybe call this out in the documentation.
>>>>
>>>> Thanks,
>>>> Kevin Liu
>>>>
>>>> On Fri, Nov 22, 2024 at 11:57 AM Szehon Ho 
>>>> wrote:
>>>>
>>>>> Should add, my personal preference is probably not to change the
>>>>> existing behavior for this part (false, if exists a Hive table with same
>>>>> name) at the moment, just adding another possibility for consideration.
>>>>>
>>>>> Thanks
>>>>> Szehon
>>>>>
>>>>> On Fri, Nov 22, 2024 at 2:00 AM Szehon Ho 
>>>>> wrote:
>>>>>
>>>>>> Thanks Kevin and Gabor, this is an interesting discussion.  I guess a
>>>>>> third option instead of returning true/false in this case, is to change 
>>>>>> it
>>>>>> to throw an NoSuchIcebergTableException if its a non-Iceberg table, 
>>>>>> which I
>>>>>> think is actually what this pr does?
>>>>>>
>>>>>> Thanks
>>>>>> Szehon
>>>>>>
>>>>>> On Fri, Nov 22, 2024 at 1:08 AM Gabor Kaszab
>>>>>>  wrote:
>>>>>>
>>>>>>> Hey,
>>>>>>>
>>>>>>> I think what Kevin says makes sense. However, it would then confuse
>>>>>>> the opposite use case of this function. Let's assume that we change the
>>>>>>> implementation of tableExists() to not load the table internally:
>>>>>>>
>>>>>>> if (tableExists(table_name)) {
>>>>>>> table = loadTable(table_name);
>>>>>>> }
>>>>>>>
>>>>>>> Here, you find that the table exists but when you try to load it it
>>>>>>> fails because it's not an Iceberg table. I don't think that any of 
>>>>>>> these 2
>>>>>>> are intuitive. I think the question here is how much an API of the 
>>>>>>> Iceberg
>>>>>>> table format should know about the existence of tables in other formats.
>>>>>>>
>>>>>>> If `tableExists` is meant to check for conflicting entries in the HMS
>>>>>>>
>>>>>>> Another interpretation of calling Catalog.tableExists() on an
>>>>>>> Iceberg API is instead "is there such an Iceberg table". TBH, not sure 
>>>>>>> if
>>>>>>> any of the 2 approaches are better than the other, I just wanted to show
>>>>>>> that there is another side of t

Re: Storing catalog directly on object store

2024-11-27 Thread rdb...@gmail.com

> We deprecated this recently and we don't have to deprecate it if object
stores support atomic operations like this.

I disagree because this misses many of the reasons for deprecation. It
isn't just that S3 didn't support a `putIfAbsent` operation. Other object
stores did and there are still several problems with this approach. The
fundamental issue is that it is attempting to solve problems at the wrong
level.

One of the reasons why Iceberg exists is that we saw people doing the same
thing with Parquet. People were trying to solve problems with their tables
by attempting to modify Parquet in wacky ways, like wanting to replace
the footer to make schema changes. Schema evolution needed to be solved at
the table level and in this community we've always tried to solve problems
more directly and elegantly by addressing them at the right layer of the
stack.

Iceberg tables scale up existing atomic operations to make transactional
guarantees on very large tables. Object stores and file systems aren't well
suited for this task. Just like they were not sufficient to provide
transactional guarantees across files and partitions, the primitives you
can use aren't sufficient for a database. Storage capabilities are also not
the right place to deliver other catalog features, like basic CRUD
operations.

The addition of `putIfAbsent` to S3 doesn't support transactions where you
need to modify multiple tables and it also doesn't address cases like the
need to atomically rename and delete tables. Schemes that use `putIfAbsent`
also rely either on consistent listing a large prefix or on maintaining a
version-hint file. That version-hint file can be out of date, so even with
one you still need to list or iteratively attempt to read metadata files to
determine the latest.

Getting a file-only scheme right is complicated and is specific to a
particular store (both commits and version-hint handling). Local file
systems would use an exclusive create operation to commit, Hadoop uses
atomic rename, and object stores use different `putIfAbsent` operations.
Making this work across languages and engines requires a lot of work to
specify requirements and document issues, only to get to single-table
functionality that doesn't deliver the catalog-level primitives like atomic
rename that are commonly used.

In the end, catalog problems are best solved at the catalog layer, not
through an elaborate scheme that uses storage-layer primitives, just as it
was not a good idea to deliver table behaviors using similar storage-layer
schemes. Adding `putIfAbsent` to S3 doesn't change that design principle.

I sympathize with the idea that it would be great if you didn't need a
catalog. Simpler infrastructure is generally better.

But trying to avoid a catalog limits the capabilities of this
infrastructure, while setting people up for later failure. When I talk with
people that have been trying to avoid having a catalog, they tend to have
tables scattered across buckets that they need to track down, they lack
observability to know what is being used, don't to know if they are
deleting data in compliance with regulations, and they often lack simple
and usable access controls.

I think that the solution is to make it easier to run or use a catalog, not
to try to build without one.

And I'm also looking forward to what Jack is alluding to.

On Tue, Nov 26, 2024 at 11:05 PM Ajantha Bhat  wrote:

> Interesting.
>
> We already have file system tables [1] in Iceberg (HadoopCatalog
> implements this spec).
> We deprecated this recently and we don't have to deprecate it if object
> stores support atomic operations like this.
>
> [1] https://iceberg.apache.org/spec/#file-system-tables
>
> - Ajantha
>
> On Wed, Nov 27, 2024 at 2:53 AM Nikhil Benesch 
> wrote:
>
>> Ah, fascinating. Thanks very much for the pointer.
>>
>> Here's the thread introducing the proposal [0], for anyone else curious.
>>
>> [0]: https://lists.apache.org/thread/kh4n98w4z22sc8h2vot4q8n44vdtnltg
>>
>> On Tue, Nov 26, 2024 at 3:27 PM Jean-Baptiste Onofré 
>> wrote:
>> >
>> > Hi Vignesh
>> >
>> > Thanks for the reminder, I remember we quickly discussed this during a
>> > community meeting.
>> >
>> > I will take a new look at the doc.
>> >
>> > Regards
>> > JB
>> >
>> > On Tue, Nov 26, 2024 at 9:19 PM Vignesh  wrote:
>> > >
>> > > Hi,
>> > > There was a proposal along the same lines, for the read portion few
>> weeks back by Ashvin.
>> > >
>> > >
>> https://docs.google.com/document/d/1yzLXSOtzBXyaWHfeVsWsMu4xmOH8rV6QyM5ZAnJZjMQ/edit?usp=drivesdk
>> > >
>> > > Thanks,
>> > > Vignesh.
>> > >
>> > >
>> > > On Tue, Nov 26, 2024, 11:59 AM Jean-Baptiste Onofré 
>> wrote:
>> > >>
>> > >> Hi Nikhil
>> > >>
>> > >> Thanks for your message, very interesting.
>> > >>
>> > >> I think it would be great to involve the Polaris project here as
>> well,
>> > >> as a REST Catalog implementation.
>> > >> The Polaris community is discussing storage/backend right now, so it
>> > >> would be the perfect timi

Re: [DISCUSS] Enforce table properties at catalog level

2024-11-27 Thread rdb...@gmail.com

Manu, this is something that you can easily build into a REST catalog
implementation. I think that's probably the best way to solve it, rather
than trying to implement this behavior across all of the catalogs in the
project, right?

On Wed, Nov 27, 2024 at 8:47 AM Pucheng Yang 
wrote:

> I think the naming of the property should be fixed as it only applies for
> any new table creation.
>
> On Wed, Nov 27, 2024 at 2:21 AM Manu Zhang 
> wrote:
>
>> Hi all,
>>
>> Currently, we can *enforce default table properties* at catalog level
>> with configs like
>> spark.sql.catalog.*catalog-name*.table-override.*propertyKey*[1].  It
>> prevents users from overriding those properties when creating a table.
>> However, users can still override later through altering the table.
>> The Spark doc is inconsistent saying that the table-override property
>> can't be overridden by user. Which one is expected?
>>
>>
>> 1. 
>> https://iceberg.apache.org/docs/nightly/spark-configuration/#catalog-configuration
>> 
>>
>>
>> Thanks,
>> Manu
>>
>

Re: [DISCUSS] Hive Support

2024-11-27 Thread rdb...@gmail.com

I think that we should remove Hive 2 and Hive 3. We already agreed to
remove Hive 2, but Hive 3 is not compatible with the project anymore and is
already EOL and will not see a release to update it so that it can be
compatible. Anyone using the existing Hive 3 support should be able to
continue using older releases.

In general, I think it's a good idea to let people use older releases when
these situations happen. It is difficult for the project to continue to
support libraries that are EOL and I don't think there's a great
justification for it, considering Iceberg support in Hive 4 is native and
much better!

On Wed, Nov 27, 2024 at 7:12 AM Cheng Pan  wrote:

> That said, it would be helpful if they continue running
> tests against the latest stable Hive releases to ensure that any
> changes don’t unintentionally break something for Hive, which would be
> beyond our control.
>
>
> I believe we should continue maintaining a Hive Iceberg runtime test suite
> with the latest version of Hive in the Iceberg repository.
>
>
> i think we can keep some basic Hive4 tests in iceberg repo
>
>
> Instead of running basic tests on the Iceberg repo, maybe let Iceberg
> publish daily snapshot jars to Nexus, and have a daily CI in Hive to
> consume those jars and run full Iceberg tests makes more sense?
>
> Thanks,
> Cheng Pan
>
>

Re: [DISCUSS] Deprecate embedded manifests

2024-11-27 Thread rdb...@gmail.com

I think it's reasonable to mark it deprecated in the spec, especially
because we don't allow it in v2. But I'm not sure how that would allow us
to remove code paths associated with it. If it is allowed by an older and
supported version of the spec, then how can we safely remove the code paths
that read it?

On Fri, Nov 22, 2024 at 2:56 AM Fokko Driesprong  wrote:

> Hey Ryan,
>
> The goal of the deprecation is to avoid other implementations to produce
> it. PyIceberg for example, does not support this and I think it would be
> good to avoid having others (rust, go, etc) to support this. Regarding the
> removal, Amogh expressed the same concern on the PR
> <https://github.com/apache/iceberg/pull/11586#discussion_r1848789823>.
>
> In my quest to make the Java implementation follow the spec as closely as
> possible, I noticed that we use a DummyFileIO to mimic a ManifestList. I
> ran into this when turning
> <https://github.com/apache/iceberg/pull/11626/files#r1853683623>503:
> added_snapshot_id
> <https://github.com/apache/iceberg/pull/11626/files#r1853683623> into a
> required field
> <https://github.com/apache/iceberg/pull/11626/files#r1853683623>. So the
> value is in removing paths, as Shezon pointed out. When removing support
> for the embedded manifest list, we can remove all that logic and keep the
> codebase nice and tidy.
>
> It would be good to start the discussion of deprecating support for older
> formats at some point, however, for a V2 reader is it fairly easy to
> project V1 metadata as V2. Except when embedded manifests are being used,
> marking this kind of oddities as deprecated I think will enable readers to
> support reading older versions for a longer time. My suggestion would be to
> mark the field as deprecated and revisit the actual removal. I've marked it
> up for removal in Java 2.0 for now to give it enough time.
>
> Kind regards,
> Fokko
>
>
>
> Op do 21 nov 2024 om 20:52 schreef rdb...@gmail.com :
>
>> Can we safely deprecate and remove this? The manifest list is required in
>> v2, but the spec has stated for a long time that v1 tables can use
>> manifests rather than a manifest list. It’s unlikely, but it would be
>> valid for other implementations to produce it.
>>
>> I would understand if other implementations chose to fail tables that
>> don’t have a manifest list to avoid adding code to handle manifests, but
>> I don’t think that there’s much value in removing support from the Java
>> implementation.
>>
>> Instead, what about discussing how to deprecate support for older format
>> versions? That seems like the main issue here. Once the majority of
>> implementations move to newer versions, we would like to deprecate the old
>> ones.
>>
>> On Thu, Nov 21, 2024 at 11:01 AM Szehon Ho 
>> wrote:
>>
>>> +1, great to have less possible paths.
>>>
>>> Thanks
>>> Szehon
>>>
>>> On Thu, Nov 21, 2024 at 10:33 AM Steve Zhang
>>>  wrote:
>>>
>>>> +1 to deprecate
>>>>
>>>> Thanks,
>>>> Steve Zhang
>>>>
>>>>
>>>>
>>>> On Nov 19, 2024, at 3:32 AM, Fokko Driesprong  wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> I would like to propose to deprecate embedded manifests
>>>> <https://github.com/apache/iceberg/pull/11586>. This has been used
>>>> before the manifest-list was introduced, but I don't think they are used
>>>> since the project has been open-sourced, and it would be good to
>>>> officially deprecate them from the spec. It is only supported by Iceberg
>>>> Java today, and I haven't seen any requests for PyIceberg to add support
>>>> for this.
>>>>
>>>> Any questions or concerns about deprecating the embedded manifests?
>>>>
>>>> Kind regards,
>>>> Fokko Driesprong
>>>>
>>>>
>>>>

Re: [Discuss] Simplify tableExists API in HiveCatalog

2024-11-26 Thread rdb...@gmail.com

I'd argue against changing this. The current behavior's intent is not to
check whether the metadata is valid, it is to detect whether the table is
an Iceberg table. It ignores non-Iceberg tables. Changing that behavior
would be surprising, especially if we started throwing exceptions.

On Fri, Nov 22, 2024 at 2:01 PM Kevin Liu  wrote:

> > Should add, my personal preference is probably not to change the
> existing behavior for this part
>
> +1. I realized that this is not a new behavior. The `loadTable`
> implementation has this problem too.
> It would be good to have a test case specifically for this edge case and
> maybe call this out in the documentation.
>
> Thanks,
> Kevin Liu
>
> On Fri, Nov 22, 2024 at 11:57 AM Szehon Ho 
> wrote:
>
>> Should add, my personal preference is probably not to change the existing
>> behavior for this part (false, if exists a Hive table with same name) at
>> the moment, just adding another possibility for consideration.
>>
>> Thanks
>> Szehon
>>
>> On Fri, Nov 22, 2024 at 2:00 AM Szehon Ho 
>> wrote:
>>
>>> Thanks Kevin and Gabor, this is an interesting discussion.  I guess a
>>> third option instead of returning true/false in this case, is to change it
>>> to throw an NoSuchIcebergTableException if its a non-Iceberg table, which I
>>> think is actually what this pr does?
>>>
>>> Thanks
>>> Szehon
>>>
>>> On Fri, Nov 22, 2024 at 1:08 AM Gabor Kaszab
>>>  wrote:
>>>
 Hey,

 I think what Kevin says makes sense. However, it would then confuse the
 opposite use case of this function. Let's assume that we change the
 implementation of tableExists() to not load the table internally:

 if (tableExists(table_name)) {
 table = loadTable(table_name);
 }

 Here, you find that the table exists but when you try to load it it
 fails because it's not an Iceberg table. I don't think that any of these 2
 are intuitive. I think the question here is how much an API of the Iceberg
 table format should know about the existence of tables in other formats.

 If `tableExists` is meant to check for conflicting entries in the HMS

 Another interpretation of calling Catalog.tableExists() on an Iceberg
 API is instead "is there such an Iceberg table". TBH, not sure if any of
 the 2 approaches are better than the other, I just wanted to show that
 there is another side of the coin :)

 Regards,
 Gabor

 On Fri, Nov 22, 2024 at 3:13 AM Kevin Liu 
 wrote:

> Hi Steve,
>
> This makes sense to me. The semantics of `tableExists` focus on
> whether a table's name exists in the catalog. For the Hive catalog,
> checking the HMS entry should be sufficient.
>
> I do have a question about usage, though. Typically, I would use `
> tableExists` like this:
>
> ```
> if (!tableExists(table_name)) {
> table = createTable(table_name);
> }
> ```
> What happens when a Hive table with the same name already exists in
> the catalog? In the current implementation, `tableExists` would return
> `false` because `HiveOperationsBase.validateTableIsIceberg` throws a
> `NoSuchTableException`.
> This would cause the code above to attempt to create the table, only
> to fail since the name already exists in the HMS.
> If `tableExists` is meant to check for conflicting entries in the HMS,
> perhaps it should return true even when a Hive table with the same name
> exists.
>
> I’d love to hear your thoughts on this.
>
> Best,
> Kevin Liu
>
> On Thu, Nov 21, 2024 at 5:22 PM Szehon Ho 
> wrote:
>
>> Hi,
>>
>> It's a good performance find and improvement.   Left some comment on
>> the PR.
>>
>> IMO, the behavior actually more matches the API javadoc ("Check
>> whether table exists"), not whether it is corrupted or not, so I'm
>> supportive of it.
>>
>> Thanks
>> Szehon
>>
>> On Thu, Nov 21, 2024 at 10:57 AM Steve Zhang
>>  wrote:
>>
>>> Hi Iceberger,
>>>
>>>   I have a proposal to simplify the tableExists API in the Hive
>>> catalog, which involves a behavior change, and I’d like to hear your
>>> thoughts.
>>>
>>>   Currently, in our catalog interface[1], the tableExists method is
>>> implemented as a default API by invoking the loadTable method. It
>>> returns true if the table can be loaded without exceptions. This
>>> behavior implies two checks:
>>>
>>>1. The table entry exists in the catalog.
>>>2. The latest metadata.json for the table is not corrupted.
>>>
>>>   The behavior change I’m proposing focuses only on the first
>>> condition—checking if the table entry exists in the catalog. This 
>>> separates
>>> the concerns of table existence and table health (e.g., metadata not
>>> corrupted). Such a change could improve the performance of existence

Re: [VOTE] Add Variant type to Iceberg Spec

2024-11-26 Thread rdb...@gmail.com

+1 and I agree with Russell. v3 is still under development, so I think it's
reasonable to include Variant based on the current Parquet spec.

On Mon, Nov 25, 2024 at 10:35 PM Jean-Baptiste Onofré 
wrote:

> I second Russell here. I think it makes sense to add variant type to
> V3 spec, even if the implementation details will come later.
>
> So +1 to add in the spec.
>
> Regards
> JB
>
> On Mon, Nov 25, 2024 at 6:21 PM Russell Spitzer
>  wrote:
> >
> > I'm +1,
> >
> > 1. I don't think we are going to change our decision on whether to
> include variants based on the timing of Parquet ratification
> > 2. We aren't going to formally close V3 Spec yet, so if we do end up in
> a situation where we want to close the spec and Parquet has not removed the
> tag, we can remove the variant from the spec then. (I think that scenario
> is unlikely)
> > 3. There is very little in our change set here that specifically
> references the Parquet spec except for our reference link to it.
> >
> > I don't think there is anything that will happen in the spec that will
> change what we would include in the Iceberg Spec (especially in this PR)
> >
> > On Fri, Nov 22, 2024 at 5:10 PM Micah Kornfield 
> wrote:
> >>
> >> My (non-binding) vote is -1 until the variant spec is formally adopted
> in Parquet.
> >>
> >> On Fri, Nov 22, 2024 at 2:51 PM Aihua Xu  wrote:
> >>>
> >>> Hi everyone,
> >>>
> >>> I've updated the Iceberg spec to include the new Variant type as part
> of #10831. The changes are basically complete. This is a heads-up about the
> upcoming change. Please review and +1 to acknowledge, so we will merge.
> >>>
> >>> Thanks,
> >>> Aihua
>

Re: [DISCUSS] Apache Iceberg Summit 2025 - Selection Committee

2024-11-26 Thread rdb...@gmail.com

I'd like to volunteer. Glad to see Iceberg Summit 2025 coming together!

On Tue, Nov 26, 2024 at 1:42 AM Jean-Baptiste Onofré 
wrote:

> Hi everyone,
>
> As you probably know, we've been having discussions about the Iceberg
> Summit 2025.
>
> The PMC pre-approved the Iceberg Summit proposal, and one of the first
> steps is to put together a selection committee that will be
> responsible for choosing talks and guiding the process.
> Once we have a selection committee, I will complete the concrete
> proposal for the ASF and the Iceberg PMC to request the ability to use
> the name Iceberg/Apache Iceberg.
>
> If you'd like to help and be part of the selection committee, please
> volunteer in a reply to this thread. Since we likely can't include
> everyone that volunteers, I propose that the PMC should choose the
> final committee from the set of people that volunteer.
>
> We'll leave this open up to Dec 10th to give people time (as
> Thanksgiving is this week).
>
> Thanks !
> Regards
> JB
>

Re: [DISCUSS] Update supported blob types in puffin spec

2025-02-04 Thread rdb...@gmail.com

Thanks for proposing this.

My main concern is that this doesn't seem to be aimed at standardizing this
metadata, but rather a way to pass existing Hive structures in a different
way. I commented on the PR, but I'll carry it over here for this discussion.

Iceberg already supports tracking column lower bounds, upper bounds, column
counts, null value counts, and NaN counts per column in table metadata. The
existing stats use column ID rather than column name so that stats are
compatible with schema evolution. These are kept at the file level and we
are adding partition stats files to handle aggregates at the partition
level. We also use snapshot summaries for similar purposes at the table
level. The proposed Hive structure doesn't seem like it adds much, and, if
anything, would be a feature regression because it doesn't use column IDs
and has extra metadata that would not be accurate (like `isPrimaryKey`).

The Puffin format also has an NDV sketch that is already in use (thanks,
Piotr!) so that seems to duplicate existing functionality as well.

The KLL sketch seems useful to me, but I would separate that from the Hive
blob.

Ryan

On Tue, Feb 4, 2025 at 8:16 AM Denys Kuzmenko  wrote:

> Hi Gabor,
>
> Thanks for your feedback!
>
> > In that use case however, we'd lose the stats we got previously from HMS
>
> For Iceberg tables Hive computes and stores the same stats object in a
> puffin file, previously persisted to HMS. So, there shouldn't be any
> changes for Impala other than changing the stats source.
>
> > We could gather all the column stats needed by different engines,
> standardize them into the Iceberg repo
>
> That is an option I mentioned above and provided the Hive schema,
> currently used to store column statistics.
> I can create a google doc to continue the discussion in that direction.
>
> > Aren't partition status just a more granular way of column stats.
>
> In Iceberg 1.7 Ajantha added a helper method to compute the basic
> partition stats for the given snapshot.
> Collection computeStats(Table table, Snapshot snapshot)
>
> Hopefully, we'll get reader and writer support in 1.8:
> https://github.com/apache/iceberg/pull/11216
>
> A similar functionality is needed for column stats.
> In the case of a partitioned table, we need to create 1 ColumnStatistics
> object per partition and store it as a separate blob in a puffin file.
>
> During the query planning, we'll compute and use aggregated stats based on
> a pruned partition list.
>
> Regards,
> Denys
>

Re: [VOTE] Update partition stats spec for V3

2025-02-04 Thread rdb...@gmail.com

+1

On Tue, Feb 4, 2025 at 12:46 AM Honah J.  wrote:

> +1
>
> On Mon, Feb 3, 2025 at 11:42 PM Ajantha Bhat 
> wrote:
>
>> +1
>>
>> On Tue, Feb 4, 2025 at 11:30 AM Eduard Tudenhöfner <
>> etudenhoef...@apache.org> wrote:
>>
>>> +1
>>>
>>> On Mon, Feb 3, 2025 at 8:33 PM Dongjoon Hyun 
>>> wrote:
>>>
 +1 for the proposal.

 Dongjoon

 > On Mon, Feb 3, 2025 at 2:35 PM ConradJam  wrote:
 >
 > > +1 (non-binding)
 > >
 > > Steven Wu  于2025年2月3日周一 14:08写道：
 > >
 > >> +1
 > >>
 > >> The spec change makes sense. left a question in the PR.
 > >>
 > >> On Sun, Feb 2, 2025 at 8:52 PM roryqi  wrote:
 > >>
 > >>> +1
 > >>>
 > >>> Amogh Jahagirdar <2am...@gmail.com> 于2025年2月2日周日 10:16写道：
 > >>>
 >  +1
 > 
 >  On Sat, Feb 1, 2025 at 11:05 AM huaxin gao <
 huaxin.ga...@gmail.com>
 >  wrote:
 > 
 > > +1 (non-binding)
 > >
 > > On Sat, Feb 1, 2025 at 8:50 AM Manish Malhotra <
 > > manish.malhotra.w...@gmail.com> wrote:
 > >
 > >> +1(nonbinding)
 > >>
 > >> On Sat, Feb 1, 2025 at 2:49 AM Russell Spitzer <
 > >> russell.spit...@gmail.com> wrote:
 > >>
 > >>> +1
 > >>>
 > >>> On Sat, Feb 1, 2025 at 3:01 AM Anton Okolnychyi <
 > >>> aokolnyc...@gmail.com> wrote:
 > >>>
 >  Hi all,
 > 
 >  I propose the following updates to our partition stats spec
 in V3:
 > 
 >  - Modify `position_delete_record_count` to include a sum of
 >  position deletes across position delete files and DVs
 >  - Keep `position_delete_file_count` to represent the number
 of
 >  position delete files (ignoring DVs)
 >  - Add `dv_count` to represent the number of DVs
 >  - Make delete counts required to avoid ambiguity w.r.t NULL
 vs
 >  unknown.
 > 
 >  Here is the PR with the spec update:
 >  https://github.com/apache/iceberg/pull/12098
 > 
 >  - Anton
 > 
 > >>>
 >

>>>

Re: Iceberg handling of Parquet "2-level" lists

2025-02-06 Thread rdb...@gmail.com

Hi Matt,

If you want to work on getting this change in, I'd be happy to review it. I
think it is fine to support older, incorrectly written data. I looked
briefly at the PR and I think it needs to be updated to at least add tests
and to justify why the changes are correct. It looks like the repetition
and definition level thresholds are calculated based on the outermost level
and then left un-adjusted, rather than calculating from the correct path.

Ryan

On Mon, Feb 3, 2025 at 2:30 PM Matt Wallace 
wrote:

> I’d like to open discussion about the handling and support for so-called
> “2-level” lists in Parquet files by the Iceberg libraries.  This issue has
> been raised in https://github.com/apache/iceberg/issues/9497 and a PR was
> submitted at https://github.com/apache/iceberg/pull/9515.  However, this
> PR was not merged because it was brought up that the Iceberg specification
> says that 2-level lists are not supported by Iceberg.   The Parquet spec
> indicates that 3-level lists should be used for writing new Parquet files,
> but it also says that libraries may implement backwards compatibility.  Is
> there any strong reason not to do this?
>
> I have tested the fix proposed in PR 9515 and it works for me.  A strength
> of the Iceberg spec is that it doesn't require re-writing Parquet files in
> order to efficiently store metadata about these Parquet files.  However, by
> not supporting 2-level lists, Iceberg is cutting off support for a large
> subset of existing Parquet data.
>
> Thanks,
>
> Matt
>
>
> 
>
> The information in this e-mail is intended only for the person or entity
> to which it is addressed.
>
> It may contain confidential and /or privileged material, the disclosure of
> which is prohibited. Any unauthorized copying, disclosure or distribution
> of the information in this email outside your company is strictly forbidden.
>
> If you are not the intended recipient (or have received this email in
> error), please contact the sender immediately and permanently delete all
> copies of this email and any attachments from your computer system and
> destroy any hard copies. Although the information in this email has been
> compiled with great care, neither IMC nor any of its related entities shall
> accept any responsibility for any errors, omissions or other inaccuracies
> in this information or for the consequences thereof, nor shall it be bound
> in any way by the contents of this e-mail or its attachments.
>
> Messages and attachments are scanned for all known viruses. Always scan
> attachments before opening them.
>

Re: [VOTE] Add Geometry and Geography types for V3

2025-02-06 Thread rdb...@gmail.com

+1

Awesome to see this ready to go!

On Thu, Feb 6, 2025 at 12:01 PM Szehon Ho  wrote:

> Hi everyone
>
> We would like to add Geometry and Geography types to the Iceberg V3 spec:
>
> https://github.com/apache/iceberg/pull/10981
>
> This is proposed together with Apache Parquet format change to support
> geospatial data.
>
> https://github.com/apache/parquet-format/pull/240
>
> This vote will be open for at least 72 hours.
>
> [ ] +1 Add these types to the format specification
> [ ] +0
> [ ] -1 Do not add these types to the format specification because...
>
> Thanks,
> Szehon
>
>
>
>
>
>

Re: [DISCUSS] Table name in table metadata

2025-02-10 Thread rdb...@gmail.com

I don't think it is a good idea to add the table name to metadata because
it can easily get stale and would be misleading. Table name is a catalog
concern and we typically try to keep catalog concerns out of the table
space. Instead, I'd suggest updating the error that your users see so that
the error report has the information that you need.

On Mon, Feb 10, 2025 at 9:20 AM Yufei Gu  wrote:

> I see the table identifier (catalog.namespace.table), including the table
> name, as a catalog concept rather than a table property. You can register a
> table with the same location but different names, which makes sense from a
> catalog perspective.
>
> This also makes tables more portable when moving or copying them to
> different locations. What do you think?
>
> Yufei
>
>
> On Mon, Feb 10, 2025 at 2:38 AM Gabor Kaszab 
> wrote:
>
>> Hi Manu,
>>
>> I'm just brainstorming about how this addition could be problematic:
>> Even though it's not recommended, it's feasible to register the same
>> table in multiple catalogs. Different catalogs could use different names
>> for the same underlying table, see register_table(name,
>> metadata_location)
>> 
>> .
>>
>> Thanks,
>> Gabor
>>
>>
>> On Mon, Feb 10, 2025 at 8:10 AM Manu Zhang 
>> wrote:
>>
>>> Hi all,
>>>
>>> From time to time, users ask me about the status of their Iceberg tables
>>> by sending me a *path*, which they've received in a file system alert
>>> email. Usually I look for the corresponding *table name *and query
>>> metadata tables through Spark SQL.
>>> However, it's not easy to find the table name since it's not saved in
>>> the table metadata[1].
>>>
>>> Do you think it's valuable to optionally add the table name in the table
>>> metadata?
>>>
>>> 1. https://iceberg.apache.org/spec/#table-metadata
>>> 
>>>
>>> Thanks,
>>> Manu
>>>
>>

Re: Table metadata swap not work for REST Catalog (#12134)

2025-02-10 Thread rdb...@gmail.com

Yeah, it sounds like a "register table force" is the right concept here. I
think we want to make sure that table updates remain change-based as
the best practice in the REST API. But there are some irregular use cases
that justify having some mechanism to completely replace the state (like
push-based mirroring). I think it makes sense to revisit mirroring and this
use case and come up with a path forward.

On Mon, Feb 10, 2025 at 3:12 PM Russell Spitzer 
wrote:

> I still would like a "register table" force" option
>
> On Mon, Feb 10, 2025 at 5:06 PM Steve Zhang
>  wrote:
>
>> Thank you Dan for your detailed reply. Based on your explanation, do you
>> think it would be worthwhile to support non-linear or complete metadata
>> replacements in the REST implementation? I am happy to contribute but might
>> need some guidance from the community on the best approach.
>>
>> For additional context, we explored into the workaround of using a
>> combination of dropping table and re-registering the table with concerns of
>> reading in between. There’s also an attempt to add a force option to the
>> register-table API (https://github.com/apache/iceberg/pull/5327), which
>> would allow for metadata swap on an existing table. However, it was
>> suggested that use TableOperations.commit(base, new) is preferred to
>> achieve atomicity.
>>
>> Thanks,
>> Steve Zhang
>>
>>
>>
>> On Feb 10, 2025, at 1:49 PM, Daniel Weeks  wrote:
>>
>> Hey Steve,
>>
>> I think the issue here is that you're using the commit api in table
>> operations to perform a non-incremental/linear change to the metadata.  The
>> REST implementation is a little more strict in that it builds a set of
>> updates based on the mutations made to the metadata and the commit process
>> applies those changes.  In this scenario, no changes have been made and the
>> call is attempting a complete replacement.
>>
>> The other implementations are just blindly swapping the location, so
>> while that operation does achieve the effect you're looking for, it's not
>> the right semantics for the commit.
>>
>> You might want to consider using the "register table" operation instead,
>> which takes the table identifier and location to perform this type of swap.
>>
>> -Dan
>>
>> On Fri, Feb 7, 2025 at 10:17 AM Steve Zhang
>>  wrote:
>>
>>> Hey Iceberg Experts:
>>>
>>>   I am seeking assistance and insights regarding an issue we’ve
>>> encountered with RESTTableOperations and its inability to support on-demand
>>> table metadata swaps. We are currently adopting the REST-based catalog from
>>> Hive and have noticed a potential gap in the TableOperations.commit()
>>> API. Typically, we use the commit API to revert a table to a previously
>>> known state, as demonstrated below:
>>>
>>> String deisredMetadataPath =
>>> "/var/newdb/table/metadata/3-579b23d1-4ca5-4acf-85ec-081e1699cb83.metadata.json""
>>> ops.commit(ops.current(), TableMetadataParser.read(ops.io(),
>>> dedeisredMetadataPath));
>>>
>>>   However, this approach is no longer working with the REST-based
>>> catalog. I suspect that the issue may be related to how the update type is
>>> modeled in RESTTableOperations.  I have shared a unit test that reproduces
>>> the problem on https://github.com/apache/iceberg/issues/12134, where it
>>> works on JDBC and in-memory catalogs, but not with RESTCatalog.
>>>
>>> Best Regards,
>>> Steve Zhang
>>>
>>>
>>>
>>>
>>

Re: [VOTE] Release Apache Iceberg 1.8.0 RC0

2025-02-11 Thread rdb...@gmail.com

+1

* Validated signature and checksum
* Ran RAT checks
* Ran tests that didn't require Docker in Java 17

As a follow up, I think that we should move any tests that require Docker
to integrationTest rather than test. We should try not to rely on Docker
containers in normal unit tests because containers can make the tests take
a lot longer. Docker is great for running services that would otherwise be
a mess of mocked interfaces, but I would prefer that we keep them separate
as integration tests.


On Mon, Feb 10, 2025 at 11:45 AM Anurag Mantripragada <
anuragmantr...@gmail.com> wrote:

> +1
>
> I verified signature, checksums, license, built and ran tests locally.
>
> Thanks for taking care of the release, Amogh!
>
> Thanks,
> Anurag
>
> On Sun, Feb 9, 2025 at 10:39 PM Amogh Jahagirdar <2am...@gmail.com> wrote:
>
>> Hi Everyone,
>>
>> I propose that we release the following RC as the official Apache Iceberg
>> 1.8.0 release.
>>
>> The commit ID is c277c2014a1b37fe755cfe37f173b6465bb8cb73
>> * This corresponds to the tag: apache-iceberg-1.8.0-rc0
>> * https://github.com/apache/iceberg/commits/apache-iceberg-1.8.0-rc0
>> *
>> https://github.com/apache/iceberg/tree/c277c2014a1b37fe755cfe37f173b6465bb8cb73
>>
>> The release tarball, signature, and checksums are here:
>> * https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.8.0-rc0
>>
>> You can find the KEYS file here:
>> * https://downloads.apache.org/iceberg/KEYS
>>
>> Convenience binary artifacts are staged on Nexus. The Maven repository
>> URL is:
>> *
>> https://repository.apache.org/content/repositories/orgapacheiceberg-1182/
>>
>> Please download, verify, and test.
>>
>> Please vote in the next 72 hours.
>>
>> [ ] +1 Release this as Apache Iceberg 1.8.0
>> [ ] +0
>> [ ] -1 Do not release this because...
>>
>> Only PMC members have binding votes, but other community members are
>> encouraged to cast
>> non-binding votes. This vote will pass if there are 3 binding +1 votes
>> and more binding
>> +1 votes than -1 votes.
>>
>

Re: [VOTE] Add RemoveSchemas update type to REST spec

2025-02-11 Thread rdb...@gmail.com

+1

On Tue, Feb 11, 2025 at 10:50 AM Steve Zhang
 wrote:

> +1 nb
>
> Thanks,
> Steve Zhang
>
>
>
> On Feb 11, 2025, at 10:26 AM, Honah J.  wrote:
>
> +1
>
> On Tue, Feb 11, 2025 at 10:16 AM Christian Thiel <
> christian.t.b...@gmail.com> wrote:
>
>> +1 (non-binding)
>> Thanks Gabor!
>>
>> On Tue, 11 Feb 2025 at 18:30, Yufei Gu  wrote:
>>
>>> +1
>>> Yufei
>>>
>>>
>>> On Tue, Feb 11, 2025 at 8:57 AM Steven Wu  wrote:
>>>
 +1

 On Tue, Feb 11, 2025 at 8:55 AM Russell Spitzer <
 russell.spit...@gmail.com> wrote:

> +1
>
> On Tue, Feb 11, 2025 at 9:15 AM Fokko Driesprong 
> wrote:
>
>> +1
>>
>> Op di 11 feb 2025 om 13:52 schreef Jean-Baptiste Onofré <
>> j...@nanthrax.net>:
>>
>>> +1 (non binding)
>>>
>>> Regards
>>> JB
>>>
>>> On Tue, Feb 11, 2025 at 3:38 AM Gabor Kaszab 
>>> wrote:
>>> >
>>> > Hi Iceberg Community,
>>> >
>>> > I'm working on removing the unused schemas from the table metadata
>>> when running snapshot expiration. One part of this work is a change in 
>>> REST
>>> spec to add a new update type for removing schemas.
>>> >
>>> > I'd like to start a vote on this REST spec change:
>>> > https://github.com/apache/iceberg/pull/12022
>>> >
>>> > This vote will be open for at least 72 hours.
>>> >
>>> > [ ] +1 Add RemoveSchemas update type to REST spec
>>> > [ ] +0
>>> > [ ] -1 I have questions and/or concerns
>>> >
>>> > Thanks,
>>> > Gabor
>>>
>>
>

Re: Contributor guidelines for becoming a committer

2024-12-11 Thread rdb...@gmail.com

I want to make a clarification. I did comment on that PR that we are
describing how the community is operating today, but that was in response
to suggestions to reference the comdev project, lower the requirements, and
add a requirement for loving the project and helping the community. My
intent is to say that we're open to discussion, but that I want to make
sure we get the guidelines published before we make changes.

I was NOT saying that we don't value non-code contributions today. We do,
and we made sure that the doc doesn't talk about only code. It refers to
contributions and reviews, not code.

Ryan

On Wed, Dec 11, 2024 at 12:21 AM Jean-Baptiste Onofré 
wrote:

> Hi Justin
>
> For reference, here's the original PR:
> https://github.com/apache/iceberg/pull/11670
>
> I agree with you, and I commented in the PR about the same points:
> - considering non code contributions (also referring com dev page
> https://community.apache.org/pmc/adding-committers.html as "example")
> - considering same criteria across the project (between iceberg-java,
> iceberg-python, iceberg-go, iceberg-rust, ...) because a committer/PMC
> member is for the whole project (not only on one module)
>
> Ryan and Fokko reassured me: they want to describe how the PMC is
> "operating" today, but they will certainly consider our comments in
> the future (and update the page).
>
> Regards
> JB
>
> On Wed, Dec 11, 2024 at 12:10 AM Justin Mclean 
> wrote:
> >
> > Hi,
> >
> > From a quick glance, this seems far too focused on code contributions.
> Remember, people can become committers from non-code contributions. A
> committer is someone that is committed to the project and may not review or
> write code. It would also be good to set some expectations around the
> amount of activity needed to become a committer rather than saying there
> are none.
> >
> > Kind Regards,
> > Justin
>

[DISCUSS] December board report

2024-12-11 Thread rdb...@gmail.com

Hi everyone,

It’s time to report to the board again. Great to see all the progress here,
and awesome to have our first go release this quarter!

My draft is below. Please reply if there’s anything you’d like to add or
change. Thanks!

Ryan
Description:

Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.
Project Status:

Current project status: Ongoing
Issues for the board: None
Membership Data:

Apache Iceberg was founded 2020-05-19 (5 years ago)
There are currently 32 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is roughly 4:3.

Community changes, past quarter:

   - No new PMC members. Last addition was Amogh Jahagirdar on 2024-08-12.
   - Matthew Topol was added as committer on 2024-12-09
   - Scott Donnelly was added as committer on 2024-12-10

Project Activity:

Releases

   - 1.7.1 was released on 2024-12-06.
   - 1.7.0 was released on 2024-11-08.
   - PyIceberg 0.8.1 was released on 2024-12-06.
   - PyIceberg 0.8.0 was released on 2024-11-18.
   - Go 0.1.0 was released on 2024-11-18.

Table format (v3)

   - Added deletion vectors and synchronous maintenance to improve
   row-level ops
   - Added row lineage fields and requirements for fine-grained row tracking
   - Proposal for geography and geometry types is close to consensus
   - Update to add Parquet’s variant type is approved, waiting on Parquet
   upstream
   - Finalized new type promotion rules

Puffin format

   - Added deletion vector blob type to support DVs in tables

REST catalog spec

   - Added storage credentials passing
   - Added credential refresh
   - Created a docker image for catalog testing
   - Discussing proposal for partial metadata commits
   - Discussed partial metadata loading

Views

   - Discussions about materialized view metadata are ongoing

Java

   - Released new Kafka Connect sink
   - Added default values implementation for Avro
   - Added nanosecond timestamps
   - Added v3 DV support in core, ongoing work in Spark
   - Flink: Made FLIP-27 source the default
   - Spark: Removed Spark 3.3 support
   - Hive: Removing Hive 2.x and 3.x (Iceberg support is in Hive for 4.x
   and on)
   - Pig: Removed the iceberg-pig module that is no longer used

PyIceberg

   - Support: Added Python 3.12, dropped Python 3.8

Rust

   - Support for default values and type promotion in reads
   - Added TableMetadataBuilder
   - Implemented table requirements

Go

   - Produced the first go release!
   - Supports scan planning and reading metadata
   - Supports loading and listing tables with the Glue catalog
   - Supports local and S3 storage

C++

   - Added a C++ repository for a Puffin implementation

Community Health:

The PMC has published guidelines for contributors that want to know more
about
how they can become committers on the Iceberg site. This guide should help
contributors understand how Iceberg and other ASF communities decide and add
committers, and should set expectations clearly. This was the most important
follow up from discussions on the dev list earlier this year, where it
became
clear that contributors did not understand the requirements or process.

The community has started planning a second Iceberg Summit, intended to be
held
in Spring of 2025. The proposal details are being finalized (such as the
members
of the selection committee) and will be submitted for approval in the next
few
weeks.

The community added two new committers this quarter and had a slight
increase in
the number of contributors.

There were also a number of commercial announcements from companies adding
or
expanding support for Iceberg.

Re: Contributor guidelines for becoming a committer

2024-12-11 Thread rdb...@gmail.com

Material contributions that show expertise can easily be documentation or
helping answer questions, right? This was one of the intended ways to
interpret that point.

The same goes for stable and maintainable. Although those are attributes we
commonly think about for code, it's important that contributions in other
areas, like build/CI or docs, are maintainable and don't require extensive
changes.

Also keep in mind that these are example questions that the PMC can ask.
This is not saying that you have to demonstrate every possible area.

Ryan

On Wed, Dec 11, 2024 at 12:25 PM Jean-Baptiste Onofré 
wrote:

> Hi Ryan,
>
> Thanks. I think the confusion is about what you describe like
> "contributions and reviews", reading the page:
>
> "
> ..
> * Has the candidate made independent material contributions to the
> community that show expertise?
> ...
> * Have the candidate’s contributions been stable and maintainable?
> ...
> "
> etc
>
> It sounds like "code contributions and code reviews". I think it's
> what Justin meant.
>
> Regards
> JB
>
> On Wed, Dec 11, 2024 at 6:34 PM rdb...@gmail.com  wrote:
> >
> > I want to make a clarification. I did comment on that PR that we are
> describing how the community is operating today, but that was in response
> to suggestions to reference the comdev project, lower the requirements, and
> add a requirement for loving the project and helping the community. My
> intent is to say that we're open to discussion, but that I want to make
> sure we get the guidelines published before we make changes.
> >
> > I was NOT saying that we don't value non-code contributions today. We
> do, and we made sure that the doc doesn't talk about only code. It refers
> to contributions and reviews, not code.
> >
> > Ryan
> >
> > On Wed, Dec 11, 2024 at 12:21 AM Jean-Baptiste Onofré 
> wrote:
> >>
> >> Hi Justin
> >>
> >> For reference, here's the original PR:
> >> https://github.com/apache/iceberg/pull/11670
> >>
> >> I agree with you, and I commented in the PR about the same points:
> >> - considering non code contributions (also referring com dev page
> >> https://community.apache.org/pmc/adding-committers.html as "example")
> >> - considering same criteria across the project (between iceberg-java,
> >> iceberg-python, iceberg-go, iceberg-rust, ...) because a committer/PMC
> >> member is for the whole project (not only on one module)
> >>
> >> Ryan and Fokko reassured me: they want to describe how the PMC is
> >> "operating" today, but they will certainly consider our comments in
> >> the future (and update the page).
> >>
> >> Regards
> >> JB
> >>
> >> On Wed, Dec 11, 2024 at 12:10 AM Justin Mclean <
> jus...@classsoftware.com> wrote:
> >> >
> >> > Hi,
> >> >
> >> > From a quick glance, this seems far too focused on code
> contributions. Remember, people can become committers from non-code
> contributions. A committer is someone that is committed to the project and
> may not review or write code. It would also be good to set some
> expectations around the amount of activity needed to become a committer
> rather than saying there are none.
> >> >
> >> > Kind Regards,
> >> > Justin
>

Re: [DISCUSS] December board report

2024-12-11 Thread rdb...@gmail.com

I'll update it. Thanks!

(By the way, the Avro default value support was in the Java section)

On Wed, Dec 11, 2024 at 2:00 PM Matt Topol  wrote:

> For the Go release, can we please point out that it supports reading the
> data too, not just metadata?
>
> It produces a stream of Arrow record batches.
>
> On Wed, Dec 11, 2024, 4:22 PM Walaa Eldin Moustafa 
> wrote:
>
>> Hi Ryan,
>>
>> For Table Format V3, we could point out that the default value support
>> for Avro has been merged and support for other formats is ongoing.
>>
>> Thanks,
>> Walaa.
>>
>>
>> On Wed, Dec 11, 2024 at 12:51 PM rdb...@gmail.com 
>> wrote:
>>
>>> Hi everyone,
>>>
>>> It’s time to report to the board again. Great to see all the progress
>>> here, and awesome to have our first go release this quarter!
>>>
>>> My draft is below. Please reply if there’s anything you’d like to add or
>>> change. Thanks!
>>>
>>> Ryan
>>> Description:
>>>
>>> Apache Iceberg is a table format for huge analytic datasets that is
>>> designed
>>> for high performance and ease of use.
>>> Project Status:
>>>
>>> Current project status: Ongoing
>>> Issues for the board: None
>>> Membership Data:
>>>
>>> Apache Iceberg was founded 2020-05-19 (5 years ago)
>>> There are currently 32 committers and 21 PMC members in this project.
>>> The Committer-to-PMC ratio is roughly 4:3.
>>>
>>> Community changes, past quarter:
>>>
>>>- No new PMC members. Last addition was Amogh Jahagirdar on
>>>2024-08-12.
>>>- Matthew Topol was added as committer on 2024-12-09
>>>- Scott Donnelly was added as committer on 2024-12-10
>>>
>>> Project Activity:
>>>
>>> Releases
>>>
>>>- 1.7.1 was released on 2024-12-06.
>>>- 1.7.0 was released on 2024-11-08.
>>>- PyIceberg 0.8.1 was released on 2024-12-06.
>>>- PyIceberg 0.8.0 was released on 2024-11-18.
>>>- Go 0.1.0 was released on 2024-11-18.
>>>
>>> Table format (v3)
>>>
>>>- Added deletion vectors and synchronous maintenance to improve
>>>row-level ops
>>>- Added row lineage fields and requirements for fine-grained row
>>>tracking
>>>- Proposal for geography and geometry types is close to consensus
>>>- Update to add Parquet’s variant type is approved, waiting on
>>>Parquet upstream
>>>- Finalized new type promotion rules
>>>
>>> Puffin format
>>>
>>>- Added deletion vector blob type to support DVs in tables
>>>
>>> REST catalog spec
>>>
>>>- Added storage credentials passing
>>>- Added credential refresh
>>>- Created a docker image for catalog testing
>>>- Discussing proposal for partial metadata commits
>>>- Discussed partial metadata loading
>>>
>>> Views
>>>
>>>- Discussions about materialized view metadata are ongoing
>>>
>>> Java
>>>
>>>- Released new Kafka Connect sink
>>>- Added default values implementation for Avro
>>>- Added nanosecond timestamps
>>>- Added v3 DV support in core, ongoing work in Spark
>>>- Flink: Made FLIP-27 source the default
>>>- Spark: Removed Spark 3.3 support
>>>- Hive: Removing Hive 2.x and 3.x (Iceberg support is in Hive for
>>>4.x and on)
>>>- Pig: Removed the iceberg-pig module that is no longer used
>>>
>>> PyIceberg
>>>
>>>- Support: Added Python 3.12, dropped Python 3.8
>>>
>>> Rust
>>>
>>>- Support for default values and type promotion in reads
>>>- Added TableMetadataBuilder
>>>- Implemented table requirements
>>>
>>> Go
>>>
>>>- Produced the first go release!
>>>- Supports scan planning and reading metadata
>>>- Supports loading and listing tables with the Glue catalog
>>>- Supports local and S3 storage
>>>
>>> C++
>>>
>>>- Added a C++ repository for a Puffin implementation
>>>
>>> Community Health:
>>>
>>> The PMC has published guidelines for contributors that want to know more
>>> about
>>> how they can become committers on the Iceberg site. This guide should
>>> help
>>> contributors understand how Iceberg and other ASF communities decide and
>>> add
>>> committers, and should set expectations clearly. This was the most
>>> important
>>> follow up from discussions on the dev list earlier this year, where it
>>> became
>>> clear that contributors did not understand the requirements or process.
>>>
>>> The community has started planning a second Iceberg Summit, intended to
>>> be held
>>> in Spring of 2025. The proposal details are being finalized (such as the
>>> members
>>> of the selection committee) and will be submitted for approval in the
>>> next few
>>> weeks.
>>>
>>> The community added two new committers this quarter and had a slight
>>> increase in
>>> the number of contributors.
>>>
>>> There were also a number of commercial announcements from companies
>>> adding or
>>> expanding support for Iceberg.
>>>
>>

Re: [DISCUSS] Spark Catalog - Drop vs Drop with Purge

2024-12-11 Thread rdb...@gmail.com

That plan sounds good to me. Thanks, Russell!

On Wed, Dec 11, 2024 at 1:43 PM Yufei Gu  wrote:

> +1 on adding a flag to support the Spark REST client behavior change
> between v1.8 and v2.0.
>
> At the same time, we may clarify further more on the behavior of DropTable
> REST API,
> https://github.com/apache/iceberg/blob/feed4e2544b5839fbc2fe040965af3906d053302/open-api/rest-catalog-open-api.yaml#L1099-L1099
> .
>
> Something like: We don't recommend clients/engines to delete table files
> in case of dropping a table managed by a REST catalog.
>
> Yufei
>
>
> On Wed, Dec 11, 2024 at 10:21 AM Russell Spitzer <
> russell.spit...@gmail.com> wrote:
>
>> Hi Y'all!
>>
>> Today we had a little discussion on the Apache Iceberg Catalog Community
>> Sync
>> about DROP and DROP WITH PURGE. Currently the SparkCatalog implementation
>> inside of the reference library has a unique method of DROP WITH PURGE vs
>> other
>> implementations. The pseudo code is essentially
>>
>>
>> ```
>> use Spark to list files to be removed and delete them
>> send a drop table request to the Catalog
>> ```
>>
>> As opposed to other systems
>>
>> ```
>> send a drop table request to the Catalog with the purge flag enabled
>> ```
>>
>> This has led us to a situation where it becomes difficult for REST
>> Catalogs
>> with custom purge implementations (or those with ignore purge) to
>> work properly with Spark.
>>
>> Bringing this behavior in line with non-Spark implementations
>> would have possibly dramatic impacts on users of the
>> iceberg library but our consensus in the Catalog Sync today was that we
>> should
>> eventually have that be the default behavior. To this end I propose the
>> following
>>
>>
>>- We support a flag to allow current Spark users to delegate to the
>>REST Catalog
>>(all other catalog behaviors remain the same). PR available here
>> from
>>(*Credit to Tobias who wrote the PR and brought up this topic)*
>>-  We deprecate the client side delete for Spark
>>- In the next major release (Iceberg 2.0?) we change the behavior
>>officially  to only
>>send through the Drop Purge flag with no client side file removal.
>>- For all non-REST catalog implementations we keep the code the same
>>for legacy compatibility.
>>
>>
>> A user of 1.8 will then have the ability to choose for their Spark DROP
>> PURGES whether
>> or not to purge locally or Remotely for REST
>>
>> A user of 2.0 will only be able to do a remote purge
>>
>> Users of non-REST Catalogs will have no change in behavior.
>>
>>
>> Thanks for your consideration,
>> Russ
>>
>

Contributor guidelines for becoming a committer

2024-12-10 Thread rdb...@gmail.com

Hi everyone,

Earlier this year, there were a few threads on this list that highlighted
that it wasn’t clear enough how contributors become committers, so the PMC
put together a doc to explain some of the common questions, including:

   - What are the responsibilities of a committer?
   - How are new committers added?
   - What does the PMC look for?

Our goal is to be clear about what committers do and how we evaluate
candidates and add committers to the project. We just added this to the
site’s community page as The Path from Contributor to Committer
.
Please take a look!

Ryan

Re: [DISCUSS] Hive Support

2024-12-16 Thread rdb...@gmail.com

> I'm not sure there's an upgrade path before Spark 4.0. Any ideas?

We can at least separate the concerns. We can remove the runtime modules
that are the main issue. If we compile against an older version of the Hive
metastore module (leaving it unchanged) that at least has a dramatically
reduced surface area for Java version issues. As long as the API is
compatible (and we haven't heard complaints that it is not) then I think
users can override the version in their environments.

Ryan

On Sun, Dec 15, 2024 at 5:55 PM Manu Zhang  wrote:

> Hi Daniel,
> I'll start a vote once I get the PR ready.
>
> Hi Ryan,
> Sorry, I wasn't clear in the last email that the consensus is to upgrade
> Hive metastore support.
>
> Well, I was too optimistic about the upgrade. Spark has only added hive
> 4.0 metastore support recently for Spark 4.0[1] and there will be conflicts
> between Spark's hive 2.3.9 and our hive 4.0 dependencies.
> I'm not sure there's an upgrade path before Spark 4.0. Any ideas?
>
> 1. https://issues.apache.org/jira/browse/SPARK-45265
>
> Thanks,
> Manu
>
>
> On Sat, Dec 14, 2024 at 4:31 AM rdb...@gmail.com  wrote:
>
>> Oh, I think I see. The upgrade to Hive 4 is just for the Hive metastore
>> support? When I read the thread, I thought that we weren't going to change
>> the metastore. That seems reasonable to me. Sorry for the confusion.
>>
>> On Fri, Dec 13, 2024 at 10:24 AM rdb...@gmail.com 
>> wrote:
>>
>>> Sorry, I must have missed something. I don't think that we should
>>> upgrade anything in Iceberg to Hive 4. Why not simply remove the Hive
>>> support entirely? Why would anyone need Hive 4 support from Iceberg when it
>>> is built into Hive 4?
>>>
>>> On Thu, Dec 12, 2024 at 11:03 AM Daniel Weeks  wrote:
>>>
>>>> Hey Manu,
>>>>
>>>> I agree with the direction here, but we should probably hold a quick
>>>> procedural vote just to confirm since this is a significant change in
>>>> support for Hive.
>>>>
>>>> -Dan
>>>>
>>>> On Wed, Dec 11, 2024 at 5:19 PM Manu Zhang 
>>>> wrote:
>>>>
>>>>> Thanks all for sharing your thoughts. It looks there's a consensus on
>>>>> upgrading to Hive 4 and dropping hive-runtime.
>>>>> I've submitted a PR[1] as the first step. Please help review.
>>>>>
>>>>> 1. https://github.com/apache/iceberg/pull/11750
>>>>>
>>>>> Thanks,
>>>>> Manu
>>>>>
>>>>> On Thu, Nov 28, 2024 at 11:26 PM Shohei Okumiya 
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I also prefer option 1. I have some initiatives[1] to improve
>>>>>> integrations between Hive and Iceberg. The current style allows us to
>>>>>> develop both Hive's core and HiveIcebergStorageHandler simultaneously.
>>>>>> That would help us enhance integrations.
>>>>>>
>>>>>> - [1] https://issues.apache.org/jira/browse/HIVE-28410
>>>>>>
>>>>>> Regards,
>>>>>> Okumin
>>>>>>
>>>>>> On Thu, Nov 28, 2024 at 4:17 AM Fokko Driesprong 
>>>>>> wrote:
>>>>>> >
>>>>>> > Hey Cheng,
>>>>>> >
>>>>>> > Thanks for the suggestion. The nightly snapshots are available:
>>>>>> https://repository.apache.org/content/groups/snapshots/org/apache/iceberg/iceberg-core/,
>>>>>> which might help when working on features that are not released yet (eg
>>>>>> Nanosecond timestamps). Besides that, we should run RCs against Hive to
>>>>>> check if everything works as expected.
>>>>>> >
>>>>>> > I'm leaning toward removing Hive 2 and 3 as well.
>>>>>> >
>>>>>> > Kind regards,
>>>>>> > Fokko
>>>>>> >
>>>>>> > Op wo 27 nov 2024 om 20:05 schreef rdb...@gmail.com <
>>>>>> rdb...@gmail.com>:
>>>>>> >>
>>>>>> >> I think that we should remove Hive 2 and Hive 3. We already agreed
>>>>>> to remove Hive 2, but Hive 3 is not compatible with the project anymore 
>>>>>> and
>>>>>> is already EOL and will not see a release to update it so that it can be
>>

Re: [VOTE] Drop Hive runtime

2024-12-18 Thread rdb...@gmail.com

The PR looks good to me. +1

On Tue, Dec 17, 2024 at 6:00 PM Manu Zhang  wrote:

> Hi all,
>
> Thanks for sharing your ideas in the discussion of Hive support[1]. We
> have a consensus to drop Hive runtime and upgrade Hive metastore connector
> to Hive 4. However, it looks like we can't upgrade metastore support till
> Spark 4[2]. Hence, I went on to create a separate PR to remove Hive runtime
> first[3] as suggested by Ryan.
>
> I'd like to raise a vote to confirm whether the community is OK with the
> change.
>
>
> 1. https://lists.apache.org/thread/jfcqfw9vhq4j7h0kwnlf338jgyzcq8s4
> 2. https://github.com/apache/iceberg/pull/11750
> 3. https://github.com/apache/iceberg/pull/11801
>
> Thanks,
> Manu
>
>

Re: [DISCUSS] Relocate Parquet to Iceberg Core

2024-12-18 Thread rdb...@gmail.com

I was the person that originally suggested that we not move iceberg-parquet
into core, so it would probably help if I gave some context for my
rationale as I remember it and what's changed since then.

I pushed back on the original suggestion to move Parquet classes into core
because it wasn't clear that we needed to move them and couldn't just
introduce an interface or abstraction that could be implemented by the
iceberg-parquet module or other formats, like ORC. The idea was to keep the
core module free of unnecessary dependencies. Most of the pieces of this
strategy exist already because all of the core metadata classes are
interfaces, like `DataFile` and `ManifestFile`, and there are writer
interfaces that can be used with them, like `FileAppender`.

I think it's a good thing to reconsider this because a few things have
become more clear in the meantime. First, the exploration work to plug in
formats through an interface never materialized, so we should reconsider
rather than blocking that work. Second, I think the Iceberg ecosystem has
moved from being mostly format-agnostic because Parquet is the format that
is widely supported and optimized (Comet, for example, supports Parquet,
not to mention vendor offerings). Third, the use case has expanded from
stats files to other metadata.

Now that we are looking at introducing more columnar metadata and have not
added an abstraction to support multiple formats in the last year or so,
I'm reconsidering the value of being format-agnostic. The original choice
to use Avro for metadata was to maximize compatibility and I think that
concern is stronger and stronger, given how widely Parquet is supported. If
we are going to keep the surface area small by specifying Parquet for
columnar metadata, then we don't need to do the extra work to add more
abstractions.

And now after having written that, I think I'm now convinced. +1 for
merging iceberg-parquet into core.

On Mon, Dec 9, 2024 at 3:21 AM Ajantha Bhat  wrote:

> Thanks Dan for the reply.
>
> This is also a good time to consider adding a native parquet read/write
>> path for use in core as the generic path in 'iceberg-data' isn't ideal.
>> Parquet metadata has been brought up in relation to improving stats
>> handling (allowing tracking of more column stats without impacting planning
>> performance), improving stats representations along with other possible
>> benefits like improved compression and scan performance.
>
>
> These two proposals are super interesting. I don't see any public
> discussion on the same. Could you please provide more details or point
> me to the discussions? +1 for moving the parquet module to core if it helps
> for these proposals.
>
> I feel like ORC is a separate discussion and while we may want to include
>> it, I wouldn't say there's a definitive answer unless we know there is
>> adequate investment in it.
>
> Having a module for ORC but not Parquet looks odd. I don't think the
> effort is huge for moving these files.
> I can take it up as well.
>
> I wasn't aware you had a PR as part of the prior discussion, but I'm happy
>> to revisit that if we decide this is a reasonable path forward.
>
> Sure. I can revive my PR.
>
> For those who are looking for previous discussion on the same topic last
> year.
> https://lists.apache.org/thread/8m6f3k7b425czktzf22q902vxgp2y10r
>
> - Ajantha
>
>
>
> On Sat, Dec 7, 2024 at 10:26 AM Daniel Weeks  wrote:
>
>> Hey Ajantha,
>>
>> I understand it was discussed before, but I think a lot of recent
>> discussions around improvements for parquet metadata/stats/etc is good
>> justification for revisiting the earlier discussion.
>>
>> Parquet metadata has been brought up in relation to improving stats
>> handling (allowing tracking of more column stats without impacting planning
>> performance), improving stats representations along with other possible
>> benefits like improved compression and scan performance.
>>
>> The original decision was more narrowly focused on the stats case and
>> there were viable (though possibly not ideal) workarounds to keep the
>> existing separation of subprojects, but at this point I see this more as a
>> barrier to exploring some of these ideas as it's quite difficult to allow
>> core to work directly with parquet.
>>
>> This is also a good time to consider adding a native parquet read/write
>> path for use in core as the generic path in 'iceberg-data' isn't ideal
>> (this might also be useful for projects like Kafka Connect).
>>
>> I feel like ORC is a separate discussion and while we may want to include
>> it, I wouldn't say there's a definitive answer unless we know there is
>> adequate investment in it.
>>
>> I wasn't aware you had a PR as part of the prior discussion, but I'm
>> happy to revisit that if we decide this is a reasonable path forward.
>>
>> -Dan
>>
>>
>>
>> On Fri, Dec 6, 2024 at 5:31 PM Ajantha Bhat 
>> wrote:
>>
>>> Hi Dan,
>>>
>>> I proposed the same last year while working on partition stats.
>>> I

Re: [DISCUSS] Hive Support

2024-12-13 Thread rdb...@gmail.com

Oh, I think I see. The upgrade to Hive 4 is just for the Hive metastore
support? When I read the thread, I thought that we weren't going to change
the metastore. That seems reasonable to me. Sorry for the confusion.

On Fri, Dec 13, 2024 at 10:24 AM rdb...@gmail.com  wrote:

> Sorry, I must have missed something. I don't think that we should upgrade
> anything in Iceberg to Hive 4. Why not simply remove the Hive support
> entirely? Why would anyone need Hive 4 support from Iceberg when it is
> built into Hive 4?
>
> On Thu, Dec 12, 2024 at 11:03 AM Daniel Weeks  wrote:
>
>> Hey Manu,
>>
>> I agree with the direction here, but we should probably hold a quick
>> procedural vote just to confirm since this is a significant change in
>> support for Hive.
>>
>> -Dan
>>
>> On Wed, Dec 11, 2024 at 5:19 PM Manu Zhang 
>> wrote:
>>
>>> Thanks all for sharing your thoughts. It looks there's a consensus on
>>> upgrading to Hive 4 and dropping hive-runtime.
>>> I've submitted a PR[1] as the first step. Please help review.
>>>
>>> 1. https://github.com/apache/iceberg/pull/11750
>>>
>>> Thanks,
>>> Manu
>>>
>>> On Thu, Nov 28, 2024 at 11:26 PM Shohei Okumiya 
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I also prefer option 1. I have some initiatives[1] to improve
>>>> integrations between Hive and Iceberg. The current style allows us to
>>>> develop both Hive's core and HiveIcebergStorageHandler simultaneously.
>>>> That would help us enhance integrations.
>>>>
>>>> - [1] https://issues.apache.org/jira/browse/HIVE-28410
>>>>
>>>> Regards,
>>>> Okumin
>>>>
>>>> On Thu, Nov 28, 2024 at 4:17 AM Fokko Driesprong 
>>>> wrote:
>>>> >
>>>> > Hey Cheng,
>>>> >
>>>> > Thanks for the suggestion. The nightly snapshots are available:
>>>> https://repository.apache.org/content/groups/snapshots/org/apache/iceberg/iceberg-core/,
>>>> which might help when working on features that are not released yet (eg
>>>> Nanosecond timestamps). Besides that, we should run RCs against Hive to
>>>> check if everything works as expected.
>>>> >
>>>> > I'm leaning toward removing Hive 2 and 3 as well.
>>>> >
>>>> > Kind regards,
>>>> > Fokko
>>>> >
>>>> > Op wo 27 nov 2024 om 20:05 schreef rdb...@gmail.com >>> >:
>>>> >>
>>>> >> I think that we should remove Hive 2 and Hive 3. We already agreed
>>>> to remove Hive 2, but Hive 3 is not compatible with the project anymore and
>>>> is already EOL and will not see a release to update it so that it can be
>>>> compatible. Anyone using the existing Hive 3 support should be able to
>>>> continue using older releases.
>>>> >>
>>>> >> In general, I think it's a good idea to let people use older
>>>> releases when these situations happen. It is difficult for the project to
>>>> continue to support libraries that are EOL and I don't think there's a
>>>> great justification for it, considering Iceberg support in Hive 4 is native
>>>> and much better!
>>>> >>
>>>> >> On Wed, Nov 27, 2024 at 7:12 AM Cheng Pan  wrote:
>>>> >>>
>>>> >>> That said, it would be helpful if they continue running
>>>> >>> tests against the latest stable Hive releases to ensure that any
>>>> >>> changes don’t unintentionally break something for Hive, which would
>>>> be
>>>> >>> beyond our control.
>>>> >>>
>>>> >>>
>>>> >>> I believe we should continue maintaining a Hive Iceberg runtime
>>>> test suite with the latest version of Hive in the Iceberg repository.
>>>> >>>
>>>> >>>
>>>> >>> i think we can keep some basic Hive4 tests in iceberg repo
>>>> >>>
>>>> >>>
>>>> >>> Instead of running basic tests on the Iceberg repo, maybe let
>>>> Iceberg publish daily snapshot jars to Nexus, and have a daily CI in Hive
>>>> to consume those jars and run full Iceberg tests makes more sense?
>>>> >>>
>>>> >>> Thanks,
>>>> >>> Cheng Pan
>>>> >>>
>>>>
>>>

Re: [DISCUSS] Hive Support

2024-12-13 Thread rdb...@gmail.com

Sorry, I must have missed something. I don't think that we should upgrade
anything in Iceberg to Hive 4. Why not simply remove the Hive support
entirely? Why would anyone need Hive 4 support from Iceberg when it is
built into Hive 4?

On Thu, Dec 12, 2024 at 11:03 AM Daniel Weeks  wrote:

> Hey Manu,
>
> I agree with the direction here, but we should probably hold a quick
> procedural vote just to confirm since this is a significant change in
> support for Hive.
>
> -Dan
>
> On Wed, Dec 11, 2024 at 5:19 PM Manu Zhang 
> wrote:
>
>> Thanks all for sharing your thoughts. It looks there's a consensus on
>> upgrading to Hive 4 and dropping hive-runtime.
>> I've submitted a PR[1] as the first step. Please help review.
>>
>> 1. https://github.com/apache/iceberg/pull/11750
>>
>> Thanks,
>> Manu
>>
>> On Thu, Nov 28, 2024 at 11:26 PM Shohei Okumiya 
>> wrote:
>>
>>> Hi all,
>>>
>>> I also prefer option 1. I have some initiatives[1] to improve
>>> integrations between Hive and Iceberg. The current style allows us to
>>> develop both Hive's core and HiveIcebergStorageHandler simultaneously.
>>> That would help us enhance integrations.
>>>
>>> - [1] https://issues.apache.org/jira/browse/HIVE-28410
>>>
>>> Regards,
>>> Okumin
>>>
>>> On Thu, Nov 28, 2024 at 4:17 AM Fokko Driesprong 
>>> wrote:
>>> >
>>> > Hey Cheng,
>>> >
>>> > Thanks for the suggestion. The nightly snapshots are available:
>>> https://repository.apache.org/content/groups/snapshots/org/apache/iceberg/iceberg-core/,
>>> which might help when working on features that are not released yet (eg
>>> Nanosecond timestamps). Besides that, we should run RCs against Hive to
>>> check if everything works as expected.
>>> >
>>> > I'm leaning toward removing Hive 2 and 3 as well.
>>> >
>>> > Kind regards,
>>> > Fokko
>>> >
>>> > Op wo 27 nov 2024 om 20:05 schreef rdb...@gmail.com >> >:
>>> >>
>>> >> I think that we should remove Hive 2 and Hive 3. We already agreed to
>>> remove Hive 2, but Hive 3 is not compatible with the project anymore and is
>>> already EOL and will not see a release to update it so that it can be
>>> compatible. Anyone using the existing Hive 3 support should be able to
>>> continue using older releases.
>>> >>
>>> >> In general, I think it's a good idea to let people use older releases
>>> when these situations happen. It is difficult for the project to continue
>>> to support libraries that are EOL and I don't think there's a great
>>> justification for it, considering Iceberg support in Hive 4 is native and
>>> much better!
>>> >>
>>> >> On Wed, Nov 27, 2024 at 7:12 AM Cheng Pan  wrote:
>>> >>>
>>> >>> That said, it would be helpful if they continue running
>>> >>> tests against the latest stable Hive releases to ensure that any
>>> >>> changes don’t unintentionally break something for Hive, which would
>>> be
>>> >>> beyond our control.
>>> >>>
>>> >>>
>>> >>> I believe we should continue maintaining a Hive Iceberg runtime test
>>> suite with the latest version of Hive in the Iceberg repository.
>>> >>>
>>> >>>
>>> >>> i think we can keep some basic Hive4 tests in iceberg repo
>>> >>>
>>> >>>
>>> >>> Instead of running basic tests on the Iceberg repo, maybe let
>>> Iceberg publish daily snapshot jars to Nexus, and have a daily CI in Hive
>>> to consume those jars and run full Iceberg tests makes more sense?
>>> >>>
>>> >>> Thanks,
>>> >>> Cheng Pan
>>> >>>
>>>
>>

Re: [DISCUSS] Standardizing Error Handling in the Iceberg Spark Module

2024-12-19 Thread rdb...@gmail.com

This looks like a good improvement to me. Thanks, Huaxin!

On Wed, Dec 18, 2024 at 11:37 PM huaxin gao  wrote:

> Hi everyone,
>
> While working on integrating Spark 4.0 with Iceberg, I noticed that error
> conditions in the Spark module are primarily validated through the content
> of error messages. I need to revise some of the validation because the
> error messages have changed in Spark 4.0. Spark has standardized error
> handling by introducing error classes and SQLSTATE codes since 3.1. I would
> like to align the error handling in the Iceberg Spark module with Spark's
> standard error handling framework, specifically by shifting from validating
> error message content to validating error classes and SQLSTATE codes.  I
> have prepared a quick write-up
> 
> for background information and an example. Please let me know what you
> think. If there are no objections to this proposal, I will begin updating
> the error handling in the Spark module.
>
> Thanks,
> Huaxin
>
>

Re: [Discuss] Proposal to Adjust Catalog Sync Schedule & Cancel Next Wednesday’s Meeting

2024-11-21 Thread rdb...@gmail.com

+1 for every 3 weeks instead of 2 out of 3.

On Thu, Nov 21, 2024 at 10:57 AM Dmitri Bourlatchkov
 wrote:

> Thanks for keeping track of this, Honah!
>
> +1 to keep the Wednesday 9 AM Pacific Time meeting every 3 weeks
>
> I'm ok to pause the 8 PM PST meeting - this time does not work for me
> personally.
>
> As for two time slots recurring every 6 weeks each, IMHO, if people in
> those meetings end up being distinct groups because of time zones being too
> far off, I'd prefer to have the same 3 week cadence. It won't cause extra
> burden if there's no overlap.
>
> Cheers,
> Dmitri.
>
> On Wed, Nov 20, 2024 at 6:48 PM Honah J.  wrote:
>
>> Hi everyone,
>>
>> Thank you all for your participation in the catalog community sync so
>> far! I'm writing to discuss changes to the meeting schedule to better fit
>> everyone's availability and make our discussions more productive.
>>
>> Here’s a quick summary of the current cadence
>> :
>>
>>- Wednesday 9 AM Pacific Time: Recurring every 3 weeks since August 7
>>(most recent: Nov 20).
>>- Wednesday 8 PM Pacific Time: Recurring every 3 weeks since August
>>14 (most recent: Nov 6).
>>
>> Given the upcoming U.S. Thanksgiving holiday, I suggest we *cancel* the 
>> *November
>> 27, 8 PM* meeting.
>>
>> Additionally, as Jack mentioned in a previous thread
>> ,
>> there’s been discussion about reducing the meeting frequency to once every
>> 3 weeks (instead of twice). With that in mind, I propose the following as a
>> starting point for discussion:
>>
>>- Keep the Wednesday 9 AM Pacific Time meeting every 3 weeks
>>- Pause the Wednesday 8 PM Pacific Time meeting for now and explore a
>>suitable alternative time for participants in different time zones.
>>- Once we find a time that works for other regions, switch to a new
>>cadence:
>>   - Wednesday 9 AM PT: Recurring every 6 weeks.
>>   - [Alternative Time for Other Time Zones]: Recurring every 6
>>   weeks, offset by 3 weeks from the 9 AM meeting.
>>
>> I’d love to hear your thoughts on this or any suggestions for alternative
>> approaches that might work better for the community. Please share your
>> feedback, and let’s discuss what works best for everyone and coordinate
>> accordingly.
>>
>> Best regards,
>> Honah
>>
>

Re: [VOTE] Deprecate and remove last-column-id

2024-11-21 Thread rdb...@gmail.com

+1

On Thu, Nov 21, 2024 at 5:22 AM Jean-Baptiste Onofré 
wrote:

> +1
>
> Regards
> JB
>
> On Tue, Nov 19, 2024 at 9:18 AM Fokko Driesprong  wrote:
> >
> > Hi everyone,
> >
> > Based on the positive feedback on the [DISCUSS] thread and the
> pull-request on GitHub, I would like to raise a vote to deprecate and
> remove the last-column-id field from the spec. Since this is a spec change,
> please vote in the next 72 hours:
> >
> > [ ] +1, commit the proposed spec changes
> > [ ] 0
> > [ ] -1, do not make these changes because...
> >
> > Kind regards,
> > Fokko
>

Re: [DISCUSS] Deprecate embedded manifests

2024-11-21 Thread rdb...@gmail.com

Can we safely deprecate and remove this? The manifest list is required in
v2, but the spec has stated for a long time that v1 tables can use manifests
rather than a manifest list. It’s unlikely, but it would be valid for other
implementations to produce it.

I would understand if other implementations chose to fail tables that don’t
have a manifest list to avoid adding code to handle manifests, but I don’t
think that there’s much value in removing support from the Java
implementation.

Instead, what about discussing how to deprecate support for older format
versions? That seems like the main issue here. Once the majority of
implementations move to newer versions, we would like to deprecate the old
ones.

On Thu, Nov 21, 2024 at 11:01 AM Szehon Ho  wrote:

> +1, great to have less possible paths.
>
> Thanks
> Szehon
>
> On Thu, Nov 21, 2024 at 10:33 AM Steve Zhang
>  wrote:
>
>> +1 to deprecate
>>
>> Thanks,
>> Steve Zhang
>>
>>
>>
>> On Nov 19, 2024, at 3:32 AM, Fokko Driesprong  wrote:
>>
>> Hi everyone,
>>
>> I would like to propose to deprecate embedded manifests
>> . This has been used
>> before the manifest-list was introduced, but I don't think they are used
>> since the project has been open-sourced, and it would be good to
>> officially deprecate them from the spec. It is only supported by Iceberg
>> Java today, and I haven't seen any requests for PyIceberg to add support
>> for this.
>>
>> Any questions or concerns about deprecating the embedded manifests?
>>
>> Kind regards,
>> Fokko Driesprong
>>
>>
>>

Re: [DISCUSS, VOTE] OpenAPI Metadata Update for EnableRowLineage

2025-01-22 Thread rdb...@gmail.com

+1

On Wed, Jan 22, 2025 at 2:51 PM Russell Spitzer 
wrote:

> Hey Y'all
>
> Yet another Row Lineage Spec update. This adds a MetadataUpdate
> EnableRowLineage to the REST Spec. We briefly talked today
> about an alternative EnableFeature(Feature Name) API instead but in the
> absence of other features it doesn't seem
> like that's really a requirement now.
>
> I agreed that if we ever do have another feature we want to enable in a
> similar way I would take the blame for adding
> this API rather than a generic one.
>
> That said please take a look
>
> https://github.com/apache/iceberg/pull/12050
>
>
> Note: We only allow enabling row lineage, it cannot be disabled.
>
> Thanks for your time,
> Russ
>

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-21 Thread rdb...@gmail.com

I think you could achieve what you're looking for by setting the age to 1
ms and the minimum number of snapshots to keep. Then snapshot expiration
would always expire all snapshots other than the min number, getting you
what you want.

It probably wouldn't make sense to set a maximum as well. Right now, the
min number of snapshots is a requirement that keeps snapshots around even
if they are eligible to be removed because of expiration. A maximum would
work differently and would be a second way to consider a snapshot eligible
for expiration -- or else we would have to redefine how the min works. I
think that would be a bit confusing to configure in practice because we'd
need to define these cases for which configuration takes precedence. It
seems much simpler to me to use the min snapshots setting with a very short
expiration interval if you want to always keep some number of snapshots
rather than using the age-based expiration.

On Tue, Jan 21, 2025 at 9:51 AM Daniel Weeks  wrote:

> Hey Manu,
>
> I think I understand what you're trying to achieve here and I feel like
> the most important part is to have an updated version of the retention
> procedure  to
> clearly state how this interacts with the other settings as part of the PR.
>
> -Dan
>
> On Thu, Jan 16, 2025 at 8:37 PM Yufei Gu  wrote:
>
>> It makes sense to have an option to control the max number of snapshots.
>> Thanks Manu for the proposal.
>>
>> Yufei
>>
>>
>> On Thu, Jan 16, 2025 at 7:46 PM Manu Zhang 
>> wrote:
>>
>>> Hi all,
>>>
>>> Do you have more comments on this feature? Do you have concerns about
>>> adding a new field to SnapshotRef?
>>>
>>> Thanks,
>>> Manu
>>>
>>> On Tue, Jan 7, 2025 at 2:37 PM Manu Zhang 
>>> wrote:
>>>
 Hi Ajantha,

 `history.expire.min-snapshots-to-keep` is the *minimum number of
 snapshots* we can keep. I'm proposing to decide the *maximum number of
 snapshots* to keep by count rather than by age.

 Thanks,
 Manu

 On Tue, Jan 7, 2025 at 2:18 PM Ajantha Bhat 
 wrote:

> Hi Manu,
>
> We already have `retain_last` and
> `history.expire.min-snapshots-to-keep` to retain the snapshots based on
> count. Can you please elaborate on why can't we use the same?
>
> - Ajantha
>
> On Tue, Jan 7, 2025 at 11:33 AM Walaa Eldin Moustafa <
> wa.moust...@gmail.com> wrote:
>
>> Thanks Manu for starting this discussion. That is definitely a valid
>> feature. I have always found maintaining snapshots by day makes it harder
>> to provide different types of guarantees/contracts especially when tables
>> change rates are diverse or irregular. Maintaining by snapshot count 
>> makes
>> a lot of sense and prevents table sizes from growing excessively when
>> change rate is frequent.
>>
>> Thanks,
>> Walaa.
>>
>>
>> On Mon, Jan 6, 2025 at 8:38 PM Manu Zhang 
>> wrote:
>>
>>> Hi all,
>>>
>>> While maintaining Iceberg tables for our customers, I find it's
>>> difficult to set a default snapshot expiration time
>>> (`history.expire.max-snapshot-age-ms`) for different workloads. The 
>>> default
>>> value of 5 days looks good for daily batch jobs but is too long for
>>> frequently-updated jobs.
>>>
>>> I'm thinking about adding another option like
>>> `history.expire.max-snapshots-to-keep` to keep at most N snapshots. A
>>> snapshot will be removed when either its age is larger than
>>> `history.expire.max-snapshot-age-ms` or it's the oldest in
>>> `history.expire.max-snapshots-to-keep + 1` snapshots. I've created a 
>>> draft
>>> PR to demo the idea[1].
>>>
>>> If you agree this is a valid feature request, we also need to update
>>> SnapshotRef[2] adding a new field `max-snapshots-to-keep`. Will there 
>>> be a
>>> compatibility issue or too much cost to maintain compatibility? My
>>> experiment shows many parsers need to be updated.
>>>
>>> I'd like to hear your thoughts on this.
>>>
>>> 1. https://github.com/apache/iceberg/pull/11879
>>> 2. https://iceberg.apache.org/spec/#snapshot-references
>>>
>>> Happy New Year!
>>> Manu
>>>
>>

Re: [VOTE] Document Snapshot Summary Optional Fields as Subsection of Appendix F in Spec

2025-01-21 Thread rdb...@gmail.com

+1

On Tue, Jan 21, 2025 at 12:20 PM Honah J.  wrote:

> Hi everyone,
>
> In the last VOTE
>  thread
> on documenting snapshot summary optional fields, we decided to move the
> documentation to a subsection of Appendix F – Implementation Notes. Since
> this is a significant change, I canceled the previous vote and would like
> to initiate a new one for the updated spec change.
>
> Key changes since the last vote:
>
>- Relocated the documentation to Appendix F – Implementation Details
>- Removed the statement: "Metrics must be accurate if written."
>- Added "engine-name" and "engine-version" under Other Optional Fields
>
>
> Useful links:
>
>- [DISCUSS] thread
>
>- pull request
>
>- Previous Vote
>
>
> Since this is a spec change, please vote in the next 72 hours:
>
> [ ] +1, commit the proposed spec changes
> [ ] 0
> [ ] -1, do not make these changes because...
>
> Best regards,
> Honah
>

Re: [VOTE] REST API changes for freshness-aware table loading

2025-01-24 Thread rdb...@gmail.com

+1

Thanks, Gabor!

On Fri, Jan 24, 2025 at 9:25 AM Christian Thiel 
wrote:

> +1 (non binding). Thanks Gabor!
>
> Daniel Weeks  schrieb am Fr. 24. Jan. 2025 um 17:15:
>
>> +1
>>
>> On Wed, Jan 22, 2025 at 1:19 PM Yufei Gu  wrote:
>>
>>> +1. Thanks, Gabor! A bit more context, we synced on this spec change
>>> during this morning's community catalog meeting and reached a general
>>> consensus on the approach.
>>>
>>> Yufei
>>>
>>>
>>> On Wed, Jan 22, 2025 at 12:05 PM Gabor Kaszab 
>>> wrote:
>>>
 Hi Iceberg Community,

 I have a PR for changing the REST spec
  as part of the
 freshness-aware table loading proposal. I feel that the spec part is
 something that the participants have agreed on so I think this is the time
 to start a vote.

 Links:
 PR for changing REST spec
  (this is the scope of
 the vote)
 Doc for freshness-aware table loading
 
 [DISCUSS] thread
 

 please vote in the next 72 hours:

 [ ] +1, commit the proposed spec changes
 [ ] 0
 [ ] -1, do not make these changes because...

 Thanks,
 Gabor

>>>

Re: [VOTE] Add initial/write defaults to REST spec

2025-01-24 Thread rdb...@gmail.com

+1

On Fri, Jan 24, 2025 at 2:25 PM Yufei Gu  wrote:

> +1
> Yufei
>
>
> On Fri, Jan 24, 2025 at 2:15 PM Amogh Jahagirdar <2am...@gmail.com> wrote:
>
>> +1 (binding)
>>
>> On Fri, Jan 24, 2025 at 2:02 PM Jean-Baptiste Onofré 
>> wrote:
>>
>>> +1 (non binding)
>>>
>>> It corresponds to the spec (initial/write default values).
>>>
>>> Thanks !
>>> Regards
>>> JB
>>>
>>> On Fri, Jan 24, 2025 at 5:11 PM Daniel Weeks  wrote:
>>> >
>>> > Everyone,
>>> >
>>> > I'd like to hold a quick vote for a small addition to the REST spec to
>>> include the initial/write defaults introduced in v3 as optional fields to
>>> the table schema.
>>> >
>>> > PR: OpenAPI: add initial/write defaults to schema #12094
>>> >
>>> > Thanks,
>>> > Dan
>>>
>>

Re: [DISCUSS/VOTE] Add in ChangeLog Reserved Field IDs to Spec and Decrement Row Lineage Reserved IDs

2025-01-28 Thread rdb...@gmail.com

+1

Thanks for catching this, Russell!

On Mon, Jan 27, 2025 at 12:51 PM Russell Spitzer 
wrote:

> Thanks everyone, I'll be merging that fix ASAP
>
> On Mon, Jan 27, 2025 at 6:01 AM Fokko Driesprong  wrote:
>
>> +1
>>
>> Op ma 27 jan 2025 om 10:54 schreef Honah J. :
>>
>>> +1, thanks for driving this!
>>>
>>> Best Regards,
>>> Honah
>>>
>>> On Sun, Jan 26, 2025 at 3:20 PM Steven Wu  wrote:
>>>
 +1

 On Sun, Jan 26, 2025 at 3:01 PM John Zhuge  wrote:

> +1 (non-binding)
>
> John Zhuge
>
>
> On Sun, Jan 26, 2025 at 2:59 PM Aihua Xu  wrote:
>
>> +1 (non-binding).
>>
>> Thanks for fixing it.
>>
>> On Sun, Jan 26, 2025 at 11:30 AM Anton Okolnychyi <
>> aokolnyc...@gmail.com> wrote:
>>
>>> +1 good catch
>>>
>>> нд, 26 січ. 2025 р. о 08:33 Steve Zhang
>>>  пише:
>>>
 +1 (non-binding)

 Thanks,
 Steve Zhang



 On Jan 25, 2025, at 10:48 AM, huaxin gao 
 wrote:

 +1 (non-binding)

Re: [VOTE] Document Snapshot Summary Optional Fields as Appendix in Spec

2025-01-16 Thread rdb...@gmail.com

+1

Thanks, Honah!

On Thu, Jan 16, 2025 at 3:38 PM Honah J.  wrote:

> Hi everyone,
>
> Thank you for your votes and valuable suggestions. I have updated the PR
> to remove the statement, "Metrics must be accurate if written," and have
> relocated the relevant documentation to Appendix F - Implementation Notes.
>
> Updated PR: https://github.com/apache/iceberg/pull/11660
>
> Given the recent restructuring and additional reviews/modifications in the
> PR, I would like to cancel the current vote and initiate a new one later.
> This will ensure that all votes are based on the latest version of the spec
> change.
>
> Best regards,
> Honah
>
> On Wed, Jan 15, 2025 at 10:16 AM Daniel Weeks  wrote:
>
>> I don't think can include the statement: "Metrics must be accurate if
>> written"
>>
>> Equality deletes make this requirement very difficult to satisfy for some
>> of the fields.
>>
>> The reason I suggested appendix was that we shouldn't be adding new
>> requirements, just documenting field names for consistency across
>> implementations.
>>
>> -Dan
>>
>> On Wed, Jan 15, 2025 at 8:07 AM Russell Spitzer <
>> russell.spit...@gmail.com> wrote:
>>
>>> @Daniel Weeks what do you think?
>>>
>>> I know both you and I had the opposite feeling here.
>>>
>>> On Tue, Jan 14, 2025 at 6:21 PM rdb...@gmail.com 
>>> wrote:
>>>
>>>> The content looks correct to me, but because this states a requirement
>>>> ("Metrics must be accurate if written") I would rather move this content
>>>> into the section on the snapshot summary instead of an appendix.
>>>>
>>>> On Tue, Jan 14, 2025 at 1:30 PM huaxin gao 
>>>> wrote:
>>>>
>>>>> +1 non-binding
>>>>>
>>>>> On Tue, Jan 14, 2025 at 1:21 PM Steve Zhang
>>>>>  wrote:
>>>>>
>>>>>> +1 non-binding
>>>>>>
>>>>>> Thanks,
>>>>>> Steve Zhang
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Jan 14, 2025, at 1:14 PM, Kevin Liu  wrote:
>>>>>>
>>>>>> +1 non-binding.
>>>>>>
>>>>>>
>>>>>>

Re: [VOTE] Document Snapshot Summary Optional Fields as Appendix in Spec

2025-01-14 Thread rdb...@gmail.com

The content looks correct to me, but because this states a requirement
("Metrics must be accurate if written") I would rather move this content
into the section on the snapshot summary instead of an appendix.

On Tue, Jan 14, 2025 at 1:30 PM huaxin gao  wrote:

> +1 non-binding
>
> On Tue, Jan 14, 2025 at 1:21 PM Steve Zhang
>  wrote:
>
>> +1 non-binding
>>
>> Thanks,
>> Steve Zhang
>>
>>
>>
>> On Jan 14, 2025, at 1:14 PM, Kevin Liu  wrote:
>>
>> +1 non-binding.
>>
>>
>>

Re: pre-proposal: schema_id on DataFile

2025-02-14 Thread rdb...@gmail.com

We've considered this in the past and I'm undecided on it. There is some
benefit, like being able to prune files during planning if the file didn't
contain a column that is used in a non-null filter (i.e. `new_data_column
IN ("a", "b")`).

On the other hand, we don't want data files that were written with older
schemas to prevent removing a schema from table metadata. The current
schema can always be used to read and we don't want to compromise that,
need to keep around too many schemas, or need to scan all metadata to
remove schemas.

I think my preference is to instead include the highest field ID in the
schema used to write a file. That enables the `new_data_column` filter
logic above, but never requires keeping schemas around.

As Fokko said, this probably depends on your use case. I'm talking about
cases that I've thought about but if you have a different one in mind,
we're open to adding this.

Ryan

On Fri, Feb 14, 2025 at 11:42 AM Devin Smith
 wrote:

> Thanks for the info, it is very helpful. I see it debugging down through
> `org.apache.iceberg.ManifestReader#readMetadata`. It wasn't obvious to me
> that this sort of data would be in the avro metadata as opposed to the
> org.apache.iceberg.ManifestFile object. I may have some questions later
> about the writing side of the equation in these regards...
>
> BTW, it looks like either the spec is incorrect, or the java
> implementation is incorrect; I see `schema` being written to the manifest
> header metadata, but not `schema-id`.
>
>
> https://github.com/apache/iceberg/blob/apache-iceberg-1.8.0/core/src/main/java/org/apache/iceberg/ManifestWriter.java#L346-L355
>
>
> https://github.com/apache/iceberg/blob/apache-iceberg-1.8.0/core/src/main/java/org/apache/iceberg/ManifestWriter.java#L312-L321
>
>
>
> On Fri, Feb 14, 2025 at 10:26 AM Fokko Driesprong 
> wrote:
>
>> Hi Devin,
>>
>> The schema-id is stored in the Manifest Avro header:
>> https://iceberg.apache.org/spec/#manifests Also the schema itself is
>> stored there. Would that help your situation? I think this makes adding it
>> to the data file redundant.
>>
>> Kind regards,
>> Fokko
>>
>> Op vr 14 feb 2025 om 17:56 schreef Devin Smith
>> :
>>
>>> I want to make sure I'm not missing something that already exists;
>>> otherwise, hoping to get a quick thumbs up / thumbs down on a potential
>>> proposal before spending more time on it.
>>>
>>> It would be nice to know what Iceberg schema a writer used (/assumed)
>>> when writing a DataFile. Oftentimes, this information is written into the
>>> parquet file's metadata, but it would be great if Iceberg provided this
>>> directly. A schema_id on DataFile would be nice, I think.
>>>
>>> Thanks,
>>> -Devin
>>>
>>

Re: [VOTE] Add overwriteRequested to RegisterTableRequest in REST spec

2025-02-13 Thread rdb...@gmail.com

+1

On Thu, Feb 13, 2025 at 9:56 AM Huang-Hsiang Cheng 
wrote:

> +1 (non-binding)
>
> On Feb 13, 2025, at 9:36 AM, Daniel Weeks  wrote:
>
> +1
>
> On Thu, Feb 13, 2025 at 9:07 AM Fokko Driesprong  wrote:
>
>> +1
>>
>> Op do 13 feb 2025 om 18:06 schreef Steven Wu :
>>
>>> +1 here.
>>>
>>> already approved the PR yesterday
>>>
>>> On Thu, Feb 13, 2025 at 8:17 AM Russell Spitzer <
>>> russell.spit...@gmail.com> wrote:
>>>
 +1

 On Wed, Feb 12, 2025 at 5:30 PM Steve Zhang
  wrote:

> Hi Iceberg Community,
>
>   I'm working on supporting the registration of iceberg metadata for
> an existing table in the catalog. As part of this work, I'm proposing to
> add an optional boolean field in RegisterTableRequest.
>
>   I'd like to start a vote on this REST spec change:
> https://github.com/apache/iceberg/pull/12239
>
> This vote will be open for at least 72 hours.
>
> [ ] +1 Add overwriteRequested to RegisterTableRequest  in REST spec
> [ ] +0
> [ ] -1 I have questions and/or concerns
>
> Thanks,
> Steve Zhang
>
>
>
>
>

Re: [ANNOUNCE] Apache Iceberg release 1.8.0

2025-02-18 Thread rdb...@gmail.com

+1 to reverting PT 11560 in main and 1.8.1. That avoids unnecessary
incompatibility with older readers.

I also agree that we should update the spec to say what Russell suggests:
> that -1 has meant "no current snapshot" in the past and is equivalent to
missing/null.

That's a correct description of the behavior.

I don't think that we should create new requirements for this that didn't
previously exist -- that is, I don't see much value in going back to
mandate this behavior when writing v1 or v2 tables. If an implementation
were writing null for the current snapshot ID up until now, I don't think
that we can say that behavior was or is incorrect. It was an
incompatibility with the Java implementation and we should note the
behavior in the "Implementation Notes" section.

> How about reverting #11560 for 1.8.1, and then reinstating this for 2.0.0?

I think we need to fix this at a format version boundary, not a library
version boundary. I'd be up for reinstating the write change for v3.

On Tue, Feb 18, 2025 at 7:56 AM Russell Spitzer 
wrote:

> The only thing I think I agree with is defining that -1 has meant "no
> current snapshot" in the past and
> is equivalent to a missing/nuil (if we have to specify that) .
>
> I don't think there is any reason to change the behavior of writing null /
> missing unless
> that's really a point of confusion for folks. Is there a JSON library
> folks are using
>  which distinguishes between missing/null?
>
> Would having field:null be the same as missing have avoided the issue
> because it seems like libraries
> that only handled non-null would also not handle "missing"
> current-snapshot-id as well?
>
> I'd really like to hear from other implementers here since changing all
> this again would be a lot of
> work and I thought we had understanding of Optional == Nullable as a valid
> thing in the spec.
>
> On Tue, Feb 18, 2025 at 7:35 AM Robert Stupp  wrote:
>
>> Correcting myself: schema/spec/sort seem to be always present - please
>> ignore that part in my previous email. The valid values for those fields
>> however should be defined.
>> On 18.02.25 14:29, Robert Stupp wrote:
>>
>> Reality is that Iceberg did write '-1' into current-snapshot-id (and
>> other "non-exist" marker values for schema/spec/sort) instead of omitting
>> the field.
>>
>> Side note: the table-spec says that these fields are optional, but
>> nothing about whether it is nullable.
>>
>> The spec should at least be amended to explicitly define the valid values
>> (-1 for current-snapshot-id for no snapshot is just a fact now that it's
>> there). IMO that field being 'null' isn't defined in the spec, but the
>> absence of the field is.
>>
>> Proposal for the implementation:
>>
>> * revert https://github.com/apache/iceberg/pull/11560 in 1.8.1 and on
>> main
>>
>> Proposal for the spec (for current-snapshot-id):
>>
>> * Define the valid value ranges for the field
>> * Define that the absence of the field means "no current snapshot"
>> * Define that the value -1 of the field means "no current snapshot"
>> * Define that the value 'null' of the field means "no current snapshot"
>> * Define that new implementations must not write the field if there's no
>> current snapshot
>> Proposal for the spec (for schema/spec/sort):
>>
>> * Define the valid value ranges for the fields
>> * Define the "schema/spec/sort not present" values (the fields are
>> optional for v1 but required for v2+v3).
>> * OR Define that "schema/spec/sort must be absent" if there is no current
>> schema/spec/sort.
>>
>> WDYT?
>>
>> On 17.02.25 21:07, Russell Spitzer wrote:
>>
>> It sounds like the argument here is that we should change the Spec for
>> V1, V2, and V3 to mark current-snapshot-id
>> as required. Then we should change all other implementations to follow
>> this new standard. I'm not sure that
>> is a good solution going forwards but I'm not sure of how we can support
>> catalogs/engines that cannot handle a null
>> correctly in this situation otherwise. Perhaps we should source out to
>> see if any other implementers worked off the
>> assumption of a non-optional "current-snapshot-id" and if we get a
>> critical mass we can try to make that change?
>> Because of how wide that change would be, I think we would need pretty
>> broad consensus to do so.
>>
>> We could possibly also have a flag to allow the old behavior but that
>> also feels wrong to me, we have often gone with a motto of
>> read "wrong" write "correct" for things like this in the past and
>> continuing to write "wrong" is a disservice to any
>> new implementers . When we do have a contradiction between our
>> implementation and the spec I think we have
>> to trust that implementers followed the spec and fix the core library.
>>
>> Are there any other solutions here?
>>
>> On Mon, Feb 17, 2025 at 11:45 AM Fokko Driesprong 
>> wrote:
>>
>>> Hey Robert,
>>>
>>> The thing is, that -1 cannot "go away".
>>>
>>>
>>> Yes, I agree, but that's also the case for null, as the

Re: [DISCUSS] Apache Iceberg 1.8.1 release

2025-02-19 Thread rdb...@gmail.com

+1 for rolling back the AWS SDK update for the patch release. We should get
the better fix into main though.

On Wed, Feb 19, 2025 at 8:56 AM Eduard Tudenhöfner 
wrote:

> @Yuya maybe it makes sense to revert the AWS SDK to a version that didn't
> introduce those integrity protection changes (added in 2.30.0) for 1.8.1
> and then see how we fix this properly for 1.9.0?
> I've opened https://github.com/apache/iceberg/pull/12339 to do that.
>
>
>
> On Wed, Feb 19, 2025 at 11:24 AM Jean-Baptiste Onofré 
> wrote:
>
>> Hi Eduard
>>
>> Thanks for volunteering for the 1.8.1 release. It makes sense to me.
>> I'm working on some Flink related updates, but for 1.9.0.
>> I will do a pass on GitHub Issues.
>>
>> Regards
>> JB
>>
>> On Wed, Feb 19, 2025 at 10:27 AM Eduard Tudenhöfner
>>  wrote:
>> >
>> > Hey everyone,
>> >
>> > we're currently tracking all issues that should go into 1.8.1 in the
>> 1.8.1 milestone.
>> > So far, all the issues that were previously added have been backported
>> to 1.8.x.
>> >
>> > I'm volunteering to be the release manager for 1.8.1 and the question
>> to the community is, whether there's anything else that came up and should
>> be included in 1.8.1.
>> > Otherwise I'll go ahead and do an RC by the end of this week.
>> >
>> >
>> > Thanks
>> > Eduard
>>
>

Re: [DISCUSS] Rest Catalog 419 Response Code

2025-02-24 Thread rdb...@gmail.com

Yeah, I don't think that this response was used. We thought that it was
needed but it can probably be safely removed as I'm not aware of any
implementation that sent or handled it. If that's the right thing to do
because there are other more standard mechanisms for sending more
information about a 401, then that sounds reasonable to me.

We should make sure the folks that run systems that authenticate using
SigV4 are okay with this, too.

On Mon, Feb 24, 2025 at 1:15 AM Alex Dutra 
wrote:

> Hi all,
>
> Thanks Sung for your support – I will start a discussion thread soon on
> the topic of expected client behavior for both 401 and 419 response codes.
>
> And out of curiosity, I looked at how the AuthenticationTimeoutResponse
> type was currently handled in iceberg-core, and was surprised to see that
> it's not even materialized as a RESTResponse. There is no test verifying
> this response type either. If we force the server to send back this error
> code, the client surfaces a generic error to the user (instead of
> NotAuthorizedException for instance):
>
> org.apache.iceberg.exceptions.RESTException: Unable to process:
> Credentials have timed out
>
> So imho this definitely means that we need a better handling of this error
> (and authentication errors in general).
>
> Thanks,
>
> Alex
>
> On Sun, Feb 23, 2025 at 12:30 AM Sung Yun  wrote:
>
>> Thank you Daniel, Dmitri, Alex and Yufei for sharing your thoughts.
>>
>> I'm in favor of updating the REST Spec to be more clear about these
>> status codes.
>>
>> Here's a PR[1] that follows the RFC language to make expected and
>> recommended behaviors of the server and the client more explicit
>> (thanks Dmitri for the suggestion). Please feel free to review and
>> make suggested changes.
>>
>> And Alex, yes that sounds like a great idea. I think codifying the
>> expected client behavior will be helpful, as it will help client
>> implementations in the different language libraries be more
>> consistent.
>>
>> [1] https://github.com/apache/iceberg/pull/12376
>>
>> On Fri, Feb 21, 2025 at 7:25 PM Yufei Gu  wrote:
>> >
>> > I'm fine with 419 being considered an optional status code, in which
>> case:
>> >
>> > Servers can choose not to implement it.
>> > Clients shouldn't rely solely on it for token expiration.
>> >
>> > However, the current REST spec doesn't clarify this. We should make it
>> more explicit.
>> >
>> > We could try to deprecate it later if needed.
>> >
>> > Yufei
>> >
>> >
>> > On Fri, Feb 21, 2025 at 5:04 AM Alex Dutra
>>  wrote:
>> >>
>> >> Hi all,
>> >>
>> >> I think there are two aspects in this issue that should be discussed
>> separately:
>> >>
>> >> What are the pros and cons of having a custom HTTP response code for
>> authentication issues, as opposed to using 401?
>> >> How should clients deal with authentication issues, both 401 and 419?
>> >>
>> >> Regarding aspect 1: I am not sure that I see a good reason for having
>> both 419 vs 401.
>> >>
>> >> First, it's a custom response code, so it requires custom handling for
>> clients. Sticking to HTTP standards is way safer imho.
>> >>
>> >> Secondly, a custom response code would be interesting only if the
>> client could take a specific action to fix the issue. But what can a client
>> do to "fix" a 419 response? An interactive client would redirect the user
>> to a login form. A non-interactive client would probably just fetch new
>> credentials and retry. That's exactly what these clients would do when
>> receiving 401 as well.
>> >>
>> >> Moreover, the 401 response code, as defined by RFCs 7235 and 9110 [1],
>> already provides a mechanism to tell the client what's wrong with
>> authentication and how to fix it: that's the purpose of the
>> WWW-Authenticate response header [2].
>> >>
>> >> For example, when using OAuth2 and bearer token authentication, an
>> error code named invalid_token is defined as follows [3]:
>> >>
>> >>> The access token provided is expired, revoked, malformed, or invalid
>> for other reasons.  The resource SHOULD respond with the HTTP 401
>> (Unauthorized) status code. The client MAY request a new access token and
>> retry the protected resource request.
>> >>
>> >>
>> >> The above error code and recommended client behavior seems perfectly
>> fit for signaling expired OAuth2 credentials to clients.
>> >>
>> >> When using other authentication schemes, like SigV4, obviously other
>> error codes should be used, but the 401 challenge mechanism is by no means
>> tied to OAuth2. Thus I do not see a valid reason to also support 419. I
>> would prefer if each authentication scheme (OAuth, SigV4 etc.) could
>> provide specific guidance as to how to interpret the WWW-Authenticate
>> header contents appropriately. I would also love to see if someone in this
>> mailing list could confirm how SigV4-enabled servers currently handle
>> expired credentials – but I would be surprised if they use anything other
>> than 401.
>> >>
>> >> Regarding aspect 2: as I mentioned above, clie

Re: [VOTE] Java implementation notes around current-snapshot-id

2025-02-24 Thread rdb...@gmail.com

+1

On Mon, Feb 24, 2025 at 12:26 PM Daniel Weeks  wrote:

> +1
>
> On Mon, Feb 24, 2025, 11:00 AM Russell Spitzer 
> wrote:
>
>> +1
>>
>> On Mon, Feb 24, 2025 at 12:55 PM Fokko Driesprong 
>> wrote:
>>
>>> Hi everyone,
>>>
>>> Recently, there was confusion
>>>  about
>>> valid values for the current-snapshot-id, which led to implementation
>>> notes  in the spec.
>>> Thanks for the reviews so far and to Daniel for the additional historical
>>> context on snapshot ID generation. Since there were a couple of
>>> reviews, and everything has been addressed, I would like to raise a vote to
>>> add this to the spec.
>>>
>>> This vote will be open for at least 72 hours.
>>>
>>> [ ] +1 Add the implementation notes to the spec
>>> [ ] +0
>>> [ ] -1 I have questions and/or concerns
>>>
>>> Kind regards,
>>> Fokko
>>>
>>

Re: [VOTE] Deprecate or remove distinct_count

2025-02-24 Thread rdb...@gmail.com

I can provide some context here. The field is very old and when we realized
that it was not only unused but also difficult to produce and use in
practice (can't be combined) we deprecated the field. However, some folks
from Dremio wanted to bring it back because they said they could store
values there and had a way to use them.

+1, but it would be good to check in with some Dremio engineers and see if
they are using it. I assume they aren't since this thread hasn't gotten
much attention. Thanks for bringing this up!

On Thu, Feb 13, 2025 at 8:02 AM Jacob Marble
 wrote:

> Xuanwo, do you favor deprecating or removing `distinct_count`?
>
> Due to lack of any real implementation, I myself favor removal (PR 12183).
>
> Jacob Marble
> 🔥🐅
>
>
> On Tue, Feb 11, 2025 at 10:25 PM Xuanwo  wrote:
>
>> Here is my +1 binding.
>>
>> The current status of `distinct_count` is quite confusing, which has also
>> led to additional discussions in `iceberg-rust` about whether we need to
>> add it and how to maintain it.
>>
>> Removing it seems reasonable to me, as there are no known use cases for
>> `distinct_count` in a single data file.
>>
>> On Tue, Feb 11, 2025, at 23:05, Fokko Driesprong wrote:
>>
>> My mistake, I suggested sending out an email with a quick vote on the PR.
>> I like the suggestion to use this thread for discussion since the number of
>> options is limited.
>>
>> I'm in favor of deprecating the field, to avoid that we re-use the
>> field-id in the future.
>>
>> Kind regards,
>> Fokko
>>
>> Op di 11 feb 2025 om 05:46 schreef Manu Zhang :
>>
>> Hi Jacob,
>>
>> Thanks for initiating the vote.
>> Typically, we would first have a DISCUSSION thread to reach a consensus
>> on the preferred option and then follow it up with a VOTE thread for
>> confirmation.
>>
>> Maybe we can take this as a DISCUSSION thread?
>>
>> Best,
>> Manu
>>
>>
>> On Tue, Feb 11, 2025 at 7:20 AM Jacob Marble
>>  wrote:
>>
>> This vote will be open for at least 72 hours.
>>
>> I propose that distinct_counts be either deprecated (#12182
>> ) or removed (#12183
>> ) from the spec.
>>
>> According to #767 
>> data_file.distinct_counts was deprecated about four years ago. Furthermore,
>> it not implemented in the canonical Java and Python implementations
>>
>> Please share your thoughts, and vote one of the following:
>> - remove
>> - deprecate
>> - no-op
>>
>> Jacob Marble
>> 🔥🐅
>>
>> Xuanwo
>>
>> https://xuanwo.io/
>>
>>

Re: [VOTE] Allow Row-Lineage with Equality Deletes

2025-02-20 Thread rdb...@gmail.com

+1

On Thu, Feb 20, 2025 at 10:01 AM Aihua Xu  wrote:

> +1 (non-binding).
>
> On Thu, Feb 20, 2025 at 9:41 AM Huang-Hsiang Cheng
>  wrote:
>
>> +1 (non-binding)
>>
>> Thanks,
>> Huang-Hsiang
>>
>> On Feb 20, 2025, at 9:37 AM, huaxin gao  wrote:
>>
>> +1 (non-binding)
>>
>> Thanks Russell!
>>
>> On Thu, Feb 20, 2025 at 1:57 AM Fokko Driesprong 
>> wrote:
>>
>>> +1
>>>
>>> Thanks Russell!
>>>
>>> Op do 20 feb 2025 om 10:25 schreef Péter Váry <
>>> peter.vary.apa...@gmail.com>:
>>>
 +1

 Manu Zhang  ezt írta (időpont: 2025. febr.
 20., Cs, 8:06):

> +1 (non-binding)
>
> Regards
> Manu
>
> On Thu, Feb 20, 2025 at 2:57 PM Jean-Baptiste Onofré 
> wrote:
>
>> +1
>>
>> Regards
>> JB
>>
>> On Wed, Feb 19, 2025 at 11:13 PM Russell Spitzer
>>  wrote:
>> >
>> > The PR: https://github.com/apache/iceberg/pull/12230  is basically
>> ready now. So let's do a last vote to make sure everyone is aware of the
>> upcoming change.
>> >
>> > Before:
>> > Equality deletes are not allowed to be used when row-lineage is
>> enabled
>> >
>> > After:
>> > Equality deletes are allowed to be used when row-lineage is enabled.
>> >
>> >
>> > Thanks for your time,
>> > Russ
>>
>
>>

95 matches

Mail list logo