Re: [VOTE] Release Apache Iceberg 1.4.3 RC0

2023-12-22 Thread Carl Steinbach
+1 (binding)

On Fri, Dec 22, 2023 at 3:44 PM Ryan Blue  wrote:

> +1 (binding)
>
> - Ran license checks
> - Validated signature and checksum
> - Built and validated tests are passing; CI shows python tests are red
> though :(
>
> On Fri, Dec 22, 2023 at 1:49 AM Ajantha Bhat 
> wrote:
>
>> +1 (non-binding)
>>
>> - validated checksum and signature
>> - checked license docs & ran RAT checks
>>
>> Thanks,
>> Ajantha
>>
>> On Fri, Dec 22, 2023 at 3:09 PM Eduard Tudenhoefner 
>> wrote:
>>
>>> +1 (non-binding)
>>>
>>> * validated checksum and signature
>>> * checked license docs & ran RAT checks
>>> * ran build and tests with JDK11
>>>
>>>
>>>
>>> On Fri, Dec 22, 2023 at 6:42 AM Driesprong, Fokko 
>>> wrote:
>>>
 +1 (Binding)

 Thanks JB for running this release!

 - Checked the signature and checksums
 - Ran the license check
 - Ran the tests locally
 - Tested against Trino:
 https://github.com/trinodb/trino/pull/20207

 Kind regards,
 Fokko

 Op do 21 dec 2023 om 16:53 schreef Jean-Baptiste Onofré <
 j...@nanthrax.net>:

> Hi Everyone,
>
> I propose that we release the following RC as the official Apache
> Iceberg 1.4.3 release.
>
> The commit ID is 9a5d24fee239352021a9a73f6a4cad8ecf464f01
> * This corresponds to the tag: apache-iceberg-1.4.3-rc0
> * https://github.com/apache/iceberg/commits/apache-iceberg-1.4.3-rc0
> *
> https://github.com/apache/iceberg/tree/9a5d24fee239352021a9a73f6a4cad8ecf464f01
>
> The release tarball, signature, and checksums are here:
> *
> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.4.3-rc0
>
> You can find the KEYS file here:
> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>
> Convenience binary artifacts are staged on Nexus. The Maven repository
> URL is:
> *
> https://repository.apache.org/content/repositories/orgapacheiceberg-1149/
>
> Please download, verify, and test.
>
> Please vote in the next 72 hours. (Weekends excluded)
>
> [ ] +1 Release this as Apache Iceberg 1.4.3
> [ ] +0
> [ ] -1 Do not release this because...
>
> Only PMC members have binding votes, but other community members are
> encouraged to cast
> non-binding votes. This vote will pass if there are 3 binding +1 votes
> and more binding
> +1 votes than -1 votes.
>

>
> --
> Ryan Blue
> Tabular
>


Re: Feedback Collection: Bylaws in Iceberg

2024-06-24 Thread Carl Steinbach
Hi Ryan and Jack,

The ASF's PMC Guide [1] is pretty clear on what belongs on the private list:


   - pre-disclosure security problems
   - pre-agreement discussions with third parties that require
   confidentiality
   - nominees for project committer, PMC or Foundation membership
   - personal conflicts among project personnel

As far as I can tell, none of the discussion so far falls into any of these
categories, so it should happen on the dev list. That includes the original
thread, which, unfortunately, started on the private list.

- Carl

[1] https://www.apache.org/dev/pmc.html#mailing-list-private


On Mon, Jun 24, 2024 at 12:09 PM Jack Ye  wrote:

> Sorry for the confusion Ryan, this is not mistakenly sent to devlist. As
> we discussed, this is the thread for collecting community feedback, which
> is essential for forming bylaws with the community.
>
> We have that separated discussion thread in the private list, which we
> will continue to iterate, and the vote will eventually be carried out in
> the private list as you said, according to the ASF guideline.
>
> -Jack
>
> On Mon, Jun 24, 2024 at 8:59 AM Ryan Blue 
> wrote:
>
>> Hey everyone, I think Jack mistakenly sent this to the dev list so please
>> let's pause discussion for now.
>>
>> There's a thread on the private list about this in which PMC members,
>> including me, have asked to keep it on the private list right now.
>>
>> The reason for that is that there's a long-standing norm to discuss the
>> conduct of individuals only on private lists. In this case, I think it
>> applies even though it is discussing hypothetical conduct. And note that
>> I'm one of the individuals here.
>>
>> I know that there are also things here that merit discussion (like how to
>> get PRs moving faster), but right now they are all tied up in a bundle of
>> proposed bylaws. I've asked on the PMC list to separate the concerns and to
>> discuss these issues individually. We will follow up with more discussion
>> on this list.
>>
>> Ryan
>>
>> On Mon, Jun 24, 2024 at 6:19 AM Renjie Liu 
>> wrote:
>>
>>> Thanks Jack for raising this, this is quite important to keep healthy of
>>> this community.
>>>
>>> I agree with Ajantha about the concerns of accumulated proposals and
>>> prs, and maybe we should have another thread to discuss about it?
>>>
>>> On Mon, Jun 24, 2024 at 20:37 Robert Stupp  wrote:
>>>
 Thanks Jack for the proposal.
 I’m generally +1 on this. There are a few details to clarify, but I
 suspect nothing that’s controversial.

 On 24. Jun 2024, at 12:45, Ajantha Bhat  wrote:

 Thank you, Jack, for your diligent work on this.

 This seems essential at the moment.

 I would like to address a couple of additional points that need our
 attention:


 *Criteria for Committership/PMC:*We've observed an inconsistency in
 how committership is granted. Contributors to sub-projects often attain
 committership to the main project more readily, while some who contribute
 significantly to the main project remain unrecognized. Although defining
 explicit criteria is challenging, it might be beneficial to establish
 guidelines or metrics that highlight the impact and quality of
 contributions. This could encourage more balanced and motivated
 participation across all project areas.


 *Accumulation of Proposals and PRs:*We have several proposals and PRs
 that are currently stalled despite multiple pings. Examples include the
 partition stats PR and proposals like table profitability, multi table
 transaction,  secondary indexing etc. These important contributions are not
 making the expected progress. It might be helpful to create bylaws or
 procedures to ensure these proposals and PRs receive the necessary
 attention and are addressed promptly. This could involve setting timeframes
 for reviews or establishing a prioritization process.

 Thoughts and feedback on these suggestions would be highly valuable.

 - Ajantha

 On Mon, Jun 24, 2024 at 1:09 PM Jack Ye  wrote:

> Hi everyone,
>
> In light of the recent change of company for a few committers and PMC
> members, I hear an increasing ask from the community to define proper
> processes in Iceberg to ensure its vendor neutral stance.
>
> I propose that we put up a bylaws document like other projects such as
> Apache Hadoop and Apache ORC. I think this will put people at peace and
> remove many people's concerns about the future of the project and its
> vendor-neutral stance.
>
> Here is a document that I have drafted that can be used as the
> starting point (mostly just copied from the Hadoop one):
> https://docs.google.com/document/d/1BVHbshE2dmCH8QzkeMd9PQdJ86_slavDy1YPueqNSgI/edit
>
> This proposal is currently undergoing review by the PMC. At the same
> time, it is critical to also understand 

Re: Feedback Collection: Bylaws in Iceberg

2024-06-24 Thread Carl Steinbach
+ private for PMC members who may not follow dev

1/ I encourage the folks who have already responded on the private@ thread
to replay their comments here. As I noted earlier, this discussion falls
outside the categories that belong on the private list.

2/ I think adopting a set of clearly articulated bylaws and a process for
amending them is a net good for the Iceberg community.

3/ Regarding the overlap between the proposed bylaws and other ASF
documents, having all of the rules in one place reduces the potential for
confusion and also makes it easier for outsiders to understand how the
project actually works.

4/ Regarding automatic emeritus status for committers and PMC members, it's
been my experience on other Apache projects that having bylaws is pointless
if the PMC can't achieve the necessary quorums for voting and that this
becomes unavoidable as the project ages and the size of the PMC increases.
As long as it's easy for emeritus PMC/commmitters to reinstate themselves,
I don't see any harm in the proposed rules for automatic emeritus status.

5/ As someone who firmly believes that Iceberg should have project bylaws,
I'm concerned that it may be hard to reach a consensus on the current
proposal because of its length and the inclusion of several provisions that
address situations that have yet to be encountered by this community. While
it may lead to more work, we're more likely to succeed if we focus first on
adopting a paired-down set of bylaws that includes a clearly defined
process for amending them in the future.

Thanks.

- Carl

On Mon, Jun 24, 2024 at 12:44 PM Jack Ye  wrote:

> Thanks for pointing to the ASF guidelines Carl, I did not know that. I had
> the impression of engaging with the private list first due to responses in
> previous devlist discussions, but I guess I landed in the right place
> eventually :)
>
> > In light of the recent change of company for a few committers and PMC
> members
> Sorry that is probably my wrong use of the word "in light". I just mean
> this event creates a good opportunity for opening this bylaws discussion. I
> didn't mean at all to have bylaws specifically for speculating and
> restricting a few specific people. I think bylaw is something essential as
> the project grows larger and more participants are involved in the
> community.
>
> -Jack
>
>
>
> On Mon, Jun 24, 2024 at 9:37 AM Ryan Blue 
> wrote:
>
>> The motivation for bylaws was this: "In light of the recent change of
>> company for a few committers and PMC members".
>>
>> That means that we're talking about new rules based on what a few
>> specific people might do. Speculation like that belongs on a private list,
>> just like discussing actual behavior would. While no individual was singled
>> out, it's clear who these committers and PMC members are.
>>
>> On Mon, Jun 24, 2024 at 9:30 AM J G  wrote:
>>
>>> > The reason for that is that there's a long-standing norm to discuss
>>> the conduct of individuals only on private lists. In this case, I think it
>>> applies even though it is discussing hypothetical conduct. And note that
>>> I'm one of the individuals here.
>>>
>>> Respectfully, what does this mean, Ryan? No individual was even singled
>>> out here. This comes off as stifling discussion, not cool...
>>>
>>> On Mon, Jun 24, 2024, 9:08 AM Jack Ye  wrote:
>>>
 Sorry for the confusion Ryan, this is not mistakenly sent to devlist.
 As we discussed, this is the thread for collecting community feedback,
 which is essential for forming bylaws with the community.

 We have that separated discussion thread in the private list, which we
 will continue to iterate, and the vote will eventually be carried out in
 the private list as you said, according to the ASF guideline.

 -Jack

 On Mon, Jun 24, 2024 at 8:59 AM Ryan Blue 
 wrote:

> Hey everyone, I think Jack mistakenly sent this to the dev list so
> please let's pause discussion for now.
>
> There's a thread on the private list about this in which PMC members,
> including me, have asked to keep it on the private list right now.
>
> The reason for that is that there's a long-standing norm to discuss
> the conduct of individuals only on private lists. In this case, I think it
> applies even though it is discussing hypothetical conduct. And note that
> I'm one of the individuals here.
>
> I know that there are also things here that merit discussion (like how
> to get PRs moving faster), but right now they are all tied up in a bundle
> of proposed bylaws. I've asked on the PMC list to separate the concerns 
> and
> to discuss these issues individually. We will follow up with more
> discussion on this list.
>
> Ryan
>
> On Mon, Jun 24, 2024 at 6:19 AM Renjie Liu 
> wrote:
>
>> Thanks Jack for raising this, this is quite important to keep healthy
>> of this community.
>>
>> I agree with Ajanth

Re: [Discussion] Apache Iceberg Community Guideline - Initial Version

2024-07-03 Thread Carl Steinbach
> 1. modified the name from "bylaws" to "community guidelines", following
the latest ASF guideline

I want to make sure everyone is aware that there is a substantive
difference between the meaning of "bylaw" and "guideline." Here's how the
two words are defined in the Cambridge Dictionary:

- Bylaw: "a rule that GOVERNS the members of an organization." [1]
[emphasis added]

- Guideline: "information intended to advise people on how something SHOULD
BE done or what something SHOULD BE." [2] [emphasis added]

Wikipedia, while not an authoritative source, provides useful context on
how these terms are used in practice:

- "A bylaw ... is a set of rules or law established by an organization or
community so as to regulate itself, as allowed or provided for by some
higher authority." [3]

- "A guideline is similar to a rule, but are legally less binding as
justified deviations are possible." [4]

I am neither a lawyer nor a lexicographer, but it seems clear that a
guideline carries no more weight than an officially approved suggestion,
while a bylaw is a binding rule. It's up to the PMC to decide whether this
document is a set of non-binding suggestions that SHOULD BE [5] followed or
a set of binding laws that MUST BE followed, but in either case, I think
the PMC needs to clearly convey their intention by using the correct word.

[1] https://dictionary.cambridge.org/us/dictionary/english/bylaw
[2] https://dictionary.cambridge.org/us/dictionary/english/guideline
[3] https://en.wikipedia.org/w/index.php?title=By-law&oldid=1215430864
[4] https://en.wikipedia.org/w/index.php?title=Guideline&oldid=1185185478
[5] https://datatracker.ietf.org/doc/html/rfc2119
[6] https://lists.apache.org/thread/h15qjp35ghg446xr5bnmmlg06p3hdoj9

On Tue, Jul 2, 2024 at 9:26 AM Jack Ye  wrote:

> Yes I am totally aware of the situation of people on vacation and
> traveling, and was in the process of talking and resolving some people's
> comments in the doc, that's why I did not start the vote as originally
> planned. I think we are all aligned on this, sorry I did not make it very
> clear in the last reply.
>
> And thank you Owen, this would be a great idea! I also heard some concerns
> of me driving this since I am also backed by a vendor. I considered opening
> the access to all PMC members, but there are some technical challenges like
> people's devlist email are not exactly their Gmail and many people are
> still out of town, so things were also delayed at this front. Let us know
> what you think is the best way to proceed!
>
> Best,
> Jack Ye
>
>
>
>
> On Tue, Jul 2, 2024 at 9:14 AM Daniel Weeks  wrote:
>
>> Thanks Owen,  I really appreciate the offer to moderate the discussion.
>> I think that's a good idea and it would really benefit the community to
>> have someone facilitating the discussion and drafting docs that does not
>> have commercial interest.
>>
>> A number of PMC members have expressed that they're currently traveling
>> or on vacation, which makes me concerned that the discussion isn't really
>> reflective of the PMC.
>>
>> I'd love to hear your thoughts on how we might want to proceed.
>>
>> Thanks,
>> Dan
>>
>> On Tue, Jul 2, 2024 at 1:57 AM Jean-Baptiste Onofré 
>> wrote:
>>
>>> Hi Owen
>>>
>>> Sorry I missed your message before replying. I agree, I think we
>>> should take more time on the proposal.
>>>
>>> Regards
>>> JB
>>>
>>> On Mon, Jul 1, 2024 at 10:14 PM Owen O'Malley 
>>> wrote:
>>> >
>>> > Sorry for coming into this conversation late, but I have a lot of
>>> experience with writing the bylaws for Apache projects (Hadoop & ORC). As a
>>> neutral third party (not working for Databricks or a cloud provider) who
>>> has a lot of Apache experience, I'd like to offer my service as a moderator
>>> for the discussion. I don't think it is appropriate for a small group to
>>> come back with a finished product for a final vote, especially during the
>>> summer when lots of people are travelling, this process should be much more
>>> gradual and inclusive.
>>> >
>>> > .. Owen
>>> >
>>> > On Mon, Jul 1, 2024 at 7:21 AM Jack Ye  wrote:
>>> >>
>>> >> Hi everyone,
>>> >>
>>> >> Thanks for all the comments and feedback on the document, I am
>>> working with a few commenters on some additional changes and wording, and
>>> then will carry out the vote.
>>> >>
>>> >> Best,
>>> >> Jack Ye
>>> >>
>>> >> On Thu, Jun 27, 2024 at 11:02 AM Jack Ye  wrote:
>>> >>>
>>> >>> To provide an update here, I have consolidated most of the comments
>>> in the initial version, with the following changes:
>>> >>>
>>> >>> (1) condensed the section of roles and responsibilities, with
>>> pointers to different pages in ASF and existing Iceberg project pages.
>>> >>>
>>> >>> (2) clarified voting details, regrading things like partial votes,
>>> difference of voting on mailing lists vs voting on GitHub PRs
>>> >>>
>>> >>> (3) clarified the section regarding lazy consensus. There is a
>>> definition difference between the ASF definition (no +1 vote ne

Re: [VOTE] Release Apache Iceberg 1.6.0 RC0

2024-07-17 Thread Carl Steinbach
+1 (binding)


On Wed, Jul 17, 2024 at 10:05 AM Amogh Jahagirdar <2am...@gmail.com> wrote:

> Following up,
>
> I think I confused myself on the original issue
> https://github.com/apache/iceberg/issues/8756 when testing. That issue
> was specific to REST implementations which use `CatalogHandlers` like
> `RESTCatalogAdapter` used in our unit tests. The fix in #10369 does address
> that case for creation. When testing I was creating a v2 table and
> attempting to replace it with a v1 table which I think makes sense to fail
> because the downgrade would possibly be lossy, and then rolling back would
> not be safe. My original statement that "I think clients should not fail to
> build the change set with the format version change." is probably not
> correct for the downgrade case; it sounds best to fail on the client side
> since it's known to be unsafe.
>
> So from a fix/issue perspective, I think we're covered. However, in terms
> of APIs there's still the case of the public constructor that I added in
> #10369. That should not be public.
>
> Thanks and sorry for the confusion there,
>
> Amogh Jahagirdar
>
>
>
>
> On Wed, Jul 17, 2024 at 9:48 AM Amogh Jahagirdar <2am...@gmail.com> wrote:
>
>> I'm -1 (non-binding).
>>
>> Aside from running through the standard checks, I was testing
>> https://github.com/apache/iceberg/pull/10369/files via Spark against a
>> REST catalog (a non-testing REST catalog) and the issue still exists
>> although the stack trace just looks a bit different now. The fix currently
>> handles it on the catalog handler's side which really masks the real issue
>> of failing to build the changes for the replace on the client side (so imo
>> it's not really a fix looking back on it). I'm still thinking through what
>> a robust solution is; in the end for REST, the service needs to be able to
>> handle it, but I think clients should not fail to build the change set with
>> the format version change.
>>
>> To be clear, I don't think I'd block on a fix for this since I'm not sure
>> how common of a case it is for downgrade of version for a replace is and if
>> there's interest in a 1.6.1, we can aim for a more thought through solution
>> for that release.
>>
>> However the main concern I have is when I was going through the fix, the
>> table metadata builder constructor I added as part of this
>> https://github.com/apache/iceberg/pull/10369/files#diff-c540a31e66b157a8f080433c82a29a070096d0e08c6578a0099153f1229bdb7aR913
>> is marked public, which I think I'd prefer to change to private upfront
>> rather than have to go through a deprecation cycle/revAPI changes.
>>
>> Thanks,
>>
>> Amogh Jahagirdar
>>
>>
>> On Wed, Jul 17, 2024 at 2:29 AM Honah J.  wrote:
>>
>>> +1 (non-binding)
>>>
>>>- verified signature and checksum
>>>- verified license doc
>>>- verified build and tests with JDK 17
>>>
>>> Best regards,
>>> Honah
>>>
>>> On Tue, Jul 16, 2024 at 10:40 PM Ajantha Bhat 
>>> wrote:
>>>
 Gentle reminder for the PMC members, we need at least two additional
> binding votes.


 One additional vote. We have binding votes from Russell and Fokko
 already.

 On Wed, Jul 17, 2024 at 10:54 AM Jean-Baptiste Onofré 
 wrote:

> Gentle reminder for the PMC members, we need at least two additional
> binding votes.
>
> Thanks !
> Regards
> JB
>
> On Fri, Jul 12, 2024 at 4:48 PM Jean-Baptiste Onofré 
> wrote:
> >
> > Hi everyone,
> >
> > I propose that we release the following RC as the official Apache
> > Iceberg 1.6.0 release.
> >
> > The commit ID is ed228f79cd3e569e04af8a8ab411811803bf3a29
> > * This corresponds to the tag: apache-iceberg-1.6.0-rc0
> > * https://github.com/apache/iceberg/commits/apache-iceberg-1.6.0-rc0
> > *
> https://github.com/apache/iceberg/tree/ed228f79cd3e569e04af8a8ab411811803bf3a29
> >
> > The release tarball, signature, and checksums are here:
> > *
> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.6.0-rc0
> >
> > You can find the KEYS file here:
> > * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
> >
> > Convenience binary artifacts are staged on Nexus. The Maven
> repository URL is:
> > *
> https://repository.apache.org/content/repositories/orgapacheiceberg-1164/
> >
> > Please download, verify, and test.
> >
> > Please vote in the next 72 hours.
> >
> > [ ] +1 Release this as Apache Iceberg 1.6.0
> > [ ] +0
> > [ ] -1 Do not release this because...
> >
> > Only PMC members have binding votes, but other community members are
> > encouraged to cast non-binding votes. This vote will pass if there
> are
> > 3 binding +1 votes and more binding +1 votes than -1 votes.
> >
> > Thanks,
> > Regards
> > JB
>



Iceberg Community Events Calendars?

2024-07-24 Thread Carl Steinbach
There are two Google Calendars listed on the community page
 of the Apache Iceberg website [1] -
one for community events

[2] and one for dev events

[3] - and as far as I can tell, both are defunct.

I'm writing with a couple of questions/requests:

1) Do we know who owns these calendars?

2) If so, can we either ask that person to start maintaining them or else
transfer them to someone else who can?

3) Can the person who currently owns the calendar invites for the weekly
sync meetings (dev, python, etc) please forward those invites to this list?

Thanks.

- Carl

[1] https://iceberg.apache.org/community/
[2]
https://calendar.google.com/calendar/u/0?cid=NTkzYmIwMGJmZTQ1N2QzMTkxNDEzNTBkZDI0Yzk2NGYzOWJkYmQ5ZmQyNDMyODFhODYzMmEwMDk2M2EyMWQ4NkBncm91cC5jYWxlbmRhci5nb29nbGUuY29t
[3]
https://calendar.google.com/calendar/u/0?cid=MzkwNWQ0OTJmMWI0NTBiYTA3MTJmMmFlNmFmYTc2ZWI3NTdmMTNkODUyMjBjYzAzYWE0NTI3ODg1YWRjNTYyOUBncm91cC5jYWxlbmRhci5nb29nbGUuY29t


Re: [DISCUSS] Iceberg 1.6.1 release

2024-08-19 Thread Carl Steinbach
I'm +1 on doing the 1.6.1 release now, followed by 1.6.2 with the Avro
changes once they become available. I'm also available to help with the
release, though Piotr has already volunteered to be the release manager,
which is great.

- Carl

On Thu, Aug 15, 2024 at 10:24 AM Piotr Findeisen 
wrote:

> Hey Fokko,
>
> Given that Avro 1.11.4 Java release was "1-2 weeks" a week ago, it should
> be done or in progress by now :)
> It seems the discussion
> https://lists.apache.org/thread/yycy9bp21r4cgq68vk9d66bkqrb162tq stalled
> 5 days ago though.
> Should we restart it, or rather go ahead with the release and let Avro fix
> come as 1.6.2?
> The patch we would be releasing was merged on Jul 26th and it would be
> great to make it available to downstream projects.
>
> I volunteer to help with Iceberg 1.6.1 release, to share the operational
> cost.
>
>
> Best
> Piotr
>
>
>
> On Thu, 8 Aug 2024 at 22:43, Fokko Driesprong  wrote:
>
>> Hey Piotr,
>>
>> We had some delays with the Avro 1.12.0 release, mostly because all the
>> languages were released at once. On the Avro devlist, I suggested
>> releasing 1.11.4 just for Java because of the CVE. Realistically this would
>> be around 1-2 weeks. Does that sound reasonable?
>>
>> Kind regards,
>> Fokko
>>
>> Op wo 7 aug 2024 om 20:03 schreef Piotr Findeisen <
>> piotr.findei...@gmail.com>:
>>
>>> Hey Fokko,
>>>
>>> thanks, that makes sense!
>>> Do you maybe know the timeline for the Avro release?
>>> Trino awaits the 1.6.1 release, so it would be great if we could get
>>> this rolling rather sooner than later.
>>>
>>> Best
>>> Piotr
>>>
>>>
>>>
>>> On Wed, 7 Aug 2024 at 16:33, Driesprong, Fokko 
>>> wrote:
>>>
 Hey Piotr,

 The Avro release still has to be done. We have 1.12.0 which has
  been released, but that
 also drops Java 8 support, so we can't backport it. We still have to run
 the 1.11.4 Avro release to backport the CVE fix.

 Kind regards,
 Fokko

 Op wo 7 aug 2024 om 16:15 schreef Piotr Findeisen <
 piotr.findei...@gmail.com>:

> Hi
>
> Thank you JB and Eduard for commenting!
>
> JB, which Avro version we would be updating to for the CVE fix?
>
> Best
> Piotr
>
>
> On Mon, 29 Jul 2024 at 13:36, Jean-Baptiste Onofré 
> wrote:
>
>> That's fair (and I agree), but as these coming Avro releases include
>> CVE fix, I think it's worth considering.
>>
>> Regards
>> JB
>>
>> On Mon, Jul 29, 2024 at 9:07 AM Eduard Tudenhöfner
>>  wrote:
>> >
>> > I don't think we should be including general dependency updates in
>> a patch release unless they are critical.
>> >
>> > On Mon, Jul 29, 2024 at 8:13 AM Jean-Baptiste Onofré <
>> j...@nanthrax.net> wrote:
>> >>
>> >> Hi,
>> >>
>> >> It would be great to include the Avro update in 1.6.1 release.
>> >>
>> >> I agree for a maintenance release on 1.6.x, but I would like to
>> >> include a couple of updates.
>> >>
>> >> Happy to drive this release :)
>> >>
>> >> Thanks !
>> >> Regards
>> >> JB
>> >>
>> >> On Fri, Jul 26, 2024 at 6:19 PM Piotr Findeisen
>> >>  wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > ParallelIterable memory limit PR [1] is backported to 1.6.x
>> branch [2].
>> >> >
>> >> > Are there any other bug fixes that should go into 1.6.1 release?
>> >> >
>> >> > Best,
>> >> > Piotr
>> >> >
>> >> >
>> >> > [1] https://github.com/apache/iceberg/pull/10691
>> >> > [2] https://github.com/apache/iceberg/pull/10787
>> >> >
>> >> >
>>
>


[VOTE] Release Apache Iceberg 1.6.1 RC0

2024-08-19 Thread Carl Steinbach
Hi Everyone,

I propose that we release the following RC as the official Apache Iceberg
0.6.1 release.

The commit ID is e18a2fe10214f5f3ffa0a317a28af8b2a619817a
* This corresponds to the tag: apache-iceberg-0.6.1-rc0
* https://github.com/apache/iceberg/commits/apache-iceberg-0.6.1-rc0
*
https://github.com/apache/iceberg/tree/e18a2fe10214f5f3ffa0a317a28af8b2a619817a

The release tarball, signature, and checksums are here:
* https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.6.1-rc0

You can find the KEYS file here:
* https://dist.apache.org/repos/dist/dev/iceberg/KEYS

Convenience binary artifacts are staged on Nexus. The Maven repository URL
is:
* https://repository.apache.org/content/repositories/orgapacheiceberg-1169/

Please download, verify, and test.

Please vote in the next 72 hours.

[ ] +1 Release this as Apache Iceberg 0.6.1
[ ] +0
[ ] -1 Do not release this because...

Only PMC members have binding votes, but other community members are
encouraged to cast
non-binding votes. This vote will pass if there are 3 binding +1 votes and
more binding
+1 votes than -1 votes.


Re: [VOTE] Release Apache Iceberg 1.6.1 RC0

2024-08-19 Thread Carl Steinbach
Wow, that's embarrassing. Let me work on RC1.

- Carl

On Mon, Aug 19, 2024 at 6:11 PM Ajantha Bhat  wrote:

> -1
>
> because the artifacts versions are incorrect.
> It should be 1.6.1 instead of 0.6.1
>
> - Ajantha
>
> On Tue, Aug 20, 2024 at 8:54 AM Carl Steinbach  wrote:
>
>> Hi Everyone,
>>
>> I propose that we release the following RC as the official Apache Iceberg
>> 0.6.1 release.
>>
>> The commit ID is e18a2fe10214f5f3ffa0a317a28af8b2a619817a
>> * This corresponds to the tag: apache-iceberg-0.6.1-rc0
>> * https://github.com/apache/iceberg/commits/apache-iceberg-0.6.1-rc0
>> *
>> https://github.com/apache/iceberg/tree/e18a2fe10214f5f3ffa0a317a28af8b2a619817a
>>
>> The release tarball, signature, and checksums are here:
>> * https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.6.1-rc0
>>
>> You can find the KEYS file here:
>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>
>> Convenience binary artifacts are staged on Nexus. The Maven repository
>> URL is:
>> *
>> https://repository.apache.org/content/repositories/orgapacheiceberg-1169/
>>
>> Please download, verify, and test.
>>
>> Please vote in the next 72 hours.
>>
>> [ ] +1 Release this as Apache Iceberg 0.6.1
>> [ ] +0
>> [ ] -1 Do not release this because...
>>
>> Only PMC members have binding votes, but other community members are
>> encouraged to cast
>> non-binding votes. This vote will pass if there are 3 binding +1 votes
>> and more binding
>> +1 votes than -1 votes.
>>
>


[VOTE] Release Apache Iceberg 1.6.1 RC1

2024-08-19 Thread Carl Steinbach
Hi Everyone,

I propose that we release the following RC as the official Apache Iceberg
1.6.1 release.

The commit ID is e18a2fe10214f5f3ffa0a317a28af8b2a619817a
* This corresponds to the tag: apache-iceberg-1.6.1-rc1
* https://github.com/apache/iceberg/commits/apache-iceberg-1.6.1-rc1
*
https://github.com/apache/iceberg/tree/e18a2fe10214f5f3ffa0a317a28af8b2a619817a

The release tarball, signature, and checksums are here:
* https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.6.1-rc1

You can find the KEYS file here:
* https://dist.apache.org/repos/dist/dev/iceberg/KEYS

Convenience binary artifacts are staged on Nexus. The Maven repository URL
is:
* https://repository.apache.org/content/repositories/orgapacheiceberg-1170/

Please download, verify, and test.

Please vote in the next 72 hours.

[ ] +1 Release this as Apache Iceberg 1.6.1
[ ] +0
[ ] -1 Do not release this because...

Only PMC members have binding votes, but other community members are
encouraged to cast
non-binding votes. This vote will pass if there are 3 binding +1 votes and
more binding
+1 votes than -1 votes.


Re: [DISCUSS] Iceberg incubator report for February 2019

2019-02-06 Thread Carl Steinbach
Looks good to me. +1

- Carl

On Tue, Feb 5, 2019 at 4:17 PM Ryan Blue  wrote:

> Hi everyone,
>
> Here's a draft of the Iceberg community report for the IPMC this month.
> Please reply if you have comments or would like to add anything. I'm going
> to post this on the report page in the mean time, but I'll update it with
> any additions from this list. Thanks!
>
> 
> Iceberg
>
> Iceberg is a table format for large, slow-moving tabular data.
>
> Iceberg has been incubating since 2018-11-16.
>
> Three most important issues to address in the move towards graduation:
>
>   1. Update build for Apache release, add LICENSE/NOTICE to Jars.
>   2. Make the first Apache release.
> (https://github.com/apache/incubator-iceberg/milestone/1)
>   3. Grow the Iceberg community
>
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> aware of?
>
>   * No issues that require attention.
>
> How has the community developed since the last report?
>
>   * Pull requests from 6 contributors were merged, 7 new contributors
>
> How has the project developed since the last report?
>
>   * Submitted evidence for podling name search: PODLINGNAMESEARCH-163
>   * Netflix submitted a revised trademark agreement for counter-signing
>   * Abstracted data file locations for community use cases
>   * Reviewing proposed API update for file stream encryption plugins
>   * New contributor highlights:
> - A new contributor is fixing case sensitivity in expressions
> - A new contributor opened a PR to add a startsWith predicate
> - A new contributor reviewed 4 pull requests and opened another
>
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
>
>   [X] Initial setup (name clearance approval pending)
>   [X] Working towards first release
>   [X] Community building
>   [ ] Nearing graduation
>   [ ] Other:
>
> Date of last release:
>
>   None yet
>
> When were the last committers or PPMC members elected?
>
>   None yet
>
> Have your mentors been helpful and responsive or are things falling
> through the cracks? In the latter case, please list any open issues
> that need to be addressed.
>
>   Yes.
>
> Signed-off-by:
>
>   [X](iceberg) Ryan Blue
>  Comments: I wrote the first pass of the report, but after the
> deadline.
>   [ ](iceberg) Julien Le Dem
>  Comments:
>   [ ](iceberg) Owen O'Malley
>  Comments: Approval from +1 on dev list.
>   [ ](iceberg) James Taylor
>  Comments:
>   [ ](iceberg) Carl Steinbach
>  Comments:
>
>
> --
> Ryan Blue
>


Re: [DISCUSS] September report

2019-09-06 Thread Carl Steinbach
+1 to the report
+1 to graduation for the same set of reasons mentioned by Owen.

- Carl

On Fri, Sep 6, 2019 at 12:04 PM Owen O'Malley 
wrote:

> On Fri, Sep 6, 2019 at 12:19 AM Justin Mclean  wrote:
>
>> So why does the project think it's ready to graduate? Mentors do you
>> think the project is ready to graduate?
>>
>
> It has to make a release or two, but I agree with Ryan that it approaching
> graduation. The project entered Apache with five Apache members from
> different companies. It has grown the community to include a few more
> companies. I think it is doing great.
>
> .. Owen
>
>


Re: [RESULT] [VOTE] Release Apache Iceberg 0.7.0-incubating RC4

2019-10-22 Thread Carl Steinbach
Belated +1 from me too.

On Tue, Oct 22, 2019, 12:43 PM Ryan Blue  wrote:

> Thanks James!
>
> The LICENSE and NOTICE in the runtime Jar are updated for the shaded
> dependencies. For NOTICE, only Airlift and ORC had notice file contents
> that needed to be included. For LICENSE, there are quite a few additional
> entries for the shaded Jars. You can find more detail on the PR:
> https://github.com/apache/incubator-iceberg/pull/356
>
> rb
>
> On Tue, Oct 22, 2019 at 12:37 PM James Taylor 
> wrote:
>
>> Belated +1 from me (will vote on incubator vote once started). I
>> successfully downloaded, verified license, verified checksum, verified
>> signature, built from source, and ran unit tests.
>>
>> I have similar questions as Jacques about the binary release. I think you
>> may need a top level LICENSE and NOTICE file that is a concatenation of the
>> LICENSE and NOTICE files of the transitive closure of dependencies. I
>> haven't had to deal with this in quite a while, though, so I may be off
>> here. Justin Mclean is very knowledgeable in this area. FWIW, most
>> incubator projects don't attempt a binary release as part of their first
>> release.
>>
>> On Tue, Oct 22, 2019 at 9:19 AM Ryan Blue  wrote:
>>
>>> With +1 votes and no +0 or -1 votes, this candidate passes.
>>>
>>> +1 votes:
>>> John Zhuge
>>> *Anton Okolnychyi
>>> Junjie Chen
>>> David Christle
>>> Thippana Vamsi Kalyan
>>> *Ryan Blue
>>> *Daniel Weeks
>>> *Parth Brahmbhatt
>>>
>>> *=binding PPMC vote
>>>
>>> The next step is to start a thread on the IPMC list. While in
>>> incubation, our releases are a two-stage process where the podling PMC
>>> validates and then the incubator PMC double-checks the result.
>>>
>>> rb
>>>
>>> On Fri, Oct 18, 2019 at 5:13 PM Ryan Blue  wrote:
>>>
 Hi everyone,

 I propose the following RC to be released as official Apache Iceberg
 0.7.0-incubating release.

 The commit id is 9c81babac65351f7aa21dd878f01c5c81ae304af
 * This corresponds to the tag: apache-iceberg-0.7.0-incubating-rc4
 *
 https://github.com/apache/incubator-iceberg/tree/apache-iceberg-0.7.0-incubating-rc4
 *
 https://github.com/apache/incubator-iceberg/tree/9c81babac65351f7aa21dd878f01c5c81ae304af

 The release tarball, signature, and checksums are here:
 *
 https://dist.apache.org/repos/dist/dev/incubator/iceberg/apache-iceberg-0.7.0-incubating-rc4/

 You can find the KEYS file here:
 * https://dist.apache.org/repos/dist/dev/incubator/iceberg/KEYS

 Convenience binary artifacts are staged in Nexus. The Maven repository
 URL is:
 *
 https://repository.apache.org/content/repositories/orgapacheiceberg-1004/

 This is the first Apache Iceberg release.

 Please download, verify, and test; then vote in the next 72 hours.

 [ ] +1 Release this as Apache Iceberg 0.7.0-incubating
 [ ] +0
 [ ] -1 Do not release this because...

 --
 Ryan Blue

>>>
>>>
>>> --
>>> Ryan Blue
>>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


Re: [VOTE] Release Apache Iceberg 0.8.0-incubating RC2

2020-05-02 Thread Carl Steinbach
+1 (binding)


On Fri, May 1, 2020 at 9:38 AM RD  wrote:

> +1
> Validated all the steps mentioned.
>
> -R
>
> On Fri, May 1, 2020 at 9:31 AM Ryan Blue 
> wrote:
>
>> +1 (binding)
>>
>> Ran rat, validated checksums and signature, and ran the build.
>>
>> I noticed that the iceberg-spark-runtime Jar is about 22MB larger and it
>> looks like the problem is mainly that parquet-avro 1.11.0 is shading all of
>> fastutil without minimizing the Jar like parquet-column does. I tried
>> rolling back to 1.10.1, but that requires rolling back Avro as well, so I
>> think the best option right now is to continue with a 37MB runtime Jar. We
>> can fix this in a 0.8.1 release when Parquet releases 1.11.1 with a fix.
>>
>> rb
>>
>> On Thu, Apr 30, 2020 at 11:41 PM Gautam  wrote:
>>
>>>
>>> Ran checks on
>>> https://dist.apache.org/repos/dist/dev/incubator/iceberg/apache-iceberg-0.8.0-incubating-rc2/
>>>
>>> √ RAT checks passed
>>> √ signature is correct
>>> √ checksum is correct
>>> √ build from source (with java 8)
>>> √ run tests locally
>>>
>>> +1 (non-binding)
>>>
>>>
>>>
>>> On Thu, Apr 30, 2020 at 4:18 PM Samarth Jain 
>>> wrote:
>>>
 +1 (non-binding)
 all checks passed

 On Thu, Apr 30, 2020 at 4:06 PM John Zhuge  wrote:

> +1 (non-binding)
>
>1. Checked signature and checksum
>2. Checked license
>3. Built and ran unit tests.
>
>
> On Thu, Apr 30, 2020 at 2:24 PM Owen O'Malley 
> wrote:
>
>> +1
>>
>>1. Checked signature and checksum
>>2. Built and ran unit tests.
>>3. Checked ORC version :)
>>
>> On Monday, ORC released 1.6.3, so we should grab those fixes soon.
>>
>> .. Owen
>>
>> On Thu, Apr 30, 2020 at 12:34 PM Dongjoon Hyun <
>> dongjoon.h...@gmail.com> wrote:
>>
>>> +1.
>>>
>>> 1. Verified checksum, sig, and license
>>> 3. Build from the source and run UTs.
>>> 4. Run some manual ORC write/read tests with Apache Spark
>>> 2.4.6-SNAPSHOT (as of today).
>>>
>>> Thank you, all!
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> On Thu, Apr 30, 2020 at 10:28 AM parth brahmbhatt <
>>> brahmbhatt.pa...@gmail.com> wrote:
>>>
 +1. checks passed, did not observe the unit test failure.

 Thanks
 Parth

 On Thu, Apr 30, 2020 at 9:13 AM Daniel Weeks 
 wrote:

> +1 all checks passed
>
> On Thu, Apr 30, 2020 at 8:53 AM Anton Okolnychyi
>  wrote:
>
>> That test uses many concurrent writes and I’ve seen cases when it
>> led to deadlocks in our test HMS. I think HMS is capable of 
>> recovering on
>> its own but that process can be slow in highly concurrent 
>> environments.
>> There is a 2 min timeout in that test so it can potentially fail. 
>> I’ve seen
>> a deadlock but 2 min was always enough for that test in my local env 
>> and
>> internal/upstream build pipelines. If there is an environment that
>> constantly or frequently hits this problem, it would be great to 
>> check
>> debug logs.
>>
>> I am +1 on releasing RC2. I checked it locally.
>>
>> - Anton
>>
>> On 30 Apr 2020, at 02:52, Mass Dosage 
>> wrote:
>>
>> The build for RC2 worked fine for me, I didn't get a failure on
>> "TestHiveTableConcurrency". Perhaps there is some kind of race 
>> condition in
>> the test? I have seen timeout errors like that when I ran tests on an
>> overloaded machine, could that have been the case?
>>
>> On Thu, 30 Apr 2020 at 08:32, OpenInx  wrote:
>>
>>> I checked the rc2, seems the TestHiveTableConcurrency is broken,
>>> may need to fix it.
>>>
>>> 1. Download the tarball and check the signature & checksum: OK
>>> 2. license checking: RAT checks passed.
>>> 3. Build and test the project (java8):
>>> org.apache.iceberg.hive.TestHiveTableConcurrency >
>>> testConcurrentConnections FAILED
>>> java.lang.AssertionError: Timeout
>>> at org.junit.Assert.fail(Assert.java:88)
>>> at org.junit.Assert.assertTrue(Assert.java:41)
>>> at
>>> org.apache.iceberg.hive.TestHiveTableConcurrency.testConcurrentConnections(TestHiveTableConcurrency.java:106)
>>>
>>> On Thu, Apr 30, 2020 at 9:29 AM Ryan Blue 
>>> wrote:
>>>
 Hi everyone,

 I propose the following candidate to be released as the
 official Apache Iceberg 0.8.0-incubating release.

 The commit id is 8c05a2f5f1c8b111c049d43cf15cd8a51920dda1
 * This corresponds to the tag:
 apache-iceberg-0.8.0-incubating-rc2
 *
>>>

Re: [VOTE] Release Apache Iceberg 0.9.1 RC0

2020-08-12 Thread Carl Steinbach
LGTM
+1 (binding)


On Wed, Aug 12, 2020 at 11:47 AM Ryan Murray  wrote:

> 1. Verify the signature: OK
> 2. Verify the checksum: OK
> 3. Untar the archive tarball: OK
> 4. Run RAT checks to validate license headers: RAT checks passed
> 5. Build and test the project: all unit tests passed.
>
> +1 (non-binding)
>
> Best,
>
> Ryan
>
> On Tue, Aug 11, 2020 at 6:56 PM Ryan Blue  wrote:
>
>> Hi everyone,
>>
>> I propose the following RC to be released as official Apache Iceberg
>> 0.9.1 release.
>>
>> The commit id is e7c59ec83bba9b2d934ef124d36eabfcdd33f319
>> * This corresponds to the tag: apache-iceberg-0.9.1-rc0
>> * https://github.com/apache/iceberg/commits/apache-iceberg-0.9.1-rc0
>> * https://github.com/apache/iceberg/tree/e7c59ec8
>>
>> The release tarball, signature, and checksums are here:
>> *
>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.9.1-rc0/
>>
>> You can find the KEYS file here:
>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>
>> Convenience binary artifacts are staged in Nexus. The Maven repository
>> URL is:
>> *
>> https://repository.apache.org/content/repositories/orgapacheiceberg-1009/
>>
>> This is a patch release on top of 0.9.0 with several fixes:
>> * A correctness fix for ORC timestamps before 1 Jan 1970 that were
>> written by Spark
>> * A read fix for ORC decimals with precision less than 18
>> * A fix to support push-down predicates with negated expressions
>> * Fixes for imports from other shaded Guava locations
>>
>> For the full list, see the 0.9.x branch:
>> https://github.com/apache/iceberg/commits/0.9.x
>>
>> Please download, verify, and test.
>>
>> Please vote in the next 72 hours.
>>
>> [ ] +1 Release this as Apache Iceberg 0.9.1
>> [ ] +0
>> [ ] -1 Do not release this because...
>>
>>
>> --
>> Ryan Blue
>>
>


Re: Adobe Blog ..

2021-01-15 Thread Carl Steinbach
I'll do it!

On Fri, Jan 15, 2021 at 7:46 PM Ryan Blue  wrote:

> Yes, we should start a page with posts like these. LinkedIn recently had a
> good one, too:
> https://engineering.linkedin.com/blog/2021/fastingest-low-latency-gobblin
>
> Anyone want to start a PR for the docs?
>
> On Fri, Jan 15, 2021 at 4:28 PM Russell Spitzer 
> wrote:
>
>> Can't wait to read them!
>>
>> On Fri, Jan 15, 2021 at 6:25 PM Gautam  wrote:
>>
>>> > I think it would be great to add a section to the website linking to
>>> helpful articles, slide decks, etc about Iceberg. In the trenches
>>> information is often the most useful
>>>
>>> +1 ..IIRC  there were also some thoughts around adding a "*PoweredBy*"
>>> page? Our team often gets asked by folks: "*Who else is using Iceberg*?",
>>> I would love to point them to the fast growing list of companies/teams that
>>> do :-) .. Wdyt?
>>>
>>> On Fri, Jan 15, 2021 at 4:00 PM Jacques Nadeau 
>>> wrote:
>>>
 +1. This is a great series.

 I think it would be great to add a section to the website linking to
 helpful articles, slide decks, etc about Iceberg. In the trenches
 information is often the most useful.

 On Fri, Jan 15, 2021 at 3:43 PM Ryan Blue 
 wrote:

> Thanks, Gautam! I was just reading the one on query optimizations.
> Great that you are writing this series, I think it will be helpful.
>
> On Fri, Jan 15, 2021 at 3:36 PM Gautam 
> wrote:
>
>> Hello Devs,
>>   We at Adobe have been penning down our experiences
>> with Apache Iceberg thus far. Here is the third blog in that series 
>> titled:
>> "Taking Query Optimizations to the Next Level with Iceberg" *[1]*.
>> In case you haven't, here are the first two blogs titled "Iceberg at 
>> Adobe"
>> *[2]* and "High Throughput Ingestion with Iceberg" *[3]*.
>>
>> Hoping these are helpful to others..
>>
>> thanks and regards,
>> -Gautam.
>>
>> [1] -
>> https://medium.com/adobetech/taking-query-optimizations-to-the-next-level-with-iceberg-6c968b83cd6f
>> [2] - https://medium.com/adobetech/iceberg-at-adobe-88cf1950e866
>> [3] -
>> https://medium.com/adobetech/high-throughput-ingestion-with-iceberg-ccf7877a413f
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


Re: [VOTE] Release Apache Iceberg 0.11.1 RC0

2021-03-31 Thread Carl Steinbach
+1 (binding)

+ Built and tested code following instructions in README
+ Verified checksums

- Carl

On Wed, Mar 31, 2021 at 1:13 PM John Zhuge  wrote:

> 1 (non-binding)
>
> Checked signature, checksum, and license.
> Ran build and test.
>
> On Wed, Mar 31, 2021 at 5:00 AM Peter Vary 
> wrote:
>
>> +1
>> Checked the signatures
>> Run build and tests
>>
>> Sadly did not have time to run manual Hive tests this time and I will be
>> out till mid next week. :(
>>
>> Thanks,
>> Peter
>>
>> On Mar 30, 2021, at 23:41, Russell Spitzer 
>> wrote:
>>
>> +1 -
>> Ran the tests
>> Checked the Checksum
>> Made sure there were no binary files anywhere in the source release
>> (there is drama going on elsewhere about this)
>>
>> On Tue, Mar 30, 2021 at 3:55 PM Edgar Rodriguez <
>> edgar.rodrig...@airbnb.com.invalid> wrote:
>>
>>> +1 (non-binding)
>>>
>>> - Verified build, signature and checkum.
>>> - Ran internal integration tests.
>>>
>>> Cheers,
>>>
>>> On Tue, Mar 30, 2021 at 7:50 AM Ryan Murray  wrote:
>>>
 +1 (non-binding)

 verified build, tests, signature, checksum.

 Best,
 Ryan

 On Tue, Mar 30, 2021 at 4:40 AM Jack Ye  wrote:

> +1 (non-binding)
>
> Verified build, unit test, AWS integration test, signature, checksum.
> Verified fix of #2146, #2267, #2333 in AWS EMR Spark3 environment.
>
> Best,
> Jack Ye
>
> On Mon, Mar 29, 2021 at 5:58 PM Anton Okolnychyi <
> aokolnyc...@apple.com.invalid> wrote:
>
>> +1 (binding)
>>
>> Checked the signature and checksum, ran RAT checks and tests.
>>
>> Had to trigger tests twice due to a HMS related failure in DELETE
>> tests in Spark extensions. We have noticed that problem while testing
>> 0.11.0 RCs and it was fixed in a later commit, which is not part of 
>> 0.11.1.
>>
>>
>> https://github.com/apache/iceberg/commit/19622dcfcb426485748fa017a6181e23df5732dc
>>
>> - Anton
>>
>> On 29 Mar 2021, at 17:42, Anton Okolnychyi <
>> aokolnyc...@apple.com.INVALID> wrote:
>>
>> Here is the link to steps we normally use to validate a release
>> candidate:
>>
>> https://lists.apache.org/thread.html/rd5e6b1656ac80252a9a7d473b36b6227da91d07d86d4ba4bee10df66%40%3Cdev.iceberg.apache.org%3E
>> 
>>
>> - Anton
>>
>> On 29 Mar 2021, at 17:41, Anton Okolnychyi <
>> aokolnyc...@apple.com.INVALID> wrote:
>>
>> Hi everyone,
>>
>> I propose the following RC to be released as official Apache Iceberg
>> 0.11.1 release.
>>
>> The commit id is 29cf712a821aa937e176f2d79a5593c4a1429e7f
>> * This corresponds to the tag: apache-iceberg-0.11.1-rc0
>> * https://github.com/apache/iceberg/commits/apache-iceberg-0.11.1-rc0
>> *
>> https://github.com/apache/iceberg/tree/29cf712a821aa937e176f2d79a5593c4a1429e7f
>>
>> The release tarball, signature, and checksums are here:
>> *
>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.11.1-rc0/
>>
>> You can find the KEYS file here (make sure to import the new key that
>> was used to sign the release):
>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>
>> Convenience binary artifacts are staged in Nexus. The Maven
>> repository URL is:
>> *
>> https://repository.apache.org/content/repositories/orgapacheiceberg-1016/
>>
>> This patch release includes these fixes:
>> https://github.com/apache/iceberg/milestone/13?closed=1
>>
>> Please download, verify, and test.
>>
>> Please vote in the next 72 hours.
>>
>> [ ] +1 Release this as Apache Iceberg 0.11.1
>> [ ] +0
>> [ ] -1 Do not release this because…
>>
>> Thanks,
>> Anton
>>
>>
>>
>>
>>>
>>> --
>>> Edgar R
>>>
>>
>>
>
> --
> John Zhuge
>


[NOTES] 23 June 2021 Iceberg Community Meeting

2021-07-01 Thread Carl Steinbach
Iceberg Community Meetings are open to everyone. To receive an invitation
to the next meeting, please join the iceberg-s...@googlegroups.com
<https://groups.google.com/g/iceberg-sync> list.Special thanks to Ryan Blue
for contributing most of these notes.Attendees: Anjali Norwood, Badrul
Chowdhury, Ben Mears, Dan Weeks, Gustavo Torres Torres, Jack Ye, Karuppayya
Rajendran, Kyle Bendickson, Parth Brahmbhatt, Russel Spitzer, Ryan Blue,
Sreeram Garlapati, Szehon Ho, Wing Yew Poon, Xinbin Huang, Yan Yan, Carl
Steinbach


   -

   Highlights
   -

  JDBC catalog was committed (Thanks, Ismail!)
  -

  DynamoDB catalog was committed (Thanks, Jack!)
  -

  Added predicate pushdown for partitions metadata table (Thanks,
  Szehon!)
  -

   Releases
   -

  0.12.0
  -

 New Actions API update
 - Almost done with compaction.
 -

Need to make the old API deprecated (to confirm)
-

 Spark 3.1 support
 -

Recently rebased on master
https://github.com/apache/iceberg/pull/2512
- No longer adds new modules, should be ready to commit.
 -

  Feature-based or time-based release cycle?
  -

 Carl: A time-based release cycle would be more predictable, not
 slipping because of some feature that isn’t quite ready. This could be
 monthly or quarterly.
 -

 Ryan: We already try not to hold back releases to get features in
 because it is better to release more often than to let them
slip. But we
 could be better about this. It’s important to continuously
release so that
 changes get back out to contributors.
 -

 The consensus was to discuss this on the dev list. It is a
 promising idea.
 -

  Iceberg 1.0?
  -

 Carl: Semver is a lie, and there is a public perception around 1.0
 releases. Should we go ahead and target a 1.0 soon?
 -

 Ryan: What do you mean that semver is a lie?
 -

 Carl: If semver were followed carefully, most projects would be on
 a major version in the 100s. Many things change, and the
version doesn’t
 always reflect it.
 -

 Ryan: That’s fair, but I think people still make downstream
 decisions based on how those version numbers change.
 -

 Jack: There is an expectation that breaking changes are signaled
 by increasing the major version, or more accurately, that not
increasing
 the major version indicates no major APIs are broken.
 -

 Ryan: Also, bumping up to 1.0 is when people start expecting more
 rigid enforcement of semver, even if it isn’t always done. If
we want to
 update to 1.0 and/or drop semver, we should figure out our
guarantees and
 document them clearly. And we should also prepare for more
API stability.
 Maybe add binary compatibility checks to the build.
 -

 The consensus was to discuss this more on the dev list and target
 a 1.0 for later this year with clear guidelines about API
compatibility.
 -

   New slack community: apache-iceberg.slack.com
   <https://communityinviter.com/apps/apache-iceberg/apache-iceberg-website>
   -

  It’s easy to sign up for ASF Slack here:
  https://s.apache.org/slack-invite
  -

  No need for an independent Iceberg workspace.
  -

   Any updates on the secondary index design?
   - Miao and Guy weren’t at the meeting, so no update.
   - Jack is going to look into this and help out.
   -

   Github triage permissions for project contributors
   -

  Carl opened an INFRA ticket for anyone with 2 or more contributions
  -

  We Will see if infra can add everyone.
  -

  Ref: INFRA-22026, INFRA-22031
  -

   Updating partitioning via Optimize/RewriteDataFiles
   -

  Russell: We ran into an issue where compaction with multiple
  partition specs will create many small files---planning groups files by
  current spec, but writing can split data for the new spec. Since
this is a
  rare event (unmerged data in an old spec), the solution is to merge files
  for the old spec separately.
  -

  Ryan: sounds reasonable.
  -

   Low-latency streaming
   -

  Sreeram: We are trying to see how frequently we can commit to an
  Iceberg table. Looking to get to commits every 1-2 seconds. One
main issue
  we’ve found is that there are several metadata files written for every
  commit: at least one manifest, the manifest list, and the metadata JSON
  file. Plus, the metadata JSON file has many snapshots and gets
quite large
  (3MB+) after a day of frequent commits. Is there a way to improve on how
  the JSON file tracks snapshots?
  -

  Ryan: There is space to improve this. I’ve thought about replacing
  t

Iceberg 0.12.0 Release Plan

2021-07-12 Thread Carl Steinbach
Hi Everyone,


I volunteered to be the release manager for the 0.12.0 release. My goal is
to start cutting release candidates this Friday, 7/17 at 24:00 PST. I am
tracking progress on this release on the Release 0.12.0 Project Board
. Note that this board only
tracks issues that BLOCK the release.


There are currently 7 PRs listed in the To-Do
 column. I am
going to remove these PRs from the 0.12.0 release tomorrow, 7/13 at 24:00
PST, unless they are updated with the following information:


1) Why does the PR need to be included in the 0.12.0 release?

2) When will the PR will be ready for review?


If you have any questions or concerns, please get in touch with me over
email or slack.


Thanks.


- Carl


Re: Iceberg 0.12.0 Release Plan

2021-07-12 Thread Carl Steinbach
Hi Grant,

Good catch! I added PR-1648 <https://github.com/apache/iceberg/pull/1648> to
the 0.12.0 project board.

- Carl

On Mon, Jul 12, 2021 at 1:16 PM Grant Nicholas  wrote:

> Howdy! Any updates on PR-1648
> <https://github.com/apache/iceberg/pull/1648> which upgrades the avro
> version used in iceberg? I do not see it in the "To-Do" section linked
> above, but the older avro version has caused major problems described in
> this issue and it would be nice to get in 0.12.0.
> https://github.com/apache/iceberg/issues/1654
>
>
>
> On Mon, Jul 12, 2021 at 2:39 PM Carl Steinbach  wrote:
>
>> Hi Everyone,
>>
>>
>> I volunteered to be the release manager for the 0.12.0 release. My goal
>> is to start cutting release candidates this Friday, 7/17 at 24:00 PST. I am
>> tracking progress on this release on the Release 0.12.0 Project Board
>> <https://github.com/apache/iceberg/projects/1>. Note that this board
>> only tracks issues that BLOCK the release.
>>
>>
>> There are currently 7 PRs listed in the To-Do
>> <https://github.com/apache/iceberg/projects/1#column-14786627> column. I
>> am going to remove these PRs from the 0.12.0 release tomorrow, 7/13 at
>> 24:00 PST, unless they are updated with the following information:
>>
>>
>> 1) Why does the PR need to be included in the 0.12.0 release?
>>
>> 2) When will the PR will be ready for review?
>>
>>
>> If you have any questions or concerns, please get in touch with me over
>> email or slack.
>>
>>
>> Thanks.
>>
>>
>> - Carl
>>
>>
>>
>
> --
>
> [image: SpotHero] <http://spothero.com/>
>
> Grant Nicholas / Senior Data Engineer II
> Pronouns <https://lgbt.ucsf.edu/pronounsmatter>: he/his
> gr...@spothero.com
> spothero.com <http://www.spothero.com/> | LinkedIn
> <https://www.linkedin.com/in/grantanicholas>
> *Your perfect spot is waiting for you! Learn more at spothero.com/careers
> <http://spothero.com/careers>*
>


Re: Iceberg 0.12.0 Release Plan

2021-07-19 Thread Carl Steinbach
Hi Everyone,

Currently, there are three issues blocking the release of 0.12.0:


   1. #2308 Handle the case that RewriteFiles and RowDelta commit the
   transaction at the same time
   <https://github.com/apache/iceberg/issues/2308>
   2. #2783 Metadata Table Empty Projection - Unknown type for int field.
   Type name: java.lang.string
   <https://github.com/apache/iceberg/issues/2783>
   3. #2284 Core: reassign the partition field IDs and reuse any existing ID
   <https://github.com/apache/iceberg/pull/2284>s

#2284 is in review.

Ryan said he would take a look at #2308.

@Szehon Ho , can you please confirm whether or not
you're working on #2783?

Thanks.

- Carl



On Mon, Jul 19, 2021 at 12:31 PM Jack Ye  wrote:

> I haven't heard any news for the 0.12.0 release since then, are we still
> planning for the release?
>
> Please let me know if there is anything we can do to help speed up the
> process. (I just saw the release board, will try to at least review those
> PRs)
>
> Best,
> Jack Ye
>
> On Mon, Jul 12, 2021 at 5:41 PM Sreeram Garlapati 
> wrote:
>
>> Great, thanks Ryan.
>>
>> On Mon, Jul 12, 2021 at 5:17 PM Ryan Blue  wrote:
>>
>>> Sreeram, I was just waiting for tests to pass on that PR. I just merged
>>> it.
>>>
>>> On Mon, Jul 12, 2021 at 4:41 PM Sreeram Garlapati <
>>> gsreeramku...@gmail.com> wrote:
>>>
>>>> Hi Carl,
>>>>
>>>> Thanks a lot for managing 0.12.0 release.
>>>>
>>>> Can you also pl. add this PR:
>>>> https://github.com/apache/iceberg/pull/2752 - which adds the option "
>>>> streaming-skip-delete-snapshots" - to Spark3 micro_batch reader.
>>>> Without this, streaming reads will fail if a snapshot of type delete or
>>>> replace is encountered, & is pretty much unusable. This PR is already
>>>> approved by multiple Committers - Ryan and Russell.
>>>>
>>>> PS: I am unsure if new PRs will be merged apart from the list proposed
>>>> on the project board - into the *0.12.0* release, and hence, proposing
>>>> this. If this PR will be merged - no action is needed. pl. pardon my
>>>> ignorance.
>>>>
>>>> Best regards,
>>>> Sreeram
>>>>
>>>> On Mon, Jul 12, 2021 at 4:14 PM Carl Steinbach  wrote:
>>>>
>>>>> Hi Grant,
>>>>>
>>>>> Good catch! I added PR-1648
>>>>> <https://github.com/apache/iceberg/pull/1648> to the 0.12.0 project
>>>>> board.
>>>>>
>>>>> - Carl
>>>>>
>>>>> On Mon, Jul 12, 2021 at 1:16 PM Grant Nicholas 
>>>>> wrote:
>>>>>
>>>>>> Howdy! Any updates on PR-1648
>>>>>> <https://github.com/apache/iceberg/pull/1648> which upgrades the
>>>>>> avro version used in iceberg? I do not see it in the "To-Do" section 
>>>>>> linked
>>>>>> above, but the older avro version has caused major problems described in
>>>>>> this issue and it would be nice to get in 0.12.0.
>>>>>> https://github.com/apache/iceberg/issues/1654
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 12, 2021 at 2:39 PM Carl Steinbach 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Everyone,
>>>>>>>
>>>>>>>
>>>>>>> I volunteered to be the release manager for the 0.12.0 release. My
>>>>>>> goal is to start cutting release candidates this Friday, 7/17 at 24:00 
>>>>>>> PST.
>>>>>>> I am tracking progress on this release on the Release 0.12.0
>>>>>>> Project Board <https://github.com/apache/iceberg/projects/1>. Note
>>>>>>> that this board only tracks issues that BLOCK the release.
>>>>>>>
>>>>>>>
>>>>>>> There are currently 7 PRs listed in the To-Do
>>>>>>> <https://github.com/apache/iceberg/projects/1#column-14786627>
>>>>>>> column. I am going to remove these PRs from the 0.12.0 release tomorrow,
>>>>>>> 7/13 at 24:00 PST, unless they are updated with the following 
>>>>>>> information:
>>>>>>>
>>>>>>>
>>>>>>> 1) Why does the PR need to be included in the 0.12.0 release?
>>>>>>>
>>>>>>> 2) When will the PR will be ready for review?
>>>>>>>
>>>>>>>
>>>>>>> If you have any questions or concerns, please get in touch with me
>>>>>>> over email or slack.
>>>>>>>
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>
>>>>>>> - Carl
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> [image: SpotHero] <http://spothero.com/>
>>>>>>
>>>>>> Grant Nicholas / Senior Data Engineer II
>>>>>> Pronouns <https://lgbt.ucsf.edu/pronounsmatter>: he/his
>>>>>> gr...@spothero.com
>>>>>> spothero.com <http://www.spothero.com/> | LinkedIn
>>>>>> <https://www.linkedin.com/in/grantanicholas>
>>>>>> *Your perfect spot is waiting for you! Learn more
>>>>>> at spothero.com/careers <http://spothero.com/careers>*
>>>>>>
>>>>>
>>>
>>> --
>>> Ryan Blue
>>> Tabular
>>>
>>


Re: [VOTE] Adopt the v2 spec changes

2021-08-01 Thread Carl Steinbach
+1 (binding)

On Wed, Jul 28, 2021 at 6:43 PM Daniel Weeks  wrote:

> +1
>
> On Wed, Jul 28, 2021, 1:36 PM Kyle Bendickson
>  wrote:
>
>> +1 (non-binding)
>>
>> 
>>
>> *Kyle Bendickson*
>> Software Engineer
>> Apple
>> ACS Data
>> One Apple Park Way,
>> Cupertino, CA 95014, USA
>> kbendick...@apple.com
>>
>> This email and any attachments may be privileged and may contain
>> confidential information intended only for the recipient(s) named above.
>> Any other distribution, forwarding, copying or disclosure of this message
>> is strictly prohibited. If you have received this email in error, please
>> notify me immediately by telephone or return email, and delete this message
>> from your system.
>>
>> On Jul 27, 2021, at 9:58 AM, Ryan Blue  wrote:
>>
>> I’d like to propose that we adopt the pending v2 spec changes as the
>> supported v2 spec. The full list of changes is documented in the v2
>> summary section of the spec .
>>
>> The major breaking change is the addition of delete files and metadata to
>> track delete files. In addition, there are a few other minor breaking
>> changes. For example, v2 drops the block_size_in_bytes field in
>> manifests that was previously required and also omits fields in table
>> metadata that are now tracked by lists; schema is no longer written in
>> favor of schemas. Other changes are forward compatible, mostly
>> tightening field requirements where possible (e.g., schemas and
>> current-schema-id are now required).
>>
>> Adopting the changes will signal that the community intends to support
>> the current set of changes and will guarantee forward-compatibility for v2
>> tables that implement the current v2 spec. Any new breaking changes would
>> go into v3.
>>
>> Please vote on adopting the v2 changes in the next 72 hours.
>>
>> [ ] +1 Adopt the changes as v2
>> [ ] +0
>> [ ] -1 Do not adopt the changes, because…
>> --
>> Ryan Blue
>>
>>
>>


Re: Iceberg 0.12.0 Release Plan

2021-08-02 Thread Carl Steinbach
I want to provide everyone with a quick update on the 0.12.0 release
process. At this point, #2906 Fix Partition field IDs in table replacement
<https://github.com/apache/iceberg/pull/2906> is the only remaining
blocker. Ryan is working on a fix and predicts that we'll be ready to cut
the first release candidate by the end of this week.

- Carl

On Mon, Jul 19, 2021 at 2:58 PM Szehon Ho  wrote:

> Hi Carl,
>
> For the Issue: https://github.com/apache/iceberg/issues/2783
>
> The status is: I gave a bit of a try but couldn’t find an easy fix, so
> hoping someone more knowledgable about this code has cycle to take a look
> at it.
>
> It would be great to fix it for 0.12 as it seems to block more metadata
> queries than before, but for timing purpose I’m not sure if its feasible.
>
> Thanks
> Szehon
>
> On Mon, Jul 19, 2021 at 2:19 PM Carl Steinbach  wrote:
>
>> Hi Everyone,
>>
>> Currently, there are three issues blocking the release of 0.12.0:
>>
>>
>>1. #2308 Handle the case that RewriteFiles and RowDelta commit the
>>transaction at the same time
>><https://github.com/apache/iceberg/issues/2308>
>>2. #2783 Metadata Table Empty Projection - Unknown type for int
>>field. Type name: java.lang.string
>><https://github.com/apache/iceberg/issues/2783>
>>3. #2284 Core: reassign the partition field IDs and reuse any
>>existing ID <https://github.com/apache/iceberg/pull/2284>s
>>
>> #2284 is in review.
>>
>> Ryan said he would take a look at #2308.
>>
>> @Szehon Ho , can you please confirm whether or not
>> you're working on #2783?
>>
>> Thanks.
>>
>> - Carl
>>
>>
>>
>> On Mon, Jul 19, 2021 at 12:31 PM Jack Ye  wrote:
>>
>>> I haven't heard any news for the 0.12.0 release since then, are we still
>>> planning for the release?
>>>
>>> Please let me know if there is anything we can do to help speed up the
>>> process. (I just saw the release board, will try to at least review those
>>> PRs)
>>>
>>> Best,
>>> Jack Ye
>>>
>>> On Mon, Jul 12, 2021 at 5:41 PM Sreeram Garlapati <
>>> gsreeramku...@gmail.com> wrote:
>>>
>>>> Great, thanks Ryan.
>>>>
>>>> On Mon, Jul 12, 2021 at 5:17 PM Ryan Blue  wrote:
>>>>
>>>>> Sreeram, I was just waiting for tests to pass on that PR. I just
>>>>> merged it.
>>>>>
>>>>> On Mon, Jul 12, 2021 at 4:41 PM Sreeram Garlapati <
>>>>> gsreeramku...@gmail.com> wrote:
>>>>>
>>>>>> Hi Carl,
>>>>>>
>>>>>> Thanks a lot for managing 0.12.0 release.
>>>>>>
>>>>>> Can you also pl. add this PR:
>>>>>> https://github.com/apache/iceberg/pull/2752 - which adds the option "
>>>>>> streaming-skip-delete-snapshots" - to Spark3 micro_batch reader.
>>>>>> Without this, streaming reads will fail if a snapshot of type delete or
>>>>>> replace is encountered, & is pretty much unusable. This PR is already
>>>>>> approved by multiple Committers - Ryan and Russell.
>>>>>>
>>>>>> PS: I am unsure if new PRs will be merged apart from the list
>>>>>> proposed on the project board - into the *0.12.0* release, and
>>>>>> hence, proposing this. If this PR will be merged - no action is needed. 
>>>>>> pl.
>>>>>> pardon my ignorance.
>>>>>>
>>>>>> Best regards,
>>>>>> Sreeram
>>>>>>
>>>>>> On Mon, Jul 12, 2021 at 4:14 PM Carl Steinbach 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Grant,
>>>>>>>
>>>>>>> Good catch! I added PR-1648
>>>>>>> <https://github.com/apache/iceberg/pull/1648> to the 0.12.0 project
>>>>>>> board.
>>>>>>>
>>>>>>> - Carl
>>>>>>>
>>>>>>> On Mon, Jul 12, 2021 at 1:16 PM Grant Nicholas 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Howdy! Any updates on PR-1648
>>>>>>>> <https://github.com/apache/iceberg/pull/1648> which upgrades the
>>>>>>>> avro version used in iceberg? I do not see it in the "To-Do" section 
>>>>>>&g

Re: Iceberg 0.12.0 Release Plan

2021-08-02 Thread Carl Steinbach
Hi Jack,

I added #2887 to the 0.12.0 release project board
<https://github.com/apache/iceberg/projects/1>. When do you think this
patch will be committed?

- Carl

On Mon, Aug 2, 2021 at 4:28 PM Ryan Blue  wrote:

> Jack, I've been reviewing that one so that we can get it in. Thanks for
> fixing it!
>
> On Mon, Aug 2, 2021 at 3:12 PM Jack Ye  wrote:
>
>> Thanks for the update Carl!
>>
>> Given that we have voted for the adoption of format v2, can we also get
>> this change in, so that people can start to use v2 tables?
>> https://github.com/apache/iceberg/pull/2887
>>
>> -Jack
>>
>> On Mon, Aug 2, 2021 at 3:08 PM Carl Steinbach  wrote:
>>
>>> I want to provide everyone with a quick update on the 0.12.0 release
>>> process. At this point, #2906 Fix Partition field IDs in table
>>> replacement <https://github.com/apache/iceberg/pull/2906> is the only
>>> remaining blocker. Ryan is working on a fix and predicts that we'll be
>>> ready to cut the first release candidate by the end of this week.
>>>
>>> - Carl
>>>
>>> On Mon, Jul 19, 2021 at 2:58 PM Szehon Ho 
>>> wrote:
>>>
>>>> Hi Carl,
>>>>
>>>> For the Issue: https://github.com/apache/iceberg/issues/2783
>>>>
>>>> The status is: I gave a bit of a try but couldn’t find an easy fix, so
>>>> hoping someone more knowledgable about this code has cycle to take a look
>>>> at it.
>>>>
>>>> It would be great to fix it for 0.12 as it seems to block more metadata
>>>> queries than before, but for timing purpose I’m not sure if its feasible.
>>>>
>>>> Thanks
>>>> Szehon
>>>>
>>>> On Mon, Jul 19, 2021 at 2:19 PM Carl Steinbach  wrote:
>>>>
>>>>> Hi Everyone,
>>>>>
>>>>> Currently, there are three issues blocking the release of 0.12.0:
>>>>>
>>>>>
>>>>>1. #2308 Handle the case that RewriteFiles and RowDelta commit the
>>>>>transaction at the same time
>>>>><https://github.com/apache/iceberg/issues/2308>
>>>>>2. #2783 Metadata Table Empty Projection - Unknown type for int
>>>>>field. Type name: java.lang.string
>>>>><https://github.com/apache/iceberg/issues/2783>
>>>>>3. #2284 Core: reassign the partition field IDs and reuse any
>>>>>existing ID <https://github.com/apache/iceberg/pull/2284>s
>>>>>
>>>>> #2284 is in review.
>>>>>
>>>>> Ryan said he would take a look at #2308.
>>>>>
>>>>> @Szehon Ho , can you please confirm whether or
>>>>> not you're working on #2783?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> - Carl
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 19, 2021 at 12:31 PM Jack Ye  wrote:
>>>>>
>>>>>> I haven't heard any news for the 0.12.0 release since then, are we
>>>>>> still planning for the release?
>>>>>>
>>>>>> Please let me know if there is anything we can do to help speed up
>>>>>> the process. (I just saw the release board, will try to at least review
>>>>>> those PRs)
>>>>>>
>>>>>> Best,
>>>>>> Jack Ye
>>>>>>
>>>>>> On Mon, Jul 12, 2021 at 5:41 PM Sreeram Garlapati <
>>>>>> gsreeramku...@gmail.com> wrote:
>>>>>>
>>>>>>> Great, thanks Ryan.
>>>>>>>
>>>>>>> On Mon, Jul 12, 2021 at 5:17 PM Ryan Blue  wrote:
>>>>>>>
>>>>>>>> Sreeram, I was just waiting for tests to pass on that PR. I just
>>>>>>>> merged it.
>>>>>>>>
>>>>>>>> On Mon, Jul 12, 2021 at 4:41 PM Sreeram Garlapati <
>>>>>>>> gsreeramku...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Carl,
>>>>>>>>>
>>>>>>>>> Thanks a lot for managing 0.12.0 release.
>>>>>>>>>
>>>>>>>>> Can you also pl. add this PR:
>>>>>>>>> https://github.com/apache/iceberg/pull/2752 - which adds the
>>>>>>>>> option "streami

Re: Iceberg 0.12.0 Release Plan

2021-08-03 Thread Carl Steinbach
Thanks, Jack and Ryan, for closing out these issues. Since there are no
more blockers, I will start preparing a release candidate.

- Carl

On Tue, Aug 3, 2021 at 3:29 PM Ryan Blue  wrote:

> I just merged the PR. Thanks for making it possible to create v2 tables
> easily, Jack!
>
> On Tue, Aug 3, 2021 at 1:02 PM Jack Ye  wrote:
>
>> Thanks! The PR is mostly ready and just got approved by Anton. As soon as
>> it is merged we can start the branch cut.
>> -Jack
>>
>> On Mon, Aug 2, 2021 at 11:39 PM Carl Steinbach 
>> wrote:
>>
>>> Hi Jack,
>>>
>>> I added #2887 to the 0.12.0 release project board
>>> <https://github.com/apache/iceberg/projects/1>. When do you think this
>>> patch will be committed?
>>>
>>> - Carl
>>>
>>> On Mon, Aug 2, 2021 at 4:28 PM Ryan Blue  wrote:
>>>
>>>> Jack, I've been reviewing that one so that we can get it in. Thanks for
>>>> fixing it!
>>>>
>>>> On Mon, Aug 2, 2021 at 3:12 PM Jack Ye  wrote:
>>>>
>>>>> Thanks for the update Carl!
>>>>>
>>>>> Given that we have voted for the adoption of format v2, can we also
>>>>> get this change in, so that people can start to use v2 tables?
>>>>> https://github.com/apache/iceberg/pull/2887
>>>>>
>>>>> -Jack
>>>>>
>>>>> On Mon, Aug 2, 2021 at 3:08 PM Carl Steinbach  wrote:
>>>>>
>>>>>> I want to provide everyone with a quick update on the 0.12.0 release
>>>>>> process. At this point, #2906 Fix Partition field IDs in table
>>>>>> replacement <https://github.com/apache/iceberg/pull/2906> is the
>>>>>> only remaining blocker. Ryan is working on a fix and predicts that we'll 
>>>>>> be
>>>>>> ready to cut the first release candidate by the end of this week.
>>>>>>
>>>>>> - Carl
>>>>>>
>>>>>> On Mon, Jul 19, 2021 at 2:58 PM Szehon Ho 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Carl,
>>>>>>>
>>>>>>> For the Issue: https://github.com/apache/iceberg/issues/2783
>>>>>>>
>>>>>>> The status is: I gave a bit of a try but couldn’t find an easy fix,
>>>>>>> so hoping someone more knowledgable about this code has cycle to take a
>>>>>>> look at it.
>>>>>>>
>>>>>>> It would be great to fix it for 0.12 as it seems to block more
>>>>>>> metadata queries than before, but for timing purpose I’m not sure if its
>>>>>>> feasible.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Szehon
>>>>>>>
>>>>>>> On Mon, Jul 19, 2021 at 2:19 PM Carl Steinbach 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Everyone,
>>>>>>>>
>>>>>>>> Currently, there are three issues blocking the release of 0.12.0:
>>>>>>>>
>>>>>>>>
>>>>>>>>1. #2308 Handle the case that RewriteFiles and RowDelta commit
>>>>>>>>the transaction at the same time
>>>>>>>><https://github.com/apache/iceberg/issues/2308>
>>>>>>>>2. #2783 Metadata Table Empty Projection - Unknown type for int
>>>>>>>>field. Type name: java.lang.string
>>>>>>>><https://github.com/apache/iceberg/issues/2783>
>>>>>>>>3. #2284 Core: reassign the partition field IDs and reuse any
>>>>>>>>existing ID <https://github.com/apache/iceberg/pull/2284>s
>>>>>>>>
>>>>>>>> #2284 is in review.
>>>>>>>>
>>>>>>>> Ryan said he would take a look at #2308.
>>>>>>>>
>>>>>>>> @Szehon Ho , can you please confirm whether
>>>>>>>> or not you're working on #2783?
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> - Carl
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jul 19, 2021 at 12:31 PM Jack Ye 
>>>>>>>> w

[VOTE] Release Apache Iceberg 0.12.0 RC2

2021-08-03 Thread Carl Steinbach
Hi everyone,

I propose that we release RC2 as the official Apache Iceberg 0.12.0
release. Please note that RC0 and RC1 were DOA.

The commit id for RC2 is 7c2fcfd893ab71bee41242b46e894e6187340070
* This corresponds to the tag: apache-iceberg-0.12.0-rc2
* https://github.com/apache/iceberg/commits/apache-iceberg-0.12.0-rc2
*
https://github.com/apache/iceberg/tree/7c2fcfd893ab71bee41242b46e894e6187340070

The release tarball, signature, and checksums are here:
* https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc2/

You can find the KEYS file here:
* https://dist.apache.org/repos/dist/dev/iceberg/KEYS

Convenience binary artifacts are staged in Nexus. The Maven repository URL
is:
* https://repository.apache.org/content/repositories/orgapacheiceberg-1017/

Please download, verify, and test.

Please vote in the next 72 hours.

[ ] +1 Release this as Apache Iceberg 0.12.0
[ ] +0
[ ] -1 Do not release this because...


[NOTES] Iceberg Community Meeting - July 21 2021

2021-08-03 Thread Carl Steinbach
Iceberg Community Meetings are open to everyone. To receive an invitation
to the next meeting, please join the iceberg-s...@googlegroups.com
 list.
Notes from previous meetings along with a running agenda for the next
meeting are available here:
https://docs.google.com/document/d/1YuGhUdukLP5gGiqCbk0A5_Wifqe2CZWgOd3TbhY3UQg/edit?pli=1#heading=h.z3dncl7gr8m1

21 July 2021

   -

   Releases
   -

  0.12 Release status
  -

 Currently blocked on “Handle the case that RewriteFiles and
 RowDelta commit the transaction at the same time”  #2308
 . Ryan is working
 on a fix.
 -

  Consider dropping support for Spark 3.0 and 3.1 after 0.12 once Spark
  3.2 is available
  -

 Spark 3.2 is set to include many changes to DSv2 which we can
 leverage to make our code simpler. Examples include
eliminating the need to
 provide our own distribution and sort ordering utils for
Spark, and the
 ability to deal with Spark expressions directly instead of via Iceberg
 wrapper code.
 -

 Should we just cut support for 3.0 and 3.1, and instead just do
 3.2 support in the next release, in order to avoid doing a three-way
 version split, which currently looks like it would require an
additional
 Spark module that is 3.2 specific.
 -

 [Anton] This is not just about the tech debt added by shims. It’s
 also about not being able to use certain Spark APIs that have been
 introduced in new versions. For example, in 3.1 there is the
purge flag, as
 well as APIs in structured streaming related to limit
support. In 3.2 there
 is the distribution and ordering support. I’m in favor of
keeping it simple
 and release 0.12 with the release for all Spark versions, and
then migrate
 to Spark 3.2 in the next version of Iceberg.
 -

 [Ryan] To recap, the main issue is that we would need to bump the
 Spark version to 3.2 in order to pull in the new interfaces,
and then when
 you roll back, and you use that same module in our 3.1, the
interfaces are
 missing, so we can't actually load them. I think we may be
able to solve
 this problem by not loading the interface until it is
actually needed. In
 other words, have a method on the object that is from 3.2, and then
 basically copy the object and mix in the interface at that
point. Sometimes
 you can get away with having an extra class in there, but not
loading the
 part of it actually depends on the missing interface. That
sometimes you
 can get away with, like, having an extra class in there. I’ll do some
 testing and see if I can get this working between Spark 3.2 and 3.1.
 -

 Conclusion: keep this discussion open for a bit longer while Ryan
 does some exploration to see if his approach is viable.
 -

   Slack community
   -

  [Ryan] At the last meeting we discussed ways of making it easier for
  community members to join the Iceberg channel on the ASF’s Slack
workspace.
  The discussion was tabled when it became known that there’s a self-invite
  link. Unfortunately, it turns out the link regularly breaks and the ASF
  INFRA team has declined to fix it this time because of an influx of
  spammers. Carl created a separate Slack workspace dedicated to Apache
  Iceberg. I think we should migrate to this space since making it easy for
  everyone to join and enter the discussion is more important and
leveraging
  the existing ASF infrastructure. Since I’m seeing lots of +1s for this on
  the chat I think the next step is to raise this issue on the dev
list. (related
  thread
  
,
  Slack invite link
  

  )
  -

  Addendum: On the dev list thread we decided to move to the
  apache-iceberg Slack workspace.
  -

   Bucketing with Unicode characters (#2837
   )
   -

  Mateusz Gajewski at Starburst discovered that Iceberg’s bucket hash
  function for Strings generates values that don’t adhere to the
Iceberg spec
  when the input String contains Unicode surrogate pair characters
  
.
  The root cause of this issue is a bug in Guava’s
Hashing.murmur3_32().hashString
  method .
  -

  It’s easy to work around this issu

Re: [VOTE] Release Apache Iceberg 0.12.0 RC2

2021-08-04 Thread Carl Steinbach
To make this easier for everyone else, here are the commands you need to
run to verify the release tarball:

curl https://dist.apache.org/repos/dist/dev/iceberg/KEYS -o KEYS
gpg --import KEYS
gpg --verify apache-iceberg-0.12.0.tar.gz.asc apache-iceberg-0.12.0.tar.gz

And here's what I see when I run the verify command:

gpg --verify apache-iceberg-0.12.0.tar.gz.asc apache-iceberg-0.12.0.tar.gz
gpg: Signature made Tue Aug  3 17:16:32 2021 PDT
gpg:using RSA key 160F51BE45616B94103ED24D5A5C7F6EB9542945
gpg: Good signature from "Carl W. Steinbach (CODE SIGNING KEY) <
c...@apache.org>" [ultimate]

- Carl


On Wed, Aug 4, 2021 at 11:53 AM Ryan Blue  wrote:

> Ryan, did you re-import the KEYS file? Carl's code signing key is in the
> linked KEYS file.
>
> On Wed, Aug 4, 2021 at 11:12 AM Ryan Murray  wrote:
>
>> Hi all,
>>
>> Unfortunately I have to give -1
>>
>> I had trouble w/ the keys:
>>
>> gpg: assuming signed data in 'apache-iceberg-0.12.0.tar.gz'
>> gpg: Signature made Mon 02 Aug 2021 03:36:30 CEST
>> gpg:using RSA key FAFEB6EAA60C95E2BB5E26F01FF0803CB78D539F
>> gpg: Can't check signature: No public key
>>
>> And I have discovered a bug in NessieCatalog. It is unclear what is wrong
>> but the NessieCatalog doesn't play nice w/ Spark3.1. I will raise a patch
>> ASAP to fix it. Very sorry for the inconvenience.
>>
>> Best,
>> Ryan
>>
>> On Wed, Aug 4, 2021 at 3:20 AM Carl Steinbach  wrote:
>>
>>> Hi everyone,
>>>
>>> I propose that we release RC2 as the official Apache Iceberg 0.12.0
>>> release. Please note that RC0 and RC1 were DOA.
>>>
>>> The commit id for RC2 is 7c2fcfd893ab71bee41242b46e894e6187340070
>>> * This corresponds to the tag: apache-iceberg-0.12.0-rc2
>>> * https://github.com/apache/iceberg/commits/apache-iceberg-0.12.0-rc2
>>> *
>>> https://github.com/apache/iceberg/tree/7c2fcfd893ab71bee41242b46e894e6187340070
>>>
>>> The release tarball, signature, and checksums are here:
>>> *
>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc2/
>>>
>>> You can find the KEYS file here:
>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>>
>>> Convenience binary artifacts are staged in Nexus. The Maven repository
>>> URL is:
>>> *
>>> https://repository.apache.org/content/repositories/orgapacheiceberg-1017/
>>>
>>> Please download, verify, and test.
>>>
>>> Please vote in the next 72 hours.
>>>
>>> [ ] +1 Release this as Apache Iceberg 0.12.0
>>> [ ] +0
>>> [ ] -1 Do not release this because...
>>>
>>
>
> --
> Ryan Blue
> Tabular
>


Re: [VOTE] Release Apache Iceberg 0.12.0 RC2

2021-08-04 Thread Carl Steinbach
Hi Ryan,

Can you please run the following command to see which keys in your public
keyring are associated with my UID?

% gpg  --list-keys c...@apache.org
pub   rsa4096/5A5C7F6EB9542945 2021-07-01 [SC]
  160F51BE45616B94103ED24D5A5C7F6EB9542945
uid [ultimate] Carl W. Steinbach (CODE SIGNING KEY) <
c...@apache.org>
sub   rsa4096/4158EB8A4F03D2AA 2021-07-01 [E]

Thanks.

- Carl

On Wed, Aug 4, 2021 at 11:12 AM Ryan Murray  wrote:

> Hi all,
>
> Unfortunately I have to give -1
>
> I had trouble w/ the keys:
>
> gpg: assuming signed data in 'apache-iceberg-0.12.0.tar.gz'
> gpg: Signature made Mon 02 Aug 2021 03:36:30 CEST
> gpg:using RSA key FAFEB6EAA60C95E2BB5E26F01FF0803CB78D539F
> gpg: Can't check signature: No public key
>
> And I have discovered a bug in NessieCatalog. It is unclear what is wrong
> but the NessieCatalog doesn't play nice w/ Spark3.1. I will raise a patch
> ASAP to fix it. Very sorry for the inconvenience.
>
> Best,
> Ryan
>
> On Wed, Aug 4, 2021 at 3:20 AM Carl Steinbach  wrote:
>
>> Hi everyone,
>>
>> I propose that we release RC2 as the official Apache Iceberg 0.12.0
>> release. Please note that RC0 and RC1 were DOA.
>>
>> The commit id for RC2 is 7c2fcfd893ab71bee41242b46e894e6187340070
>> * This corresponds to the tag: apache-iceberg-0.12.0-rc2
>> * https://github.com/apache/iceberg/commits/apache-iceberg-0.12.0-rc2
>> *
>> https://github.com/apache/iceberg/tree/7c2fcfd893ab71bee41242b46e894e6187340070
>>
>> The release tarball, signature, and checksums are here:
>> *
>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc2/
>>
>> You can find the KEYS file here:
>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>
>> Convenience binary artifacts are staged in Nexus. The Maven repository
>> URL is:
>> *
>> https://repository.apache.org/content/repositories/orgapacheiceberg-1017/
>>
>> Please download, verify, and test.
>>
>> Please vote in the next 72 hours.
>>
>> [ ] +1 Release this as Apache Iceberg 0.12.0
>> [ ] +0
>> [ ] -1 Do not release this because...
>>
>


Re: [VOTE] Release Apache Iceberg 0.12.0 RC2

2021-08-08 Thread Carl Steinbach
Hi Wing Yew,

I will create a new RC once this patch is committed.

Thanks.

- Carl

On Sat, Aug 7, 2021 at 4:29 PM Wing Yew Poon 
wrote:

> Sorry to bring this up so late, but this just came up: there is a Spark
> 3.1 (runtime) compatibility issue (not found by existing tests), which I
> have a fix for in https://github.com/apache/iceberg/pull/2954. I think it
> would be really helpful if it can go into 0.12.0.
> - Wing Yew
>
>
> On Fri, Aug 6, 2021 at 11:36 AM Jack Ye  wrote:
>
>> +1 (non-binding)
>>
>> Verified release test and AWS integration test, issue found in test but
>> not blocking for release (https://github.com/apache/iceberg/pull/2948)
>>
>> Verified Spark 3.1 and 3.0 operations and new SQL extensions and
>> procedures on EMR.
>>
>> Thanks,
>> Jack Ye
>>
>> On Fri, Aug 6, 2021 at 1:19 AM Kyle Bendickson 
>> wrote:
>>
>>> +1 (binding)
>>>
>>> I verified:
>>>  - KEYS signature & checksum
>>>  - ./gradlew clean build (tests, etc)
>>>  - Ran Spark jobs on Kubernetes after building from the tarball at
>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc2/
>>>  - Spark 3.1.1 batch jobs against both Hadoop and Hive tables, using
>>> HMS for Hive catalog
>>>  - Verified default FileIO and S3FileIO
>>>  - Basic read and writes
>>>  - Jobs using Spark procedures (remove unreachable files)
>>>  - Special mention: verified that Spark catalogs can override hadoop
>>> configurations using configs prefixed with
>>> "spark.sql.catalog.(catalog-name).hadoop."
>>>  - one of my contributions to this release that has been asked about
>>> by several customers internally
>>>  - tested using
>>> `spark.sql.catalog.(catalog-name).hadoop.fs.s3a.impl` for two catalogs,
>>> both values respected as opposed to the default globally configured value
>>>
>>> Thank you Carl!
>>>
>>> - Kyle, Data OSS Dev @ Apple =)
>>>
>>> On Thu, Aug 5, 2021 at 11:49 PM Szehon Ho 
>>> wrote:
>>>
>>>> +1 (non-binding)
>>>>
>>>> * Verify Signature Keys
>>>> * Verify Checksum
>>>> * dev/check-license
>>>> * Build
>>>> * Run tests (though some timeout failures, on Hive MR test..)
>>>>
>>>> Thanks
>>>> Szehon
>>>>
>>>> On Thu, Aug 5, 2021 at 2:23 PM Daniel Weeks  wrote:
>>>>
>>>>> +1 (binding)
>>>>>
>>>>> I verified sigs/sums, license, build, and test
>>>>>
>>>>> -Dan
>>>>>
>>>>> On Wed, Aug 4, 2021 at 2:53 PM Ryan Murray  wrote:
>>>>>
>>>>>> After some wrestling w/ Spark I discovered that the problem was with
>>>>>> my test. Some SparkSession apis changed. so all good here now.
>>>>>>
>>>>>> +1 (non-binding)
>>>>>>
>>>>>> On Wed, Aug 4, 2021 at 11:29 PM Ryan Murray  wrote:
>>>>>>
>>>>>>> Thanks for the help Carl, got it sorted out. The gpg check now
>>>>>>> works. For those who were interested I used a canned wget command in my
>>>>>>> history and it pulled the RC0 :-)
>>>>>>>
>>>>>>> Will have a PR to fix the Nessie Catalog soon.
>>>>>>>
>>>>>>> Best,
>>>>>>> Ryan
>>>>>>>
>>>>>>> On Wed, Aug 4, 2021 at 9:21 PM Carl Steinbach 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Ryan,
>>>>>>>>
>>>>>>>> Can you please run the following command to see which keys in your
>>>>>>>> public keyring are associated with my UID?
>>>>>>>>
>>>>>>>> % gpg  --list-keys c...@apache.org
>>>>>>>> pub   rsa4096/5A5C7F6EB9542945 2021-07-01 [SC]
>>>>>>>>   160F51BE45616B94103ED24D5A5C7F6EB9542945
>>>>>>>> uid [ultimate] Carl W. Steinbach (CODE SIGNING KEY)
>>>>>>>> 
>>>>>>>> sub   rsa4096/4158EB8A4F03D2AA 2021-07-01 [E]
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> - Carl
>>>>>>>>
>>>>>>>> On Wed, Aug 4, 

Re: [VOTE] Release Apache Iceberg 0.12.0 RC2

2021-08-09 Thread Carl Steinbach
I am withdrawing RC2 from voting because of issues that were found during
testing. RC3 will follow shortly.

Thanks, everyone, for your help testing RC2!

- Carl

On Mon, Aug 9, 2021 at 1:44 PM Szehon Ho 
wrote:

> Got it, I somehow thought changes were manually cherry-picked, thanks for
> clarification.
>
> Thanks
> Szehon
>
> On 9 Aug 2021, at 13:34, Ryan Blue  wrote:
>
> Szehon, I think that should make it because the RC will come from master.
>
> On Mon, Aug 9, 2021 at 12:56 PM Szehon Ho 
> wrote:
>
>> If it’s easy, would it make sense to include Russell’s fix as well for
>> Metadata tables query , as it affects Spark 3.1 (a regression from Spark
>> 3.0)?  https://github.com/apache/iceberg/pull/2877/files
>>
>> The issue : https://github.com/apache/iceberg/issues/2783 was at some
>> point marked for 0.12 release.  I had mentioned it’s ok to remove, if it
>> takes too long to fix, and now it is indeed fixed.
>>
>> Thanks,
>> Szehon
>>
>>
>>
>> On 9 Aug 2021, at 11:36, Ryan Blue  wrote:
>>
>> Thanks for pointing that one out, Jack! That would be good to get in as
>> well.
>>
>> On Mon, Aug 9, 2021 at 11:02 AM Jack Ye  wrote:
>>
>>> If we are considering recutting the branch, please also include this PR
>>> https://github.com/apache/iceberg/pull/2943 which fixes the validation
>>> when creating a schema with identifier fields, thank you!
>>>
>>> -Jack Ye
>>>
>>> On Mon, Aug 9, 2021 at 9:08 AM Wing Yew Poon <
>>> wyp...@cloudera.com.invalid> wrote:
>>>
>>>> Ryan,
>>>> Thanks for the review. Let me look into implementing your refactoring
>>>> suggestion.
>>>> - Wing Yew
>>>>
>>>>
>>>> On Mon, Aug 9, 2021 at 8:41 AM Ryan Blue  wrote:
>>>>
>>>>> Yeah, I agree. We should fix this for the 0.12.0 release. That said, I
>>>>> plan to continue testing this RC because it won't change that much since
>>>>> this affects the Spark extensions in 3.1. Other engines and Spark 3.0 or
>>>>> older should be fine.
>>>>>
>>>>> I left a comment on the PR. I think it looks good, but we should try
>>>>> to refactor to make sure we don't have more issues like this. I think when
>>>>> we update our extensions to be compatible with multiple Spark versions, we
>>>>> should introduce a factory method to create the Catalyst plan node and use
>>>>> that everywhere. That will hopefully cut down on the number of times this
>>>>> happens.
>>>>>
>>>>> Thank you, Wing Yew!
>>>>>
>>>>> On Sun, Aug 8, 2021 at 2:52 PM Carl Steinbach 
>>>>> wrote:
>>>>>
>>>>>> Hi Wing Yew,
>>>>>>
>>>>>> I will create a new RC once this patch is committed.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> - Carl
>>>>>>
>>>>>> On Sat, Aug 7, 2021 at 4:29 PM Wing Yew Poon <
>>>>>> wyp...@cloudera.com.invalid> wrote:
>>>>>>
>>>>>>> Sorry to bring this up so late, but this just came up: there is a
>>>>>>> Spark 3.1 (runtime) compatibility issue (not found by existing tests),
>>>>>>> which I have a fix for in
>>>>>>> https://github.com/apache/iceberg/pull/2954. I think it would be
>>>>>>> really helpful if it can go into 0.12.0.
>>>>>>> - Wing Yew
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Aug 6, 2021 at 11:36 AM Jack Ye  wrote:
>>>>>>>
>>>>>>>> +1 (non-binding)
>>>>>>>>
>>>>>>>> Verified release test and AWS integration test, issue found in test
>>>>>>>> but not blocking for release (
>>>>>>>> https://github.com/apache/iceberg/pull/2948)
>>>>>>>>
>>>>>>>> Verified Spark 3.1 and 3.0 operations and new SQL extensions and
>>>>>>>> procedures on EMR.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jack Ye
>>>>>>>>
>>>>>>>> On Fri, Aug 6, 2021 at 1:19 AM Kyle Bendickson <
>>>>>>>> kjbendick...@gmail.com> wrote:
>>>>>>>>
>>>&

Subject: [VOTE] Release Apache Iceberg 0.12.0 RC3

2021-08-09 Thread Carl Steinbach
Hi Everyone,

I propose the following RC to be released as the official Apache Iceberg
0.12.0 release.

The commit ID is 7ca1044655694dbbab660d02cef360ac1925f1c2
* This corresponds to the tag: apache-iceberg-0.12.0-rc3
* https://github.com/apache/iceberg/commits/apache-iceberg-0.12.0-rc3
*
https://github.com/apache/iceberg/tree/7ca1044655694dbbab660d02cef360ac1925f1c2

The release tarball, signature, and checksums are here:
* https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc3/

You can find the KEYS file here:
* https://dist.apache.org/repos/dist/dev/iceberg/KEYS

Convenience binary artifacts are staged in Nexus. The Maven repository URL
is:
* https://repository.apache.org/content/repositories/orgapacheiceberg-1018/

Please download, verify, and test.

Please vote in the next 72 hours.

[ ] +1 Release this as Apache Iceberg 0.12.0
[ ] +0
[ ] -1 Do not release this because...


Re: Subject: [VOTE] Release Apache Iceberg 0.12.0 RC3

2021-08-13 Thread Carl Steinbach
+1 (binding)

* Checked signatures of all artifacts.
* Ran build and test to completion without failures.
* Verified that RAT checks pass and that dates have the correct year.

- Carl

On Wed, Aug 11, 2021 at 12:59 AM John Zhuge  wrote:

> +1 (non-binding)
>
> - Checked signature, checksum, and license.
> - Ran build and test (failures in iceberg-mr and iceberg-hive3)
>
> On Tue, Aug 10, 2021 at 10:12 PM Szehon Ho 
> wrote:
>
>> +1 (non binding)
>>
>> * Checked Signature Keys
>> * Verified Checksum
>> * Rat checks
>> * Build and run tests, most functionality pass (also timeout errors on
>> Hive-MR)
>>
>> Thanks
>> Szehon
>>
>> On Tue, Aug 10, 2021 at 1:40 AM Ryan Murray  wrote:
>>
>>> +1 (non-binding)
>>>
>>> * Verify Signature Keys
>>> * Verify Checksum
>>> * dev/check-license
>>> * Build
>>> * Run tests (though some timeout failures, on Hive MR test..)
>>> * ran with Nessie in spark 3.1 and 3.0
>>>
>>> On Tue, Aug 10, 2021 at 4:21 AM Carl Steinbach  wrote:
>>>
>>>> Hi Everyone,
>>>>
>>>> I propose the following RC to be released as the official Apache
>>>> Iceberg 0.12.0 release.
>>>>
>>>> The commit ID is 7ca1044655694dbbab660d02cef360ac1925f1c2
>>>> * This corresponds to the tag: apache-iceberg-0.12.0-rc3
>>>> * https://github.com/apache/iceberg/commits/apache-iceberg-0.12.0-rc3
>>>> *
>>>> https://github.com/apache/iceberg/tree/7ca1044655694dbbab660d02cef360ac1925f1c2
>>>>
>>>> The release tarball, signature, and checksums are here:
>>>> *
>>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc3/
>>>>
>>>> You can find the KEYS file here:
>>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>>>
>>>> Convenience binary artifacts are staged in Nexus. The Maven repository
>>>> URL is:
>>>> *
>>>> https://repository.apache.org/content/repositories/orgapacheiceberg-1018/
>>>>
>>>> Please download, verify, and test.
>>>>
>>>> Please vote in the next 72 hours.
>>>>
>>>> [ ] +1 Release this as Apache Iceberg 0.12.0
>>>> [ ] +0
>>>> [ ] -1 Do not release this because...
>>>>
>>>
>
> --
> John Zhuge
>


Re: [CWS] Re: Subject: [VOTE] Release Apache Iceberg 0.12.0 RC3

2021-08-14 Thread Carl Steinbach
Voting is now over. The motion to release RC3 as the Apache Iceberg 0.12.0
release passes with the following results:

3 binding +1s
3 non-binding +1s

Thanks.

- Carl

On Sat, Aug 14, 2021 at 4:42 PM Ryan Blue  wrote:

> Everything is still looking good to me. I also tested Spark 3.1 using the
> following configuration:
>
> /home/blue/Apps/spark-3.1.1-bin-hadoop3.2/bin/spark-shell \
> --conf 
> spark.jars.repositories=https://repository.apache.org/content/repositories/orgapacheiceberg-1018/
>  \
> --packages org.apache.iceberg:iceberg-spark3-runtime:0.12.0 \
> --conf 
> spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
>  \
> --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \
> --conf spark.sql.catalog.local.type=hadoop \
> --conf spark.sql.catalog.local.warehouse=/home/blue/tmp/hadoop-warehouse \
> --conf spark.sql.catalog.local.default-namespace=default \
> --conf spark.sql.catalog.prodhive=org.apache.iceberg.spark.SparkCatalog \
> --conf spark.sql.catalog.prodhive.type=hive \
> --conf spark.sql.catalog.prodhive.warehouse=/home/blue/tmp/prod-warehouse 
> \
> --conf spark.sql.catalog.prodhive.default-namespace=default \
> --conf spark.sql.defaultCatalog=local
>
>
>- Tested metadata tables (files, manifests, history)
>- Tested ALTER TABLE ADD PARTITION
>- Tested MERGE INTO
>- Tested updating a table to v2 via SET TBLPROPERTIES
>- Tested ALTER TABLE DROP PARTITION with v2 behavior (remove field)
>- Tested reading data in old partition spec
>- Tested DELETE FROM
>
> I also built local projects using 0.12.0 plus a couple of internal patches
> and tests are passing.
>
> Ryan
>
> On Sat, Aug 14, 2021 at 2:41 PM Daniel Weeks 
> wrote:
>
>> +1 (binding)
>>
>> Verified sigs, sums, license, build, and tests.
>>
>> -Dan
>>
>> On Fri, Aug 13, 2021 at 5:05 PM Ryan Blue  wrote:
>>
>>> +1 (binding)
>>>
>>>- Checked signatures, checksums, and RAT
>>>- Ran build and test. There were only failures in
>>>org.apache.iceberg.mr.hive.TestHiveIcebergStorageHandlerWithEngine
>>>that I think I hit last time
>>>
>>> I’ll do more checking over the weekend, but right now it looks good!
>>>
>>> On Fri, Aug 13, 2021 at 3:52 PM Carl Steinbach  wrote:
>>>
>>>> +1 (binding)
>>>>
>>>> * Checked signatures of all artifacts.
>>>> * Ran build and test to completion without failures.
>>>> * Verified that RAT checks pass and that dates have the correct year.
>>>>
>>>> - Carl
>>>>
>>>> On Wed, Aug 11, 2021 at 12:59 AM John Zhuge  wrote:
>>>>
>>>>> +1 (non-binding)
>>>>>
>>>>> - Checked signature, checksum, and license.
>>>>> - Ran build and test (failures in iceberg-mr and iceberg-hive3)
>>>>>
>>>>> On Tue, Aug 10, 2021 at 10:12 PM Szehon Ho 
>>>>> wrote:
>>>>>
>>>>>> +1 (non binding)
>>>>>>
>>>>>> * Checked Signature Keys
>>>>>> * Verified Checksum
>>>>>> * Rat checks
>>>>>> * Build and run tests, most functionality pass (also timeout errors
>>>>>> on Hive-MR)
>>>>>>
>>>>>> Thanks
>>>>>> Szehon
>>>>>>
>>>>>> On Tue, Aug 10, 2021 at 1:40 AM Ryan Murray  wrote:
>>>>>>
>>>>>>> +1 (non-binding)
>>>>>>>
>>>>>>> * Verify Signature Keys
>>>>>>> * Verify Checksum
>>>>>>> * dev/check-license
>>>>>>> * Build
>>>>>>> * Run tests (though some timeout failures, on Hive MR test..)
>>>>>>> * ran with Nessie in spark 3.1 and 3.0
>>>>>>>
>>>>>>> On Tue, Aug 10, 2021 at 4:21 AM Carl Steinbach 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Everyone,
>>>>>>>>
>>>>>>>> I propose the following RC to be released as the official Apache
>>>>>>>> Iceberg 0.12.0 release.
>>>>>>>>
>>>>>>>> The commit ID is 7ca1044655694dbbab660d02cef360ac1925f1c2
>>>>>>>> * This corresponds to the tag: apache-iceberg-0.12.0-rc3
>>>>>>>> *
>>>>>>>> https://github.com/apache/iceberg/commits/apache-iceberg-0.12.0-rc3
>>>>>>>> *
>>>>>>>> https://github.com/apache/iceberg/tree/7ca1044655694dbbab660d02cef360ac1925f1c2
>>>>>>>>
>>>>>>>> The release tarball, signature, and checksums are here:
>>>>>>>> *
>>>>>>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc3/
>>>>>>>>
>>>>>>>> You can find the KEYS file here:
>>>>>>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>>>>>>>
>>>>>>>> Convenience binary artifacts are staged in Nexus. The Maven
>>>>>>>> repository URL is:
>>>>>>>> *
>>>>>>>> https://repository.apache.org/content/repositories/orgapacheiceberg-1018/
>>>>>>>>
>>>>>>>> Please download, verify, and test.
>>>>>>>>
>>>>>>>> Please vote in the next 72 hours.
>>>>>>>>
>>>>>>>> [ ] +1 Release this as Apache Iceberg 0.12.0
>>>>>>>> [ ] +0
>>>>>>>> [ ] -1 Do not release this because...
>>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> John Zhuge
>>>>>
>>>>
>>>
>>> --
>>> Ryan Blue
>>> Tabular
>>>
>>
>
> --
> Ryan Blue
> Tabular
>


Re: [CWS] Re: Subject: [VOTE] Release Apache Iceberg 0.12.0 RC3

2021-08-14 Thread Carl Steinbach
The 0.12.0 release notes are ready for review here:
https://github.com/apache/iceberg/pull/2973

- Carl

On Sat, Aug 14, 2021 at 6:06 PM Carl Steinbach  wrote:

> Voting is now over. The motion to release RC3 as the Apache Iceberg 0.12.0
> release passes with the following results:
>
> 3 binding +1s
> 3 non-binding +1s
>
> Thanks.
>
> - Carl
>
> On Sat, Aug 14, 2021 at 4:42 PM Ryan Blue  wrote:
>
>> Everything is still looking good to me. I also tested Spark 3.1 using the
>> following configuration:
>>
>> /home/blue/Apps/spark-3.1.1-bin-hadoop3.2/bin/spark-shell \
>> --conf 
>> spark.jars.repositories=https://repository.apache.org/content/repositories/orgapacheiceberg-1018/
>>  \
>> --packages org.apache.iceberg:iceberg-spark3-runtime:0.12.0 \
>> --conf 
>> spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
>>  \
>> --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \
>> --conf spark.sql.catalog.local.type=hadoop \
>> --conf spark.sql.catalog.local.warehouse=/home/blue/tmp/hadoop-warehouse 
>> \
>> --conf spark.sql.catalog.local.default-namespace=default \
>> --conf spark.sql.catalog.prodhive=org.apache.iceberg.spark.SparkCatalog \
>> --conf spark.sql.catalog.prodhive.type=hive \
>> --conf 
>> spark.sql.catalog.prodhive.warehouse=/home/blue/tmp/prod-warehouse \
>> --conf spark.sql.catalog.prodhive.default-namespace=default \
>> --conf spark.sql.defaultCatalog=local
>>
>>
>>- Tested metadata tables (files, manifests, history)
>>- Tested ALTER TABLE ADD PARTITION
>>- Tested MERGE INTO
>>- Tested updating a table to v2 via SET TBLPROPERTIES
>>- Tested ALTER TABLE DROP PARTITION with v2 behavior (remove field)
>>- Tested reading data in old partition spec
>>- Tested DELETE FROM
>>
>> I also built local projects using 0.12.0 plus a couple of internal
>> patches and tests are passing.
>>
>> Ryan
>>
>> On Sat, Aug 14, 2021 at 2:41 PM Daniel Weeks 
>> wrote:
>>
>>> +1 (binding)
>>>
>>> Verified sigs, sums, license, build, and tests.
>>>
>>> -Dan
>>>
>>> On Fri, Aug 13, 2021 at 5:05 PM Ryan Blue  wrote:
>>>
>>>> +1 (binding)
>>>>
>>>>- Checked signatures, checksums, and RAT
>>>>- Ran build and test. There were only failures in
>>>>org.apache.iceberg.mr.hive.TestHiveIcebergStorageHandlerWithEngine
>>>>that I think I hit last time
>>>>
>>>> I’ll do more checking over the weekend, but right now it looks good!
>>>>
>>>> On Fri, Aug 13, 2021 at 3:52 PM Carl Steinbach  wrote:
>>>>
>>>>> +1 (binding)
>>>>>
>>>>> * Checked signatures of all artifacts.
>>>>> * Ran build and test to completion without failures.
>>>>> * Verified that RAT checks pass and that dates have the correct year.
>>>>>
>>>>> - Carl
>>>>>
>>>>> On Wed, Aug 11, 2021 at 12:59 AM John Zhuge  wrote:
>>>>>
>>>>>> +1 (non-binding)
>>>>>>
>>>>>> - Checked signature, checksum, and license.
>>>>>> - Ran build and test (failures in iceberg-mr and iceberg-hive3)
>>>>>>
>>>>>> On Tue, Aug 10, 2021 at 10:12 PM Szehon Ho 
>>>>>> wrote:
>>>>>>
>>>>>>> +1 (non binding)
>>>>>>>
>>>>>>> * Checked Signature Keys
>>>>>>> * Verified Checksum
>>>>>>> * Rat checks
>>>>>>> * Build and run tests, most functionality pass (also timeout errors
>>>>>>> on Hive-MR)
>>>>>>>
>>>>>>> Thanks
>>>>>>> Szehon
>>>>>>>
>>>>>>> On Tue, Aug 10, 2021 at 1:40 AM Ryan Murray 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> +1 (non-binding)
>>>>>>>>
>>>>>>>> * Verify Signature Keys
>>>>>>>> * Verify Checksum
>>>>>>>> * dev/check-license
>>>>>>>> * Build
>>>>>>>> * Run tests (though some timeout failures, on Hive MR test..)
>>>>>>>> * ran with Nessie in spark 3.1 and 3.0
>>>>>>>>
>&g

[RESULT][VOTE] Release Apache Iceberg 0.12.0 RC3

2021-08-14 Thread Carl Steinbach
Thanks to everyone who participated in the vote to approve Apache Iceberg
0.12.0 RC3.

The vote result is:

+1: 3 (binding), 3 (non-binding)
+0: 0 (binding), 0 (non-binding)
-1: 0 (binding), 0 (non-binding)

Therefore, the release candidate is accepted.


Re: Iceberg 0.12.0 Release Plan

2021-08-14 Thread Carl Steinbach
The 0.12.0 release has been approved, but before we can announce it we need
to add the release notes.

Please help us get over this hurdle by reviewing the release notes draft
here: https://github.com/apache/iceberg/pull/2973

Thanks!

- Carl

On Tue, Aug 3, 2021 at 4:46 PM Carl Steinbach  wrote:

> Thanks, Jack and Ryan, for closing out these issues. Since there are no
> more blockers, I will start preparing a release candidate.
>
> - Carl
>
> On Tue, Aug 3, 2021 at 3:29 PM Ryan Blue  wrote:
>
>> I just merged the PR. Thanks for making it possible to create v2 tables
>> easily, Jack!
>>
>> On Tue, Aug 3, 2021 at 1:02 PM Jack Ye  wrote:
>>
>>> Thanks! The PR is mostly ready and just got approved by Anton. As soon
>>> as it is merged we can start the branch cut.
>>> -Jack
>>>
>>> On Mon, Aug 2, 2021 at 11:39 PM Carl Steinbach 
>>> wrote:
>>>
>>>> Hi Jack,
>>>>
>>>> I added #2887 to the 0.12.0 release project board
>>>> <https://github.com/apache/iceberg/projects/1>. When do you think this
>>>> patch will be committed?
>>>>
>>>> - Carl
>>>>
>>>> On Mon, Aug 2, 2021 at 4:28 PM Ryan Blue  wrote:
>>>>
>>>>> Jack, I've been reviewing that one so that we can get it in. Thanks
>>>>> for fixing it!
>>>>>
>>>>> On Mon, Aug 2, 2021 at 3:12 PM Jack Ye  wrote:
>>>>>
>>>>>> Thanks for the update Carl!
>>>>>>
>>>>>> Given that we have voted for the adoption of format v2, can we also
>>>>>> get this change in, so that people can start to use v2 tables?
>>>>>> https://github.com/apache/iceberg/pull/2887
>>>>>>
>>>>>> -Jack
>>>>>>
>>>>>> On Mon, Aug 2, 2021 at 3:08 PM Carl Steinbach  wrote:
>>>>>>
>>>>>>> I want to provide everyone with a quick update on the 0.12.0 release
>>>>>>> process. At this point, #2906 Fix Partition field IDs in table
>>>>>>> replacement <https://github.com/apache/iceberg/pull/2906> is the
>>>>>>> only remaining blocker. Ryan is working on a fix and predicts that 
>>>>>>> we'll be
>>>>>>> ready to cut the first release candidate by the end of this week.
>>>>>>>
>>>>>>> - Carl
>>>>>>>
>>>>>>> On Mon, Jul 19, 2021 at 2:58 PM Szehon Ho 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Carl,
>>>>>>>>
>>>>>>>> For the Issue: https://github.com/apache/iceberg/issues/2783
>>>>>>>>
>>>>>>>> The status is: I gave a bit of a try but couldn’t find an easy fix,
>>>>>>>> so hoping someone more knowledgable about this code has cycle to take a
>>>>>>>> look at it.
>>>>>>>>
>>>>>>>> It would be great to fix it for 0.12 as it seems to block more
>>>>>>>> metadata queries than before, but for timing purpose I’m not sure if 
>>>>>>>> its
>>>>>>>> feasible.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Szehon
>>>>>>>>
>>>>>>>> On Mon, Jul 19, 2021 at 2:19 PM Carl Steinbach 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Everyone,
>>>>>>>>>
>>>>>>>>> Currently, there are three issues blocking the release of 0.12.0:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>1. #2308 Handle the case that RewriteFiles and RowDelta commit
>>>>>>>>>the transaction at the same time
>>>>>>>>><https://github.com/apache/iceberg/issues/2308>
>>>>>>>>>2. #2783 Metadata Table Empty Projection - Unknown type for
>>>>>>>>>int field. Type name: java.lang.string
>>>>>>>>><https://github.com/apache/iceberg/issues/2783>
>>>>>>>>>3. #2284 Core: reassign the partition field IDs and reuse any
>>>>>>>>>existing ID <https://github.com/apache/iceberg/pull/2284>s
>>>>>>>>>
>>>

Re: Iceberg 0.12.0 Release Plan

2021-08-16 Thread Carl Steinbach
Update: still waiting for a +1 on the release notes
https://github.com/apache/iceberg/pull/2973

- Carl

On Sat, Aug 14, 2021 at 6:25 PM Carl Steinbach  wrote:

> The 0.12.0 release has been approved, but before we can announce it we
> need to add the release notes.
>
> Please help us get over this hurdle by reviewing the release notes draft
> here: https://github.com/apache/iceberg/pull/2973
>
> Thanks!
>
> - Carl
>
> On Tue, Aug 3, 2021 at 4:46 PM Carl Steinbach  wrote:
>
>> Thanks, Jack and Ryan, for closing out these issues. Since there are no
>> more blockers, I will start preparing a release candidate.
>>
>> - Carl
>>
>> On Tue, Aug 3, 2021 at 3:29 PM Ryan Blue  wrote:
>>
>>> I just merged the PR. Thanks for making it possible to create v2 tables
>>> easily, Jack!
>>>
>>> On Tue, Aug 3, 2021 at 1:02 PM Jack Ye  wrote:
>>>
>>>> Thanks! The PR is mostly ready and just got approved by Anton. As soon
>>>> as it is merged we can start the branch cut.
>>>> -Jack
>>>>
>>>> On Mon, Aug 2, 2021 at 11:39 PM Carl Steinbach 
>>>> wrote:
>>>>
>>>>> Hi Jack,
>>>>>
>>>>> I added #2887 to the 0.12.0 release project board
>>>>> <https://github.com/apache/iceberg/projects/1>. When do you think
>>>>> this patch will be committed?
>>>>>
>>>>> - Carl
>>>>>
>>>>> On Mon, Aug 2, 2021 at 4:28 PM Ryan Blue  wrote:
>>>>>
>>>>>> Jack, I've been reviewing that one so that we can get it in. Thanks
>>>>>> for fixing it!
>>>>>>
>>>>>> On Mon, Aug 2, 2021 at 3:12 PM Jack Ye  wrote:
>>>>>>
>>>>>>> Thanks for the update Carl!
>>>>>>>
>>>>>>> Given that we have voted for the adoption of format v2, can we also
>>>>>>> get this change in, so that people can start to use v2 tables?
>>>>>>> https://github.com/apache/iceberg/pull/2887
>>>>>>>
>>>>>>> -Jack
>>>>>>>
>>>>>>> On Mon, Aug 2, 2021 at 3:08 PM Carl Steinbach 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I want to provide everyone with a quick update on the 0.12.0
>>>>>>>> release process. At this point, #2906 Fix Partition field IDs in
>>>>>>>> table replacement <https://github.com/apache/iceberg/pull/2906> is
>>>>>>>> the only remaining blocker. Ryan is working on a fix and predicts that
>>>>>>>> we'll be ready to cut the first release candidate by the end of this 
>>>>>>>> week.
>>>>>>>>
>>>>>>>> - Carl
>>>>>>>>
>>>>>>>> On Mon, Jul 19, 2021 at 2:58 PM Szehon Ho 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Carl,
>>>>>>>>>
>>>>>>>>> For the Issue: https://github.com/apache/iceberg/issues/2783
>>>>>>>>>
>>>>>>>>> The status is: I gave a bit of a try but couldn’t find an easy
>>>>>>>>> fix, so hoping someone more knowledgable about this code has cycle to 
>>>>>>>>> take
>>>>>>>>> a look at it.
>>>>>>>>>
>>>>>>>>> It would be great to fix it for 0.12 as it seems to block more
>>>>>>>>> metadata queries than before, but for timing purpose I’m not sure if 
>>>>>>>>> its
>>>>>>>>> feasible.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Szehon
>>>>>>>>>
>>>>>>>>> On Mon, Jul 19, 2021 at 2:19 PM Carl Steinbach 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Everyone,
>>>>>>>>>>
>>>>>>>>>> Currently, there are three issues blocking the release of 0.12.0:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>1. #2308 Handle the case that RewriteFiles and RowDelta
>>>>>>>>>>commit the transaction at the same time
>>>>

Re: Iceberg 0.12.0 Release Plan

2021-08-16 Thread Carl Steinbach
Hi Dan,

By the time I saw your email I had already gotten approval from Jack and
committed the patch. I created a new PR to track any additional changes we
want to make before announcing the release:
https://github.com/apache/iceberg/pull/2986

Thanks.

- Carl


On Mon, Aug 16, 2021 at 8:56 PM Daniel Weeks  wrote:

> I think there might be a couple more bug fixes that we want to call out.
> I'm still reviewing, but will try to add any additions tomorrow.
>
> Thanks,
> Dan
>
>
>
> On Mon, Aug 16, 2021, 8:23 PM Carl Steinbach  wrote:
>
>> Update: still waiting for a +1 on the release notes
>> https://github.com/apache/iceberg/pull/2973
>>
>> - Carl
>>
>> On Sat, Aug 14, 2021 at 6:25 PM Carl Steinbach  wrote:
>>
>>> The 0.12.0 release has been approved, but before we can announce it we
>>> need to add the release notes.
>>>
>>> Please help us get over this hurdle by reviewing the release notes draft
>>> here: https://github.com/apache/iceberg/pull/2973
>>>
>>> Thanks!
>>>
>>> - Carl
>>>
>>> On Tue, Aug 3, 2021 at 4:46 PM Carl Steinbach  wrote:
>>>
>>>> Thanks, Jack and Ryan, for closing out these issues. Since there are no
>>>> more blockers, I will start preparing a release candidate.
>>>>
>>>> - Carl
>>>>
>>>> On Tue, Aug 3, 2021 at 3:29 PM Ryan Blue  wrote:
>>>>
>>>>> I just merged the PR. Thanks for making it possible to create v2
>>>>> tables easily, Jack!
>>>>>
>>>>> On Tue, Aug 3, 2021 at 1:02 PM Jack Ye  wrote:
>>>>>
>>>>>> Thanks! The PR is mostly ready and just got approved by Anton. As
>>>>>> soon as it is merged we can start the branch cut.
>>>>>> -Jack
>>>>>>
>>>>>> On Mon, Aug 2, 2021 at 11:39 PM Carl Steinbach 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Jack,
>>>>>>>
>>>>>>> I added #2887 to the 0.12.0 release project board
>>>>>>> <https://github.com/apache/iceberg/projects/1>. When do you think
>>>>>>> this patch will be committed?
>>>>>>>
>>>>>>> - Carl
>>>>>>>
>>>>>>> On Mon, Aug 2, 2021 at 4:28 PM Ryan Blue  wrote:
>>>>>>>
>>>>>>>> Jack, I've been reviewing that one so that we can get it in. Thanks
>>>>>>>> for fixing it!
>>>>>>>>
>>>>>>>> On Mon, Aug 2, 2021 at 3:12 PM Jack Ye  wrote:
>>>>>>>>
>>>>>>>>> Thanks for the update Carl!
>>>>>>>>>
>>>>>>>>> Given that we have voted for the adoption of format v2, can we
>>>>>>>>> also get this change in, so that people can start to use v2 tables?
>>>>>>>>> https://github.com/apache/iceberg/pull/2887
>>>>>>>>>
>>>>>>>>> -Jack
>>>>>>>>>
>>>>>>>>> On Mon, Aug 2, 2021 at 3:08 PM Carl Steinbach 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I want to provide everyone with a quick update on the 0.12.0
>>>>>>>>>> release process. At this point, #2906 Fix Partition field IDs in
>>>>>>>>>> table replacement <https://github.com/apache/iceberg/pull/2906>
>>>>>>>>>> is the only remaining blocker. Ryan is working on a fix and predicts 
>>>>>>>>>> that
>>>>>>>>>> we'll be ready to cut the first release candidate by the end of this 
>>>>>>>>>> week.
>>>>>>>>>>
>>>>>>>>>> - Carl
>>>>>>>>>>
>>>>>>>>>> On Mon, Jul 19, 2021 at 2:58 PM Szehon Ho <
>>>>>>>>>> szehon.apa...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Carl,
>>>>>>>>>>>
>>>>>>>>>>> For the Issue: https://github.com/apache/iceberg/issues/2783
>>>>>>>>>>>
>>>>>>>>>>> The status is: I gave a bit of a try but couldn’t find an easy
>>>>>>>>>>&

[ANNOUNCE] Apache Iceberg release 0.12.0

2021-08-19 Thread Carl Steinbach
I'm pleased to announce the release of Apache Iceberg 0.12.0!

Apache Iceberg is an open table format for huge analytic datasets. Iceberg
delivers high query performance for tables with tens of petabytes of data,
along with atomic commits, concurrent writes, and SQL-compatible table
evolution.

Comprehensive release notes for 0.12.0 are available here:
https://iceberg.apache.org/releases/#0120-release-notes

The 0.12.0 source release tarball is available here:
https://www.apache.org/dyn/closer.cgi/iceberg/apache-iceberg-0.12.0/apache-iceberg-0.12.0.tar.gz

Java artifacts are available from Maven Central.

Thanks to everyone who contributed code or helped with the release process!

- Carl


Re: [VOTE] Release Apache Iceberg 1.6.1 RC1

2024-08-22 Thread Carl Steinbach
Working on it now.

- Carl

On Thu, Aug 22, 2024 at 5:13 AM Jean-Baptiste Onofré 
wrote:

> Yeah, it makes sense (and it was what I expected to be honest :) ).
>
> Eduard already reviewed and merged, so I think we are good for a new
> RC. I guess Carl will prepare a new one soon.
>
> Thanks !
>
> Regards
> JB
>
> On Thu, Aug 22, 2024 at 11:54 AM Driesprong, Fokko 
> wrote:
> >
> > It was not correctly backported, I do think we want to add this since it
> fixes a CVE as mentioned earlier. I've created a PR:
> https://github.com/apache/iceberg/pull/10988
> >
> > Kind regards,
> > Fokko
> >
> > Op do 22 aug 2024 om 11:35 schreef Jean-Baptiste Onofré  >:
> >>
> >> Hi guys,
> >>
> >> FYI, the reason I mentioned ORC update is because the PR is "flagged"
> >> with milestone 1.6.1.
> >> So it's a bit surprising to not have it in 1.6.1.
> >>
> >> We should at least update the PR/issue removing the 1.6.1 milestone,
> >> else it would not be "accurate".
> >>
> >> Regards
> >> JB
> >>
> >> On Thu, Aug 22, 2024 at 12:04 AM Piotr Findeisen
> >>  wrote:
> >> >
> >> > Hi Eduard,
> >> >
> >> > JB wrote
> >> >
> >> >> For the record (maybe it helps users/reviewers), this release
> includes:
> >> >> - ORC 1.9.4 update
> >> >> - introduce memory limit on ParallelIterable
> >> >
> >> >
> >> > I can confirm ParallelIterable change, but i am not sure whether ORC
> update was part of the release.
> >> >
> >> >
> >> > Best
> >> > Piotr
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, 21 Aug 2024 at 09:45, Fokko Driesprong 
> wrote:
> >> >>
> >> >> Hey Eduard,
> >> >>
> >> >> I think it relates to this PR. It contains a CVE and would be good
> to be backported. We wanted to include it in 1.6.1 if we needed another RC,
> but that didn't happen, so I think we didn't cherry-pick it to 1.6.x branch.
> >> >>
> >> >> Kind regards,
> >> >> Fokko
> >> >>
> >> >> Op wo 21 aug 2024 om 09:34 schreef Eduard Tudenhöfner <
> etudenhoef...@apache.org>:
> >> >>>
> >> >>> @Piotr can you please elaborate which ORC update you are referring
> to? Or did you mean the Avro update (which I think we were planning for
> 1.6.2)?
> >> >>>
> >> >>> On Tue, Aug 20, 2024 at 7:05 PM Piotr Findeisen <
> piotr.findei...@gmail.com> wrote:
> >> >>>>
> >> >>>> Hi
> >> >>>>
> >> >>>> -1 (non-binding)
> >> >>>>
> >> >>>> I verified source tarball matches the git tag (except it lacks
> jitpack.yml, docs/ and 'examples/Convert table to Iceberg.ipynb').
> >> >>>> However, i noted that source tarball verification is not part of
> https://iceberg.apache.org/how-to-release/#validating-a-source-release-candidate
> .
> >> >>>> I started a separate dev list thread about this (
> https://lists.apache.org/thread/24c0xhfbb2680nrqyd2jrngxtg6qoz8c).
> >> >>>>
> >> >>>> as to the changes, it looks like it contains the ParallelIterable
> change, but I don't see ORC update
> >> >>>>
> >> >>>> $ git diff apache-iceberg-1.6.0..apache-iceberg-1.6.1-rc1
> --numstat
> >> >>>> 167 55
> core/src/main/java/org/apache/iceberg/util/ParallelIterable.java
> >> >>>> 48  0
>  core/src/test/java/org/apache/iceberg/util/TestParallelIterable.java
> >> >>>>
> >> >>>> I tested with Trino https://github.com/trinodb/trino/pull/23083
> >> >>>> The parallel change iterable caused a regression in Trino when
> planning queries with LIMIT.
> >> >>>> Now the query scheduler will open more manifests than it used to
> (test io.trino.plugin.iceberg.TestIcebergFileOperations#testSelectWithLimit
> in Trino)
> >> >>>> Reverting the change around queue low water mark [1][2] solved the
> test for me locally.
> >> >>>>
> >> >>>> Best,
> >> >>>> Piotr
> >> >>>>
> >> >>>> [1] https://github.com/apache/iceberg/pull/10978
> >&g

[VOTE] Release Apache Iceberg 1.6.1 RC2

2024-08-22 Thread Carl Steinbach
Hi Everyone,

I propose that we release the following RC as the official Apache Iceberg
1.6.1 release.

The commit ID is 8e9d59d299be42b0bca9461457cd1e95dbaad086
* This corresponds to the tag: apache-iceberg-1.6.1-rc2
* https://github.com/apache/iceberg/commits/apache-iceberg-1.6.1-rc2
*
https://github.com/apache/iceberg/tree/8e9d59d299be42b0bca9461457cd1e95dbaad086

The release tarball, signature, and checksums are here:
* https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.6.1-rc2

You can find the KEYS file here:
* https://dist.apache.org/repos/dist/dev/iceberg/KEYS

Convenience binary artifacts are staged on Nexus. The Maven repository URL
is:
* https://repository.apache.org/content/repositories/orgapacheiceberg-1171/

This release includes the following changes:

8e9d59d29 Build: Bump orc from 1.9.3 to 1.9.4 (#10728) (#10988)
ed53c6d32 Drop ParallelIterable's queue low water mark (#10979)
e18a2fe10 Core: Limit ParallelIterable memory consumption by yielding in
tasks (backport #10691) (#10787)

Please download, verify, and test.

Please vote in the next 72 hours.

[ ] +1 Release this as Apache Iceberg 1.6.1
[ ] +0
[ ] -1 Do not release this because...

Only PMC members have binding votes, but other community members are
encouraged to cast
non-binding votes. This vote will pass if there are 3 binding +1 votes and
more binding
+1 votes than -1 votes.


Re: [VOTE] Release Apache Iceberg 1.6.1 RC2

2024-08-27 Thread Carl Steinbach
Thanks to everyone who participated in the vote for Release Apache Iceberg
1.6.1 RC2.

The vote result is:

+1: 4 (binding), 5 (non-binding)
+0: 0 (binding), 0 (non-binding)
-1: 0 (binding), 0 (non-binding)

Therefore, the release candidate is passed/rejected.

On Tue, Aug 27, 2024 at 7:06 AM ConradJam  wrote:

> +1 (non binding)
>
>  checked:
> - NOTICE/LICENSE
> - build on jdk11 OK
> - tested on flink1.18.1 is ok to streaming write
>
> Jean-Baptiste Onofré  于2024年8月27日周二 17:24写道:
>
>> +1 (non binding)
>>
>> I checked:
>> - NOTICE/LICENSE
>> - no binary file found in the source distribution
>> - signature and checksum are OK
>> - build OK
>> - quickly tested on iceland
>>
>> Thanks !
>>
>> Regards
>> JB
>>
>> On Thu, Aug 22, 2024 at 8:08 PM Carl Steinbach  wrote:
>> >
>> > Hi Everyone,
>> >
>> > I propose that we release the following RC as the official Apache
>> Iceberg 1.6.1 release.
>> >
>> > The commit ID is 8e9d59d299be42b0bca9461457cd1e95dbaad086
>> > * This corresponds to the tag: apache-iceberg-1.6.1-rc2
>> > * https://github.com/apache/iceberg/commits/apache-iceberg-1.6.1-rc2
>> > *
>> https://github.com/apache/iceberg/tree/8e9d59d299be42b0bca9461457cd1e95dbaad086
>> >
>> > The release tarball, signature, and checksums are here:
>> > *
>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.6.1-rc2
>> >
>> > You can find the KEYS file here:
>> > * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>> >
>> > Convenience binary artifacts are staged on Nexus. The Maven repository
>> URL is:
>> > *
>> https://repository.apache.org/content/repositories/orgapacheiceberg-1171/
>> >
>> > This release includes the following changes:
>> >
>> > 8e9d59d29 Build: Bump orc from 1.9.3 to 1.9.4 (#10728) (#10988)
>> > ed53c6d32 Drop ParallelIterable's queue low water mark (#10979)
>> > e18a2fe10 Core: Limit ParallelIterable memory consumption by yielding
>> in tasks (backport #10691) (#10787)
>> >
>> > Please download, verify, and test.
>> >
>> > Please vote in the next 72 hours.
>> >
>> > [ ] +1 Release this as Apache Iceberg 1.6.1
>> > [ ] +0
>> > [ ] -1 Do not release this because...
>> >
>> > Only PMC members have binding votes, but other community members are
>> encouraged to cast
>> > non-binding votes. This vote will pass if there are 3 binding +1 votes
>> and more binding
>> > +1 votes than -1 votes.
>>
>
>
> --
> Best
>
> ConradJam
>


[ANNOUNCE] Apache Iceberg release 1.6.1

2024-08-28 Thread Carl Steinbach
I'm pleased to announce the release of Apache Iceberg 1.6.1!

Apache Iceberg is an open table format for huge analytic datasets. Iceberg
delivers high query performance for tables with tens of petabytes of data,
along with atomic commits, concurrent writes, and SQL-compatible table
evolution.

This release can be downloaded from:
https://dlcdn.apache.org/iceberg/apache-iceberg-1.6.1/apache-iceberg-1.6.1.tar.gz

Release notes: https://iceberg.apache.org/releases/#1.6.1-release

Java artifacts are available from Maven Central.

Thanks to everyone for contributing!


Re: project report

2018-12-04 Thread Carl Steinbach
+1

On Tue, Dec 4, 2018 at 5:14 PM Ryan Blue  wrote:

> Done. Thanks for working on the first draft.
>
> rb
>
> On Tue, Dec 4, 2018 at 4:47 PM Owen O'Malley 
> wrote:
>
> > Go ahead and edit the wiki
> >
> > https://wiki.apache.org/incubator/December2018
> >
> > I'd suggest that we do a source-only release and after the release
> passes,
> > push the binaries into Maven central.
> >
> > On Tue, Dec 4, 2018 at 4:43 PM Ryan Blue 
> > wrote:
> >
> >> Looks good to me!
> >>
> >> We might want to note that the codebase has been updated with Apache
> >> license headers and now complies with ASF guidelines for a source
> release.
> >>
> >> We still need to cut over to org.apache package names instead of
> >> com.netflix, which I think we should do before the first release. We
> >> should
> >> also decide whether to do a source-only first release or to go through
> the
> >> pain of publishing convenience binaries with their own LICENSE and
> NOTICE
> >> content.
> >>
> >> rb
> >>
> >> On Tue, Dec 4, 2018 at 3:37 PM Owen O'Malley 
> >> wrote:
> >>
> >> > I wrote a first pass of the report for the Apache board.
> >> >
> >> > Iceberg
> >> > >
> >> > > Iceberg is a table format for large, slow-moving tabular data.
> >> > >
> >> > > Iceberg has been incubating since 2018-11-16.
> >> > >
> >> > > Three most important issues to address in the move towards
> graduation:
> >> > >
> >> > >   1. Get the SGA accepted.
> >> > >   2. Finish the name clearance.
> >> > >   3. Make the first Apache release.
> >> > >
> >> > > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to
> be
> >> > > aware of?
> >> > >
> >> > >   * Gitbox integration has helped a lot, although it is frustrating
> >> that
> >> > > the team members are not allowed to configure the project and
> >> must go
> >> > > through infra for every change.
> >> > >   * The traffic on the dev list from Github pull requests and issues
> >> is
> >> > > pretty heavy. It would be nice to have emails from creation go
> to
> >> > dev@
> >> > > ,
> >> > > while updates and resolutions would go the issues@.
> >> > >
> >> > > How has the community developed since the last report?
> >> > >
> >> > >   This is the first report.
> >> > >
> >> > > How has the project developed since the last report?
> >> > >
> >> > >   This is the first report.
> >> > >
> >> > > How would you assess the podling's maturity?
> >> > > Please feel free to add your own commentary.
> >> > >
> >> > >   [X] Initial setup
> >> > >   [ ] Working towards first release
> >> > >   [ ] Community building
> >> > >   [ ] Nearing graduation
> >> > >   [ ] Other:
> >> > >
> >> > > Date of last release:
> >> > >
> >> > >   None yet
> >> > >
> >> > > When were the last committers or PPMC members elected?
> >> > >
> >> > >   None yet
> >> > >
> >> > > Have your mentors been helpful and responsive or are things falling
> >> > > through the cracks? In the latter case, please list any open issues
> >> > > that need to be addressed.
> >> > >
> >> > >   We're working through the issues as they come up.
> >> > >
> >> >
> >>
> >>
> >> --
> >> Ryan Blue
> >> Software Engineer
> >> Netflix
> >>
> >
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>