Re: Question about replacing files and about Publishing Jars

2019-02-26 Thread Jacques Nadeau
We're using etag for better clarity on this at Dremio (for a different use case). I wonder if the same thing should be available in iceberg. -- Jacques Nadeau CTO and Co-Founder, Dremio On Tue, Feb 26, 2019 at 9:48 AM Ryan Blue wrote: > Hi Arvind, > > Iceberg assumes that all

Re: [DISCUSS] Community code reviews

2019-02-26 Thread Jacques Nadeau
I'm +1 (non-binding) if you allow a window for review (for example, I think others have suggested 1-2 business day before self+1). The post, self +1, merge in two minutes is not great situation for anyone. -- Jacques Nadeau CTO and Co-Founder, Dremio On Tue, Feb 26, 2019 at 4:51 PM Ryan

Re: Updates/Deletes/Upserts in Iceberg

2019-05-07 Thread Jacques Nadeau
Awesome. This was on my list for some time. Glad you got it started. On Wed, May 8, 2019, 3:42 AM Anton Okolnychyi wrote: > Hi folks, > > Miguel (cc) and I have spent some time thinking about how to perform > updates/deletes/upserts on top of Iceberg tables. This functionality is > essential for

Re: Updates/Deletes/Upserts in Iceberg

2019-05-09 Thread Jacques Nadeau
think they do. Thanks again for putting this together! Jacques -- Jacques Nadeau CTO and Co-Founder, Dremio On Wed, May 8, 2019 at 3:42 AM Anton Okolnychyi wrote: > Hi folks, > > Miguel (cc) and I have spent some time thinking about how to perform > updates/deletes/upserts on top

Re: Updates/Deletes/Upserts in Iceberg

2019-05-21 Thread Jacques Nadeau
o be true. If we give control over the creation of synthetic key, wouldn't that resolve this issue? -- Jacques Nadeau CTO and Co-Founder, Dremio On Tue, May 21, 2019 at 7:54 AM Erik Wright wrote: > On Thu, May 16, 2019 at 4:13 PM Ryan Blue wrote: > >> Replies inline. >&

Re: Updates/Deletes/Upserts in Iceberg

2019-05-21 Thread Jacques Nadeau
It would be useful to describe the types of concurrent operations that > would be supported (i.e., failed snapshotting could easily be recovered, > vs. the whole operation needing to be re-executed) vs. those that wouldn't. > Solving for unlimited concurrency cases may create way more complexity th

Re: Updates/Deletes/Upserts in Iceberg

2019-05-21 Thread Jacques Nadeau
> > It’s not at all clear why unique keys would be needed at all. If we turn your questions around, you answer yourself. If you have independent writers, you need unique keys. Also truly independent writers (like a job writing while a job compacts), > means effectively a distributed transaction,

Re: Updates/Deletes/Upserts in Iceberg

2019-05-21 Thread Jacques Nadeau
> That's my point, truly independent writers (two Spark jobs, or a Spark job > and Dremio job) means a distributed transaction. It would need yet another > external transaction coordinator on top of both Spark and Dremio, Iceberg > by itself > cannot solve this. > I'm not ready to accept this. Ice

Re: Updates/Deletes/Upserts in Iceberg

2019-05-21 Thread Jacques Nadeau
I agree with Anton that we should probably spend some time on hangouts further discussing things. Definitely differing expectations here and we seem to be talking a bit past each other. -- Jacques Nadeau CTO and Co-Founder, Dremio On Tue, May 21, 2019 at 3:44 PM Cristian Opris wrote: > I l

Re: Updates/Deletes/Upserts in Iceberg

2019-05-22 Thread Jacques Nadeau
669 900 6833 US (San Jose) 877 853 5257 US Toll-free 888 475 4499 US Toll-free Meeting ID: 415 730 2092 Find your local number: https://zoom.us/u/aH9XYBfm -- Jacques Nadeau CTO and Co-Founder, Dremio On Wed, May 22, 2019 at 8:54 AM Ryan Blue wrote: > 9AM on Friday works b

Re: Updates/Deletes/Upserts in Iceberg

2019-05-29 Thread Jacques Nadeau
Yeah, I totally forgot to record our discussion. Will do so next time, sorry. -- Jacques Nadeau CTO and Co-Founder, Dremio On Wed, May 29, 2019 at 4:24 PM Ryan Blue wrote: > It wasn't recorded, but I can summarize what we talked about. Sorry I > haven't sent this out earlie

Re: Subscribing to dev mailing list

2019-10-01 Thread Jacques Nadeau
You should send an email to dev-subscr...@iceberg.apache.org On Tue, Oct 1, 2019, 4:42 AM Thippana Vamsi Kalyan wrote: > Please add my email id. > > Thank you so much > -- > Best regards > T.Vamsi Kalyan >

Re: [DISCUSS] Iceberg community sync?

2019-10-03 Thread Jacques Nadeau
Sounds good to me. I'd vote for once a month. -- Jacques Nadeau CTO and Co-Founder, Dremio On Thu, Oct 3, 2019 at 4:56 PM Ryan Blue wrote: > Hi everyone, > > Other projects I'm involved in use a hangouts meetup every few weeks to > sync up about the status of diff

Re: [DISCUSS] Iceberg community sync?

2019-10-06 Thread Jacques Nadeau
v@iceberg.apache.org" >> *Date: *Friday, October 4, 2019 at 1:41 AM >> *To: *Iceberg Dev List >> *Subject: *Re: [DISCUSS] Iceberg community sync? >> >> >> >> +1 >> >> >> >> On 4 Oct 2019, at 07:14, Julien Le Dem >

Re: [VOTE] Release Apache Iceberg 0.7.0-incubating RC1

2019-10-14 Thread Jacques Nadeau
I don't see any reference to it. Not sure if needs to go in both NOTICE and LICENSE or only one. It has been a long time since I did an incubator check so I may be wrong on both of these and would love someone who has done it more recently to chime in... -- Jacques Nadeau CTO and Co-

random comment

2020-01-02 Thread Jacques Nadeau
I have a random comment on this project versus others I'm involved in. This is not meant to be critical, it's just an observation. It feels like very little discussion happens on the dev list other than the random technical support email. Basically, all interaction is on Github (?) but there are n

Re: Welcome new committer and PPMC member Ratandeep Ratti

2020-02-16 Thread Jacques Nadeau
Congrats! On Sun, Feb 16, 2020, 7:06 PM xiaokun ding wrote: > CONGRATULATIONS > > 李响 于2020年2月17日周一 上午11:05写道: > >> CONGRATULATIONS!!! >> >> On Mon, Feb 17, 2020 at 9:50 AM Junjie Chen >> wrote: >> >>> Congratulations! >>> >>> On Mon, Feb 17, 2020 at 5:48 AM Ryan Blue wrote: >>> Hi everyo

Re: [DISCUSS] Graduating from the Apache Incubator

2020-05-11 Thread Jacques Nadeau
Agree with Owen. Great to see Iceberg's growth. -- Jacques Nadeau CTO and Co-Founder, Dremio On Mon, May 11, 2020 at 12:16 PM Owen O'Malley wrote: > +1 to graduation. It is exciting watching the project and its community > grow. > > .. Owen > > On Mon, May 11, 2020

Re: [VOTE] Graduate to a top-level project

2020-05-12 Thread Jacques Nadeau
I'm +1. (I think that is non-binding here but binding at the incubator level) -- Jacques Nadeau CTO and Co-Founder, Dremio On Tue, May 12, 2020 at 2:35 PM Romin Parekh wrote: > +1 > > On Tue, May 12, 2020 at 2:32 PM Owen O'Malley > wrote: > >> +1 >> &g

Re: [DISCUSS] August board report

2020-08-12 Thread Jacques Nadeau
The conference was free so all the recordings are available on-demand for free: https://subsurfaceconf.com/summer2020/recordings -- Jacques Nadeau CTO and Co-Founder, Dremio On Wed, Aug 12, 2020 at 7:07 PM OpenInx wrote: > > Community members gave 2 Iceberg talks at Subsurface Co

Re: [DISCUSS] August board report

2020-08-24 Thread Jacques Nadeau
.youtube.com/watch?v=5RJrqS8_u68&list=PL-gIUf9e9CCtewYqIGUKvz0fVcoyOYU1H&index=10 Dan https://www.youtube.com/watch?v=9uiaCN3tJyI&list=PL-gIUf9e9CCtewYqIGUKvz0fVcoyOYU1H&index=3 -- Jacques Nadeau CTO and Co-Founder, Dremio On Thu, Aug 13, 2020 at 7:49 PM OpenInx wrote: > Tha

New project integrated with Iceberg

2020-10-01 Thread Jacques Nadeau
ng it Project Nessie ( projectnessie.org) and we'd love your feedback. Our goal is to contribute our Iceberg integration into the project. You can check that work out here: https://github.com/projectnessie/nessie/tree/main/clients/iceberg Thanks, Jacques -- Jacques Nadeau CTO and Co-Founder, Dremio

Re: [VOTE] Release Apache Iceberg 0.8.0-incubating RC2

2020-11-01 Thread Jacques Nadeau
+1 (non-binding) Ran through steps 1-7, completed successfully. I also updated Nessie to pull from the staging maven repository and ran the Nessie test suite and it completed successfully with the staged 0.10.0 artifacts. -- Jacques Nadeau CTO and Co-Founder, Dremio On Sat, May 2, 2020 at 8

Re: Integrating Existing Iceberg Tables with a Metastore

2020-11-20 Thread Jacques Nadeau
overall operational load of Nessie is targeted to be a fraction of what HMS is. Full disclosure, I work on Nessie. Foot for thought, anyway. -- Jacques Nadeau CTO and Co-Founder, Dremio On Fri, Nov 20, 2020 at 10:58 AM Marko Babic wrote: > Hi Peter. Thanks for responding. > > &g

Re: Iceberg/Hive properties handling

2020-11-25 Thread Jacques Nadeau
d follow #1 and only store properties that are about the ptr, not the content/metadata. Lastly, I believe #4 is the case but haven't tested it. Can someone confirm that it is true? And that it is possible/not problematic? -- Jacques Nadeau CTO and Co-Founder, Dremio On Wed, Nov 25, 2020 at 4:28 PM

Re: Iceberg/Hive properties handling

2020-11-25 Thread Jacques Nadeau
Minor error, my last example should have been: db1.table1_etl_branch => nessie.folder1.folder2.folder3.table1@etl_branch -- Jacques Nadeau CTO and Co-Founder, Dremio On Wed, Nov 25, 2020 at 4:56 PM Jacques Nadeau wrote: > I agree with Ryan on the core principles here. As I understan

Re: Iceberg/Hive properties handling

2020-12-01 Thread Jacques Nadeau
Would someone be willing to create a document that states the current proposal? It is becoming somewhat difficult to follow this thread. I also worry that without a complete statement of the current shape that people may be incorrectly thinking they are in alignment. -- Jacques Nadeau CTO and

Re: Iceberg At Adobe

2020-12-03 Thread Jacques Nadeau
; >> https://medium.com/adobetech/iceberg-at-adobe-88cf1950e866 >> >> >> >> There will be a series of blogs to show more details. >> >> >> >> Miao >> > > > -- > John Zhuge > -- -- Jacques Nadeau CTO and Co-Founder, Dremio

Re: Iceberg/Hive properties handling

2020-12-07 Thread Jacques Nadeau
berg properties 4. Prefix for HMS only properties I generally think #2 is a no-go as it creates too much coupling between catalog implementations and core iceberg. It seems like Ryan Blue would prefer #4 (correct?). Any other strong opinions? -- Jacques Nadeau CTO and Co-Founder, Dremio On Thu, De

Re: Adobe Blog ..

2021-01-15 Thread Jacques Nadeau
+1. This is a great series. I think it would be great to add a section to the website linking to helpful articles, slide decks, etc about Iceberg. In the trenches information is often the most useful. On Fri, Jan 15, 2021 at 3:43 PM Ryan Blue wrote: > Thanks, Gautam! I was just reading the one

Re: Welcoming Peter Vary as a new committer!

2021-01-25 Thread Jacques Nadeau
Congrats Peter! Thanks for all your great work On Mon, Jan 25, 2021 at 10:24 AM Ryan Blue wrote: > Hi everyone, > > I'd like to welcome Peter Vary as a new Iceberg committer. > > Thanks for all your contributions, Peter! > > rb > > -- > Ryan Blue >

Re: Proposal: Support for views in Iceberg

2021-07-22 Thread Jacques Nadeau
Some thoughts... - In general, many engines want (or may require) a resolved sql field. This--at minimum--typically includes star expansion since traditional view behavior is stars are expanded at view creation time (since this is the only way to guarantee that the view returns the sam

Re: [DISCUSS] UUID type

2021-07-27 Thread Jacques Nadeau
What specific arguments are there for it being a first class type besides it is elsewhere? Is there some kind of optimization iceberg or an engine could do if it was typed versus just a bucket of bits? Fixed width binary seems to cover the cases I see in terms of actual functionality in the iceberg

Re: [DISCUSS] Moving to apache-iceberg Slack workspace

2021-07-28 Thread Jacques Nadeau
My one recommendation would be that if you go off Apache infra that you make sure all the PMC members are admins of the new account. On Wed, Jul 28, 2021 at 8:35 AM Ryan Blue wrote: > No problem, the site hasn't been deployed yet. > > I've also looked a bit into the invite issue for the apache-i

Re: [DISCUSS] UUID type

2021-07-29 Thread Jacques Nadeau
t; Yes I agree with Jacques that fixed binary is what it is in the end. I >>> think It is more about user experience, whether the conversion is done at >>> the user side or Iceberg and engine side. Many people just store UUID as a >>> 36 byte string instead of a 16 byte bin

Re: [DISCUSS] UUID type

2021-07-29 Thread Jacques Nadeau
n because the use as an ID makes them > likely to be join keys. > > If we want the values to be stored as 16-byte fixed, then we need to make > it easy to get the expected string representation in and out, just like we > do with date/time types. I don't think that's specific

Re: Proposal: Support for views in Iceberg

2021-08-26 Thread Jacques Nadeau
ed. >>>>>>>>>We can all agree that all of these options are non-trivial to >>>>>>>>>design/implement (perhaps a multi-year effort based on the option >>>>>>>>> chosen) >>>&

Re: Proposal: Support for views in Iceberg

2021-08-26 Thread Jacques Nadeau
On Thu, Aug 26, 2021 at 2:44 PM Ryan Blue wrote: > Would a physical plan be portable for the purpose of an engine-agnostic > view? > My goal is it would be. There may be optional "hints" that a particular engine could leverage and others wouldn't but I think the goal should be that the IR is ent

A new project focused on serialized algebra

2021-09-08 Thread Jacques Nadeau
Hey all, For some time I've been thinking that having a common serialized representation of query plans would be helpful across multiple related projects. I started working on something independently in this vein several months ago. Since then, Arrow has started exploring "Arrow IR" and in Iceberg

Re: A new project focused on serialized algebra

2021-09-10 Thread Jacques Nadeau
t of goals for an >> existing project. >> >> Where is a good place to discuss this? Should we create a #substrait room >> on Iceberg Slack? ASF Slack? On this thread? >> >> Ryan >> >> On Wed, Sep 8, 2021 at 8:21 AM Jacques Nadeau >> wrote: &g

Re: [DISCUSS] UUID type

2021-09-17 Thread Jacques Nadeau
art talking about >>>>>>> views. >>>>>>> >>>>>>> Some of this argues for physical vs logical type abstraction. >>>>>>> (Something that was always challenging in Parquet but also helped to >>>>>>&

Re: support of RCFile

2021-09-29 Thread Jacques Nadeau
I actually wonder if file formats should be an extension api so someone can implement a file format but it without any changes in Iceberg core (I don't think this is possible today). Let's say one wanted to create a proprietary format but use Iceberg semantics (not me). Could we make it such that o

Re: Iceberg python library sync

2021-10-05 Thread Jacques Nadeau
This might be a dumb question but...why is the iceberg python mailing list a Google group as opposed to an Apache mailing list? On Sat, Oct 2, 2021 at 9:57 PM Jun H. wrote: > Hi everyone, > > I just sent the invite for the next python library meeting on Tuesday > (10/12) at 9 AM (UTC-7, PDT). In

Re: Iceberg python library sync

2021-10-05 Thread Jacques Nadeau
). > > Notes, correspondence, scheduling, etc. should be done via the dev list. > > That's my understanding at least, > -Dan > > > > On Tue, Oct 5, 2021 at 9:31 AM Jacques Nadeau > wrote: > >> This might be a dumb question but...why is the iceberg python mailin

Re: [DISCUSS] Iceberg roadmap

2021-11-07 Thread Jacques Nadeau
A few additional observations about StarRocks... - As far as I can tell, StarRocks has an ASF incompatible license (Elastic License 2.0). - It appears to be a hard fork of Apache Doris, a project still in the incubator (and looks like it probably is destructive to the Doris project) - The project

Re: [Discuss] Iceberg View Interoperability

2024-11-29 Thread Jacques Nadeau
Hey Ajantha, thanks for looping me in. This is a great conversation. FYI, I'm a co-creator of Substrait so read this all with that in mind. Substrait has a couple of key underpinnings that are worth noting: 1. It's a specification first and foremost (with tools to help work with the specification

Re: [DISCUSS] Apache Iceberg Summit 2025 - Selection Committee

2024-11-29 Thread Jacques Nadeau
happy to help on selection committee. On Mon, Nov 25, 2024, 11:43 PM Jean-Baptiste Onofré wrote: > Hi everyone, > > As you probably know, we've been having discussions about the Iceberg > Summit 2025. > > The PMC pre-approved the Iceberg Summit proposal, and one of the first > steps is to put to