REST catalog proposal

2021-12-13 Thread Ryan Murray
Hi all, For those of you who haven't been following there has been some interesting discussion around the proposal for a REST based catalog[1]. One of the primary questions I had while reading it was 'what is the overall goal of the API?'. Given the size of this question I thought it might be b

Re: Supporting gs:// prefix in S3URI for Google Cloud S3 Storage

2021-12-03 Thread Ryan Murray
Echoing Laurent and Igor I wonder what the consequence of adding 'gs://' scheme to S3FileIO is if that scheme is already used by the hadoop gcs connector? Do we want to overload that scheme? I would almost think it should be an s3:// scheme or so right? Best, Ryan On Fri, Dec 3, 2021 at 9:26 AM M

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-29 Thread Ryan Murray
Blue wrote: > Yeah, that sounds about right. I'm not sure how we would want to do it > exactly, but I think the catalog would be able to override the procedures > it wants to. > > On Fri, Nov 26, 2021 at 9:44 AM Ryan Murray wrote: > >> Hey Ryan, >> >> Th

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-26 Thread Ryan Murray
at least to me. > > For modifying the existing procedures, we could look at plugging in > through the Iceberg catalog rather than directly in the Spark catalog as > well. > > On Fri, Nov 19, 2021 at 12:13 AM Ryan Murray wrote: > >> Hey Ryan, >> >> Thanks f

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-19 Thread Ryan Murray
l project by > just having dependency on Iceberg jars > > >1. Implement ProcedureCatalog >2. Configure a Spark catalog that uses your ProcedureCatalog >implementation (spark.sql.catalog.pcat = com.company.ProcedureCatalog) >3. Call procedures using that catalo

Re: Welcome new PMC members!

2021-11-18 Thread Ryan Murray
Congratulations both! Well deserved! On Thu, 18 Nov 2021, 09:19 Omar Al-Safi, wrote: > Congrats both of you! > > On Thu, Nov 18, 2021 at 8:31 AM Eduard Tudenhoefner > wrote: > >> Congrats Jack and Russell! Very well deserved. >> >> On Thu, Nov 18, 2021, 01:12 Ryan Blue wrote: >> >>> Hi everyon

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-12 Thread Ryan Murray
alog` directly. > > That translates to less code for Iceberg to manage and no long-term debt > supporting procedures plugged in through Iceberg instead of through a Spark > interface. > > On Thu, Nov 11, 2021 at 8:08 AM Ryan Murray wrote: > >> Hey Ryan, >> >

Re: Regarding the support of pluggable procedures in Iceberg

2021-11-11 Thread Ryan Murray
Hey Ryan, What is the timeline for ProcedureCatalog to be moved into Spark and will it be backported? I agree 100% that its the 'correct' way to go long term but currently Iceberg has a `static final Map`[1] of valid procedures and no way for users to customize that. I personally don't love a stat

Re: Iceberg python library sync

2021-08-12 Thread Ryan Murray
I would love to join and +1 for the timing. On Thu, Aug 12, 2021 at 5:54 PM Ryan Blue wrote: > I know there are a couple of people in CEST (UTC+2) that would like to > join this sync. Maybe next Wednesday at 9 AM PDT (UTC-7) is a good time? > > On Thu, Aug 12, 2021 at 6:53 AM Jason Reid wrote:

Re: Subject: [VOTE] Release Apache Iceberg 0.12.0 RC3

2021-08-10 Thread Ryan Murray
+1 (non-binding) * Verify Signature Keys * Verify Checksum * dev/check-license * Build * Run tests (though some timeout failures, on Hive MR test..) * ran with Nessie in spark 3.1 and 3.0 On Tue, Aug 10, 2021 at 4:21 AM Carl Steinbach wrote: > Hi Everyone, > > I propose the following RC to be r

Re: [VOTE] Release Apache Iceberg 0.12.0 RC2

2021-08-04 Thread Ryan Murray
After some wrestling w/ Spark I discovered that the problem was with my test. Some SparkSession apis changed. so all good here now. +1 (non-binding) On Wed, Aug 4, 2021 at 11:29 PM Ryan Murray wrote: > Thanks for the help Carl, got it sorted out. The gpg check now works. For > those wh

Re: [VOTE] Release Apache Iceberg 0.12.0 RC2

2021-08-04 Thread Ryan Murray
[ultimate] Carl W. Steinbach (CODE SIGNING KEY) < > c...@apache.org> > sub rsa4096/4158EB8A4F03D2AA 2021-07-01 [E] > > Thanks. > > - Carl > > On Wed, Aug 4, 2021 at 11:12 AM Ryan Murray wrote: > >> Hi all, >> >> Unfortunately I have to give

Re: [VOTE] Release Apache Iceberg 0.12.0 RC2

2021-08-04 Thread Ryan Murray
Hi all, Unfortunately I have to give -1 I had trouble w/ the keys: gpg: assuming signed data in 'apache-iceberg-0.12.0.tar.gz' gpg: Signature made Mon 02 Aug 2021 03:36:30 CEST gpg:using RSA key FAFEB6EAA60C95E2BB5E26F01FF0803CB78D539F gpg: Can't check signature: No public key A

Re: [VOTE] Adopt the v2 spec changes

2021-07-28 Thread Ryan Murray
+1 (non-binding) On Wed, Jul 28, 2021 at 5:14 PM Russell Spitzer wrote: > +1 (non-binding) > > On Jul 28, 2021, at 10:11 AM, Ryan Blue wrote: > > +1 > > On Tue, Jul 27, 2021 at 9:58 AM Ryan Blue wrote: > >> I’d like to propose that we adopt the pending v2 spec changes as the >> supported v2 sp

Re: [DISCUSS] Distinct count map

2021-07-23 Thread Ryan Murray
Hey Piotr, There are a few proposals around secondary indexes floating around[1][2]. The current thinking is that this would be the best place for sketches to live. Best, Ryan [1] https://docs.google.com/document/d/11o3T7XQVITY_5F9Vbri9lF9oJjDZKjHIso7K8tEaFfY/edit#heading=h.uqr5wcfm85p7 [2] http

Re: Proposal: Support for views in Iceberg

2021-07-22 Thread Ryan Murray
n of some new system for storing >> cross-engine compatible views on that. >> >> Is there something else we can use? >> Maybe the view definition should use some intermediate structured >> language that's not SQL? >> For example, it could represent logical str

Re: Proposal: Support for views in Iceberg

2021-07-20 Thread Ryan Murray
Thanks Anjali! I have left some comments on the document. I unfortunately have to miss the community meetup tomorrow but would love to chat more/help w/ implementation. Best, Ryan On Tue, Jul 20, 2021 at 7:42 AM Anjali Norwood wrote: > Hello, > > John Zhuge and I would like to propose the foll

Re: View definitions

2021-07-13 Thread Ryan Murray
r discussion soon. > > Regards > Anjali > > On Tue, Jul 13, 2021 at 2:29 AM Ryan Murray wrote: > >> Hi All, >> >> I was curious if anyone was working on bringing a proposal for Views to >> the community? I know extensive work at Netflix has already

View definitions

2021-07-13 Thread Ryan Murray
Hi All, I was curious if anyone was working on bringing a proposal for Views to the community? I know extensive work at Netflix has already been done[1] and it looks like a proposal could be relatively easy to extract from there. If no one else is currently working on it I can volunteer to take th

Re: Welcoming Jack Ye as a new committer!

2021-07-06 Thread Ryan Murray
Congrats Jack! Well deserved! On Tue, Jul 6, 2021 at 7:24 AM Eduard Tudenhoefner wrote: > Congratulations Jack > > On Tue, Jul 6, 2021, 05:59 Forward Xu wrote: > >> Congratulations Jack! >> >> >> forward >> >> Best >> >> Szehon Ho 于2021年7月6日周二 上午11:07写道: >> >>> Congratulations Jack! >>> >>> On

Re: migrating Hadoop tables to tables with hive catalog

2021-07-01 Thread Ryan Murray
I had a short proposal here[1] suggesting the same as Russell. I think this is probably a more broadly useful operation but I don't really know the best place for it to live. Im happy to finish the proposal if there are some opinions on where in iceberg it is appropriate to add such functionality.

Re: Welcoming OpenInx as a new PMC member!

2021-06-29 Thread Ryan Murray
Congrats!! On Tue, Jun 29, 2021 at 10:53 PM Russell Spitzer wrote: > Congratulations! > > > On Jun 29, 2021, at 3:52 PM, Ryan Blue wrote: > > > > Hi everyone, > > > > I'd like to welcome OpenInx (Zheng Hu) as a new Iceberg PMC member. > > > > Thanks for all your contributions and commitment to

Re: Iceberg build and Nessie

2021-06-14 Thread Ryan Murray
time in classpath, so we should be more cautious >> about it. For option 2, yes the major concern will be the large Docker >> dependency, but since it is only test dependency it should be fine, so that >> is also a good way to go. >> >> -Jack Ye >> >> On Mon, Ju

Iceberg build and Nessie

2021-06-14 Thread Ryan Murray
Hi All, Currently the iceberg CI/development build relies on a custom gradle plugin to start a Nessie server for the nessie module tests. The underlying tech that powers a nessie server is Quarkus. Unfortunately Quarkus is dropping support for building and running from jdk8. This means that the Ja

Re: Iceberg Python library support

2021-04-14 Thread Ryan Murray
ed with read. > > Chen > > On Wed, Apr 14, 2021 at 10:28 AM Ryan Murray wrote: > >> Hey Chen Song, >> >> Answers inline below >> >> On Wed, Apr 14, 2021 at 4:04 PM Chen Song wrote: >> >>> Is https://iceberg.apache.org/python-feature-s

Re: Iceberg Python library support

2021-04-14 Thread Ryan Murray
Hey Chen Song, Answers inline below On Wed, Apr 14, 2021 at 4:04 PM Chen Song wrote: > Is https://iceberg.apache.org/python-feature-support/ still up to date? > Are the following statements true for Iceberg python library support? > >- The python library has only limited support for read (f

Re: CI problems

2021-04-12 Thread Ryan Murray
I am also seeing a forbidden error related to palantir/baseline which appears to still refer to palantir.bintray.com. Currently investigating that issue... On Mon, Apr 12, 2021 at 5:08 PM Peter Vary wrote: > Hi Team, > > Today morning I have pushed 2 commits which had green runs both (#2129, > #

Re: Welcoming Ryan Murray as a new committer!

2021-03-30 Thread Ryan Murray
t; wrote: >>>>>>>>> >>>>>>>>>> Congratulations Ryan >>>>>>>>>> >>>>>>>>>> Thirumalesh Reddy >>>>>>>>>> Dremio | VP of Engineering >>>>>>>>>

Re: [VOTE] Release Apache Iceberg 0.11.1 RC0

2021-03-30 Thread Ryan Murray
+1 (non-binding) verified build, tests, signature, checksum. Best, Ryan On Tue, Mar 30, 2021 at 4:40 AM Jack Ye wrote: > +1 (non-binding) > > Verified build, unit test, AWS integration test, signature, checksum. > Verified fix of #2146, #2267, #2333 in AWS EMR Spark3 environment. > > Best, > J

Re: Welcoming Russell Spitzer as a new committer

2021-03-29 Thread Ryan Murray
Congrats Russell! On Mon, Mar 29, 2021 at 6:11 PM Holden Karau wrote: > Congratulations Russel! > > On Mon, Mar 29, 2021 at 9:10 AM Anton Okolnychyi > wrote: > >> Hey folks, >> >> I’d like to welcome Russell Spitzer as a new committer to the project! >> >> Thanks for all your contributions, Rus

Re: Welcoming Yan Yan as a new committer!

2021-03-24 Thread Ryan Murray
Congratulations!! On Wed, 24 Mar 2021, 11:39 Szehon Ho, wrote: > Nice, congratulations! > > On 24 Mar 2021, at 11:37, Marton Bod wrote: > > Congratulations, well done! > > On Wed, 24 Mar 2021 at 11:32, Peter Vary > wrote: > >> Congratulations Yan! >> >> On Mar 24, 2021, at 05:43, Yufei Gu wro

Re: Nessie PRs

2021-03-16 Thread Ryan Murray
> Could not resolve >>>> org.projectnessie:org.projectnessie.gradle.plugin:0.4.0. >>>> 37 >>>> <https://github.com/edgarRd/iceberg/runs/2114545801?check_suite_focus=true#step:5:37> >>>> > Could not get resource ' >>>> http

Re: Nessie PRs

2021-03-15 Thread Ryan Murray
Thanks a lot Dan! On Mon, Mar 15, 2021 at 4:58 PM Daniel Weeks wrote: > Hey Ryan, I just took care of the first one and I might have some time to > look over the others today or tomorrow (unless someone else gets to them > first). > > -Dan > > On Mon, Mar 15, 2021 at 7:03

Nessie PRs

2021-03-15 Thread Ryan Murray
Hi All, I have a few Nessie PRs that I am hoping to try and get merged. Is there a committer who has a bit of free time? Happy to help w/ some of the details if the PRs aren't clear. I am @rymurr on the apache slack if you want a quick response. The nessie version bump is the most pressing and th

Re: Migrating legacy snapshot daily Hive table concept to Iceberg

2021-03-09 Thread Ryan Murray
Hey Edgar, Cheng Pan, I am not sure if you are aware of project nessie ? It _may_ suit your needs. Nessie applies git-like functionality to iceberg tables (in this case most useful are branches and tags). In effect you would be pivoting the snapshot partition into the t

Re: [ANNOUNCE] Apache Iceberg release 0.11

2021-01-27 Thread Ryan Murray
Congrats on the release all and thanks Jack! On Wed, Jan 27, 2021 at 4:45 AM Forward Xu wrote: > Thank you very much for driving these work. > > Best, > Forward > > Jack Ye 于2021年1月27日周三 上午7:24写道: > >> Hi everyone, >> >> I'm pleased to announce the release of Apache Iceberg 0.11! >> >> Apache I

Re: Welcoming Peter Vary as a new committer!

2021-01-25 Thread Ryan Murray
Congrats!! On Mon, Jan 25, 2021 at 7:35 PM Russell Spitzer wrote: > Congratulations! > > On Jan 25, 2021, at 12:34 PM, Jacques Nadeau > wrote: > > Congrats Peter! Thanks for all your great work > > On Mon, Jan 25, 2021 at 10:24 AM Ryan Blue wrote: > >> Hi everyone, >> >> I'd like to welcome P

Re: [VOTE] Release Apache Iceberg 0.11.0 RC0

2021-01-25 Thread Ryan Murray
I have moved back to +1 (non-binding) As you said Ryan, the error message is bad and hides the real error. While I was testing misconfigured catalogs I kept getting the error 'Cannot initialize Catalog, missing no-arg constructor: org.apache.iceberg.hive.HiveCatalog' when the real error is (in th

Re: [VOTE] Release Apache Iceberg 0.11.0 RC0

2021-01-24 Thread Ryan Murray
that needs to be fixed for the release. > > On Sun, Jan 24, 2021 at 12:07 PM Ryan Murray wrote: > >> Based on #2144 <https://github.com/apache/iceberg/issues/2144> I would >> change my vote to -1. I will raise a fix asap. >> >> >> On Sun, Jan 24, 202

Re: [VOTE] Release Apache Iceberg 0.11.0 RC0

2021-01-24 Thread Ryan Murray
Based on #2144 <https://github.com/apache/iceberg/issues/2144> I would change my vote to -1. I will raise a fix asap. On Sun, Jan 24, 2021 at 3:12 PM Ryan Murray wrote: > > +1 (non-binding) > > I verified the build and ran the tests. Also verified both flink and spark >

Re: [VOTE] Release Apache Iceberg 0.11.0 RC0

2021-01-24 Thread Ryan Murray
+1 (non-binding) I verified the build and ran the tests. Also verified both flink and spark custom catalogs are working. One side note: I had to run the tests a few times to get the build to pass. Flaky Hive tests. Best, Ryan Murray On Sat, Jan 23, 2021 at 12:26 AM Jack Ye wrote: >

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-11 Thread Ryan Murray
like a rather disruptive change. However it fixes a lot of issues like this and it (marginally) improves GC performance. I can respond back here when I post a candidate PR and we can discuss its value. Best Ryan Murray On Fri, Jan 8, 2021 at 7:50 PM Ryan Blue wrote: > It could be that th

Breaking change in IcebergSource

2020-12-14 Thread Ryan Murray
I would appreciate any feedback and to see if we can find a way forward so that this change can be included in the next iceberg release. Best Ryan Murray [1] https://github.com/apache/iceberg/pull/1783 [2] https://github.com/apache/iceberg/pull/1783#issuecomment-742889117 [3] https://

Re: [VOTE] Release Apache Iceberg 0.10.0 RC5

2020-11-09 Thread Ryan Murray
+1 (non-binding) all normal tests pass and tests against nessie also look good. Best, Ryan On Mon, Nov 9, 2020 at 12:03 PM Mass Dosage wrote: > +1 (non-binding) > > I tested the Hive read path in distributed mode for HadoopTables-backed > Iceberg tables and it worked fine. > > On Sun, 8 Nov

Re: [VOTE] Release Apache Iceberg 0.10.0 RC4

2020-11-04 Thread Ryan Murray
+1 (non-binding) 1. Download the source tarball, signature (.asc), and checksum (.sha512): OK 2. Import gpg keys: download KEYS and run gpg --import /path/to/downloaded/KEYS (optional if this hasn’t changed) : OK 3. Verify the signature by running: gpg --verify apache-iceberg-xx.tar.gz.asc: I g

Re: [VOTE] Release Apache Iceberg 0.10.0 RC2

2020-11-02 Thread Ryan Murray
+1 (non-binding) Ran through steps 1-7, completed successfully. Also tested locally against nessie and all looked good. Best, Ryan On Mon, Nov 2, 2020 at 2:28 PM Mass Dosage wrote: > +1 (non-binding) > > I ran the RC against a set of integration tests I have for a subset of the > Hive2 read fu

Re: New project integrated with Iceberg

2020-10-12 Thread Ryan Murray
We have published a draft PR here: https://github.com/apache/iceberg/pull/1587 Looking forward to your feedback! Best, Ryan On Thu, Oct 1, 2020 at 10:39 PM Jacques Nadeau wrote: > Hey All, > > Ryan Murray, Laurent Goujon and I have been working since early this year > on a way

Re: [VOTE] Release Apache Iceberg 0.9.1 RC0

2020-08-12 Thread Ryan Murray
1. Verify the signature: OK 2. Verify the checksum: OK 3. Untar the archive tarball: OK 4. Run RAT checks to validate license headers: RAT checks passed 5. Build and test the project: all unit tests passed. +1 (non-binding) Best, Ryan On Tue, Aug 11, 2020 at 6:56 PM Ryan Blue wrote: > Hi ever

Re: [VOTE] Release Apache Iceberg 0.9.0 RC5

2020-07-10 Thread Ryan Murray
1. Verify the signature: OK 2. Verify the checksum: OK 3. Untar the archive tarball: OK 4. Run RAT checks to validate license headers: RAT checks passed 5. Build and test the project: all unit tests passed. +1 (non-binding) I did see that my build took >12 minutes and used all 100% of all 8 cores

Re: [DISCUSS] Changes for row-level deletes

2020-05-07 Thread Ryan Murray
fwiw i agree with Gautam on the changes. Keeping complexity down and easing transition to V2 should be a goal for this work. Is there a list of items that need to be finished for V2 schema/row level deletes to be ready? I would love to help but am not sure what is missing/in-progress. Best, Ryan

Re: [VOTE] Release Apache Iceberg 0.8.0-incubating RC1

2020-04-29 Thread Ryan Murray
+1 (non-binding) Validated checksums and signature, ran ./gradlew build, and ran license check. Loaded into a downstream project and it worked as expected. Best, Ryan On Wed, Apr 29, 2020 at 5:26 AM tison wrote: > Hi Ryan, > > Thanks for starting the voting process. > > +1 (non-binding) I ver