[DISCUSS] March board report

2021-03-05 Thread Ryan Blue
Hi everyone, Time for another board report! Here’s my current draft. It’s a little early since I’m going to be gone next week. If you want to add or update something, please comment soon and I’ll update it. Thanks! Description: Apache Iceberg is a table format for huge analytic datasets that is

Re: Question about Snappy compression format.

2021-03-05 Thread Russell Spitzer
I think they all have different names and that's what I would be whitelisting, so any table options or a-like would be rejected as invalid options. On Fri, Mar 5, 2021 at 10:54 AM Ryan Blue wrote: > Do we support any table options passed through here? I thought we had > separate options defined

Re: Secondary Indexes - Pluggable File Filter interface for Apache Iceberg

2021-03-05 Thread Ryan Blue
I updated the invites. Sorry for the mixup! On Fri, Mar 5, 2021 at 2:10 AM webdev.andrei wrote: > Hi all, > > I would like to attend the discussion. I'm very interested into it as I'm > working with Miao's team on indexing. The PR for Iceberg support in > Hyperspace referred by Miao is my work.

Re: Question about Snappy compression format.

2021-03-05 Thread Ryan Blue
Do we support any table options passed through here? I thought we had separate options defined that use shorter names (like target-size). On Fri, Mar 5, 2021 at 8:50 AM Russell Spitzer wrote: > I think if we are going to have our write behavior work like that we > should probably switch to a whi

Re: Question about Snappy compression format.

2021-03-05 Thread Russell Spitzer
I think if we are going to have our write behavior work like that we should probably switch to a whitelisting of valid properties for Spark writes, so we can warn folks that some options won't actually do anything. I think the current behavior is a bit of a surprise, I also don't like silent opt

Re: Secondary Indexes - Pluggable File Filter interface for Apache Iceberg

2021-03-05 Thread Andrei Taleanu
Hi guys, I could attend the meeting, but I believe @Andrei Ionescu would benefit more from it (it’s his PR not mine, likely an autocomplete fail 😃). Cheers, Andrei From: Miao Wang Date: Thursday, 4 March 2021 at 19:22 To: dev@iceberg.apache.org , rb...@netflix.com

Re: Question about Snappy compression format.

2021-03-05 Thread Ryan Blue
Russell is right. The property you're trying to set is a table property and needs to be set on the table. We don't currently support overriding arbitrary table properties in write options, mainly because we want to encourage people to set their configuration on the table instead of in jobs. That's

Re: Question about Snappy compression format.

2021-03-05 Thread Russell Spitzer
I believe those are currently only respected as table properties and not as "spark write" properties although there is a case to be made that we should accept them there as well. You can alter your table so that it contains those properties and new files will be created with the compression you

Question about Snappy compression format.

2021-03-05 Thread Javier Sanchez Beltran
Hello Iceberg team! I have been researching Apache Iceberg to see how would work in our environment. We are still trying out things. We would like to have Parquet format with SNAPPY compression type. I already try changing these two properties to SNAPPY, but it didn’t work (https://iceberg.apa

Re: Secondary Indexes - Pluggable File Filter interface for Apache Iceberg

2021-03-05 Thread webdev.andrei
Hi all, I would like to attend the discussion. I'm very interested into it as I'm working with Miao's team on indexing. The PR for Iceberg support in Hyperspace referred by Miao is my work. If needed I can explain how Hyperspace works and what’s the plan with Hyperspace for the near future. You