Do we support any table options passed through here? I thought we had separate options defined that use shorter names (like target-size).
On Fri, Mar 5, 2021 at 8:50 AM Russell Spitzer <russell.spit...@gmail.com> wrote: > I think if we are going to have our write behavior work like that we > should probably switch to a whitelisting of valid properties for Spark > writes, so we can warn folks that some options won't actually do anything. > I think the current behavior is a bit of a surprise, I also don't like > silent options :) > > On Mar 5, 2021, at 10:47 AM, Ryan Blue <rb...@netflix.com.INVALID> wrote: > > Russell is right. The property you're trying to set is a table property > and needs to be set on the table. > > We don't currently support overriding arbitrary table properties in write > options, mainly because we want to encourage people to set their > configuration on the table instead of in jobs. That's a best practice that > I highly recommend so you don't need to configure every job that writes to > the table, and so you can make changes and have them automatically take > effect without recompiling your write job. > > On Fri, Mar 5, 2021 at 8:44 AM Russell Spitzer <russell.spit...@gmail.com> > wrote: > >> I believe those are currently only respected as table properties and not >> as "spark write" properties although there is a case to be made that we >> should accept them there as well. You can alter your table so that it >> contains those properties and new files will be created with the >> compression you would like. >> >> On Mar 5, 2021, at 7:15 AM, Javier Sanchez Beltran < >> jabelt...@expediagroup.com.INVALID> wrote: >> >> Hello Iceberg team! >> >> I have been researching Apache Iceberg to see how would work in our >> environment. We are still trying out things. We would like to have Parquet >> format with SNAPPY compression type. >> >> I already try changing these two properties to SNAPPY, but it didn’t work >> (https://iceberg.apache.org/configuration/): >> >> >> write.avro.compression-codec >> >> Gzip -> SNAPPY >> >> write.parquet.compression-codec >> >> Gzip -> SNAPPY >> In this way: >> >> dataset >> .writeStream() >> .format("iceberg") >> .outputMode("append") >> .option("write.parquet.compression-codec", "SNAPPY") >> .option("write.avro.compression-codec", "SNAPPY") >> …start() >> >> >> Did I do something in a bad way? Or maybe we need to take care of the >> implementation of this SNAPPY compression? >> >> Thank you in advance, >> Javier. >> >> >> > > -- > Ryan Blue > Software Engineer > Netflix > > > -- Ryan Blue Software Engineer Netflix