I think they all have different names and that's what I would be whitelisting, so any table options or a-like would be rejected as invalid options.
On Fri, Mar 5, 2021 at 10:54 AM Ryan Blue <rb...@netflix.com> wrote: > Do we support any table options passed through here? I thought we had > separate options defined that use shorter names (like target-size). > > On Fri, Mar 5, 2021 at 8:50 AM Russell Spitzer <russell.spit...@gmail.com> > wrote: > >> I think if we are going to have our write behavior work like that we >> should probably switch to a whitelisting of valid properties for Spark >> writes, so we can warn folks that some options won't actually do anything. >> I think the current behavior is a bit of a surprise, I also don't like >> silent options :) >> >> On Mar 5, 2021, at 10:47 AM, Ryan Blue <rb...@netflix.com.INVALID> wrote: >> >> Russell is right. The property you're trying to set is a table property >> and needs to be set on the table. >> >> We don't currently support overriding arbitrary table properties in write >> options, mainly because we want to encourage people to set their >> configuration on the table instead of in jobs. That's a best practice that >> I highly recommend so you don't need to configure every job that writes to >> the table, and so you can make changes and have them automatically take >> effect without recompiling your write job. >> >> On Fri, Mar 5, 2021 at 8:44 AM Russell Spitzer <russell.spit...@gmail.com> >> wrote: >> >>> I believe those are currently only respected as table properties and not >>> as "spark write" properties although there is a case to be made that we >>> should accept them there as well. You can alter your table so that it >>> contains those properties and new files will be created with the >>> compression you would like. >>> >>> On Mar 5, 2021, at 7:15 AM, Javier Sanchez Beltran < >>> jabelt...@expediagroup.com.INVALID> wrote: >>> >>> Hello Iceberg team! >>> >>> I have been researching Apache Iceberg to see how would work in our >>> environment. We are still trying out things. We would like to have Parquet >>> format with SNAPPY compression type. >>> >>> I already try changing these two properties to SNAPPY, but it didn’t >>> work (https://iceberg.apache.org/configuration/): >>> >>> >>> write.avro.compression-codec >>> >>> Gzip -> SNAPPY >>> >>> write.parquet.compression-codec >>> >>> Gzip -> SNAPPY >>> In this way: >>> >>> dataset >>> .writeStream() >>> .format("iceberg") >>> .outputMode("append") >>> .option("write.parquet.compression-codec", "SNAPPY") >>> .option("write.avro.compression-codec", "SNAPPY") >>> …start() >>> >>> >>> Did I do something in a bad way? Or maybe we need to take care of the >>> implementation of this SNAPPY compression? >>> >>> Thank you in advance, >>> Javier. >>> >>> >>> >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> >> >> > > -- > Ryan Blue > Software Engineer > Netflix >