GitHub user theelderbeever created a discussion: How do you write a single
parquet file with a specified compression?
As the title says? How do you just write a single parquet file with
configuration? The two `write_parquet` methods that exist have completely
different arguments and config options that hardly offer much to work with.
Additionally, all the examples for
Datafusion 49.0.2
```rust
let options = WriterProperties::builder()
.set_compression(datafusion::parquet::basic::Compression::ZSTD(
ZstdLevel::try_new(3)?,
))
.build();
let write_options =
DataFrameWriteOptions::new().with_single_file_output(true);
// This writes a single file but takes `TableParquetOptions` which can't
configure compression. The docs say this is tied to `ParquetWriterOptions` but
there is no way to convert between the two.
df.repartition(Partitioning::RoundRobinBatch(1))?
.write_parquet("data/data.zstd.parquet", write_options, None)
.await?;
// This accepts the `WriterProperties` but can't be configured to write a
single file.
ctx.write_parquet(
df.repartition(Partitioning::RoundRobinBatch(1))?
.create_physical_plan()
.await?,
"data/data.zstd.parquet",
Some(options)
)
.await?;
```
GitHub link: https://github.com/apache/datafusion/discussions/17578
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]