[I] Add configuration option to adjust ObjectStore's BufWriter upload size to support large file uploading to S3 [datafusion]

via GitHub Wed, 09 Apr 2025 04:41:40 -0700


m09526 opened a new issue, #15656:
URL: https://github.com/apache/datafusion/issues/15656

### Is your feature request related to a problem or challenge?

When running a DataFusion query that produces a single >100GiB output
Parquet file written to S3, the query failed due to an error returned from S3
via ObjectStore.

Investigation showed this stems from how DataFusion uses ObjectStore's
[BufWriter](https://docs.rs/object_store/latest/object_store/buffered/struct.BufWriter.html)
to upload files to remote stores. BufWriter uses a [default buffer
size](https://docs.rs/object_store/latest/src/object_store/buffered.rs.html#252)
of 10MiB. For the ObjectStore's AWS S3 implementation, it uses the "multipart
upload" API which allows for uploading of object data in chunks and is
recommended for objects >100MiB and [supports up to 10,000
parts](https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html#:~:text=When%20uploading%20a%20part%2C%20you%20must%20specify%20a%20part%20number%20in%20addition%20to%20the%20upload%20ID.%20You%20can%20choose%20any%20part%20number%20between%201%20and%2010%2C000).

With the default buffer size: 10,000 x 10MiB = 100GiB maximum object size.

BufWriter supports changing the buffer size using the
[with_capacity](https://docs.rs/object_store/latest/object_store/buffered/struct.BufWriter.html#method.with_capacity)
function.

### Describe the solution you'd like

Add an configuration option to DataFusion, probably an execution option.
This would be an `Option<usize>` to allow specifying a larger upload buffer
size to enable writing larger result files to S3. This option would then be
available via `TaskContext`. BufWriter is used in
* datafusion/datasource/src/write/mod.rs -> in `create_writer`.
* datafusion/datasource-csv/src/source.rs
* datafusion/datasource-json/src/source.rs
* datafusion/datasource-parquet/src/writer.rs
* datafusion/datasource-parquet/src/file_format.rs

The relevant `TaskContext` is directly available in most of these files
where the BufWriter is created. The `create_writer` function is also only
called from places where the `TaskContext` object is in scope. As this function
is public, we could create a new `create_writer_with_size` function which
accepts a buffer size parameter.

### Describe alternatives you've considered

Limit the size of the query to never exceed 100GiB of Parquet output. Split
queries into smaller workloads. This is not a viable option for our usecase.

Create a ObjectStore wrapper object that implements extra buffering
behaviour and register this with DataFusion before running a query.

### Additional context

_No response_

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[I] Add configuration option to adjust ObjectStore's BufWriter upload size to support large file uploading to S3 [datafusion]

Reply via email to