alamb commented on code in PR #16342: URL: https://github.com/apache/datafusion/pull/16342#discussion_r2138987979
########## datafusion/core/src/datasource/file_format/csv.rs: ########## @@ -795,6 +796,25 @@ mod tests { Ok(()) } + #[tokio::test] + async fn test_csv_write_empty_file() -> Result<()> { Review Comment: Can you please also add a test of what happens if we are writing into a directory (aka a partitioned file)? Like a test that targets a path of `/foo` In this case, I think I would expect that no files are created when the data stream is empty ########## datafusion/datasource/src/file_sink_config.rs: ########## @@ -77,13 +79,32 @@ pub trait FileSink: DataSink { .runtime_env() .object_store(&config.object_store_url)?; let (demux_task, file_stream_rx) = start_demuxer_task(config, data, context); - self.spawn_writer_tasks_and_join( - context, - demux_task, - file_stream_rx, - object_store, - ) - .await + let mut num_rows = self + .spawn_writer_tasks_and_join( + context, + demux_task, + file_stream_rx, + Arc::clone(&object_store), + ) + .await?; + if num_rows == 0 { Review Comment: I think it would help here to add some comments with context about what it is doing. For example, something like ```suggestion // If no rows were written, then no files are output either. In this case // send an empty record batch through to ensure the output file is generated if num_rows == 0 { ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org