brunal commented on code in PR #16342:
URL: https://github.com/apache/datafusion/pull/16342#discussion_r2185262228


##########
datafusion/datasource/src/file_sink_config.rs:
##########
@@ -77,13 +79,34 @@ pub trait FileSink: DataSink {
             .runtime_env()
             .object_store(&config.object_store_url)?;
         let (demux_task, file_stream_rx) = start_demuxer_task(config, data, 
context);
-        self.spawn_writer_tasks_and_join(
-            context,
-            demux_task,
-            file_stream_rx,
-            object_store,
-        )
-        .await
+        let mut num_rows = self
+            .spawn_writer_tasks_and_join(
+                context,
+                demux_task,
+                file_stream_rx,
+                Arc::clone(&object_store),
+            )
+            .await?;
+        if num_rows == 0 {
+            // If no rows were written, then no files are output either.

Review Comment:
   You say now row => no file was created.
   
   But then you say write an empty recordbatch => ensure a file gets created.
   
   Except an empty recordbatch has no rows (at least when written to a parquet 
file).
   
   Your 2 sentences don't make sense together.
   
   In practice, this PR caused a regression: we cannot write empty recordbatch 
to parquet anymore,  as the code here tries to write it a second time, and we 
get an error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to