sergiimk opened a new issue, #13323:
URL: https://github.com/apache/datafusion/issues/13323

   ### Describe the bug
   
   Consider a snippet like this:
   ```rust
   df.write_parquet(
     "dir/data",
     DataFrameWriteOptions::new().with_single_file_output(true),
     None
   ).await
   ```
   Before v43 this would write a single file called `data`, but in v43 this is 
creating `data` as a directory with a randomly named file(s) in it.
   
   This seems to be related to #13079 (cc @dhegberg) that added an 
extension-based heuristic.
   
   I see this as a regression, as single file output is requested explicitly, 
and I don't want a heuristics to be applied.
   
   We are using Parquet files with a content-addressable file system and our 
files don't have extensions.
   
   ### To Reproduce
   
   See above
   
   ### Expected behavior
   
   Considering the introduction of the extension-based heuristic I would the 
following behavior:
   - `with_single_file_output` is not called (`single_file_output == None`) - 
apply the heuristic
   - `with_single_file_output(true)` - produce a single file at the exact path 
specified
   - `with_single_file_output(false)` - create directory under specified path 
if doesn't exist and write one or many files
   
   ### Additional context
   
   -


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to