Resending...

> On Jan 9, 2025, at 1:57 PM, Rozov, Vlad <vro...@amazon.com.INVALID> wrote:
> 
> Hi,
> 
> I see a difference in how “path" is handled in DataFrameWriter.save(path) and 
> DataStreamWriter.start(path) while using relative path (for example 
> “test.parquet") to write to parquet files (possibly applies to other file 
> formats as well). In case of DataFrameWriter path is relative to the current 
> working directory (of the driver). And this is what I would expect it to be. 
> In the case of DataStreamWriter only _spark_metadata is written to the 
> directory relative to the current working directory of the driver and parquet 
> files are written to the directory that is relative to the executor 
> directory. Is this a bug caused by relative path being passed to an executor 
> as is or the behavior is by design? In the later case, what is the rationale?
> 
> I do understand that using relative path is not the best option especially in 
> the distributed systems, but I think that relative path is still commonly 
> used for testing and prototyping (and in examples).
> 
> Thank you,
> 
> Vlad

Reply via email to