Sean Owen created SPARK-17810:
---------------------------------

             Summary: Default spark.sql.warehouse.dir is relative to local FS 
but can resolve as HDFS path
                 Key: SPARK-17810
                 URL: https://issues.apache.org/jira/browse/SPARK-17810
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.1
            Reporter: Sean Owen
            Assignee: Sean Owen


Following SPARK-15899 and 
https://github.com/apache/spark/pull/13868#discussion_r82252372 we have a 
slightly different problem. 

The change removed the {{file:}} scheme from the default 
{{spark.sql.warehouse.dir}} as part of its fix, though the path is still 
clearly intended to be a local FS path and defaults to "spark-warehouse" in the 
user's home dir. However when running on HDFS this path will be resolved as an 
HDFS path, where it almost surely doesn't exist. 

Although it can be fixed by overriding {{spark.sql.warehouse.dir}} to a path 
like "file:/tmp/spark-warehouse", or any valid HDFS path, this probably won't 
work on Windows (the original problem) and of course means the default fails to 
work for most HDFS use cases.

There's a related problem here: the docs say the default should be 
spark-warehouse relative to the current working dir, not the user home dir. We 
can adjust that.

PR coming shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to