[
https://issues.apache.org/jira/browse/SPARK-18817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15757998#comment-15757998
]
Felix Cheung commented on SPARK-18817:
--------------------------------------
Aside from changing the existing shipped behavior, there are a few mentions of
this behavior in various documentation that would become wrong and would need
to be updated.
IMO more importantly we still have a feature that can be turned on (as
documented or suggested in documentations) that would cause files to be written
without the user explicitly agreeing to it (or understanding it). This to me
doesn't seem like we would be addressing the root of the issue fully, merely
side-stepping it?
I've managed to track down the fix to move metastore_db and derby.log though.
There are two separate switches to set that it doable from pure R; but I'd
recommend doing in
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L116
in order to respect any existing value from hive-site.xml if given one.
How about we introduce something like spark.sql.default.derby.dir and fix this
that way?
> Ensure nothing is written outside R's tempdir() by default
> ----------------------------------------------------------
>
> Key: SPARK-18817
> URL: https://issues.apache.org/jira/browse/SPARK-18817
> Project: Spark
> Issue Type: Sub-task
> Components: SparkR
> Reporter: Brendan Dwyer
> Priority: Critical
>
> Per CRAN policies
> https://cran.r-project.org/web/packages/policies.html
> {quote}
> - Packages should not write in the users’ home filespace, nor anywhere else
> on the file system apart from the R session’s temporary directory (or during
> installation in the location pointed to by TMPDIR: and such usage should be
> cleaned up). Installing into the system’s R installation (e.g., scripts to
> its bin directory) is not allowed.
> Limited exceptions may be allowed in interactive sessions if the package
> obtains confirmation from the user.
> - Packages should not modify the global environment (user’s workspace).
> {quote}
> Currently "spark-warehouse" gets created in the working directory when
> sparkR.session() is called.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]