[
https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725542#comment-15725542
]
Michael Schmeißer commented on SPARK-650:
-----------------------------------------
Sure it can be included in the closure and this was also our first solution to
the problem. But if the application has many layers and you need the resource
which requires info X to initialize often, it soon gets very inconvenient
because you have to pass X around a lot and pollute your APIs.
Thus, our next solution was to create a base function class which takes X in
its constructor and makes sure that the resource is initialized on the executor
side if it wasn't before. The drawback of this solution is that the function
developer can forget to extend the function base class and then he may or may
not be able to access the resource depending on whether a function has run
before on the executor which performed the initialization. This is really
error-prone (actually led to errors) and even if done correctly, prevents
lambdas from beeing used for functions.
As a result, we now use the "empty RDD" approach or piggy-back the Spark
JavaSerializer. Both works fine and initializes the executor-side resource
properly on all executors. So, from a function developer's point-of-view that's
nice, but overall, the solution relies on Spark internals to work which is why
I would rather have an explicit mechanism to perform such an initialization.
> Add a "setup hook" API for running initialization code on each executor
> -----------------------------------------------------------------------
>
> Key: SPARK-650
> URL: https://issues.apache.org/jira/browse/SPARK-650
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core
> Reporter: Matei Zaharia
> Priority: Minor
>
> Would be useful to configure things like reporting libraries
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]