[
https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581971#comment-15581971
]
Michael Schmeißer commented on SPARK-650:
-----------------------------------------
I agree that static initialization would solve the problem for cases where
everything is known or can be loaded at class-loading time, e.g. from property
files in the artifact itself.
For situations like RecordReaders, it might also work, because they have an
initialize method where they get contextual information that could have been
enriched with the required values from the driver.
However, we also have other cases, where information from the driver is needed.
Imagine the following case: We have a temporary directory in HDFS which is
determined by the Oozie workflow instance ID. The driver knows this
information, because it is provided by Oozie via main method arguments. The
executor needs this information as well, e.g. to load some data that is
required to initialize a static context. Then, the question arises: How does
the information get to the executor?
Either with the function instance which would mean that the developer of the
function needs to know that he has to call an initialization method in every
function or at least in every first function on an RDD (which he probably
doesn't know, because he received the RDD from a different part of the
application). Or with an explicit mechanism which is executed before the
developer functions run on any executor. Which would lead me again to the
"empty RDD" workaround.
> Add a "setup hook" API for running initialization code on each executor
> -----------------------------------------------------------------------
>
> Key: SPARK-650
> URL: https://issues.apache.org/jira/browse/SPARK-650
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core
> Reporter: Matei Zaharia
> Priority: Minor
>
> Would be useful to configure things like reporting libraries
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]