[
https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15714760#comment-15714760
]
Robert Neumann commented on SPARK-650:
--------------------------------------
Sean, I agree this is the essential question in this thread. If we get this
sorted out, then we are good and can achieve consensus on what to do with this
ticket.
A singleton "works" indeed. However, from a software engineering point of view
it is not nice. There exists a class of Spark Streaming jobs that requires
"setup -> do -> cleanup" semantics. The framework (in this case Spark
Streaming) should explicitly support these semantics through appropriate API
hooks. A singleton instead would hide these semantics and you would need to
implement some laxy code to check whether an HBase connection was already setup
or not; the singelton would need to do this for every write operation to HBase.
I do not think that application logic (the Singleton within the Spark Streaming
job) is the right place to wire in the "setup -> do -> cleanup" pattern. It is
a generic pattern and there exists a class of Spark Streaming jobs (not only
one specific Streaming job) that are based on this pattern.
> Add a "setup hook" API for running initialization code on each executor
> -----------------------------------------------------------------------
>
> Key: SPARK-650
> URL: https://issues.apache.org/jira/browse/SPARK-650
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core
> Reporter: Matei Zaharia
> Priority: Minor
>
> Would be useful to configure things like reporting libraries
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]