Re: Add hot-deploy capability in Spark Shell

Kai Chen Mon, 06 Jun 2016 16:25:51 -0700

I don't.  The hot-deploy shouldn't happen while there is a job running.  At
least in the REPL it won't make much sense.  It's a development-only
feature to shorten the iterative coding cycle.  In production environment,
this is not enabled ... though there might be situations where it would be
desirable.  But currently I'm not handling that, as it's much more complex.


On Mon, Jun 6, 2016 at 4:16 PM, Reynold Xin <r...@databricks.com> wrote:

> Thanks for the email. How do you deal with in-memory state that reference
> the classes? This can happen in both streaming and caching in RDD and
> temporary view creation in SQL.
>
> On Mon, Jun 6, 2016 at 3:40 PM, S. Kai Chen <sean.kai.c...@gmail.com>
> wrote:
>
>> Hi,
>>
>> We use spark-shell heavily for ad-hoc data analysis as well as iterative
>> development of the analytics code. A common workflow consists the following
>> steps:
>>
>>    1. Write a small Scala module, assemble the fat jar
>>    2. Start spark-shell with the assembly jar file
>>    3. Try out some ideas in the shell, then capture the code back into
>>    the module
>>    4. Go back to step 1 and restart the shell
>>
>> This is very similar to what people do in web-app development. And the
>> pain point is similar: in web-app development, a lot of time is spent
>> waiting for new code to be deployed; here, a lot of time is spent waiting
>> for Spark to restart. Having the ability to hot-deploy code in the REPL
>> would help a lot, just as being able to hot-deploy in containers like Play,
>> or using JRebel, has helped boost productivity tremendously.
>>
>> I do have code that works with the 1.5.2 release.  Is this something
>> that's interesting enough to be included in Spark proper?  If so, should I
>> create a Jira ticket or github PR for the master branch?
>>
>>
>> Cheers,
>>
>> Kai
>>
>
>

Re: Add hot-deploy capability in Spark Shell

Reply via email to