I should read my posts at least once to avoid so many typos. Hopefully you
are brave enough to read through.

Petr

On Mon, Sep 21, 2015 at 11:23 AM, Petr Novak <oss.mli...@gmail.com> wrote:

> I think you would have to persist events somehow if you don't want to miss
> them. I don't see any other option there. Either in MQTT if it is supported
> there or routing them through Kafka.
>
> There is WriteAheadLog in Spark but you would have decouple stream MQTT
> reading and processing into 2 separate job so that you could upgrade the
> processing one assuming the reading one would be stable (without changes)
> across versions. But it is problematic because there is no easy way how to
> share DStreams between jobs - you would have develop your own facility for
> it.
>
> Alternatively the reading job could could save MQTT event in its the most
> raw form into files - to limit need to change code - and then the
> processing job would work on top of it using Spark streaming based on
> files. I this is inefficient and can get quite complex if you would like to
> make it reliable.
>
> Basically either MQTT supports prsistence (which I don't know) or there is
> Kafka for these use case.
>
> Another option would be I think to place observable streams in between
> MQTT and Spark streaming with bakcpressure as far as you could perform
> upgrade till buffers fills up.
>
> I'm sorry that it is not thought out well from my side, it is just a
> brainstorm but it might lead you somewhere.
>
> Regards,
> Petr
>
> On Mon, Sep 21, 2015 at 10:09 AM, Jeetendra Gangele <gangele...@gmail.com>
> wrote:
>
>> Hi All,
>>
>> I have an spark streaming application with batch (10 ms) which is reading
>> the MQTT channel and dumping the data from MQTT to HDFS.
>>
>> So suppose if I have to deploy new application jar(with changes in spark
>> streaming application) what is the best way to deploy, currently I am doing
>> as below
>>
>> 1.killing the running streaming app using yarn application -kill ID
>> 2. and then starting the application again
>>
>> Problem with above approach is since we are not persisting the events in
>> MQTT we will miss the events for the period of deploy.
>>
>> how to handle this case?
>>
>> regards
>> jeeetndra
>>
>
>

Reply via email to