Re: cassandra + spark / pyspark

Oleg Ruchovets Wed, 10 Sep 2014 20:29:54 -0700

Typo. I am talking about spark only.
Thanks
Oleg.


On Thursday, September 11, 2014, DuyHai Doan <doanduy...@gmail.com> wrote:

> Stupid question: do you really need both Storm & Spark ? Can't you
> implement the Storm jobs in Spark ? It will be operationally simpler to
> have less moving parts. I'm not saying that Storm is not the right fit, it
> may be totally suitable for some usages.
>
>  But if you want to avoid the SPOF thing and don't want to bring in
> resource management frameworks, the Spark/Cassandra integration is an
> interesting alternative.
>
>
> On Wed, Sep 10, 2014 at 8:20 PM, Oleg Ruchovets <oruchov...@gmail.com
> <javascript:_e(%7B%7D,'cvml','oruchov...@gmail.com');>> wrote:
>
>> Interesting things actually:
>>    We have hadoop in our eco system. It has single point of failure and I
>> am not sure about inter  data center replication.
>>  Plan is to use cassandra - no single point of failure , there is data
>> center replication.
>> For aggregation/transformation using SPARK. BUT storm requires mesos
>> which has SINGLE POINT of failure ( and it will require the same
>> maintenance like with secondary name node with hadoop) :-) :-).
>>
>> Question : is there a way to have storage and processing without single
>> point of failure and inter data center replication ?
>>
>> Thanks
>> Oleg.
>>
>> On Thu, Sep 11, 2014 at 2:09 AM, DuyHai Doan <doanduy...@gmail.com
>> <javascript:_e(%7B%7D,'cvml','doanduy...@gmail.com');>> wrote:
>>
>>> "As far as I know, the Datastax connector uses thrift to connect Spark
>>> with Cassandra although thrift is already deprecated, could someone confirm
>>> this point?"
>>>
>>> --> the Scala connector is using the latest Java driver, so no there is
>>> no Thrift there.
>>>
>>>  For the Java version, I'm not sure, have not looked into it but I think
>>> it also uses the new Java driver
>>>
>>>
>>> On Wed, Sep 10, 2014 at 7:27 PM, Francisco Madrid-Salvador <
>>> pmad...@stratio.com
>>> <javascript:_e(%7B%7D,'cvml','pmad...@stratio.com');>> wrote:
>>>
>>>> Hi Oleg,
>>>>
>>>> Stratio Deep is just a library you must include in your Spark
>>>> deployment so it doesn't guarantee any high availability at all. To achieve
>>>> HA you must use Mesos or any other 3rd party resource manager.
>>>>
>>>> Stratio doesn't currently support PySpark, just Scala and Java. Perhaps
>>>> in the future...
>>>>
>>>> It should be ready for production use, but like always please test
>>>> before on a testing environment ;-)
>>>>
>>>> As far as I know, the Datastax connector uses thrift to connect Spark
>>>> with Cassandra although thrift is already deprecated, could someone confirm
>>>> this point?
>>>>
>>>> Paco
>>>>
>>>
>>>
>>
>

Re: cassandra + spark / pyspark

Reply via email to