Got it.
Thank you so for your detailed explanation.

Regards,
Shyam

On Thu, 12 Apr 2018, 17:37 Evelyn Smith, <u5015...@gmail.com> wrote:

> Cassandra tends to be used in a lot of web applications. It’s loads are
> more natural and evenly distributed. Like people logging on throughout the
> day. And people operating it tend to be latency sensitive.
>
> Spark on the other hand will try and complete it’s tasks as quickly as
> possible. This might mean bulk reading from the Cassandra at 10 times the
> usual operations load, but for only say 5 minutes every half hour (however
> long it takes to read in the data for a job and whenever that job is run).
> In this case during that 5 minutes your normal operations work (customers)
> are going to experience a lot of latency.
>
> This even happens with streaming jobs, every time spark goes to interact
> with Cassandra it does so very quickly, hammers it for reads and then does
> it’s own stuff until it needs to write things out. This might equate to
> intermittent latency spikes.
>
> In theory, you can throttle your reads and writes but I don’t know much
> about this and don’t see people actually doing it.
>
> Regards,
> Evelyn.
>
> On 12 Apr 2018, at 4:30 pm, sha p <shatestt...@gmail.com> wrote:
>
> Evelyn,
> Can you please elaborate on below
> Spark is notorious for causing latency spikes in Cassandra which is not
> great if you are are sensitive to that.
>
>
> On Thu, 12 Apr 2018, 10:46 Evelyn Smith, <u5015...@gmail.com> wrote:
>
>> Are you building a search engine -> Solr
>> Are you building an analytics function -> Spark
>>
>> I feel they are used in significantly different use cases, what are you
>> trying to build?
>>
>> If it’s an analytics functionality that’s seperate from your operations
>> functionality I’d build it in it’s own DC. Spark is notorious for causing
>> latency spikes in Cassandra which is not great if you are are sensitive to
>> that.
>>
>> Regards,
>> Evelyn.
>>
>> On 12 Apr 2018, at 6:55 am, kooljava2 <koolja...@yahoo.com.INVALID>
>> wrote:
>>
>> Hello,
>>
>> We are exploring on configuring Sorl/Spark. Wanted to get input on this.
>> 1) How do we decide which one to use?
>> 2) Do we run this on a DC where there is less workload?
>>
>> Any other suggestion or comments are appreciated.
>>
>> Thank you.
>>
>>
>>
>

Reply via email to