Got it. Thank you so for your detailed explanation. Regards, Shyam
On Thu, 12 Apr 2018, 17:37 Evelyn Smith, <u5015...@gmail.com> wrote: > Cassandra tends to be used in a lot of web applications. It’s loads are > more natural and evenly distributed. Like people logging on throughout the > day. And people operating it tend to be latency sensitive. > > Spark on the other hand will try and complete it’s tasks as quickly as > possible. This might mean bulk reading from the Cassandra at 10 times the > usual operations load, but for only say 5 minutes every half hour (however > long it takes to read in the data for a job and whenever that job is run). > In this case during that 5 minutes your normal operations work (customers) > are going to experience a lot of latency. > > This even happens with streaming jobs, every time spark goes to interact > with Cassandra it does so very quickly, hammers it for reads and then does > it’s own stuff until it needs to write things out. This might equate to > intermittent latency spikes. > > In theory, you can throttle your reads and writes but I don’t know much > about this and don’t see people actually doing it. > > Regards, > Evelyn. > > On 12 Apr 2018, at 4:30 pm, sha p <shatestt...@gmail.com> wrote: > > Evelyn, > Can you please elaborate on below > Spark is notorious for causing latency spikes in Cassandra which is not > great if you are are sensitive to that. > > > On Thu, 12 Apr 2018, 10:46 Evelyn Smith, <u5015...@gmail.com> wrote: > >> Are you building a search engine -> Solr >> Are you building an analytics function -> Spark >> >> I feel they are used in significantly different use cases, what are you >> trying to build? >> >> If it’s an analytics functionality that’s seperate from your operations >> functionality I’d build it in it’s own DC. Spark is notorious for causing >> latency spikes in Cassandra which is not great if you are are sensitive to >> that. >> >> Regards, >> Evelyn. >> >> On 12 Apr 2018, at 6:55 am, kooljava2 <koolja...@yahoo.com.INVALID> >> wrote: >> >> Hello, >> >> We are exploring on configuring Sorl/Spark. Wanted to get input on this. >> 1) How do we decide which one to use? >> 2) Do we run this on a DC where there is less workload? >> >> Any other suggestion or comments are appreciated. >> >> Thank you. >> >> >> >