I believe ElasticSearch has better support for scaling horizontally (by adding nodes) than Solr does. Some benchmarks that I've looked at, also show it as performing better under high load.
I probably wouldn't run them both on the same node, or you might see low performance as they compete for resources. What type of usage do you expect - mostly read, or mostly write? On Wed, Apr 22, 2015 at 5:06 PM, Matthew Johnson <matt.john...@algomi.com> wrote: > Hi Ali, Brian, > > > > Thanks for the suggestion – we have previously used Solr (SolrCloud for > distribution) for a lot of other products, presumably this will do the same > job as ElasticSearch? Or does ElasticSearch have specifically better > integration with Cassandra or better support for aggregate queries? > > > > Would it be an ok architecture to have a Cassandra node and a Solr/ES > instance on each box, so they scale together? Or is it better to have > separate servers for storage and search? > > > > Cheers, > > Matt > > > > *From:* Brian O'Neill [mailto:boneil...@gmail.com] *On Behalf Of *Brian > O'Neill > *Sent:* 22 April 2015 12:56 > *To:* user@cassandra.apache.org > *Subject:* Re: Adhoc querying in Cassandra? > > > > > > +1, I think many organizations (including ours) pair Elastic Search with > Cassandra. > > Use Cassandra as your system of record, then index the data with ES. > > > > -brian > > > > --- > > *Brian O'Neill * > > Chief Technology Officer > > Health Market Science, a LexisNexis Company > > 215.588.6024 Mobile • @boneill42 <http://www.twitter.com/boneill42> > > > > This information transmitted in this email message is for the intended > recipient only and may contain confidential and/or privileged material. If > you received this email in error and are not the intended recipient, or the > person responsible to deliver it to the intended recipient, please contact > the sender at the email above and delete this email and any attachments and > destroy any copies thereof. Any review, retransmission, dissemination, > copying or other use of, or taking any action in reliance upon, this > information by persons or entities other than the intended recipient is > strictly prohibited. > > > > > > *From: *Ali Akhtar <ali.rac...@gmail.com> > *Reply-To: *<user@cassandra.apache.org> > *Date: *Wednesday, April 22, 2015 at 7:52 AM > *To: *<user@cassandra.apache.org> > *Subject: *Re: Adhoc querying in Cassandra? > > > > You might find it better to use elasticsearch for your aggregate queries > and analytics. Cassandra is more of just a data store. > > On Apr 22, 2015 4:42 PM, "Matthew Johnson" <matt.john...@algomi.com> > wrote: > > Hi all, > > > > Currently we are setting up a “big” data cluster, but we are only going to > have a couple of servers to start with but we need to be able to scale out > quickly when usage ramps up. Previously we have used Hadoop/HBase for our > big data cluster, but since we are starting this one on only two nodes I > think Cassandra will be a much better fit, as Hadoop and HBase really need > at least 3 to achieve any sort of resilience (zookeeper quorum etc). > > > > My question is this: > > > > I have used Apache Phoenix as a JDBC layer on top of HBase, which allows > me to issue ad-hoc SQL-style queries. (eg count the number of times users > have clicked on a certain button after clicking a different button in the > last 3 weeks etc). My understanding is that CQL does not support this style > of adhoc aggregate querying out of the box. Is there a recommended way to > do count, sum, average etc without writing client code (in my case Java) > every time I want to run one? I have been looking at projects like Drill, > Spark etc that could potentially sit on top of Cassandra but without > actually setting everything up and testing them it is difficult to figure > out what they would give us. > > > > Does anyone else interactively issue adhoc aggregate queries to Cassandra, > and if so, what stack do you use? > > > > Thanks! > > Matt > > >