IN versus multiple asynchronous queries

2014-10-04 Thread Robert Wille
I have a table of small documents (less than 1K) that are often accessed together as a group. The group size is always less than 50. Which produces less load on the server, one query using an IN clause to get all 50 back together, or 50 concurrent queries? Which one is fastest? Thanks Robert

Cassandra + Solr

2014-10-04 Thread Robert Wille
I am architecting a solution for moving a large number of documents out of our MySQL database to C*. We use Solr to index these documents. I’ve recently become aware of a few different packages that integrate C* and Solr. At first blush, this seems like the perfect fit, as it would eliminate a c

Re: IN versus multiple asynchronous queries

2014-10-04 Thread DuyHai Doan
Definitely 50 concurrent queries, possibly in async mode. If you're using the IN clause with 50 values, the coordinator will block, waiting for 50 partitions to be fetched from different nodes (worst case = 50 nodes) before responding to client. In addition to the very high latency, you'll put th

Re: Cassandra + Solr

2014-10-04 Thread Jack Krupansky
The "requirement" is only that the Lucene (Solr) index fit in system memory that the OS uses for file caching. SSDs are another matter. If somebody or some information source is telling you that the index must fit "in heap", please identify the source so that we can correct them. If the index