Won't the performeance improve significantly if you increase the number of nodes even in a commodity hardware profile. On 5 Jul 2014 01:38, "Jens Rantil" <jens.ran...@tink.se> wrote:
> Hi Mike, > > To learn get subsecond performance on your queries using _any_ database > you need to use proper indexing. Like Jeremy said, Solr will do this. > > If you'd like to try to solve this using Cassandra you need to learn the > difference between partition and clustering in your primary key and > understand you need a clustering to do any kind of range query. > > Also, COUNTs in Cassandra are generally fairly slow. > > Cheers, > Jens > — > Sent from Mailbox <https://www.dropbox.com/mailbox> > > > On Tue, Jun 24, 2014 at 10:09 AM, Mike Carter <jaloos...@gmail.com> wrote: > >> Hello! >> >> >> I'm a beginner in C* and I'm quite struggling with it. >> >> I’d like to measure the performance of some Cassandra-Range-Queries. The >> idea is to execute multidimensional range-queries on Cassandra. E.g. there >> is a given table of 1million rows with 10 columns and I like to execute >> some queries like “select count(*) from testable where d=1 and v1<10 and v2 >> >20 and v3 <45 and v4>70 … allow filtering”. This kind of queries is very >> slow in C* and soon the tables are bigger, I get a read-timeout probably >> caused by long scan operations. >> >> In further tests I like to extend the dimensions to more than 200 >> hundreds and the rows to 100millions, but actually I can’t handle this >> small table. Should reorganize the data or is it impossible to perform such >> high multi-dimensional queries on Cassandra? >> >> >> >> >> >> The setup: >> >> Cassandra is installed on a single node with 2 TB disk space and 180GB >> Ram. >> >> Connected to Test Cluster at localhost:9160. >> >> [cqlsh 4.1.1 | Cassandra 2.0.7 | CQL spec 3.1.1 | Thrift protocol 19.39.0] >> >> >> >> Keyspace: >> >> CREATE KEYSPACE test WITH replication = { >> >> 'class': 'SimpleStrategy', >> >> 'replication_factor': '1' >> >> }; >> >> >> >> >> >> Table: >> >> CREATE TABLE testc21 ( >> >> key int, >> >> d int, >> >> v1 int, >> >> v10 int, >> >> v2 int, >> >> v3 int, >> >> v4 int, >> >> v5 int, >> >> v6 int, >> >> v7 int, >> >> v8 int, >> >> v9 int, >> >> PRIMARY KEY (key) >> >> ) WITH >> >> bloom_filter_fp_chance=0.010000 AND >> >> caching='ROWS_ONLY' AND >> >> comment='' AND >> >> dclocal_read_repair_chance=0.000000 AND >> >> gc_grace_seconds=864000 AND >> >> index_interval=128 AND >> >> read_repair_chance=0.100000 AND >> >> replicate_on_write='true' AND >> >> populate_io_cache_on_flush='false' AND >> >> default_time_to_live=0 AND >> >> speculative_retry='99.0PERCENTILE' AND >> >> memtable_flush_period_in_ms=0 AND >> >> compaction={'class': 'SizeTieredCompactionStrategy'} AND >> >> compression={'sstable_compression': 'LZ4Compressor'}; >> >> >> >> CREATE INDEX testc21_d_idx ON testc21 (d); >> >> >> >> select * from testc21 limit 10; >> >> key | d | v1 | v10 | v2 | v3 | v4 | v5 | v6 | v7 | v8 | v9 >> >> --------+---+----+-----+----+----+-----+----+----+----+----+----- >> >> 302602 | 1 | 56 | 55 | 26 | 45 | 67 | 75 | 25 | 50 | 26 | 54 >> >> 531141 | 1 | 90 | 77 | 86 | 42 | 76 | 91 | 47 | 31 | 77 | 27 >> >> 693077 | 1 | 67 | 71 | 14 | 59 | 100 | 90 | 11 | 15 | 6 | 19 >> >> 4317 | 1 | 70 | 77 | 44 | 77 | 41 | 68 | 33 | 0 | 99 | 14 >> >> 927961 | 1 | 15 | 97 | 95 | 80 | 35 | 36 | 45 | 8 | 11 | 100 >> >> 313395 | 1 | 68 | 62 | 56 | 85 | 14 | 96 | 43 | 6 | 32 | 7 >> >> 368168 | 1 | 3 | 63 | 55 | 32 | 18 | 95 | 67 | 78 | 83 | 52 >> >> 671830 | 1 | 14 | 29 | 28 | 17 | 42 | 42 | 4 | 6 | 61 | 93 >> >> 62693 | 1 | 26 | 48 | 15 | 22 | 73 | 94 | 86 | 4 | 66 | 63 >> >> 488360 | 1 | 8 | 57 | 86 | 31 | 51 | 9 | 40 | 52 | 91 | 45 >> >> Mike >> > >