Re: Cassandra read throughput with little/no caching.

2012-12-24 Thread James Masson

Hi Aaron,

On 23/12/12 20:18, aaron morton wrote:

First, the non helpful advice, I strongly suggest changing the data
model so you do not have 100MB+ rows. They will make life harder.


I don't think we have 100MB+ rows. Column families, yes - but not rows.




Write request latency is about 900 microsecs, read request

latency
is about 4000 microsecs.



4 milliseconds to drag 100 to 300 MB data off a SAN, through your
network, into C* and out to the client does not sound terrible at first
glance. Can you benchmark and individual request to get an idea of the
throughput?


It's large numbers of small requests - 250 writes/sec - about 100 
reads/sec. I might look at some tcpdumps, to see what it's actually doing...


With a total volume of approx 400Mb, split over 3 nodes, it takes about 
30mins to run through the complete data-set. There's near zero disk I/O, 
and disk-wait. It's definitely coming out of the Linux disk cache.


That works out at about 0.2Mb/sec in data crunching terms - and about 
0.6Mb/sec network I/O.




I would recommend removing the SAN from the equation, cassandra will run
better with local disks. It also introduces a single point of failure
into a distributed system.


Understood about the SPoF, but negated by good SAN fabric design. I 
think a single local disk or two is going to find it hard to compete 
with a FC attached SAN with Gb of dedicated DRAM cache, and SSD tiering.

This is all on VMware anyway, so there's no option of local disks.




but it's likely in the Linux disk cache, given the sizing of the
node/data/jvm.

Are you sure that the local Linux machine is going to cache files stored
on the SAN ?


Yes, Linux doesn't care ( and isn't aware) at the filesystem level if 
the volume is 'local' or not, everything goes through the same caching 
strategy. Again, because this is VMware, it appears as a 'local' disk 
anyway.


In short, disk isn't the limiting factor here.

thanks

James M


Re: Cassandra read throughput with little/no caching.

2012-12-24 Thread James Masson



On 21/12/12 17:56, Yiming Sun wrote:

James, you could experiment with Row cache, with off-heap JNA cache, and
see if it helps.  My own experience with row cache was not good, and the
OS cache seemed to be most useful, but in my case, our data space was
big, over 10TB.  Your sequential access pattern certainly doesn't play
well with LRU, but giving the small data space you have, you may be able
to fit the data from one column family entirely into the row cache.




I've done some experimenting today with JNA/row cache. Extra 500Mb of 
heap, 300Mb row cache, latest JNA, set caching=ALL in the schema for all 
column families in this keyspace.


Getting average 5% row cache hit rate - no increase in cassandra 
throughput, and increased disk read I/O, basically because I've 
sacrificed Linux disk cache for the cassandra row-cache.


Load average was 4 (2cpu boxes) for the duration of the cycle, where it 
was about 2 before, basically because of the disk I/O I think.


So, I think I'll disable row caching again...

James M


Re: CQL3 Compound Primary Keys - Do I have the right idea?

2012-12-24 Thread Manu Zhang
>
> CREATE TABLE seen_ships (
>day text,
>time_seen timestamp,
>shipname text,
>PRIMARY KEY (day, time_seen)
>);


In CQL3, we could select all the columns with the same 'day' and same
'time_seen'.

Is it possible with cassandra-cli?


On Mon, Dec 24, 2012 at 6:54 AM, Tristan Seligmann
wrote:

> On Sun, Dec 23, 2012 at 9:25 PM, aaron morton 
> wrote:
> > In this example:
> >
> >  CREATE TABLE seen_ships (
> >day text,
> >time_seen timestamp,
> >shipname text,
> >PRIMARY KEY (day, time_seen)
> >);
> > http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
> >
> > * day is the internal row key
> > * there is only ONE internal column / cell, the shipname
> > * the internal column / cell "shipname" is a composite of the *value* of
> > time_seen. e.g. 
>
> Alternatively, if you want a composite partition key eg.
> , this functionality is implemented in
> https://issues.apache.org/jira/browse/CASSANDRA-4179 and I believe is
> available in Cassandra 1.2 as well[1].
>
> [1] I recently asked about this on SO:
>
> http://stackoverflow.com/questions/13938288/can-a-cassandra-cql3-column-family-have-a-composite-partition-key
> --
> mithrandi, i Ainil en-Balandor, a faer Ambar
>