Hi Aaron,

On 23/12/12 20:18, aaron morton wrote:
First, the non helpful advice, I strongly suggest changing the data
model so you do not have 100MB+ rows. They will make life harder.

I don't think we have 100MB+ rows. Column families, yes - but not rows.


Write request latency is about 900 microsecs, read request

        latency
        is about 4000 microsecs.


4 milliseconds to drag 100 to 300 MB data off a SAN, through your
network, into C* and out to the client does not sound terrible at first
glance. Can you benchmark and individual request to get an idea of the
throughput?

It's large numbers of small requests - 250 writes/sec - about 100 reads/sec. I might look at some tcpdumps, to see what it's actually doing...

With a total volume of approx 400Mb, split over 3 nodes, it takes about 30mins to run through the complete data-set. There's near zero disk I/O, and disk-wait. It's definitely coming out of the Linux disk cache.

That works out at about 0.2Mb/sec in data crunching terms - and about 0.6Mb/sec network I/O.


I would recommend removing the SAN from the equation, cassandra will run
better with local disks. It also introduces a single point of failure
into a distributed system.

Understood about the SPoF, but negated by good SAN fabric design. I think a single local disk or two is going to find it hard to compete with a FC attached SAN with Gb of dedicated DRAM cache, and SSD tiering.
This is all on VMware anyway, so there's no option of local disks.


but it's likely in the Linux disk cache, given the sizing of the
node/data/jvm.
Are you sure that the local Linux machine is going to cache files stored
on the SAN ?

Yes, Linux doesn't care ( and isn't aware) at the filesystem level if the volume is 'local' or not, everything goes through the same caching strategy. Again, because this is VMware, it appears as a 'local' disk anyway.

In short, disk isn't the limiting factor here.

thanks

James M

Reply via email to