Re: Sorting keys for batch reads to minimize seeks

Artur Kronenberg Fri, 18 Oct 2013 03:04:02 -0700

Hi,

Thanks for your reply. Our latency currently is 23.618ms. However Isimply read that off one node just now while it wasn't under a loadtest. I am going to be able to get a better number after the next test run.


What is a good value for read latency?


On 18/10/13 08:31, Viktor Jevdokimov wrote:

The only thing you may win - avoid unnecessary network hops if:
- request sorted keys (by token) from appropriate replica with ConsistencyLevel.ONE and 
"dynamic_snitch: false".
- nodes has the same load
- replica not doing GC, and GC pauses are much higher than internode 
communication.

For multiple keys request C* will do multiple single key reads, except for 
range scan requests, where only starting key and batch size is used in request.

Consider multiple key request as a slow request by design, try to model your 
data for low latency single key requests.

So, what latencies do you want to achieve?



Best regards / Pagarbiai

Viktor Jevdokimov
Senior Developer

Email: viktor.jevdoki...@adform.com
Phone: +370 5 212 3063
Fax: +370 5 261 0453

J. Jasinskio 16C,
LT-03163 Vilnius,
Lithuania



Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
this message and any copies.-----Original Message-----
From: Artur Kronenberg [mailto:artur.kronenb...@openmarket.com]
Sent: Thursday, October 17, 2013 7:40 PM
To: user@cassandra.apache.org
Subject: Sorting keys for batch reads to minimize seeks

Hi,

I am looking to somehow increase read performance on cassandra. We are still 
playing with configurations but I was thinking if there would be solutions in 
software that might help us speed up our read performance.

E.g. one idea, not sure how sane that is, was to sort read-batches by row-keys 
before submitting them to cassandra. The idea is that row-keys should be closer 
together on the physical disk and therefor this may minimize the amount of 
random seeks we have to do when querying say 1000 entries from cassandra. Does 
that make any sense?

Is there anything else that we can do in software to improve performance? Like 
specific batch sizes for reads? We are using the astyanax library to access 
cassandra.

Thanks!

Re: Sorting keys for batch reads to minimize seeks

Reply via email to