> I'm confused : don't range queries such as the ones we've been > > discussing require using an orderedpartitionner ? > > Alright, so distribution depends on your choice of token. > Ah yes, I get it now : with a naive orderedpartitioner, the key is associated with the node whose token is the closest numerically-wise and that is where the "master" replica is located. Yes ?
Now let's assume I am using super columns as {X} and columns as {timeFrame}. In time each row will grow very large because X can (very sparsly) go to 2^28 i) does cassandra load all columns everytime it reads a row ? Same question for super column ii) Similarly does it cache all columns in memory ? Now some order of magnitudes, let's say a row is about 20KB and the cluster is running smoothly on low-end servers. There are millions of rows per node. i) If I were to only issue gets on the key, what is the order of magnitude I can expect to reach : 10/s, 100/s, 1000/s or 10.000/s ? ii) If I were to issue a slice on just the keys, does cassandra optimize the gets or does it run every get on the server and then concatenate to send to the client ? iii) is slicing on the columns going to improve the time to get the data on the server side or does it just cut down on network traffic ? Thanks Philippe