I have a 7 node setup with a replication factor of 1 and a read
consistency of 1. I have two column families: Messages which stores
millions of rows with a UUID for the row key, DateIndex which stores
thousands of rows with a String as the row key. I perform 2 look-ups
for my queries:

1) Fetch the row from DateIndex that includes the date I'm looking
for. This returns 1,000 columns where the column names are the UUID of
the messages
2) Do a multi-get (Hector client) using those 1,000 row keys I got
from the first query.

Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
respectable. However, query 2 is taking over 50s to perform 1,000 row
look-ups! Also, when I scale down to 100 row look-ups for query 2, the
time scales in a similar fashion, down to 5s.

Am I doing something wrong here? It seems like taking 5s to look-up
100 rows in a distributed hash table is way too slow.

Thoughts?

Bill-

Reply via email to