I have a 7 node setup with a replication factor of 1 and a read consistency of 1. I have two column families: Messages which stores millions of rows with a UUID for the row key, DateIndex which stores thousands of rows with a String as the row key. I perform 2 look-ups for my queries:
1) Fetch the row from DateIndex that includes the date I'm looking for. This returns 1,000 columns where the column names are the UUID of the messages 2) Do a multi-get (Hector client) using those 1,000 row keys I got from the first query. Query 1 is taking ~300ms to fetch 1,000 columns from a single row... respectable. However, query 2 is taking over 50s to perform 1,000 row look-ups! Also, when I scale down to 100 row look-ups for query 2, the time scales in a similar fashion, down to 5s. Am I doing something wrong here? It seems like taking 5s to look-up 100 rows in a distributed hash table is way too slow. Thoughts? Bill-