Hi Robert,

why do you need the actual text as a key? I sounds a bit unatural at least for me. Keep in mind that you cannot do "like" queries on keys in cassandra. For performance and keeping things more readable I would prefer hashing your text and use the hash as key.

You should also take into account to store the keys (hashes) in a seperate table per day / hour or something like that, so you can quickly get all keys for a time range. A query without the partition key may be very slow.

Jan

Am 11.04.2016 um 23:43 schrieb Robert Wille:
I have a need to be able to use the text of a document as the primary key in a 
table. These texts are usually less than 1K, but can sometimes be 10’s of K’s 
in size. Would it be better to use a digest of the text as the key? I have a 
background process that will occasionally need to do a full table scan and 
retrieve all of the texts, so using the digest doesn’t eliminate the need to 
store the text. Anyway, is it better to keep primary keys small, or is C* okay 
with large primary keys?

Robert


Reply via email to