Hi Robert,
why do you need the actual text as a key? I sounds a bit unatural at
least for me. Keep in mind that you cannot do "like" queries on keys in
cassandra. For performance and keeping things more readable I would
prefer hashing your text and use the hash as key.
You should also take into account to store the keys (hashes) in a
seperate table per day / hour or something like that, so you can quickly
get all keys for a time range. A query without the partition key may be
very slow.
Jan
Am 11.04.2016 um 23:43 schrieb Robert Wille:
I have a need to be able to use the text of a document as the primary key in a
table. These texts are usually less than 1K, but can sometimes be 10’s of K’s
in size. Would it be better to use a digest of the text as the key? I have a
background process that will occasionally need to do a full table scan and
retrieve all of the texts, so using the digest doesn’t eliminate the need to
store the text. Anyway, is it better to keep primary keys small, or is C* okay
with large primary keys?
Robert