Hi all,

I need to store column values that are of various data types in a
single column family, i.e I have column values that are integers,
others that are strings, and maybe more later. All column names are
strings (no comparator problem for me).
The thing is I need to store unstructured data - I do not have fixed
and known-in-advacne column names, so I can not use a fixed static map
for casting the values back to their original type on retrieval from
cassandra.

My immediate naive thought is to simply prefix every column name with
the type the value needs to be cast back to.
For example i'll do the follwing conversion to the columns of some key -
{'attr1': 'val1','attr2': 100}  ~> {'str_attr1' : 'val1', 'int_attr2' : '100'}
and only then send it to cassandra. This way I know to what should I
cast it back.

But all this casting back and forth on the client side seems to me to
be very bad for performance.
Another option is to split the columns on dedicated column families
with mathcing validation types - a column family for integer values,
one for string, one for timestamp etc.
But that does not seem very efficient either (and worse for any
rollback mechanism), since now I have to perform several get calls on
multiple CFs where once I had only one.

I thought perhaps someone has encountered a similar situation in the
past, and can offer some advice on the best course of action.

Thank you,
Osi

Reply via email to