You could do the serialization for all your supported datatypes yourself (many 
libraries for serialization are available and a pretty thorough benchmarking 
for them can be found here: https://github.com/eishay/jvm-serializers/wiki) and 
prepend the serialized bytes with an identifier for your datatype.
This would not avoid casting though but would still be better performing then 
serializing to strings as it is done in your example.
Prepending the values with the id seems to be better to me, because you can be 
sure that a new insertion to some field overwrites the correct column even if 
it changed the type.

-----Ursprüngliche Nachricht-----
Von: osishkin osishkin [mailto:osish...@gmail.com] 
Gesendet: Sonntag, 3. Juli 2011 13:52
An: user@cassandra.apache.org
Betreff: Multi-type column values in single CF

Hi all,

I need to store column values that are of various data types in a
single column family, i.e I have column values that are integers,
others that are strings, and maybe more later. All column names are
strings (no comparator problem for me).
The thing is I need to store unstructured data - I do not have fixed
and known-in-advacne column names, so I can not use a fixed static map
for casting the values back to their original type on retrieval from
cassandra.

My immediate naive thought is to simply prefix every column name with
the type the value needs to be cast back to.
For example i'll do the follwing conversion to the columns of some key -
{'attr1': 'val1','attr2': 100}  ~> {'str_attr1' : 'val1', 'int_attr2' : '100'}
and only then send it to cassandra. This way I know to what should I
cast it back.

But all this casting back and forth on the client side seems to me to
be very bad for performance.
Another option is to split the columns on dedicated column families
with mathcing validation types - a column family for integer values,
one for string, one for timestamp etc.
But that does not seem very efficient either (and worse for any
rollback mechanism), since now I have to perform several get calls on
multiple CFs where once I had only one.

I thought perhaps someone has encountered a similar situation in the
past, and can offer some advice on the best course of action.

Thank you,
Osi


Reply via email to