Hi! I have some data in a table created using thrift. In cassandra-cli, the 'show schema' output for this table is:
create column family Users with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'UTF8Type' and key_validation_class = 'LexicalUUIDType' and column_metadata = [ {column_name : 'date_created', validation_class : LongType}, {column_name : 'active', validation_class : IntegerType, index_name : 'Users_active_idx_1', index_type : 0}, {column_name : 'email', validation_class : UTF8Type, index_name : 'Users_email_idx_1', index_type : 0}, {column_name : 'username', validation_class : UTF8Type, index_name : 'Users_username_idx_1', index_type : 0}, {column_name : 'default_account_id', validation_class : LexicalUUIDType}]; >From cqlsh, it looks like this: [cqlsh 4.1.1 | Cassandra 2.0.11 | CQL spec 3.1.1 | Thrift protocol 19.39.0] Use HELP for help. cqlsh:test> describe table Users; CREATE TABLE "Users" ( key 'org.apache.cassandra.db.marshal.LexicalUUIDType', column1 ascii, active varint, date_created bigint, default_account_id 'org.apache.cassandra.db.marshal.LexicalUUIDType', email text, username text, value text, PRIMARY KEY ((key), column1) ) WITH COMPACT STORAGE; CREATE INDEX Users_active_idx_12 ON "Users" (active); CREATE INDEX Users_email_idx_12 ON "Users" (email); CREATE INDEX Users_username_idx_12 ON "Users" (username); Now, when I try to extract data from this using cqlsh or the python-driver, I have no problems getting data for the columns which are actually UTF8,but for those where column_metadata have been set to something else, there's trouble. Example using the python driver: -- snip -- In [8]: u = uuid.UUID("a6b07340-047c-4d4c-9a02-1b59eabf611c") In [9]: sess.execute('SELECT column1,value from "Users" where key = %s and column1 = %s', [u, 'username']) Out[9]: [Row(column1='username', value=u'uc6vf')] In [10]: sess.execute('SELECT column1,value from "Users" where key = %s and column1 = %s', [u, 'date_created']) --------------------------------------------------------------------------- UnicodeDecodeError Traceback (most recent call last) <ipython-input-10-d06f98a160e1> in <module>() ----> 1 sess.execute('SELECT column1,value from "Users" where key = %s and column1 = %s', [u, 'date_created']) /home/forsberg/dev/virtualenvs/ospapi/local/lib/python2.7/site-packages/cassandra/cluster.pyc in execute(self, query, parameters, timeout, trace) 1279 future = self.execute_async(query, parameters, trace) 1280 try: -> 1281 result = future.result(timeout) 1282 finally: 1283 if trace: /home/forsberg/dev/virtualenvs/ospapi/local/lib/python2.7/site-packages/cassandra/cluster.pyc in result(self, timeout) 2742 return PagedResult(self, self._final_result) 2743 elif self._final_exception: -> 2744 raise self._final_exception 2745 else: 2746 raise OperationTimedOut(errors=self._errors, last_host=self._current_host) UnicodeDecodeError: 'utf8' codec can't decode byte 0xf3 in position 6: unexpected end of data -- snap -- cqlsh gives me similar errors. Can I tell the python driver to parse some column values as integers, or is this an unsupported case? For sure this is an ugly table, but I have data in it, and I would like to avoid having to rewrite all my tools at once, so if I could support it from CQL that would be great. Regards, \EF