I am trying to store Counters CF from cassandra to Hive. Below is the CREATE TABLE syntax in Hive:
DROP TABLE IF EXISTS Counters; create external table Counters(row_key string, column_name string, value string) STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler' WITH SERDEPROPERTIES ("cassandra.columns.mapping" = ":key,:column,:value", "cassandra.ks.name" = "BAMSchema", "cassandra.ks.repfactor" = "1", "cassandra.ks.strategy" = "org.apache.cassandra.locator.SimpleStrategy", "cassandra.cf.name" = "Counters" , "cassandra.host" = "" , "cassandra.port" = "9160", "cassandra.partitioner" = "org.apache.cassandra.dht.RandomPartitioner") TBLPROPERTIES ( "cassandra.input.split.size" = "64000", "cassandra.range.size" = "1000", "cassandra.slice.predicate.size" = "1000"); and Counter CF is defined as : create column family Counters with comparator = UTF8Type and default_validation_class=CounterColumnType and replicate_on_write=true; I am not able to import the Counter value in Hive. I am getting other row_key and column_name properly. Below is the output from Hive: hive> select * from Counters; OK 213_debit_1326690000 1-sess_count d 213_debit_1326690000 1-total_db_time 213_debit_1326690000 1-total_exec_time 213_debit_1326690000 1-txn_count 213_debit_1326690000 2-sess_count 213_debit_1326690000 2-total_db_time 213_debit_1326690000 2-total_exec_time 213_debit_1326690000 2-txn_count Time taken: 0.263 seconds Below is output from Cassandra: [default@BAMSchema] list Counters; Using default limit of 100 ------------------- RowKey: 213_debit_1326690000 => (counter=1-sess_count, value=100) => (counter=1-total_db_time, value=20) => (counter=1-total_exec_time, value=30) => (counter=1-txn_count, value=1) => (counter=2-sess_count, value=30) => (counter=2-total_db_time, value=30) => (counter=2-total_exec_time, value=30) => (counter=2-txn_count, value=1) As you can see that "d" junk letter is getting added in Hive when import happens from Cassandra to Hive. Wondering what am I missing. Thanks for your help!