Hi,

Has anyone used HIVE + Cassandra Community successfully? I am having
problems mapping the keyspace, but I started wondering if only DSE has
support for it.


I am trying to use HIVE 0.13 to access cassandra 2.0.8 column families
created with CQL3.

Here is how I created my column families:

CREATE KEYSPACE IF NOT EXISTS Identification
  WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',
  'DC1' : 2 };

USE Identification;

CREATE TABLE IF NOT EXISTS entitylookup (
  name varchar,
  value varchar,
  entity_id uuid,
  PRIMARY KEY ((name, value), entity_id))
WITH
    caching=all
;

I followed the instructions from the README of this project:
https://github.com/tuplejump/cash/tree/master/cassandra-handler

I generated hive-cassandra-1.2.6.jar, copied it and
cassandra-all-1.2.6.jar, cassandra-thrift-1.2.6.jar to hive lib folder.

Then I started hive and tried the following:

CREATE EXTERNAL TABLE identification.entitylookup(name string, value
string, entity_id binary)
STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler'
WITH SERDEPROPERTIES("cql.primarykey" = "name, value",
"cassandra.host" = "localhost", "cassandra.port "= "9160")
TBLPROPERTIES ("cassandra.ks.name" = "identification",
"cassandra.ks.stratOptions"="'DC1':2",
"cassandra.ks.strategy"="NetworkTopologyStrategy");

Here is the output:

hive> mvalle@mvalle:~/hadoop$ hive
14/05/30 12:02:02 INFO Configuration.deprecation: mapred.reduce.tasks
is deprecated. Instead, use mapreduce.job.reduces
14/05/30 12:02:02 INFO Configuration.deprecation:
mapred.min.split.size is deprecated. Instead, use
mapreduce.input.fileinputformat.split.minsize
14/05/30 12:02:02 INFO Configuration.deprecation:
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
mapreduce.reduce.speculative
14/05/30 12:02:02 INFO Configuration.deprecation:
mapred.min.split.size.per.node is deprecated. Instead, use
mapreduce.input.fileinputformat.split.minsize.per.node
14/05/30 12:02:02 INFO Configuration.deprecation:
mapred.input.dir.recursive is deprecated. Instead, use
mapreduce.input.fileinputformat.input.dir.recursive
14/05/30 12:02:02 INFO Configuration.deprecation:
mapred.min.split.size.per.rack is deprecated. Instead, use
mapreduce.input.fileinputformat.split.minsize.per.rack
14/05/30 12:02:02 INFO Configuration.deprecation:
mapred.max.split.size is deprecated. Instead, use
mapreduce.input.fileinputformat.split.maxsize
14/05/30 12:02:02 INFO Configuration.deprecation:
mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use
mapreduce.job.committer.setup.cleanup.needed

Logging initialized using configuration in
jar:file:/home/mvalle/hadoop/apache-hive-0.13.0-bin/lib/hive-common-0.13.0.jar!/hive-log4j.properties
OpenJDK 64-Bit Server VM warning: You have loaded library
/home/mvalle/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which
might have disabled stack guard. The VM will try to fix the stack
guard now.
It's highly recommended that you fix the library with 'execstack -c
<libfile>', or link it with '-z noexecstack'.
hive> CREATE EXTERNAL TABLE identification.entitylookup(name string,
value string, entity_id binary)
    > STORED BY
'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' WITH
SERDEPROPERTIES("cql.primarykey" = "name, value", "cassandra.host" =
"ident.s1mbi0se.com", "cassandra.port "= "9160")
    > TBLPROPERTIES ("cassandra.ks.name" = "identification",
"cassandra.ks.stratOptions"="'DC1':2",
"cassandra.ks.strategy"="NetworkTopologyStrategy");
FAILED: SemanticException [Error 10072]: Database does not exist: identification

Question: how do I do to get more information about what is going wrong? I
tried the same hive command using "Identification" (capital I), but same
result. Is it possible to access CQL3 column families in cassandra
community? It seems the keyspace has not been mapped, but I don't see how
to map then. In DSE, they are automatically mapped...


Best regards,
Marcelo.

Reply via email to