Hi, I have been unsuccessful trying to setup Hive 0.12 with HBase 0.96.1.1. I am able to get both Hive and HBase to work independently of each other, but when I attempt to map a Hive table to an HBase table I am greeted with:
hive> CREATE TABLE foo(rowkey INT, name STRING) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,f:c1') > TBLPROPERTIES ('hbase.table.name' = 'foo'); FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/hbase/HBaseConfiguration Could anyone suggest what I need to do to fix this? I initially ran into the following problem: Exception in thread "main" java.lang.IllegalArgumentException: Not a host:port pair: PBUF and found several links (e.x. http://hank-ca.blogspot.com/2014/02/setup-install-hadoop-220-hbase-096-on.html, http://hortonworks.com/community/forums/topic/hive-and-hbase-integration/) suggesting that $HIVE_HOME/lib/hbase-0.94.6.1.jar should be deleted and replaced with hbase's jars. So I've added the following symlinks: hadoop$ for filn in `find /usr/local/hive/lib -type l`; do ls -al $filn; done lrwxrwxrwx 1 hadoop hadoop 41 Feb 11 18:03 /usr/local/hive/lib/htrace-core-2.01.jar -> /usr/local/hbase/lib/htrace-core-2.01.jar lrwxrwxrwx 1 hadoop hadoop 54 Feb 11 18:03 /usr/local/hive/lib/hbase-server-0.96.1.1-hadoop2.jar -> /usr/local/hbase/lib/hbase-server-0.96.1.1-hadoop2.jar lrwxrwxrwx 1 hadoop hadoop 54 Feb 11 18:03 /usr/local/hive/lib/hbase-client-0.96.1.1-hadoop2.jar -> /usr/local/hbase/lib/hbase-client-0.96.1.1-hadoop2.jar lrwxrwxrwx 1 hadoop hadoop 56 Feb 11 18:03 /usr/local/hive/lib/hbase-protocol-0.96.1.1-hadoop2.jar -> /usr/local/hbase/lib/hbase-protocol-0.96.1.1-hadoop2.jar lrwxrwxrwx 1 hadoop hadoop 40 Feb 11 18:04 /usr/local/hive/lib/zookeeper-3.4.5.jar -> /usr/local/hbase/lib/zookeeper-3.4.5.jar The HBase table in question was created via: hbase(main):009:0> create 'foo', 'cf' 0 row(s) in 0.4840 seconds => Hbase::Table - foo hbase(main):010:0> put 'foo',2,'cf:name','Test' 0 row(s) in 0.0580 seconds hbase(main):011:0> scan 'foo' ROW COLUMN+CELL 2 column=cf:name, timestamp=1392171457664, value=Test 1 row(s) in 0.0180 seconds Version & config information: hadoop$ hadoop version Hadoop 2.2.0 Subversion Unknown -r Unknown Compiled by jdraner on 2014-02-10T20:20Z Compiled with protoc 2.5.0 >From source with checksum 79e53ce7994d1628b240f09af91e1af4 This command was run using /usr/src/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar hadoop$ hbase version 2014-02-11 18:34:48,932 INFO [main] util.VersionInfo: HBase 0.96.1.1-hadoop2 2014-02-11 18:34:48,933 INFO [main] util.VersionInfo: Subversion file:///home/jon/proj/hbase-svn/hbase-0.96.1.1 -r Unknown 2014-02-11 18:34:48,933 INFO [main] util.VersionInfo: Compiled by jon on Tue Dec 17 12:22:12 PST 2013 hadoop$ jps 17080 NodeManager 19045 HMaster 19370 HRegionServer 17704 NameNode 17981 DataNode 336 Jps 16822 ResourceManager 18313 SecondaryNameNode 18954 HQuorumPeer 32032 Main hadoop$ cat /usr/local/hbase/conf/hbase-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- /** * * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ --> <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://localhost:9000/hbase</value> </property> <property> <name>hbase.master</name> <value>localhost:60000</value> <description>The host and port that the HBase master runs at.</description> </property> <property> <name>hbase.regionserver.port</name> <value>60020</value> <description>The host and port that the HBase master runs at.</description> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>localhost</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> <description>Property from ZooKeeper's config zoo.cfg. The port at which the clients will connect. </description> </property> <property> <name>zookeeper.session.timeout</name> <value>1800000</value> <description>Session Time out.</description> </property> <property> <name>hbase.client.scanner.caching</name> <value>500</value> </property> <property> <name>hbase.regionserver.lease.period</name> <value>240000</value> </property> <property> <name>dfs.support.append</name> <value>true</value> </property> </configuration> hadoop$ tail -10 /usr/local/hive/conf/hive-env.sh # Hive Configuration Directory can be controlled by: # export HIVE_CONF_DIR= # Folder containing extra ibraries required for hive compilation/execution can be controlled by: # export HIVE_AUX_JARS_PATH= export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 export HIVE_HOME=/usr/local/hive export HIVE_CONF_DIR=$HIVE_HOME/conf hadoop$ tail -10 /usr/local/hbase/conf/hbase-env.sh # For example: # HBASE_ROOT_LOGGER=INFO,DRFA # The reason for changing default to RFA is to avoid the boundary case of filling out disk space as # DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context. export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 export HBASE_REGIONSERVERS=/usr/local/hbase/conf/regionservers export HBASE_LOG_DIR=/tmp/hbase/logs export HBASE_PID_DIR=/tmp/hbase/pid export HBASE_MANAGES_ZK=true hadoop$ cat /etc/issue.net Ubuntu 12.04.4 LTS hadoop$ uname -a Linux host1 3.8.0-35-generic #52~precise1-Ubuntu SMP Thu Jan 30 17:24:40 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Thanks, -Josh Draner