[ https://issues.apache.org/jira/browse/HIVE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152297#comment-13152297 ]
Ben West commented on HIVE-2590: -------------------------------- Thanks John. My username is "xodarap" (same as Jira). > HBase bulk load wiki page improvements > -------------------------------------- > > Key: HIVE-2590 > URL: https://issues.apache.org/jira/browse/HIVE-2590 > Project: Hive > Issue Type: Bug > Components: Documentation, HBase Handler > Reporter: Ben West > Assignee: Ben West > Priority: Minor > Labels: wiki > > Some suggestions on the page > https://cwiki.apache.org/confluence/display/Hive/HBaseBulkLoad which seems > kind of out of date: > 1. It seems like it's required that the number of reduce tasks in the "Sort > Data" phase be one more than the number of keys selected in the "Range > Partitioning" step, or else you get an error like this: > Caused by: java.lang.IllegalArgumentException: Can't read partitions file > at > org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:91) > ... 15 more > Caused by: java.io.IOException: Wrong number of partitions in keyset > at > org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:72) > ... 15 more > If so, it would be helpful if this was explicitly pointed out. > 2. It recommends that you should use the "loadtable" ruby script to put data > into hbase, but if you run this on newer versions of HBase (e.g. 0.90.3) it > errors: > DISABLED!!!! Use completebulkload instead. See tail of > http://hbase.apache.org/bulk-loads.html > The instructions should probably be changed to use completebulkload instead > of this script. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira