[ https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085477#comment-13085477 ]
John Sichi commented on HIVE-2365: ---------------------------------- (Just realized I forgot to link the original doc where "as simple as this" is mentioned.) https://cwiki.apache.org/confluence/display/Hive/HBaseBulkLoad This issue pertains to INSERT of large amounts of data into HBase from Hive (not CREATE; I'll follow up separately in HIVE-2373). The major challenges here are: * automating the sampling needed for coming up with the range partitioning for the global sort * extending Hive's INSERT to express the whole thing * chaining together the sampling job with the actual load job and tying together the relevant bits such as temporary file locations (we've had success doing something similar via reentrant SQL for index load/query statements) * making the load use the HBase bulk load API which was added subsequent to the original Hive work > SQL support for bulk load into HBase > ------------------------------------ > > Key: HIVE-2365 > URL: https://issues.apache.org/jira/browse/HIVE-2365 > Project: Hive > Issue Type: Improvement > Components: HBase Handler > Reporter: John Sichi > > Support the "as simple as this" SQL for bulk load from Hive into HBase. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira