[ https://issues.apache.org/jira/browse/HIVE-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702769#comment-13702769 ]
Edward Capriolo commented on HIVE-3603: --------------------------------------- I am +1. Is anyone not +1. The patch seems to enable scanner caching by setting the appropriate properties, verifying it works is going to be tricky since we have no simple way of counting the RPC calls the underlying hbase client will make. > Enable client-side caching for scans on HBase > --------------------------------------------- > > Key: HIVE-3603 > URL: https://issues.apache.org/jira/browse/HIVE-3603 > Project: Hive > Issue Type: Improvement > Components: HBase Handler > Reporter: Karthik Ranganathan > Assignee: Navis > Priority: Minor > Attachments: HIVE-3603.D7761.1.patch > > > HBaseHandler sets up a TableInputFormat MR job against HBase to read data in. > The underlying implementation (in HBaseHandler.java) makes an RPC call per > row-key, which makes it very inefficient. Need to specify a client side cache > size on the scan. > Note that HBase currently only supports num-rows based caching (no way to > specify a memory limit). Created HBASE-6770 to address this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira