[
https://issues.apache.org/jira/browse/HBASE-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135991#comment-15135991
]
Jerry He commented on HBASE-15223:
----------------------------------
The main thing in the patch is to change convertScanToString and
convertStringToScan in TableMapReduceUtil to public so that they can be used by
external users.
Users don't need to be concerned by the internal of the conversion.
The other part of the patch is just to use the Scan JSON in the toString()
instead of the the encoded string.
> Make convertScanToString public for Spark
> -----------------------------------------
>
> Key: HBASE-15223
> URL: https://issues.apache.org/jira/browse/HBASE-15223
> Project: HBase
> Issue Type: Improvement
> Reporter: Jerry He
> Assignee: Jerry He
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-15223-master.patch
>
>
> One way to access HBase from Spark is to use newAPIHadoopRDD, which can take
> a TableInputFormat as class name. But we are not able to set a Scan object
> in there, for example to set a HBase filter.
> In MR, the public API TableMapReduceUtil.initTableMapperJob() or equivalent
> is used which can take a Scan object. But this call is not used in Spark
> conveniently.
> We need to make the TableMapReduceUtil.convertScanToString() public.
> So that a Scan object can be created, populated and then convert to the
> property and used by Spark. They are now package private.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)