Re: newAPIHadoopRDD Mutiple scan result return from Hbase

2015-04-05 Thread Jeetendra Gangele
I am already using STRATROW and ENDROW in Hbase from newAPIHadoopRDD. Can I do similar with RDD?.lets say use Filter in RDD to get only those records which matches the same Criteria mentioned in STARTROW and Stop ROW.will it much faster than Hbase querying? On 6 April 2015 at 03:15, Ted Yu wr

Re: newAPIHadoopRDD Mutiple scan result return from Hbase

2015-04-05 Thread Ted Yu
bq. HBase scan operation like scan StartROW and EndROW in RDD? I don't think RDD supports concept of start row and end row. In HBase, please take a look at the following methods of Scan: public Scan setStartRow(byte [] startRow) { public Scan setStopRow(byte [] stopRow) { Cheers On Sun, A

Re: newAPIHadoopRDD Mutiple scan result return from Hbase

2015-04-05 Thread Jeetendra Gangele
I have 2GB hbase table where this data is store in the form on key and value(only one column per key) and key also unique What I thinking to load the complete hbase table into RDD and then do the operation like scan and all in RDD rather than Hbase. Can I do HBase scan operation like scan StartR

Re: newAPIHadoopRDD Mutiple scan result return from Hbase

2015-04-05 Thread Jeetendra Gangele
Sure I will check. On 6 April 2015 at 02:45, Ted Yu wrote: > You do need to apply the patch since 0.96 doesn't have this feature. > > For JavaSparkContext.newAPIHadoopRDD, can you check region server metrics > to see where the overhead might be (compared to creating scan and firing > query using

Re: newAPIHadoopRDD Mutiple scan result return from Hbase

2015-04-05 Thread Ted Yu
You do need to apply the patch since 0.96 doesn't have this feature. For JavaSparkContext.newAPIHadoopRDD, can you check region server metrics to see where the overhead might be (compared to creating scan and firing query using native client) ? Thanks On Sun, Apr 5, 2015 at 2:00 PM, Jeetendra Ga

Re: newAPIHadoopRDD Mutiple scan result return from Hbase

2015-04-05 Thread Jeetendra Gangele
Thats true I checked the MultiRowRangeFilter and its serving my need. do I need to apply the patch? for this since I am using 0.96 hbase version. Also I have checked when I used JavaSparkContext.newAPIHadoopRDD its slow compare to creating scan and firing query, is there any reason? On 6 Apri

Re: newAPIHadoopRDD Mutiple scan result return from Hbase

2015-04-05 Thread Ted Yu
Looks like MultiRowRangeFilter would serve your need. See HBASE-11144. HBase 1.1 would be released in May. You can also backport it to the HBase release you're using. On Sat, Apr 4, 2015 at 8:45 AM, Jeetendra Gangele wrote: > Here is my conf object passing first parameter of API. > but here I

Re: newAPIHadoopRDD Mutiple scan result return from Hbase

2015-04-04 Thread Jeetendra Gangele
Here is my conf object passing first parameter of API. but here I want to pass multiple scan means i have 4 criteria for STRAT ROW and STOROW in same table. by using below code i can get result for one STARTROW and ENDROW. Configuration conf = DBConfiguration.getConf(); // int scannerTimeout = (i

newAPIHadoopRDD Mutiple scan result return from Hbase

2015-04-04 Thread Jeetendra Gangele
Hi All, Can we get the result of the multiple scan from JavaSparkContext.newAPIHadoopRDD from Hbase. This method first parameter take configuration object where I have added filter. but how Can I query multiple scan from same table calling this API only once? regards jeetendra