Thank you for the link Anil it was a good explanation indeed.
>It's not recommended to do put/deletes across
>region servers like this.
That was not my intention, I want to keep the region for the aggregates and
the aggregated values on the same server. I read in the link that you gave
me that I
We did the same but on the client side, without any issue
On Monday, August 26, 2013, Olle Mårtensson wrote:
> Hi,
>
> I have developed a coprocessor that is extending BaseRegionObserver and
> implements the
> postPut method. The postPut method scans the columns of the row that the
> put was issu
Hi,
I am new to hbase, so few noob questions.
So, I created a table in hbase:
A quick scan gives me the following:
hbase(main):001:0> scan 'test'
ROW COLUMN+CELL
row1column=cf:word,
timestamp=1377
On Mon, Aug 26, 2013 at 7:27 AM, Olle Mårtensson
wrote:
> Hi,
>
> I have developed a coprocessor that is extending BaseRegionObserver and
> implements the
> postPut method. The postPut method scans the columns of the row that the
> put was issued on and calculates an aggregated based on these valu
Hi,
I have developed a coprocessor that is extending BaseRegionObserver and
implements the
postPut method. The postPut method scans the columns of the row that the
put was issued on and calculates an aggregated based on these values, when
this is done a row in another table is updated with the agg
bq. store particular row on a particular region server
Can you let us know your use case ? Any single region server may go down,
due to various reasons. How do you plan to maintain row key distribution
after that ?
Thanks
On Mon, Aug 26, 2013 at 3:52 AM, Vamshi Krishna wrote:
> Hi all,
>
Hi all,
The problem got solved by changing the value for below property from local
directory path to the hdfs:// path AND running hadoop before i start
running my hbase.
hbase.rootdir
/home/biginfolabs/BILSftwrs/hbase-0.94.10/hbstmp/
Now, i see the data gets distributed acros
Hi all,
Is there any facility in hbase such that, in a task of storing 1000
rows on a cluster of 10 machines with a specification like, the Nth row
should be stored in N%1000 th region server.
In essence, how to store particular row on a particular region server..?
(Can we specify which ro
Awesome.. Thanks :) Now my map and reduce tasks are super fast.. Although,
the table i'll eventually be using has a region split of 25.. 4 on 5
machines and 5 on the master region node.. I don't know if thats enough
though..
But i'll look into this..
On Mon, Aug 26, 2013 at 2:55 PM, Ashwanth Kum
A 'table split' is a region split and as you split regions, balance the
regions, you should see some parallelism in your M/R jobs.
Of course depending on your choice of row keys... YMMV.
HTH
-Mike
On Aug 26, 2013, at 2:16 AM, Pavan Sudheendra wrote:
> Hi all,
>
> How to make use of a Table
Ted, I guessed the problem could be due to only single zookeeper server in
hbase.zookeepr.quorumpeer. So, i have added the region server machine also
apart from the master. Now, i don't see any such FAIL cases as mentioned
below. (which was the case earlier)
Handling transition=RS_ZK_REGION_
FAILE
Just click on "Split" that should be fine. It would pick up a key in the
middle of each region and split them. Split happens like 1 -> 2 -> 4 -> 8
regions and so on. # of regions for a table is something that you should be
able to come up given the # of region servers and size of data that you are
Further more, what can we do if a table has 25 online regions? Can we
safely set caching to a bigger number? Is a split necessary as well?
On Mon, Aug 26, 2013 at 2:42 PM, Pavan Sudheendra wrote:
> Hi Ashwanth, thanks for the reply..
>
> I went to the HBase Web UI and saw that my table had 1 Onl
Hi Ashwanth, thanks for the reply..
I went to the HBase Web UI and saw that my table had 1 Online Regions.. Can
you please guide me as to how to do the split on this table? I see the UI
asking for a region key and a split button... How many splits can i make
exactly? Can i give two different 'keys
Thanks Ted, it is explicitly mentioned in the limitations section but I seem to
have missed it .. oh well.
It is an awesome filter.. great work by Alex, you and the team. Thanks to you
all.
Regards,
- kiru
Kiru Pakkirisamy | webcloudtech.wordpress.com
From
setCaching is setting the value via API, other way is to set it in the job
configuration using the Key "hbase.client.scanner.caching".
I just realized, given that you have just 1 region Caching wouldn't help
much in reducing the time. Splitting might be an ideal solution. Based on
your Heap space
Hi Ashwanth,
My caching is set to 1500 ..
scan.setCaching(1500);
scan.setCacheBlocks(false);
Can i set the number of splits via an API?
On Mon, Aug 26, 2013 at 2:22 PM, Ashwanth Kumar <
ashwanthku...@googlemail.com> wrote:
> To answer your question - Go to HBase Web UI and you can initiate a m
To answer your question - Go to HBase Web UI and you can initiate a manual
split on the table.
But, before you do that. May be you can try increasing your client caching
value (hbase.client.scanner.caching) in your Job.
On Mon, Aug 26, 2013 at 2:09 PM, Pavan Sudheendra wrote:
> What is the inpu
Two nodes is insufficient. Default DFS replication is 3. That would be the
bare minimum just for kicking the tires IMO but is still a degenerate case.
In my opinion 5 is the lowest you should go. You shouldn't draw conclusions
from inadequate deploys.
On Friday, August 23, 2013, Vamshi Krishna wro
What is the input split of the HBase Table in this job status?
map() completion: 0.0
reduce() completion: 0.0
Counters: 24
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=216030
FILE: Number of read operations=
That is right.
See
http://blog.sematext.com/2012/08/09/consider-using-fuzzyrowfilter-when-in-need-for-secondary-indexes-in-hbase/
On Aug 25, 2013, at 10:56 PM, Kiru Pakkirisamy
wrote:
> I am using FuzzyRowFilter with my coprocessors as it seems to give the best
> performance (even though I
Hi all,
How to make use of a TableSplit or a Region Split? How is it used in
TableInputFormatBase#
getSplits() ?
I have 6 Region Servers across the cluster for the map-reduce task which i
am using, How to leverage this so that the table is split across the
clusters and the map-reduce application
22 matches
Mail list logo