Re: ACLs/Quotas for HBase structures

2014-12-19 Thread Ted Yu
Manoj: HBASE-8410 is under active development. If you have time, please go over the feature to see if it fits your need. Cheers On Dec 19, 2014, at 11:42 PM, Esteban Gutierrez wrote: > Hello Manoj, > > Thats a very interesting requirement, unfortunately the existing HBase > directory struct

Re: ACLs/Quotas for HBase structures

2014-12-19 Thread Esteban Gutierrez
Hello Manoj, Thats a very interesting requirement, unfortunately the existing HBase directory structure needs to be owned by the user that started HBase (usually the 'hbase' user) and HBase will handle all the permissions and ACL rules without exposing details from HDFS to the client API. Even if

ACLs/Quotas for HBase structures

2014-12-19 Thread Manoj Murumkar
Folks, We are trying to control space usage and manage security at HBase namespace level. Think of it in terms of a RDBMS (database and superuser for a database). Is there a simple way to do this? This is what I have in mind. Does it make sense? - Space quotas: Namespace is managed under /ap

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Nick Dimiduk
Could be in an API-compatible way, though semantics would change, which is probably worse. Table keeps these methods. When setAutoFlush is used, write buffer managed by connection is created. If multiple Table instances for the same table setWriteBufferSize(), perhaps the largest value wins. Writes

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Stack
On Fri, Dec 19, 2014 at 12:20 PM, Solomon Duskis wrote: > My first thought based on this discussion was that it would require moving > some methods (setAutoFlush() and setWriteBufferSize()) from Table to > Connection. That would be a breaking API change. > > This will mean a bunch of Table state

RE: HBase - bulk loading files

2014-12-19 Thread Rama Ramani
0.98.0.2.1.9.0-2196-hadoop2Hadoop 2.4.0.2.1.9.0-2196Subversion g...@github.com:hortonworks/hadoop-monarch.git -r cb50542bc92fb77dee52 No, the clusters were not taking additional load. ThanksRama > Date: Fri, 19 Dec 2014 13:50:30 -0800 > Subject: Re: HBase - bulk loading files > From: yuzhih...@gma

Re: HBase - bulk loading files

2014-12-19 Thread Ted Yu
Can you let us know the HBase and hadoop versions you're using ? Were the clusters taking load from other sources when ImportTsv was running ? Cheers On Fri, Dec 19, 2014 at 1:43 PM, Rama Ramani wrote: > Hello, I am bulk loading a set of files (about 400MB each) with > "|" as the delim

HBase - bulk loading files

2014-12-19 Thread Rama Ramani
Hello, I am bulk loading a set of files (about 400MB each) with "|" as the delimiter using ImportTsv. It takes a long time for the 'map' job to complete on both a 4 node and a 16 node cluster. I tried the option to generate the output (providing -Dimporttsv.bulk.output) which took time i

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Solomon Duskis
My first thought based on this discussion was that it would require moving some methods (setAutoFlush() and setWriteBufferSize()) from Table to Connection. That would be a breaking API change. -Solomon On Fri, Dec 19, 2014 at 3:04 PM, Andrew Purtell wrote: > > I think it would be critical if we

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Andrew Purtell
I think it would be critical if we're contemplating something that requires a breaking API change? Do we have that here? I'm not sure. On Fri, Dec 19, 2014 at 12:02 PM, Solomon Duskis wrote: > > Is this critical to sort out before 1.0, or is fixing this a post-1.0 > enhancement? > > -Solomon > >

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Solomon Duskis
Is this critical to sort out before 1.0, or is fixing this a post-1.0 enhancement? -Solomon On Fri, Dec 19, 2014 at 2:19 PM, Andrew Purtell wrote: > > I don't like the dropped writes either. Just pointing out what we have now. > There is a gap no doubt. > > On Fri, Dec 19, 2014 at 11:16 AM, Nick

Re: Region Server Thread with a Single High Idle CPU

2014-12-19 Thread Esteban Gutierrez
Hi Jon, Do you see something interesting in the RS logs from KVM15 or the HBase Master? one possibility is that if there are no requests to META coming from the Thrift server or external clients, then it might be possible that one or many region servers for some reason are updating META too freque

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Andrew Purtell
I don't like the dropped writes either. Just pointing out what we have now. There is a gap no doubt. On Fri, Dec 19, 2014 at 11:16 AM, Nick Dimiduk wrote: > > Thanks for the reminder about the Multiplexer, Andrew. It sort-of solves > this problem, but think it's semantics of dropping writes are n

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Nick Dimiduk
Thanks for the reminder about the Multiplexer, Andrew. It sort-of solves this problem, but think it's semantics of dropping writes are not desirable in the general case. Further, my understanding was that the new connection implementation is designed to handle this kind of use-case (hence cc'ing La

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Andrew Purtell
Aaron: Please post a copy of that feedback on the JIRA, pretty sure we will be having an improvement discussion there. On Fri, Dec 19, 2014 at 10:58 AM, Aaron Beppu wrote: > > Nick : Thanks, I've created an issue [1]. > > Pradeep : Yes, I have considered using that. However for the moment, we've

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Aaron Beppu
Nick : Thanks, I've created an issue [1]. Pradeep : Yes, I have considered using that. However for the moment, we've set it out of scope, since our migration from 0.94 -> 0.98 is already a bit complicated, and we hoped to separate isolate these changes by not moving to the async client until after

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Andrew Purtell
I believe HTableMultiplexer[1] is meant to stand in for HTablePool for buffered writing. FWIW, I've not used it. 1: https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTableMultiplexer.html On Fri, Dec 19, 2014 at 9:00 AM, Nick Dimiduk wrote: > > Hi Aaron, > > Your analysis is spot

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Pradeep Gollakota
Hi Aaron, Just out of curiosity, have you considered using asynchbase? https://github.com/OpenTSDB/asynchbase On Fri, Dec 19, 2014 at 9:00 AM, Nick Dimiduk wrote: > Hi Aaron, > > Your analysis is spot on and I do not believe this is by design. I see the > write buffer is owned by the table, wh

Re: Region Server Thread with a Single High Idle CPU

2014-12-19 Thread uamadman
Yes, I tested the following by restarting the cluster and waiting approximately 5-10 minutes for its initial ramp up. There are no clients asking for data. In the following example KVM15 was randomly assigned to serve the META Table. root@KVM15:~# lsof -n | grep :60020- | sed 's/.*->//;s/:.*//' |

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Nick Dimiduk
Hi Aaron, Your analysis is spot on and I do not believe this is by design. I see the write buffer is owned by the table, while I would have expected there to be a buffer per table all managed by the connection. I suggest you raise a blocker ticket vs the 1.0.0 release that's just around the corner

Re: Spark-HBase connector

2014-12-19 Thread Mukesh Jha
Thanks Stack, looks promising will give it a try. On Fri, Dec 19, 2014 at 3:28 AM, Stack wrote: > > On Tue, Dec 16, 2014 at 10:52 AM, Stack wrote: > > > > On Sun, Dec 14, 2014 at 10:49 PM, Mukesh Jha > > wrote: > >> > >> Hello Experts, > >> > >> I've come across multiple posts where users want