access permission denied on table creation

2012-06-21 Thread Tony Dean
Hi, I finally was able to get a secure cluster running and a client connecting to it successfully using hadoop shell commands. Permissions worked as expected. Then I moved to hbase shell in order to try to create a table and test its permissions based on different user access. Well, I can't

Re: When node is down

2012-06-21 Thread Michael Segel
Assuming that you have an Apache release (Apache, HW, Cloudera) ... (If MapR, replace the drive and you should be able to repair the cluster from the console. Node doesn't go down. ) Node goes down. 10 min later, cluster sees node down. Should then be able to replicate the missing blocks. Rep

When node is down

2012-06-21 Thread David Charle
What is the best practice to remove a node and add the same node back for hbase/hadoop ? Currently in our 10 node cluster; 2 nodes went down (bad disk, so node is down as its the root volume+data); need to replace the disk and add them back. Any quick suggestions or pointers to doc for the right p

Re: HFile Performance

2012-06-21 Thread Elliott Clark
HFilePerformanceEvaluation is in the source tree hbase-server/src/test. I haven't played with it myself but it might help you. On Thu, Jun 21, 2012 at 3:13 PM, Jerry Lam wrote: > Hi HBase guru, > > I would like to benchmark HFile performance without other components in > HBase. I know that I ca

HFile Performance

2012-06-21 Thread Jerry Lam
Hi HBase guru, I would like to benchmark HFile performance without other components in HBase. I know that I can use HFile as any other file format in Hadoop IO. I wonder if there is a HFile benchmark available so I don't end up reinventing the wheel. Best Regards, Jerry

Re: TTL performance

2012-06-21 Thread Andrew Purtell
> 2012/6/21, Frédéric Fondement : > opt3. looks the nicest (only 3-4 tables to scan when reading), but won't my > daily major compact become crazy ? If you want more control over the major compaction process, for example to lessen the load on your production cluster to a constant background level

Re: TTL performance

2012-06-21 Thread Jean-Marc Spaggiari
Hi Frédéric, Have you looked at http://hbase.apache.org/book/versions.html ? What you want to do, if I undesrtand correctly, is already part of the hbase features... This: http://outerthought.org/blog/417-ot.html can be interesting too. JM 2012/6/21, Frédéric Fondement : > Hi all ! > > Before I

Re: Filesystem

2012-06-21 Thread Jean-Marc Spaggiari
Hi Frédéric, Ext4 journaling is to recover files after a crash. Or if you are using hadoop, you already have this feature part of the application. So I will say yes, you can disable ext4 journaling for the datanodes. BUT NOT for the master... Also, in book, for datanodes, if the drive you are usin

RE: RS unresponsive after series of deletes

2012-06-21 Thread Ted Tuttle
>Ted T: > Can you log a JIRA summarizing the issue ? https://issues.apache.org/jira/browse/HBASE-6254

Re: RS unresponsive after series of deletes

2012-06-21 Thread Ted Yu
Doug: Can you enhance related part in the book w.r.t. usage of Delete.deleteColumn(family, qual) ? Basically we should warn users of potentially long process time if there're many columns involved. Thanks On Thu, Jun 21, 2012 at 11:00 AM, Ted Tuttle wrote: > Working on the JIRA ticket now, btw.

RE: RS unresponsive after series of deletes

2012-06-21 Thread Ted Tuttle
Working on the JIRA ticket now, btw. > Ted: > Can you share what ts value was passed to Delete.deleteColumn(family, qual, ts) ? > Potentially, an insertion for the same (family, qual) immediately following the delete call may be masked by the above. We scan for KeyValues matching rows and c

Re: RS unresponsive after series of deletes

2012-06-21 Thread Ted Yu
Ted: Can you share what ts value was passed to Delete.deleteColumn(family, qual, ts) ? Potentially, an insertion for the same (family, qual) immediately following the delete call may be masked by the above. Cheers On Thu, Jun 21, 2012 at 7:02 AM, Ted Tuttle wrote: > Good hint, Ted > > By calling

Re: ConnectionLoss

2012-06-21 Thread Guilherme Vanz
Hi Yes, I added this property. I'll follow your advice. I'll configure the HBase in pseudo-distributed and try again. If I have any troubles I shout for help... xD Thanks for attention 2012/6/21 Mohammad Tariq > Hi Guilherme, > > First of all I would like to suggest you to use Hbase a

Single disk failure on single node causes 1 data node, 3 region servers to go down

2012-06-21 Thread Peter Naudus
Hello All,As this problem has both a Hadoop and HBase component, rather than posting the same message to both groups, I'm posting the datanode portion of this problem under the title of "Single disk failure (with HDFS-457 applied) causes data node to die" to the Hadoop users group.In our production

Filesystem

2012-06-21 Thread Frédéric Fondement
Hi ! I'm using ext4 as wisely advised here some (long ?) time ago and indeed we gained about 30% perf compared to ext3. Does someone knows whether it is safe to disable ext4 journaling ? All the best, Frédéric.

Re: Addition to Apache HBase Reference Guide

2012-06-21 Thread Mohammad Tariq
Hi Ian, I'll go through the link get myself familiar with the process and open a JIRA for the edits. And once the docs are ready I'll post the patch. Thanks for you support. Regards,     Mohammad Tariq On Thu, Jun 21, 2012 at 8:40 PM, Ian Varley wrote: > Mohammad, > > Absolutely - that'

Re: Addition to Apache HBase Reference Guide

2012-06-21 Thread Ian Varley
Mohammad, Absolutely - that's exactly how open source projects grow! :) You can open up a JIRA for your suggested edits, and then either post your edits directly in the JIRA ticket, or, ideally, make the docs change yourself and post a patch (the reference guide is also part of the HBase source

Re: Timestamp as a key good practice?

2012-06-21 Thread Michael Segel
If you have a really small cluster... You can put your HMaster, JobTracker, Name Node, and ZooKeeper all on a single node. (Secondary too) Then you have Data Nodes that run DN, TT, and RS. That would solve any ZK RS problems. On Jun 21, 2012, at 6:43 AM, Jean-Marc Spaggiari wrote: > Hi Mike, H

Re: RS unresponsive after series of deletes

2012-06-21 Thread Ted Yu
Ted T: Can you log a JIRA summarizing the issue ? I feel HBase should provide better handling for cell deletion of very wide rows intrinsically - without user tweaking timestamp. On Thu, Jun 21, 2012 at 7:02 AM, Ted Tuttle wrote: > Good hint, Ted > > By calling Delete.deleteColumn(family, qual,

TTL performance

2012-06-21 Thread Frédéric Fondement
Hi all ! Before I start, I'd like to have some feedback about TTL performance in HBase. My use case is the following. I have constantly data coming in the base (i.e. a write-instensive application). This data should be kept during a certain amount of time, either 3, 6, 12... monthes, dependi

RE: RS unresponsive after series of deletes

2012-06-21 Thread Ted Tuttle
Good hint, Ted By calling Delete.deleteColumn(family, qual, ts) instead of deleteColumn w/o timestamp, the time to delete row keys is reduced by 95%. I am going to experiment w/ limited batches of Deletes, too. Thanks everyone for help on this one. -Original Message- From: Ted Yu [mail

Addition to Apache HBase Reference Guide

2012-06-21 Thread Mohammad Tariq
Hello list, Many a times I see newbies like me getting stuck while trying to configure Hbase in pseudo distributed mode using the Apache HBase Reference Guide..While I faced these problems I tried a few things which worked for me and few of my friends..Is it possible to add these things i

Re: ConnectionLoss

2012-06-21 Thread Mohammad Tariq
Hi Guilherme, First of all I would like to suggest you to use Hbase atleast in pseudo-distributed modeif you want to learn things properly, as in stand alone mode it doesn't use the Hdfs. But this is just an advice. Now coming back to your problem, have you added "hbase.rootdir" property

ConnectionLoss

2012-06-21 Thread Guilherme Vanz
Hello! I have started studied HBase and I'm having a problem when tried execute some command on Hbase shell. I configured my HBase on Standalone mode. I have started HBase without troubles, any exception have been throwed. But when I tried execute command for create a table, for example, the follo

Re: Data locality in HBase

2012-06-21 Thread Michael Segel
While data locality is nice, you may see it becoming less of a bonus or issue. With Co-processors available, indexing becomes viable. So you may see things where within the M/R you process a row from table A, maybe hit an index to find a value in table B and then do some processing. There's

Re: performance of Get from MR Job

2012-06-21 Thread Michael Segel
I think the version issue is the killer factor here. Usually performing a simple get() where you are getting the latest version of the data on the row/cell occurs in some constant time k. This is constant regardless of the size of the cluster and should scale in a near linear curve. As JD C

Re: Timestamp as a key good practice?

2012-06-21 Thread Jean-Marc Spaggiari
Hi Mike, Hi Rob, Thanks for your replies and advices. Seems that now I'm due for some implementation. I'm readgin Lars' book first and when I will be done I will start with the coding. I already have my Zookeeper/Hadoop/HBase running and based on the first pages I read, I already know it's not we

Re: Blocking Inserts

2012-06-21 Thread Martin Alig
Thank you for the suggestions. So I changed the setup and now have: 1 Master running Namenode, SecondaryNamenode, ZK and the HMaster 7 Slaves running Datanode and Regionserver 2 Clients to insert data What I forgot in my first post, that sometimes the clients even get a SocketTimeOutException wh

Re: Data locality in HBase

2012-06-21 Thread Lars George
Hi Ben, According to your fsck dump, the first copy is located on hadoop-143, which has all the blocks for the region. So if you check, I would assume that the region is currently open and served by hadoop-143, right? The TableInputFormat getSplit() will report that server to the MapReduce fra