Re: GC peaks during major compaction

2014-08-20 Thread lars hofhansl
You want to decrease your young gen (defaults to 40% of heap, which is *way* to big for HBase). I wrote the reasoning here: http://hadoop-hbase.blogspot.com/2014/03/hbase-gc-tuning-observations.html (Basically HBase produces a lot of day-to-day garbage that can be collected quickly. You do not

Re: performance of block cache

2014-08-20 Thread 牛兆捷
Hi Nick: Yes, I am interested in it. I will try first. Btw, this site (http://people.apache.org/~stack/bc/) also does the similar performance evaluation. You can have a look if you are interested in. 2014-08-21 1:48 GMT+08:00 Nick Dimiduk : > Hi Zhaojie, > > I'm responsible for this particular

Re: Shout-out for Misty

2014-08-20 Thread tobe
Keep updating. Thanks very much! On Thu, Aug 21, 2014 at 9:28 AM, iain wright wrote: > As an admin constantly referring to these docs, Thank you! > > -- > Iain Wright > > This email message is confidential, intended only for the recipient(s) > named above and may contain information that is pri

Re: Shout-out for Misty

2014-08-20 Thread iain wright
As an admin constantly referring to these docs, Thank you! -- Iain Wright This email message is confidential, intended only for the recipient(s) named above and may contain information that is privileged, exempt from disclosure under applicable law. If you are not the intended recipient, do not

Re: Create custom filter on HBase 0.96.1.1-cdh5.0.1

2014-08-20 Thread Ted Yu
Can you show implementation for parseFrom(byte[]) - using pastebin ? If possible, seeing the code for whole class would help us understand better. Cheers On Wed, Aug 20, 2014 at 4:18 PM, gabriela montiel < gabriela.mont...@oracle.com> wrote: > Hi all, > > I have been working on migrating a cus

Re: Shout-out for Misty

2014-08-20 Thread Alex Newman
Hooray docs! On Aug 20, 2014 10:25 AM, "Esteban Gutierrez" wrote: > +1 thank you for all the hard work Misty! > > esteban. > > > -- > Cloudera, Inc. > > > > On Wed, Aug 20, 2014 at 10:20 AM, Andrew Purtell > wrote: > > > Huge +1 > > > > > > On Tue, Aug 19, 2014 at 10:53 PM, Nick Dimiduk > wrot

Create custom filter on HBase 0.96.1.1-cdh5.0.1

2014-08-20 Thread gabriela montiel
Hi all, I have been working on migrating a custom filter used in HBase 0.94 to make it work on HBase 0.96.1.1. This custom filter extends the FilterBase API and receives only two byte arrays. According to the documentation both toByteArray() and parseFrom(byte[]) should be implemented. After

Re: Splitting an existing table with new keys.

2014-08-20 Thread Ted Yu
Good question. Method #2 works for now. Please watch HBASE-11608 which proposes to add synchronous split. Cheers On Wed, Aug 20, 2014 at 1:45 PM, Shahab Yunus wrote: > Thanks Ted. > > I also wanted know that from recommendation perspective is this approach > even safe or desirable or not. Or

Re: Splitting an existing table with new keys.

2014-08-20 Thread Shahab Yunus
Thanks Ted. I also wanted know that from recommendation perspective is this approach even safe or desirable or not. Or if this is some kind of HBase anti-pattern (splitting a same table before each bulk import.) So I did try this and it works with and without existing data. Now one follow-up que

Re: hbase is not deleting the cell when a Put with a KeyValue, KeyValue.Type.Delete is submitted

2014-08-20 Thread Ted Yu
bq. Batch does not guarantee the order of the mutations sent over Did you get the above from javadoc of the method ? javadoc gives example of order between Get and Put. In you case, the Put and Delete are for the same row. Therefore they would be executed atomically. On Wed, Aug 20, 2014 at 7:4

Re: GC peaks during major compaction

2014-08-20 Thread Andrew Purtell
Cool, let me ping my former colleagues about that. On Wed, Aug 20, 2014 at 11:34 AM, Bryan Beaudreault < bbeaudrea...@hubspot.com> wrote: > That blog post is awesome, I hadn't seen it before. Eagerly looking > forward to parts 2 and 3. > > > > > On Wed, Aug 20, 2014 at 2:00 PM, Andrew Purtell

Re: GC peaks during major compaction

2014-08-20 Thread Bryan Beaudreault
That blog post is awesome, I hadn't seen it before. Eagerly looking forward to parts 2 and 3. On Wed, Aug 20, 2014 at 2:00 PM, Andrew Purtell wrote: > If using Java 7 and G1, you might want to look over: > > https://software.intel.com/en-us/blogs/2014/06/18/part-1-tuning-java-garbage-collect

Re: GC peaks during major compaction

2014-08-20 Thread Andrew Purtell
If using Java 7 and G1, you might want to look over: https://software.intel.com/en-us/blogs/2014/06/18/part-1-tuning-java-garbage-collection-for-hbase On Wed, Aug 20, 2014 at 8:26 AM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > I agree with Bryan. > > HBase start to have some GC diff

Re: performance of block cache

2014-08-20 Thread Nick Dimiduk
Hi Zhaojie, I'm responsible for this particular bit of work. One thing to note in these experiments is that I did not control explicitly for OS caching. I ran "warmup" workloads before collecting measurements, but because the amount of RAM on the machine is fixed, it's impact of OS cache is differ

Re: Development Connectivity Issues in "Stand-Alone" Mode

2014-08-20 Thread Sean Kennedy
This once helped me.. Add the following to /etc/hosts localhost 127.0.0.1 - Original Message - From: "Jean-Marc Spaggiari" To: "user" Sent: Monday, August 18, 2014 5:51:47 PM Subject: Re: Development Connectivity Issues in "Stand-Alone" Mode Hi VJ, Simply make sure your /et

Re: Shout-out for Misty

2014-08-20 Thread Esteban Gutierrez
+1 thank you for all the hard work Misty! esteban. -- Cloudera, Inc. On Wed, Aug 20, 2014 at 10:20 AM, Andrew Purtell wrote: > Huge +1 > > > On Tue, Aug 19, 2014 at 10:53 PM, Nick Dimiduk wrote: > > > Our docs are getting a lot of love lately, courtesy of one Misty > > Stanley-Jones. As s

Re: How to change MAX_FILES_PER_REGION_PER_FAMILY in LoadIncrementalHFiles?

2014-08-20 Thread Jerry Lam
Hi Matteo, Thank you for addressing the issue. For now, I will just set the variable in hbase-site.xml. Best Regards, Jerry On Wed, Aug 20, 2014 at 12:33 PM, Matteo Bertozzi wrote: > yeah sorry, just looked at the code and it is not initializing the tool > correctly to pickup the -D configur

Re: Shout-out for Misty

2014-08-20 Thread Andrew Purtell
Huge +1 On Tue, Aug 19, 2014 at 10:53 PM, Nick Dimiduk wrote: > Our docs are getting a lot of love lately, courtesy of one Misty > Stanley-Jones. As someone who joined this community by way of > documentation, I'd like to say: Thank you, Misty! > > -n > -- Best regards, - Andy Problems

Re: Hbase InputFormat for multi-row + column range, how to do it?

2014-08-20 Thread Jianshi Huang
I see and I'll try. Thanks Andrey! Jianshi On Wed, Aug 20, 2014 at 6:01 PM, Andrey Stepachev wrote: > Hi Jianshi. > > You can create your own. Just inherit from TableInputFormatBase or > TableInputFormat and add ColumnRangeFilter to scan (either construct your > own, or intercept setScan metho

Re: How to change MAX_FILES_PER_REGION_PER_FAMILY in LoadIncrementalHFiles?

2014-08-20 Thread Matteo Bertozzi
yeah sorry, just looked at the code and it is not initializing the tool correctly to pickup the -D configuration. let me fix that, I've opened HBASE-11789 as you said with the current code only the hbase-site.xml conf is used, so you need to set the property there. Matteo On Wed, Aug 20, 2014 a

Re: How to change MAX_FILES_PER_REGION_PER_FAMILY in LoadIncrementalHFiles?

2014-08-20 Thread Jerry Lam
Hi Matteo, Thank you for the info. I tried it but it doesn't seem to take any effect. Apparently the code in the LoadIncremtnalHFiles does not take anything other than variables from hbase-site.xml which is unfortunate. We have more than 32 hfiles to bulkload. So this is really not working... Bes

Re: Hadoop 1.2.1 integration with Hbase 0.98.5

2014-08-20 Thread Jean-Marc Spaggiari
Can you try with adding your host name with the local IP on your /etc/hosts file? Something like: 192.168.1.3t430s Where t430s is your host name and 192.168.1.3 is your local IP. Restart all the processes after that... JM 2014-08-20 8:32 GMT-04:00 Ratnajit Devnath - ERS, HCL Tech < ratnaji

RE: Hadoop 1.2.1 integration with Hbase 0.98.5

2014-08-20 Thread Ratnajit Devnath - ERS, HCL Tech
Hi All, I have installed Hadoop 1.2.1 in Fedora 20 Linux OS. I am using the single node configuration. Hadoop is up. [root@localhost hbase-0.98.5-hadoop1]# jps 6153 JobTracker 6272 TaskTracker 6061 SecondaryNameNode 8594 HQuorumPeer 5938 DataNode 5816 NameNode 8917 Jps Core-site.xml file -

Re: GC peaks during major compaction

2014-08-20 Thread Jean-Marc Spaggiari
I agree with Bryan. HBase start to have some GC difficulties after 16GB. Depending of the kind of load you put on it, it might be fine up to a certain point. Seems that with your daily load, 30GB is fine. However, when you start to do the compactions, you start to see the GC issues. You can try G

Re: delete ".corrupt" folder?

2014-08-20 Thread Jean-Marc Spaggiari
They seems to be the logs as said before. But as you said too, too late now ;) We can not take one and look at it. Basically, when you ran out of space, mot probably HBase failed to write the logs correctly so the files got corrupted, and got moved into this folder when they got replayed. Since it

Re: delete ".corrupt" folder?

2014-08-20 Thread Henning Blohm
Ah.. man.. sorry for the confusion: Just noted that the terminal was still open. Here's the output from the delete: $ hadoop fs -rmr /hbase/.corrupt/* Deleted hdfs://localhost:9000/hbase/.corrupt/localhost%3A60020.1406915392963 Deleted hdfs://localhost:9000/hbase/.corrupt/localhost%3A60020.14070

Re: delete ".corrupt" folder?

2014-08-20 Thread Henning Blohm
Too bad. I read this just now after I have already deleted those files. The folder was empty before that node ran into disk space trouble. Seems that nothing bad happened so far. Thanks, Henning On 08/20/2014 04:43 PM, Jean-Marc Spaggiari wrote: Can you list the files you have under this dire

Re: How to change MAX_FILES_PER_REGION_PER_FAMILY in LoadIncrementalHFiles?

2014-08-20 Thread Matteo Bertozzi
you should be able to use the -D option to set the new value LoadIncrementalHFiles -Dhbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily=NEW_VALUE Matteo On Wed, Aug 20, 2014 at 3:46 PM, Jerry Lam wrote: > Hi HBase users, > > I wonder if anyone knows how to make change to > the MAX_FILES

How to change MAX_FILES_PER_REGION_PER_FAMILY in LoadIncrementalHFiles?

2014-08-20 Thread Jerry Lam
Hi HBase users, I wonder if anyone knows how to make change to the MAX_FILES_PER_REGION_PER_FAMILY in LoadIncrementalHFiles? The default value is 32 which is quite small. HBase Version 0.98 Thank you, Jerry

RE: hbase is not deleting the cell when a Put with a KeyValue, KeyValue.Type.Delete is submitted

2014-08-20 Thread Armaselu, Cristian
Batch does not guarantee the order of the mutations sent over (Put/Delete,etc). We need an atomic change of a row. Cristian Armaselu Solution Architect Shared Technology Services 6021 Connection Drive Irving, TX 75039 carmas...@epsilon.com The information contained in this communication is confi

Re: delete ".corrupt" folder?

2014-08-20 Thread Jean-Marc Spaggiari
Can you list the files you have under this directory? Look at 9.6.5.3.1 in http://hbase.apache.org/book/regionserver.arch.html They might be corrupt logs files that we can not replay. So might be safe to remove, but you might have some data lost there... JM 2014-08-20 10:29 GMT-04:00 Henning

Re: delete ".corrupt" folder?

2014-08-20 Thread Henning Blohm
Nobody? Well... I will try and see what happens... Thanks, Henning On 08/11/2014 09:28 PM, Henning Blohm wrote: Lately, on a single node test installation, I noticed that the Hadoop/Hbase folder /hbase/.corrupt got quite big (probably due to failed log splitting due to lack of disk space).

Re: performance of block cache

2014-08-20 Thread 牛兆捷
the complete blog link is: http://zh.hortonworks.com/blog/blockcache-showdown-hbase/ 2014-08-20 11:41 GMT+08:00 牛兆捷 : > Hi all: > > I saw some interesting results from Hortonworks blog (block cache > > ). > > In th

Re: hbase is not deleting the cell when a Put with a KeyValue, KeyValue.Type.Delete is submitted

2014-08-20 Thread Ted Yu
Can you use this API ? https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#batch(java.util.List,%20java.lang.Object[]) On Aug 20, 2014, at 5:41 AM, "Armaselu, Cristian" wrote: > Is not atomic. > A Put is atomic while a Put and a Delete are not. > > > Cristian Armaselu >

RE: hbase is not deleting the cell when a Put with a KeyValue, KeyValue.Type.Delete is submitted

2014-08-20 Thread Armaselu, Cristian
Is not atomic. A Put is atomic while a Put and a Delete are not. Cristian Armaselu Solution Architect Shared Technology Services 6021 Connection Drive Irving, TX 75039 carmas...@epsilon.com The information contained in this communication is confidential, and is intended only for the sole use o

Re: Shout-out for Misty

2014-08-20 Thread Ted Yu
+1 Thanks Misty On Aug 20, 2014, at 3:17 AM, Anoop John wrote: > Great work! Thanks a lot Misty... > > > -Anoop- > > On Wed, Aug 20, 2014 at 11:56 AM, ramkrishna vasudevan < > ramkrishna.s.vasude...@gmail.com> wrote: > >> Great job !! Keep it up.!!! >> >> Regards >> Ram >> >> >> On Wed,

Re: Shout-out for Misty

2014-08-20 Thread Anoop John
Great work! Thanks a lot Misty... -Anoop- On Wed, Aug 20, 2014 at 11:56 AM, ramkrishna vasudevan < ramkrishna.s.vasude...@gmail.com> wrote: > Great job !! Keep it up.!!! > > Regards > Ram > > > On Wed, Aug 20, 2014 at 11:49 AM, rajeshbabu chintaguntla < > rajeshbabu.chintagun...@huawei.com> wr

Re: GC peaks during major compaction

2014-08-20 Thread Bryan Beaudreault
You're gonna want to use java7 and the G1 collector with a heap that high. I'm surprised you aren't seeing other issues. It's possible that with the added CPU load from the compactions, the garbage collector is not able to keep up and must do a full clean. On Wednesday, August 20, 2014, yanivG w

Re: Hbase InputFormat for multi-row + column range, how to do it?

2014-08-20 Thread Andrey Stepachev
Hi Jianshi. You can create your own. Just inherit from TableInputFormatBase or TableInputFormat and add ColumnRangeFilter to scan (either construct your own, or intercept setScan method). Hope this helps. -- Andrey. On Wed, Aug 20, 2014 at 1:35 PM, Jianshi Huang wrote: > Hi, > > I know Table

Hbase InputFormat for multi-row + column range, how to do it?

2014-08-20 Thread Jianshi Huang
Hi, I know TableInputFormat and HFileInputFormat can both set ROW_START and ROW_END, but none of them can set the column range (like what we do in ColumnRangeFilter). So how can I do column range in HBase InputFormat? Is there an implementation available? If not, how much effort do you think it t