Re: Full GC on client may lead to empty scan results

2015-07-30 Thread Sean Busbey
This sounds similar to HBASE-13262, but on versions that expressly have that fix in place. Mind putting up a jira with the problem reproduction? On Thu, Jul 30, 2015 at 1:13 PM, James Estes wrote: > All, > > If a full GC happens on the client when a scan is in progress, the scan can > be missin

fsck reports WALs corrupt in Hadoop 2.6.0

2015-07-30 Thread James Estes
All, With 0.98.12 on top of Hadoop 2.6.0, running: ./bin/hdfs fsck -openforwrite Reports that the WALs are corrupt. Without -openforwrite, everything is fine. They all seem to be "MISSING 1 blocks of total size 83 B" for each region server WAL. May be the same as to https://issues.apache.org/jir

Full GC on client may lead to empty scan results

2015-07-30 Thread James Estes
All, If a full GC happens on the client when a scan is in progress, the scan can be missing rows. I have a test that repros this almost every time. The test runs against a local standalone server with 10g heap, using jdk1.7.0_45. The Test: - run with -Xmx1900m to restrict client heap - run with

Re: [DISCUSS] Split up the book again?

2015-07-30 Thread anil gupta
Good to know that there is already a JIRA for this.Thanks. On Thu, Jul 30, 2015 at 10:44 AM, Sean Busbey wrote: > The need to update the javadocs published on the site is a long standing > issue. Please keep discussion for it on its jira: > > https://issues.apache.org/jira/browse/HBASE-13140 > >

Re: [DISCUSS] Split up the book again?

2015-07-30 Thread Sean Busbey
I personally like using the single-page version and tended to do so when we had both by-chapter and whole book. IIRC, asciidoc can make the per-chapter version. I think there was some reasoning from Misty on not enabling it by default, but I can't find the jira/thread at the moment. What if inste

Re: [DISCUSS] Split up the book again?

2015-07-30 Thread Sean Busbey
The need to update the javadocs published on the site is a long standing issue. Please keep discussion for it on its jira: https://issues.apache.org/jira/browse/HBASE-13140 (there's also a workaround provided) On Thu, Jul 30, 2015 at 12:23 PM, anil gupta wrote: > http://hbase.apache.org/apidoc

Re: [DISCUSS] Split up the book again?

2015-07-30 Thread anil gupta
http://hbase.apache.org/apidocs/index.html Above link refers to HBase2.0 docs and another link on our website refers to 0.94. So, there is no way to reach to 0.98,1.0 or 1.1 On Thu, Jul 30, 2015 at 10:18 AM, anil gupta wrote: > Hi All, > > Since we are talking about HBase documentation. Is it

Re: [DISCUSS] Split up the book again?

2015-07-30 Thread anil gupta
Hi All, Since we are talking about HBase documentation. Is it possible to have docs for Specific versions. Right now, JavaDocs refer to 0.94 or HBase2.0. Its not convenient to look at 2.0 docs while working on 0.98 or 1.0. I hope this should not be super difficult to accomplish. Apache Kafka, Ela

Re: [DISCUSS] Split up the book again?

2015-07-30 Thread Jean-Marc Spaggiari
+1 too. Even if cleaner and nicer, searching in it is a pain compares to before. Le 2015-07-30 07:17, "Shane O'Donnell" a écrit : > +1. > > One specific case where this is an issue is if you are entering the book > with an anchor link. If you try this, it appears to just hang. > > Shane O. > >

Re: Compaction after bulk-load

2015-07-30 Thread Laurent H
I think it doesn't matter about number of region in your RS IF your key is good one ! Maybe, check some documentation about number of HFile in each HRegion (there is some stuff about HFile and minor compaction) and this property can affect your write/read speed. -- Laurent HATIER - Consultant Big

Re: Compaction after bulk-load

2015-07-30 Thread Laurent H
yes, its one region = one reducer = one HFile generated -- Laurent HATIER - Consultant Big Data & Business Intelligence chez CapGemini fr.linkedin.com/pub/laurent-hatier/25/36b/a86/ 2015-07-30 17:07 GMT+02:00 Krishna : > There are 10 region serv

Re: Compaction after bulk-load

2015-07-30 Thread Krishna
There are 10 region servers & I can schedule compaction during weekend when the write load negligable. After reading the documentation, its not clear how many HFiles are created once bulk-load finishes - is it one HFile per reducer? My question is, is it recommended to run major compaction after b

Re: [DISCUSS] Split up the book again?

2015-07-30 Thread Shane O'Donnell
+1. One specific case where this is an issue is if you are entering the book with an anchor link. If you try this, it appears to just hang. Shane O. On Thu, Jul 30, 2015 at 10:07 AM, Stack wrote: > On Thu, Jul 30, 2015 at 2:06 PM, Lars Francke > wrote: > > > While I like the new and better l

Re: [DISCUSS] Split up the book again?

2015-07-30 Thread Stack
On Thu, Jul 30, 2015 at 2:06 PM, Lars Francke wrote: > While I like the new and better layout of the book it is painful to use - > at least for me - because of its size. > > I've started to notice this too. It'd be sweet if it loaded more promptly. Thanks for starting the discussion. St.Ack

Re: Compaction after bulk-load

2015-07-30 Thread Laurent H
It's a very big treatment "Major Compaction". We use bulk loading and we've put one major at 2 A.M and it rocks ! -- Laurent HATIER - Consultant Big Data & Business Intelligence chez CapGemini fr.linkedin.com/pub/laurent-hatier/25/36b/a86/ 2015-0

Re: Compaction after bulk-load

2015-07-30 Thread Ted Yu
How many region servers do you have in the cluster ? Would there be concurrent write load on the cluster if you choose to run major compaction ? I ask this because the concurrent write would be slowed down by the major compaction and compacting 10 TB of data would take some time. Cheers On Wed,

Re:Compaction after bulk-load

2015-07-30 Thread 奥伯莱恩
major compaction will cleanup deleted data and merge HFiles into one file . Maybe it's not needed . At 2015-07-30 07:23:02, "Krishna" wrote: >Hi, > >I am planning to bulk-load about 10 TB of data to a table pre-split with >30 regions with max region file size configured to 10 GB. > >Is it recomme

[DISCUSS] Split up the book again?

2015-07-30 Thread Lars Francke
While I like the new and better layout of the book it is painful to use - at least for me - because of its size. A few problems I'm hitting/seeing: * It takes Chrome almost 10 seconds to render the page until I can interact with it, this is not because of data transfer time this happens locally t

RegionServerObserver preMerge()/postMerge() methods are not getting invoked..

2015-07-30 Thread Mrudula Madiraju
Hi, I have a custom coprocessor associated with a table. This coprocessor extends from BaseRegionServerObserver. I can see that it is loaded, and flow goes to the constructor, start() and stop() methods. When I invoke merge_region on the hbase shell on two regions of the table. I can see the m