Run same job on hbase over hadoop: all works like a sharm.
I can give to conclusions:
1. some bug in standalone mode
2. memory, but i think this is not a case
(disks are same, memory are same, machine a same,
workload is same, but result - differs).
Later I'll try to write testcase
2010/9/22 And
Very strange. With habase over hadoop no such errors with checksums.
Very strange. I'll recheck on another big family.
2010/9/22 Andrey Stepachev :
> Thanks. Now i run the same job on
> hbase 0.89 over cloudera hadoop instead of standalone mode.
> May be here some bug in standalone mode, which pre
Thanks. Now i run the same job on
hbase 0.89 over cloudera hadoop instead of standalone mode.
May be here some bug in standalone mode, which prevents to
write correct data on disk. And later I'll check memory.
Btw, linux is opensuse 11.0. 2.6.25.20-0.7-default 64 bit.
2010/9/22 Ryan Rawson :
> So
So the client code looks good, hard to say what exactly is going on.
BTW I opened this JIRA:
https://issues.apache.org/jira/browse/HBASE-3029
To address the confusing exception in this case.
It's hard to say why you get that exception under load... some systems
have been known to give weird flak
One more note. This database was 0.20.6 before. Then
i start 0.89 over it.
(but table with wrong checksum was created in 0.89 hbase)
2010/9/22 Andrey Stepachev :
> 2010/9/22 Ryan Rawson :
>> why are you using such expensive disks? raid + hdfs = lower
>> performance than non-raid.
>
> It was datab
2010/9/22 Ryan Rawson :
> why are you using such expensive disks? raid + hdfs = lower
> performance than non-raid.
It was database server, before we migrate to hbase. It was designed
for postgresql. Now with compression and hbase nature our database
is 12Gb instead of 180GB in pg.
So this server
why are you using such expensive disks? raid + hdfs = lower
performance than non-raid.
how's your ram? hows your network switches? NICs? etc etc.
anything along the data path can introduce errors.
in this case we did the right thing and threw exceptions, but looks
like your client continues t
but yesterday hbase was 0.20.6 and exceptions was different
from my previous email:
----
I need to massive data rewrite in some family on standalone server. I
got org.apache.hadoop.hbase.NotServingRegionException
or java.io.IOException: Region xxx closed if I write and read at the same time.
-
hp proliant raid 10 with 4 sas. 15k. smartarray 6i. 2cpu/4core.
2010/9/22 Ryan Rawson :
> generally checksum errors are due to hardware faults of one kind or another.
>
> what is your hardware like?
>
> On Wed, Sep 22, 2010 at 2:08 AM, Andrey Stepachev wrote:
>> But why it is bad? Split/compactio
generally checksum errors are due to hardware faults of one kind or another.
what is your hardware like?
On Wed, Sep 22, 2010 at 2:08 AM, Andrey Stepachev wrote:
> But why it is bad? Split/compaction? I make my own RetryResultIterator
> which reopen scanner on timeout. But what is best way to re
But why it is bad? Split/compaction? I make my own RetryResultIterator
which reopen scanner on timeout. But what is best way to reopen scanner.
Can you point me where i can find all this exceptions? Or may be
here already some sort for recoveratble iterator?
2010/9/22 Ryan Rawson :
> ah ok i think
ah ok i think i get it... basically at this point your scanner is bad
and iterating on it again won't work. the scanner should probably
self close itself so you get tons of additional exceptions but instead
we dont.
there is probably a better fix for this, i'll ponder
On Wed, Sep 22, 2010 at 1:5
very strange... looks like a bad block ended up in your scanner and
subsequent nexts were failing due to that short read.
did you have to kill the regionserver or did things recover and
continue normally?
-ryan
On Wed, Sep 22, 2010 at 1:37 AM, Andrey Stepachev wrote:
> Hi All.
>
> I get org.apa
13 matches
Mail list logo