cient load.
Added item #20 to http://wiki.apache.org/hadoop/Hbase/Troubleshooting
Sorry I didn't get to this sooner, Jack.
- Andy
--- On Wed, 3/30/11, Jack Levin wrote:
> From: Jack Levin
> Subject: Re: major hdfs issues
> To: user@hbase.apache.org
> Cc: "Suraj Varma
Thanks for updating the list Jack. I added a note to our 'book' on
nproc and referenced your email below (Will push the changes to the
website later).
Good stuff,
St.Ack
On Wed, Mar 30, 2011 at 7:31 PM, Jack Levin wrote:
> Thanks to everyone chiming in to help me fix this issue... It has now
> b
Thanks to everyone chiming in to help me fix this issue... It has now
been resolved, JD and I spend some time looking at thread limits and
apparently, our userid 'hadoop' had nproc limit (default) set to 1024,
this of course caused the issue of running out of threads every time
we were under load,
On Sun, Mar 13, 2011 at 1:33 PM, Jack Levin wrote:
> we are running at 128000 ulimit -n. I am pretty sure the culpreet is the
> thrift server, it opens up 20k threads under load, and crashes all other
> servers by taking away RAM.
> Do you guys disable tcp cookies also? In regards to iptables,
we are running at 128000 ulimit -n. I am pretty sure the culpreet is the
thrift server, it opens up 20k threads under load, and crashes all other
servers by taking away RAM.
Do you guys disable tcp cookies also? In regards to iptables, what is the
best way to disable?
-Jack
On Sat, Mar 12, 20
You may also want to look at the value set for ulimit -u - it's
unlimited on many OSes, but RHEL6 in particular sets it way too low,
which will cause the "unable to creative native thread". What OS you
running?
The conntrack error has to do with ip_conntrack, which is an iptables
module that keeps
Awesome, thanks... This is similar to mysql max-conn setting.
-Jack
On Sat, Mar 12, 2011 at 11:29 AM, Stack wrote:
> I opened HBASE-3628 to expose the TThreadPoolServer options on the
> command-line for thrift server.
> St.Ack
>
> On Sat, Mar 12, 2011 at 11:20 AM, Stack wrote:
> > Via Bryan (a
I opened HBASE-3628 to expose the TThreadPoolServer options on the
command-line for thrift server.
St.Ack
On Sat, Mar 12, 2011 at 11:20 AM, Stack wrote:
> Via Bryan (and J-D), by default we use the thread pool server from
> Thrift (unless you choose the non-blocking option):
>
> 978 LOG.inf
Via Bryan (and J-D), by default we use the thread pool server from
Thrift (unless you choose the non-blocking option):
978 LOG.info("starting HBase ThreadPool Thrift server on " +
listenAddress + ":" + Integer.toString(listenPort));
979 server = new TThreadPoolServer(processor, serverT
I don't see any bounding in the thrift code. Asking Bryan
St.Ack
On Sat, Mar 12, 2011 at 10:04 AM, Jack Levin wrote:
> So our problem is this: when we restart a region server, or it goes
> down, hbase slows down, while we send super high frequency thrift
> calls from our PHP front-end APP we
So our problem is this: when we restart a region server, or it goes
down, hbase slows down, while we send super high frequency thrift
calls from our PHP front-end APP we actually spawn up 2+ threads on
thrift, and what this
does is destroys all memory on the boxes, and causes DNs just to shut
d
>> to:java.lang.OutOfMemoryError: unable to create new native thread
This indicates that you are oversubscribed on your RAM to the extent
that the JVM doesn't have any space to create native threads (which
are allocated outside of the JVM heap.)
You may actually have to _reduce_ your heap sizes t
I am noticing following errors also:
2011-03-11 17:52:00,376 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.103.7.3:50010, storageID=DS-824332190-10.103.7.3-50010-1290043658438,
infoPort=50075, ipcPort=50020):DataXceiveServer: Exiting due
to:java.lang.OutOfMemoryEr
Looks like a datanode went down. InterruptedException is how java
uses to interrupt IO in threads, its similar to the EINTR errno. That
means the actual source of the abort is higher up...
So back to how InterruptedException works... at some point a thread in
the JVM decides that the VM should a
http://pastebin.com/ZmsyvcVc Here is the regionserver log, they all have
similar stuff,
On Thu, Mar 10, 2011 at 11:34 AM, Stack wrote:
> Whats in the regionserver logs? Please put up regionserver and
> datanode excerpts.
> Thanks Jack,
> St.Ack
>
> On Thu, Mar 10, 2011 at 10:31 AM, Jack Levin
Whats in the regionserver logs? Please put up regionserver and
datanode excerpts.
Thanks Jack,
St.Ack
On Thu, Mar 10, 2011 at 10:31 AM, Jack Levin wrote:
> All was well, until this happen:
>
> http://pastebin.com/iM1niwrS
>
> and all regionservers went down, is this xciever issue?
>
>
> dfs.dat
16 matches
Mail list logo