Thanks for the reply Josh, I am running 3 zookeeper servers.
On 02/24/2016 10:29 PM, Josh Elser wrote:
ZooKeeper is a funny system. This kind of ConnectionLossException is a
normal "state" that a ZooKeeper client can enter. We handle this
condition in Accumulo, retrying the operation (in this case, a
`create()`), after the client can reconnect to the ZooKeeper servers
in the background.
ConnectionLossExceptions can be indicative of over-saturation of your
nodes. A ZooKeeper client might lose it's connection because it is
starved for CPU time. It can also indicate that the ZooKeeper servers
might be starved for resources.
* Check the ZooKeeper server logs for any errors about dropped
connections (maxClientCnxns)
* Make sure your servers running Accumulo are not running at 100%
total CPU usage and that there is free memory (no swapping).
ACCUMULO-3336 is about a different ZooKeeper error condition called a
"session loss". This is when the entire ZooKeeper session needs to be
torn down and recreated. This only happens after prolonged pauses in
the client JVM or the ZooKeeper servers actively drop your connections
due to the internal configuration (maxClientCnxns). The stacktrace you
copied is not a session loss error.
Are you saying that when a ZooKeeper server dies, you cannot use
Accumulo? How many are you running?
mohit.kaushik wrote:
Sent so early...
Another exception I am getting frequently with zookeeper which is a
bigger problem.
ACCUMULO-3336 <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
it is unresolved yet
Saw (possibly) transient exception communicating with ZooKeeper
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for
/accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
at
org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at
org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at
org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
at
org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
And the worst case is whenever a zookeeper goes down cluster becomes
unreacheble for the time being, untill it restarts ingest process halts.
What do you suggest, I need to resolve these problems. I do not want to
be the ingest process to stop ever.
Thanks
Mohit kaushik
On 02/22/2016 12:06 PM, mohit.kaushik wrote:
I am facing the below given exception continuously, the count keeps
on increasing every sec(current value around 3000 on a server) I can
see the exception for all 3 tablet servers.
ACCUMULO-2420 <https://issues.apache.org/jira/browse/ACCUMULO-2420>
says that this exception comes when a client closes a connection
before scan completes. But the connection is not closed every thread
uses a common connection object to ingest and query, then what could
cause this exception?
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at
org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
at
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
at
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
at
org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
at
org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
at
org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
Regards
Mohit kaushik
--
Signature
*Mohit Kaushik*
Software Engineer
A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
*Tel:*+91 (124) 4969352 | *Fax:*+91 (124) 4033553
<http://politicomapper.orkash.com>interactive social intelligence at work...
<https://www.facebook.com/Orkash2012>
<http://www.linkedin.com/company/orkash-services-private-limited>
<https://twitter.com/Orkash> <http://www.orkash.com/blog/>
<http://www.orkash.com>
<http://www.orkash.com> ... ensuring Assurance in complexity and uncertainty
/This message including the attachments, if any, is a confidential
business communication. If you are not the intended recipient it may be
unlawful for you to read, copy, distribute, disclose or otherwise use
the information in this e-mail. If you have received it in error or are
not the intended recipient, please destroy it and notify the sender
immediately. Thank you /