We have about a 30 node cluster running the latest C* 1.2 series DSE.  One 
datacenter uses VNodes and the other datacenter has VNodes Disabled (because it 
is running DSE-Seearch)

We have been replacing nodes in the VNode datacenter with faster ones and we 
have yet to have a successful decommission.  Every time we attempt to 
decommission a node we get an “Operation Timed Out” error and the decommission 
fails.  We keep retrying it and sometimes it will work and other times we will 
just give up and force the node removal.  It seems though, that all the data 
has streamed out of the node before the decommission fails.

What exactly does it need to read before leaving that would cause this?  We 
also have noticed that in several nodes after the removal that there are ghost 
entries for the removed node in the system.peers table and this doesn’t get 
removed until we restart Cassandra on that node.

Also, we have noticed that running repairs with VNodes is considerably slower. 
Is this a misconfiguration? Or is it expected that VNodes repairs will be slow?


Here is the stack trace from the decommission failure:

Exception in thread "main" java.lang.RuntimeException: 
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
received only 0 responses.
        at 
org.apache.cassandra.db.HintedHandOffManager.getHintsSlice(HintedHandOffManager.java:578)
        at 
org.apache.cassandra.db.HintedHandOffManager.listEndpointsPendingHints(HintedHandOffManager.java:528)
        at 
org.apache.cassandra.service.StorageService.streamHints(StorageService.java:2925)
        at 
org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2905)
        at 
org.apache.cassandra.service.StorageService.decommission(StorageService.java:2866)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
        at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
        at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
        at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
        at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
        at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
        at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
        at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
        at 
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
        at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
        at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
        at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
        at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
        at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
        at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
        at sun.rmi.transport.Transport$1.run(Transport.java:177)
        at sun.rmi.transport.Transport$1.run(Transport.java:174)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
        at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
        at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
        at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation 
timed out - received only 0 responses.
        at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:105)
        at 
org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1213)
        at 
org.apache.cassandra.db.HintedHandOffManager.getHintsSlice(HintedHandOffManager.java:573)
        ... 39 more




Reply via email to