keith-turner commented on PR #5771:
URL: https://github.com/apache/accumulo/pull/5771#issuecomment-3140787431

   I ran all ITs matching `*Compact*IT` and all passed except seeing problems 
w/ one metrics related IT.  However that seems like a preexisting issue.  Going 
to look into that and maybe open another PR.
   
   Manually tested the timeout by creating a table w/ 10K tablets, starting a 
compaction, and then repeatedly briefly SIGSTOPing the tserver until I saw the 
coordinator stuck trying to get a job from the tserver.  Did see the 
coordinator eventually timeout and carry on.
   
   ```
   2025-07-31T16:56:32,378 [coordinator.CompactionCoordinator] WARN : Error 
from tserver localhost:9997 while trying to reserve compaction, trying next 
tserver
   org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: 240000 millis timeout while waiting for 
channel to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/localhost:57318 remote=/localhost:9997]
           at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:171)
 ~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.thrift.transport.TTransport.readAll(TTransport.java:100) 
~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.thrift.transport.layered.TFramedTransport.readFrame(TFramedTransport.java:132)
 ~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.thrift.transport.layered.TFramedTransport.read(TFramedTransport.java:100)
 ~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.thrift.transport.TTransport.readAll(TTransport.java:100) 
~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.accumulo.core.clientImpl.ThriftTransportPool$CachedTTransport.readAll(ThriftTransportPool.java:701)
 ~[accumulo-core-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at 
org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:622) 
~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:479)
 ~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.thrift.protocol.TProtocolDecorator.readMessageBegin(TProtocolDecorator.java:156)
 ~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79) 
~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_reserveCompactionJob(TabletClientService.java:1002)
 ~[accumulo-core-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at 
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.reserveCompactionJob(TabletClientService.java:984)
 ~[accumulo-core-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at 
org.apache.accumulo.coordinator.CompactionCoordinator.getCompactionJob(CompactionCoordinator.java:568)
 ~[accumulo-compaction-coordinator-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown 
Source) ~[?:?]
           at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:?]
           at java.base/java.lang.reflect.Method.invoke(Method.java:569) ~[?:?]
           at 
org.apache.accumulo.core.trace.TraceUtil.lambda$wrapService$0(TraceUtil.java:204)
 ~[accumulo-core-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at jdk.proxy2/jdk.proxy2.$Proxy35.getCompactionJob(Unknown Source) 
~[?:?]
           at 
org.apache.accumulo.core.compaction.thrift.CompactionCoordinatorService$Processor$getCompactionJob.getResult(CompactionCoordinatorService.java:666)
 ~[accumulo-core-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at 
org.apache.accumulo.core.compaction.thrift.CompactionCoordinatorService$Processor$getCompactionJob.getResult(CompactionCoordinatorService.java:643)
 ~[accumulo-core-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at 
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:40) 
~[libthrift-0.17.0.jar:0.17.0]
           at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:40) 
~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:147) 
~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:50) 
~[accumulo-server-base-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at 
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:492)
 ~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:129)
 ~[accumulo-server-base-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at org.apache.thrift.server.Invocation.run(Invocation.java:18) 
~[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52)
 ~[accumulo-core-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
 ~[?:?]
           at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
 ~[?:?]
           at 
org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52)
 ~[accumulo-core-2.1.4-SNAPSHOT.jar:2.1.4-SNAPSHOT]
           at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
   Caused by: java.net.SocketTimeoutException: 240000 millis timeout while 
waiting for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=/localhost:57318 
remote=/localhost:9997]
           at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:163) 
~[hadoop-client-api-3.4.0.jar:?]
           at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) 
~[hadoop-client-api-3.4.0.jar:?]
           at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) 
~[hadoop-client-api-3.4.0.jar:?]
           at 
java.base/java.io.FilterInputStream.read(FilterInputStream.java:132) ~[?:?]
           at 
java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:244) ~[?:?]
           at 
java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:284) ~[?:?]
           at 
java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:343) ~[?:?]
           at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:169)
 ~[libthrift-0.17.0.jar:0.17.0]
           ... 31 more
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to