Bill Farner created AURORA-100: ---------------------------------- Summary: Thrift connection appears to keep the scheduler from shutting down Key: AURORA-100 URL: https://issues.apache.org/jira/browse/AURORA-100 Project: Aurora Issue Type: Bug Components: Scheduler Reporter: Bill Farner Priority: Minor
This originally cropped up when we were using thrift 0.5.0, so the code sample below will be stale: Looking at TThreadPoolServer source, the behavior here makes sense. We use the default ExecutorService created by TThreadPoolServer, which uses non-daemon threads. Here's TThreadPoolServer's serve loop: {code} public void serve() { try { serverTransport_.listen(); } catch (TTransportException ttx) { LOGGER.error("Error occurred during listening.", ttx); return; } // Run the preServe event if (eventHandler_ != null) { eventHandler_.preServe(); } stopped_ = false; setServing(true); while (!stopped_) { int failureCount = 0; try { TTransport client = serverTransport_.accept(); WorkerProcess wp = new WorkerProcess(client); executorService_.execute(wp); } catch (TTransportException ttx) { if (!stopped_) { ++failureCount; LOGGER.warn("Transport error occurred during acceptance of message.", ttx); } } } executorService_.shutdown(); // Loop until awaitTermination finally does return without a interrupted // exception. If we don't do this, then we'll shut down prematurely. We want // to let the executorService clear it's task queue, closing client sockets // appropriately. long timeoutMS = stopTimeoutUnit.toMillis(stopTimeoutVal); long now = System.currentTimeMillis(); while (timeoutMS >= 0) { try { executorService_.awaitTermination(timeoutMS, TimeUnit.MILLISECONDS); break; } catch (InterruptedException ix) { long newnow = System.currentTimeMillis(); timeoutMS -= (newnow - now); now = newnow; } } setServing(false); } {code} The important bit, near the end, is that they never invoke executorService_.shutdownNow , which would terminate active connections. This is likely a deliberate design choice, and thrift 0.6.0+ allows callers to provide their own ExecutorService, which would give us some more control here. -- This message was sent by Atlassian JIRA (v6.1.5#6160)