Hi,
(cassandra 1.0.8)
Stumbled on a piece of code in Memtable that looks like it could hang a
thread forever.
public void flushAndSignal(final CountDownLatch latch, ExecutorService
writer, final ReplayPosition context)
{
writer.execute(new WrappedRunnable()
{
public void runMayThrow() throws IOException
{
cfs.flushLock.lock();
try
{
if (!cfs.isDropped())
{
SSTableReader sstable =
writeSortedContents(context);
cfs.replaceFlushed(Memtable.this, sstable);
}
}
finally
{
cfs.flushLock.unlock();
}
latch.countDown();
}
});
}
Given an IOException in writeSortedContents the latch.countDown() will
not be called. Wouldn't it be better to place the latch.countDown() in
the finally statement? We've had issues with IOExceptions in
writeSortedContents when doing a snapshot which hung a thread (and still
hangs) for 4 days. The thread hangs in
ColumnFamilyStore.forceBlockingFlush waiting for future.get() because
the latch.await() in ColumnFamilyStore.maybeSwitchMemtable never completes.
Regards
--
Mikael Wikblom
Software Architect
SiteVision AB
019-217058
mikael.wikb...@sitevision.se
http://www.sitevision.se