Hi David, Edouard, Depending on your data model on event_data, you might want to consider upgrading to use DTCS (C* 2.0.11+).
Basically if those tombstones are due to a a Constant TTL and this is a time series, it could be a real improvement. See: https://labs.spotify.com/2014/12/18/date-tiered-compaction/ http://www.datastax.com/dev/blog/datetieredcompactionstrategy I am not sure this is related to your problem but having 8904 tombstones read at once is pretty bad. Also you might want to paginate queries a bit since it looks like you retrieve a lot of data at once. Meanwhile, if you are using STCS you can consider performing major compaction on a regular basis (taking into consideration major compaction downsides) C*heers, Alain 2015-06-12 15:08 GMT+02:00 David CHARBONNIER <david.charbonn...@rgsystem.com >: > Hi, > > > > We’re using Cassandra 2.0.8.39 through Datastax Enterprise 4.5.1 and we’re > experiencing issues with OPSCenter (version 5.1.3) Repair Service. > > When Repair Service is running, we can see repair timing out on a few > ranges in OPSCenter’s event log viewer. See screenshot attached. > > > > On our Cassandra nodes, we can see a lot of theese messages in > cassandra/system.log log file while a timeout shows up in OPSCenter : > > > > ERROR [Native-Transport-Requests:3372] 2015-06-12 > 02:22:33,231 ErrorMessage.java (line 222) Unexpected exception during > request > > java.io.IOException: Connection reset by peer > > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > > at sun.nio.ch.SocketDispatcher.read(Unknown Source) > > at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source) > > at sun.nio.ch.IOUtil.read(Unknown Source) > > at sun.nio.ch.SocketChannelImpl.read(Unknown Source) > > at > org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64) > > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109) > > at > org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) > > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) > > at > org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) > > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > Source) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) > > at java.lang.Thread.run(Unknown Source) > > > > You’ll find attached an extract of the system.log file with some more > informations. > > > > Do you have any idea of what’s happening ? > > > > We suspect timeouts happening because we have some tables with many > tombstones, and a warning is something triggered. We have edited the > configuration allow warning, but still perform until encounter 1.000.000 > tombstones. > > > > During a compaction, we’ve also warning messages telling us that we’ve a > lot of tombstones too : > > > > WARN [CompactionExecutor:1584] 2015-06-11 19:22:24,904 > SliceQueryFilter.java (line 225) Read 8640 live and 8904 tombstoned cells > in rgsupv.event_data (see tombstone_warn_threshold). 10000 columns was > requested, slices=[-], delInfo={deletedAt=-9223372036854775808, > localDeletion=2147483647} > > > > Do you think it’s related to our first problem ? > > > > Our cluster is configured as follow : > > - 8 nodes with Debian 7.8 x64 > > - 16Gb of memory and 4 CPU > > - 2 HDD : 1 for the system and the other for the data directory > > > > Best regards, > > > > *David CHARBONNIER* > > Sysadmin > > T : +33 411 934 200 > > david.charbonn...@rgsystem.com > > ZAC Aéroport > > 125 Impasse Adam Smith > > 34470 Pérols - France > > *www.rgsystem.com* <http://www.rgsystem.com/> > > > > > > >