Truncate causing subsequent timeout on KeyIterator?
Hi, I'm running a bunch of integration tests using an embedded cassandra instance via the Cassandra Maven Plugin v1.0.0-1, using Hector v1.0-5. I've got an issue where one of the tests is using a StringKeyIterator to iterate over all the keys in a CF, but it gets TimedOutExceptions every time when trying to communicate with Cassandra; all the other tests using the same (Spring-wired) keyspace behave fine (stack trace below). A previous test is calling a cluster.truncate() to ensure an empty CF before each test, and it's this that seems to cause the problem - at least, commenting it out causes the other test to run fine. Any ideas on what could be causing this? Both tests are using the same instance of Keyspace, autowired via Spring, and the same instance of Cluster in the same way. No exceptions are being thrown by the truncate operation - it completes successfully and does its job. Thanks, Conan Stack trace: [2012-09-26 18:59:53,002] [WARN ] [main] [m.p.c.c.HConnectionManager] Could not fullfill request on this host CassandraClient [2012-09-26 18:59:53,003] [WARN ] [main] [m.p.c.c.HConnectionManager] Exception: me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException() at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:35) ~[hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:163) ~[hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:145) ~[hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103) ~[hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ~[hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131) [hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getRangeSlices(KeyspaceServiceImpl.java:167) [hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.model.thrift.ThriftRangeSlicesQuery$1.doInKeyspace(ThriftRangeSlicesQuery.java:66) [hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.model.thrift.ThriftRangeSlicesQuery$1.doInKeyspace(ThriftRangeSlicesQuery.java:62) [hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) [hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85) [hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.model.thrift.ThriftRangeSlicesQuery.execute(ThriftRangeSlicesQuery.java:61) [hector-core-1.0-5.jar:na] at me.prettyprint.cassandra.service.KeyIterator.runQuery(KeyIterator.java:102) [hector-core-1.0-5.jar:na] .. Caused by: org.apache.cassandra.thrift.TimedOutException: null at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12270) ~[cassandra-thrift-1.1.0.jar:1.1.0] at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) ~[libthrift-0.7.0.jar:0.7.0] at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:683) ~[cassandra-thrift-1.1.0.jar:1.1.0] at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:667) ~[cassandra-thrift-1.1.0.jar:1.1.0] at me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:151) ~[hector-core-1.0-5.jar:na]
Failing to delete commitlog at startup/shutdown (Windows)
Hi, I'm experiencing a problem running a suite of integration tests on Windows 7, using Cassandra 1.0.9 and Java 1.6.0_31. A new cassandra instance is spun up for each test class and shut down afterwards, using the Maven Failsafe plugin. The problem is that the Commitlog file seems to be kept open, and so subsequent test classes fail to delete it. Here is the stack trace: java.io.IOException: Failed to delete D:\amee.realtime.api\server\engine\tmp\var\lib\cassandra\commitlog\CommitLog-1335190398587.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:54) at org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:220) at org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:216) ... I've tried to delete the file when shutting down Cassandra and before firing up a new one. I've tried setting the failsafe plugin's forkMode to both "once" and "always", so that it fires up a new JVM for each test or a single JVM for all tests; the results are similar. Debugging through the code takes me right down to the native method call in the windows filesystem class in the JVM, and an access denied error is returned; I'm also unable to delete it manually through Windows Explorer or a terminal window at that point (with the JVM suspended), and running Process Explorer indicates that a Java process has a handle open to that file. I've read a number of posts and mails mentioning this problem and there is a JIRA saying a similar problem is fixed ( https://issues.apache.org/jira/browse/CASSANDRA-1348). I've tried a number of things to clean up the Commitlog file after each test is complete, and have followed the recommendations made here (I'm also using Hector's EmbeddedServerHelper to start/stop Cassandra): http://stackoverflow.com/questions/7944287/how-to-cleanup-embedded-cassandra-after-unittest Does anyone have any ideas on how to avoid this issue? I don't have any way of knowing what it is that's holding onto this file other than a Java process. Thanks! Conan
Re: Failing to delete commitlog at startup/shutdown (Windows)
Hi Steve, Thanks for your reply, sorry for the delay in getting back to you. We're actually doing something very similar already, using Hector's EmbeddedServerHelper (it's basically the same, maybe it came from the same code). Unfortunately whilst writing this our internet went down and I sometimes need to develop offline anyway, so using an external Cassandra instance isn't really an option. I've had a try using the maven-cassandra-plugin and don't seem to be having the problem any more, plus it's a neater solution anyway. Conan On 23 April 2012 15:51, Steve Neely wrote: > We used a modified version of Ran's embedded Cassandra for a while: > http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/which > worked well for us. You have way more control over that. > > Recently, we switched to having a single Cassandra installation that runs > all the time. Kind of like you'd treat a regular relational DB. Just fire > up Cassandra, leave it running and point your tests at that instance. Seems > like starting up your data store every time you execute integration tests > will slow them down and isn't really helpful. > > BTW, you may want to scrub the test data out of Cassandra when you're test > suite finishes. > > -- Steve > > > > On Mon, Apr 23, 2012 at 8:41 AM, Conan Cook wrote: > >> Hi, >> >> I'm experiencing a problem running a suite of integration tests on >> Windows 7, using Cassandra 1.0.9 and Java 1.6.0_31. A new cassandra >> instance is spun up for each test class and shut down afterwards, using the >> Maven Failsafe plugin. The problem is that the Commitlog file seems to be >> kept open, and so subsequent test classes fail to delete it. Here is the >> stack trace: >> >> java.io.IOException: Failed to delete >> D:\amee.realtime.api\server\engine\tmp\var\lib\cassandra\commitlog\CommitLog-1335190398587.log >> at >> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:54) >> at >> org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:220) >> at >> org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:216) >> ... >> >> I've tried to delete the file when shutting down Cassandra and before >> firing up a new one. I've tried setting the failsafe plugin's forkMode to >> both "once" and "always", so that it fires up a new JVM for each test or a >> single JVM for all tests; the results are similar. Debugging through the >> code takes me right down to the native method call in the windows >> filesystem class in the JVM, and an access denied error is returned; I'm >> also unable to delete it manually through Windows Explorer or a terminal >> window at that point (with the JVM suspended), and running Process Explorer >> indicates that a Java process has a handle open to that file. >> >> I've read a number of posts and mails mentioning this problem and there >> is a JIRA saying a similar problem is fixed ( >> https://issues.apache.org/jira/browse/CASSANDRA-1348). I've tried a >> number of things to clean up the Commitlog file after each test is >> complete, and have followed the recommendations made here (I'm also using >> Hector's EmbeddedServerHelper to start/stop Cassandra): >> http://stackoverflow.com/questions/7944287/how-to-cleanup-embedded-cassandra-after-unittest >> >> Does anyone have any ideas on how to avoid this issue? I don't have any >> way of knowing what it is that's holding onto this file other than a Java >> process. >> >> Thanks! >> >> >> Conan >> >> >
Re: Keyspace lost after restart
Sorry, forgot to mention we're running Cassandra 1.1. Conan On 8 May 2012 17:51, Conan Cook wrote: > Hi Cassandra Folk, > > We've experienced a problem a couple of times where Cassandra nodes lose a > keyspace after a restart. We've restarted 2 out of 3 nodes, and they have > both experienced this problem; clearly we're doing something wrong, but > don't know what. The data files are all still there, as before, but the > node can't see the keyspace (we only have one). Tthe nodetool still says > that each one is responsible for 33% of the keys, but the disk usage has > dropped to a tiny amount on the nodes that we've restarted. I saw this: > > > http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3c4f3582e7.20...@conga.com%3E > > Seems to be exactly our problem, but we have not modified the > cassandra.yaml - we have overwritten it through an automated process, and > that happened just before restarting, but the contents did not change. > > Any ideas as to what might cause this, or how the keyspace can be restored > (like I say, the data is all still in the data directory). > > We're running in AWS. > > Thanks, > > > Conan >
Re: Keyspace lost after restart
Hi Aaron, Thanks for getting back to me! Yes, I believe our keyspace was created prior to 1.1, and I think I also understand why you're asking that, having found this: https://issues.apache.org/jira/browse/CASSANDRA-4219 Here's our startup log: https://gist.github.com/2654155 There isn't much in there of interest however. It may well be the case that we created our keyspace, dropped it, then created it again. The dev responsible for setting it up is ill today, but I'll get back to you tomorrow with exact details of how it was originally created and whether we did definitely drop and re-create it. Ta, Conan On 10 May 2012 11:43, aaron morton wrote: > Was this a schema that was created prior to 1.1 ? > > What process are you using to create the schema ? > > Can you share the logs from system startup ? Up until it logs "Listening > for thrift clients". (if they are long please link to them) > > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 10/05/2012, at 1:04 AM, Conan Cook wrote: > > Sorry, forgot to mention we're running Cassandra 1.1. > > Conan > > On 8 May 2012 17:51, Conan Cook wrote: > >> Hi Cassandra Folk, >> >> We've experienced a problem a couple of times where Cassandra nodes lose >> a keyspace after a restart. We've restarted 2 out of 3 nodes, and they >> have both experienced this problem; clearly we're doing something wrong, >> but don't know what. The data files are all still there, as before, but >> the node can't see the keyspace (we only have one). Tthe nodetool still >> says that each one is responsible for 33% of the keys, but the disk usage >> has dropped to a tiny amount on the nodes that we've restarted. I saw this: >> >> >> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3c4f3582e7.20...@conga.com%3E >> >> Seems to be exactly our problem, but we have not modified the >> cassandra.yaml - we have overwritten it through an automated process, and >> that happened just before restarting, but the contents did not change. >> >> Any ideas as to what might cause this, or how the keyspace can be >> restored (like I say, the data is all still in the data directory). >> >> We're running in AWS. >> >> Thanks, >> >> >> Conan >> > > >
Re: [1.1] Can't create column
Just stumbled upon this by chance, is it related? https://issues.apache.org/jira/browse/CASSANDRA-2497 Conan On 6 May 2012 13:19, cyril auburtin wrote: > It's the comparator so? because I tried without the single quotes on > column_name and same error > thanks > > > 2012/5/6 Pierre Chalamet > >> create column family Post with comparator=UTF8Type and >> colum_metadata=[{column_name : user, validation_class : UTF8Type}] and >> comment='bla’; >> >> ** ** >> >> - Pierre >> >> ** ** >> >> *From:* cyril auburtin [mailto:cyril.aubur...@gmail.com] >> *Sent:* dimanche 6 mai 2012 13:10 >> *To:* user@cassandra.apache.org >> *Subject:* [1.1] Can't create column >> >> ** ** >> >> [default@ks] create column family Post with column_type = 'Standard' and >> column_metadata = [{column_name: 'user', validation_class: 'UTF8Type'}, >> {column_name: 'type', validation_class: 'UTF8Type'}] and comment = 'bla'; >> >> >> java.lang.RuntimeException: >> org.apache.cassandra.db.marshal.MarshalException: cannot parse 'user' as >> hex bytes >> >> ** ** >> > >
Re: Keyspace lost after restart
Hi, OK we're pretty sure we dropped and re-created the keyspace before restarting the Cassandra nodes during some testing (we've been migrating to a new cluster). The keyspace was created via the cli: create keyspace m7 with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {us-east: 3} and durable_writes = true; I'm pretty confident that it's a result of the issue I spotted before: https://issues.apache.org/jira/browse/CASSANDRA-4219 Does anyone know whether this also affected versions before 1.1.0? If not then we can just roll back until there's a fix; we're not using our cluster in production so we can afford to just bin it all and load it again. +1 for this being a major issue though, the fact that you can't see it until you restart a node makes it quite dangerous, and that node is lost when it occurs (I also haven't been able to restore the schema in any way). Thanks very much, Conan On 10 May 2012 17:15, Conan Cook wrote: > Hi Aaron, > > Thanks for getting back to me! Yes, I believe our keyspace was created > prior to 1.1, and I think I also understand why you're asking that, having > found this: > > https://issues.apache.org/jira/browse/CASSANDRA-4219 > > Here's our startup log: > > https://gist.github.com/2654155 > > There isn't much in there of interest however. It may well be the case > that we created our keyspace, dropped it, then created it again. The dev > responsible for setting it up is ill today, but I'll get back to you > tomorrow with exact details of how it was originally created and whether we > did definitely drop and re-create it. > > Ta, > > Conan > > > On 10 May 2012 11:43, aaron morton wrote: > >> Was this a schema that was created prior to 1.1 ? >> >> What process are you using to create the schema ? >> >> Can you share the logs from system startup ? Up until it logs "Listening >> for thrift clients". (if they are long please link to them) >> >> Cheers >> >> - >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 10/05/2012, at 1:04 AM, Conan Cook wrote: >> >> Sorry, forgot to mention we're running Cassandra 1.1. >> >> Conan >> >> On 8 May 2012 17:51, Conan Cook wrote: >> >>> Hi Cassandra Folk, >>> >>> We've experienced a problem a couple of times where Cassandra nodes lose >>> a keyspace after a restart. We've restarted 2 out of 3 nodes, and they >>> have both experienced this problem; clearly we're doing something wrong, >>> but don't know what. The data files are all still there, as before, but >>> the node can't see the keyspace (we only have one). Tthe nodetool still >>> says that each one is responsible for 33% of the keys, but the disk usage >>> has dropped to a tiny amount on the nodes that we've restarted. I saw this: >>> >>> >>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3c4f3582e7.20...@conga.com%3E >>> >>> Seems to be exactly our problem, but we have not modified the >>> cassandra.yaml - we have overwritten it through an automated process, and >>> that happened just before restarting, but the contents did not change. >>> >>> Any ideas as to what might cause this, or how the keyspace can be >>> restored (like I say, the data is all still in the data directory). >>> >>> We're running in AWS. >>> >>> Thanks, >>> >>> >>> Conan >>> >> >> >> >
Re: Keyspace lost after restart
Hi Jeff, Great! We'll roll back for now, thanks for letting me know. Conan On 11 May 2012 10:18, Jeff Williams wrote: > Conan, > > Good to see I'm not alone in this! I just set up a fresh test cluster. I > first did a fresh install of 1.1.0 and was able to replicate the issue. I > then did a fresh install using 1.0.10 and didn't see the issue. So it looks > like rolling back to 1.0.10 could be the answer for now. > > Jeff > > On May 11, 2012, at 10:40 AM, Conan Cook wrote: > > Hi, > > OK we're pretty sure we dropped and re-created the keyspace before > restarting the Cassandra nodes during some testing (we've been migrating to > a new cluster). The keyspace was created via the cli: > > > create keyspace m7 > > with placement_strategy = 'NetworkTopologyStrategy' > > and strategy_options = {us-east: 3} > > and durable_writes = true; > > > I'm pretty confident that it's a result of the issue I spotted before: > > https://issues.apache.org/jira/browse/CASSANDRA-4219 > > Does anyone know whether this also affected versions before 1.1.0? If not > then we can just roll back until there's a fix; we're not using our cluster > in production so we can afford to just bin it all and load it again. +1 > for this being a major issue though, the fact that you can't see it until > you restart a node makes it quite dangerous, and that node is lost when it > occurs (I also haven't been able to restore the schema in any way). > > Thanks very much, > > > Conan > > > > On 10 May 2012 17:15, Conan Cook wrote: > >> Hi Aaron, >> >> Thanks for getting back to me! Yes, I believe our keyspace was created >> prior to 1.1, and I think I also understand why you're asking that, having >> found this: >> >> https://issues.apache.org/jira/browse/CASSANDRA-4219 >> >> Here's our startup log: >> >> https://gist.github.com/2654155 >> >> There isn't much in there of interest however. It may well be the case >> that we created our keyspace, dropped it, then created it again. The dev >> responsible for setting it up is ill today, but I'll get back to you >> tomorrow with exact details of how it was originally created and whether we >> did definitely drop and re-create it. >> >> Ta, >> >> Conan >> >> >> On 10 May 2012 11:43, aaron morton wrote: >> >>> Was this a schema that was created prior to 1.1 ? >>> >>> What process are you using to create the schema ? >>> >>> Can you share the logs from system startup ? Up until it logs "Listening >>> for thrift clients". (if they are long please link to them) >>> >>> Cheers >>> >>> - >>> Aaron Morton >>> Freelance Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 10/05/2012, at 1:04 AM, Conan Cook wrote: >>> >>> Sorry, forgot to mention we're running Cassandra 1.1. >>> >>> Conan >>> >>> On 8 May 2012 17:51, Conan Cook wrote: >>> >>>> Hi Cassandra Folk, >>>> >>>> We've experienced a problem a couple of times where Cassandra nodes >>>> lose a keyspace after a restart. We've restarted 2 out of 3 nodes, and >>>> they have both experienced this problem; clearly we're doing something >>>> wrong, but don't know what. The data files are all still there, as before, >>>> but the node can't see the keyspace (we only have one). Tthe nodetool >>>> still says that each one is responsible for 33% of the keys, but the disk >>>> usage has dropped to a tiny amount on the nodes that we've restarted. I >>>> saw this: >>>> >>>> >>>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3c4f3582e7.20...@conga.com%3E >>>> >>>> Seems to be exactly our problem, but we have not modified the >>>> cassandra.yaml - we have overwritten it through an automated process, and >>>> that happened just before restarting, but the contents did not change. >>>> >>>> Any ideas as to what might cause this, or how the keyspace can be >>>> restored (like I say, the data is all still in the data directory). >>>> >>>> We're running in AWS. >>>> >>>> Thanks, >>>> >>>> >>>> Conan >>>> >>> >>> >>> >> > >
Re: Migrating from a windows cluster to a linux cluster.
Hi, We were trying to do a similar kind of migration (to a new cluster, no downtime) in order to remove a legacy OrderedPartitioner limitation. In the end we were allowed enough downtime to migrate, but originally we were proposing a similar solution based around deploying an update to the application to write to two clusters simultaneously, and a background copy of older data in some way. I'd love to hear how the migration went, and whether there were any (un)expected hurdles along the way! Thanks, Conan On 24 May 2012 23:56, Rob Coli wrote: > On Thu, May 24, 2012 at 12:44 PM, Steve Neely wrote: > > It also seems like a dark deployment of your new cluster is a great > method > > for testing the Linux-based systems before switching your mision critical > > traffic over. Monitor them for a while with real traffic and you can have > > confidence that they'll function correctly when you perform the > switchover. > > FWIW, I would love to see graphs which show their compared performance > under identical write load and then show the cut-over point for reads > between the two clusters. My hypothesis is that your linux cluster > will magically be much more perfomant/less loaded due to many > linux-specific optimizations in Cassandra, but I'd dig seeing this > illustrated in an apples to apples sense with real app traffic. > > =Rob > > -- > =Robert Coli > AIM>ALK - rc...@palominodb.com > YAHOO - rcoli.palominob > SKYPE - rcoli_palominodb >