Truncate causing subsequent timeout on KeyIterator?

2012-09-26 Thread Conan Cook
Hi,

I'm running a bunch of integration tests using an embedded cassandra
instance via the Cassandra Maven Plugin v1.0.0-1, using Hector v1.0-5.
 I've got an issue where one of the tests is using a StringKeyIterator to
iterate over all the keys in a CF, but it gets TimedOutExceptions every
time when trying to communicate with Cassandra; all the other tests using
the same (Spring-wired) keyspace behave fine (stack trace below).  A
previous test is calling a cluster.truncate() to ensure an empty CF before
each test, and it's this that seems to cause the problem - at least,
commenting it out causes the other test to run fine.

Any ideas on what could be causing this?  Both tests are using the same
instance of Keyspace, autowired via Spring, and the same instance of
Cluster in the same way.  No exceptions are being thrown by the truncate
operation - it completes successfully and does its job.

Thanks,


Conan

Stack trace:

[2012-09-26 18:59:53,002] [WARN ] [main] [m.p.c.c.HConnectionManager] Could
not fullfill request on this host CassandraClient
[2012-09-26 18:59:53,003] [WARN ] [main] [m.p.c.c.HConnectionManager]
Exception:
me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
at
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:35)
~[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:163)
~[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:145)
~[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103)
~[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258)
~[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl.getRangeSlices(KeyspaceServiceImpl.java:167)
[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.model.thrift.ThriftRangeSlicesQuery$1.doInKeyspace(ThriftRangeSlicesQuery.java:66)
[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.model.thrift.ThriftRangeSlicesQuery$1.doInKeyspace(ThriftRangeSlicesQuery.java:62)
[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.model.thrift.ThriftRangeSlicesQuery.execute(ThriftRangeSlicesQuery.java:61)
[hector-core-1.0-5.jar:na]
at
me.prettyprint.cassandra.service.KeyIterator.runQuery(KeyIterator.java:102)
[hector-core-1.0-5.jar:na]

..

Caused by: org.apache.cassandra.thrift.TimedOutException: null
at
org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12270)
~[cassandra-thrift-1.1.0.jar:1.1.0]
at
org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
~[libthrift-0.7.0.jar:0.7.0]
at
org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:683)
~[cassandra-thrift-1.1.0.jar:1.1.0]
at
org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:667)
~[cassandra-thrift-1.1.0.jar:1.1.0]
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:151)
~[hector-core-1.0-5.jar:na]


Failing to delete commitlog at startup/shutdown (Windows)

2012-04-23 Thread Conan Cook
Hi,

I'm experiencing a problem running a suite of integration tests on Windows
7, using Cassandra 1.0.9 and Java 1.6.0_31.  A new cassandra instance is
spun up for each test class and shut down afterwards, using the Maven
Failsafe plugin.  The problem is that the Commitlog file seems to be kept
open, and so subsequent test classes fail to delete it.  Here is the stack
trace:

java.io.IOException: Failed to delete
D:\amee.realtime.api\server\engine\tmp\var\lib\cassandra\commitlog\CommitLog-1335190398587.log
at
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:54)
 at
org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:220)
at
org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:216)
...

I've tried to delete the file when shutting down Cassandra and before
firing up a new one.  I've tried setting the failsafe plugin's forkMode to
both "once" and "always", so that it fires up a new JVM for each test or a
single JVM for all tests; the results are similar.  Debugging through the
code takes me right down to the native method call in the windows
filesystem class in the JVM, and an access denied error is returned; I'm
also unable to delete it manually through Windows Explorer or a terminal
window at that point (with the JVM suspended), and running Process Explorer
indicates that a Java process has a handle open to that file.

I've read a number of posts and mails mentioning this problem and there is
a JIRA saying a similar problem is fixed (
https://issues.apache.org/jira/browse/CASSANDRA-1348).  I've tried a number
of things to clean up the Commitlog file after each test is complete, and
have followed the recommendations made here (I'm also using Hector's
EmbeddedServerHelper to start/stop Cassandra):
http://stackoverflow.com/questions/7944287/how-to-cleanup-embedded-cassandra-after-unittest

Does anyone have any ideas on how to avoid this issue?  I don't have any
way of knowing what it is that's holding onto this file other than a Java
process.

Thanks!


Conan


Re: Failing to delete commitlog at startup/shutdown (Windows)

2012-05-08 Thread Conan Cook
Hi Steve,

Thanks for your reply, sorry for the delay in getting back to you.  We're
actually doing something very similar already, using Hector's
EmbeddedServerHelper (it's basically the same, maybe it came from the same
code).  Unfortunately whilst writing this our internet went down and I
sometimes need to develop offline anyway, so using an external Cassandra
instance isn't really an option.

I've had a try using the maven-cassandra-plugin and don't seem to be having
the problem any more, plus it's a neater solution anyway.

Conan

On 23 April 2012 15:51, Steve Neely  wrote:

> We used a modified version of Ran's embedded Cassandra for a while:
> http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/which
>  worked well for us. You have way more control over that.
>
> Recently, we switched to having a single Cassandra installation that runs
> all the time. Kind of like you'd treat a regular relational DB. Just fire
> up Cassandra, leave it running and point your tests at that instance. Seems
> like starting up your data store every time you execute integration tests
> will slow them down and isn't really helpful.
>
> BTW, you may want to scrub the test data out of Cassandra when you're test
> suite finishes.
>
> -- Steve
>
>
>
> On Mon, Apr 23, 2012 at 8:41 AM, Conan Cook  wrote:
>
>> Hi,
>>
>> I'm experiencing a problem running a suite of integration tests on
>> Windows 7, using Cassandra 1.0.9 and Java 1.6.0_31.  A new cassandra
>> instance is spun up for each test class and shut down afterwards, using the
>> Maven Failsafe plugin.  The problem is that the Commitlog file seems to be
>> kept open, and so subsequent test classes fail to delete it.  Here is the
>> stack trace:
>>
>> java.io.IOException: Failed to delete
>> D:\amee.realtime.api\server\engine\tmp\var\lib\cassandra\commitlog\CommitLog-1335190398587.log
>> at
>> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:54)
>>  at
>> org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:220)
>> at
>> org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:216)
>> ...
>>
>> I've tried to delete the file when shutting down Cassandra and before
>> firing up a new one.  I've tried setting the failsafe plugin's forkMode to
>> both "once" and "always", so that it fires up a new JVM for each test or a
>> single JVM for all tests; the results are similar.  Debugging through the
>> code takes me right down to the native method call in the windows
>> filesystem class in the JVM, and an access denied error is returned; I'm
>> also unable to delete it manually through Windows Explorer or a terminal
>> window at that point (with the JVM suspended), and running Process Explorer
>> indicates that a Java process has a handle open to that file.
>>
>> I've read a number of posts and mails mentioning this problem and there
>> is a JIRA saying a similar problem is fixed (
>> https://issues.apache.org/jira/browse/CASSANDRA-1348).  I've tried a
>> number of things to clean up the Commitlog file after each test is
>> complete, and have followed the recommendations made here (I'm also using
>> Hector's EmbeddedServerHelper to start/stop Cassandra):
>> http://stackoverflow.com/questions/7944287/how-to-cleanup-embedded-cassandra-after-unittest
>>
>> Does anyone have any ideas on how to avoid this issue?  I don't have any
>> way of knowing what it is that's holding onto this file other than a Java
>> process.
>>
>> Thanks!
>>
>>
>> Conan
>>
>>
>


Re: Keyspace lost after restart

2012-05-09 Thread Conan Cook
Sorry, forgot to mention we're running Cassandra 1.1.

Conan

On 8 May 2012 17:51, Conan Cook  wrote:

> Hi Cassandra Folk,
>
> We've experienced a problem a couple of times where Cassandra nodes lose a
> keyspace after a restart.  We've restarted 2 out of 3 nodes, and they have
> both experienced this problem; clearly we're doing something wrong, but
> don't know what.  The data files are all still there, as before, but the
> node can't see the keyspace (we only have one).  Tthe nodetool still says
> that each one is responsible for 33% of the keys, but the disk usage has
> dropped to a tiny amount on the nodes that we've restarted.  I saw this:
>
>
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3c4f3582e7.20...@conga.com%3E
>
> Seems to be exactly our problem, but we have not modified the
> cassandra.yaml - we have overwritten it through an automated process, and
> that happened just before restarting, but the contents did not change.
>
> Any ideas as to what might cause this, or how the keyspace can be restored
> (like I say, the data is all still in the data directory).
>
> We're running in AWS.
>
> Thanks,
>
>
> Conan
>


Re: Keyspace lost after restart

2012-05-10 Thread Conan Cook
Hi Aaron,

Thanks for getting back to me!  Yes, I believe our keyspace was created
prior to 1.1, and I think I also understand why you're asking that, having
found this:

https://issues.apache.org/jira/browse/CASSANDRA-4219

Here's our startup log:

https://gist.github.com/2654155

There isn't much in there of interest however.  It may well be the case
that we created our keyspace, dropped it, then created it again.  The dev
responsible for setting it up is ill today, but I'll get back to you
tomorrow with exact details of how it was originally created and whether we
did definitely drop and re-create it.

Ta,

Conan


On 10 May 2012 11:43, aaron morton  wrote:

> Was this a schema that was created prior to 1.1 ?
>
> What process are you using to create the schema ?
>
> Can you share the logs from system startup ? Up until it logs "Listening
> for thrift clients". (if they are long please link to them)
>
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 10/05/2012, at 1:04 AM, Conan Cook wrote:
>
> Sorry, forgot to mention we're running Cassandra 1.1.
>
> Conan
>
> On 8 May 2012 17:51, Conan Cook  wrote:
>
>> Hi Cassandra Folk,
>>
>> We've experienced a problem a couple of times where Cassandra nodes lose
>> a keyspace after a restart.  We've restarted 2 out of 3 nodes, and they
>> have both experienced this problem; clearly we're doing something wrong,
>> but don't know what.  The data files are all still there, as before, but
>> the node can't see the keyspace (we only have one).  Tthe nodetool still
>> says that each one is responsible for 33% of the keys, but the disk usage
>> has dropped to a tiny amount on the nodes that we've restarted.  I saw this:
>>
>>
>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3c4f3582e7.20...@conga.com%3E
>>
>> Seems to be exactly our problem, but we have not modified the
>> cassandra.yaml - we have overwritten it through an automated process, and
>> that happened just before restarting, but the contents did not change.
>>
>> Any ideas as to what might cause this, or how the keyspace can be
>> restored (like I say, the data is all still in the data directory).
>>
>> We're running in AWS.
>>
>> Thanks,
>>
>>
>> Conan
>>
>
>
>


Re: [1.1] Can't create column

2012-05-10 Thread Conan Cook
Just stumbled upon this by chance, is it related?

https://issues.apache.org/jira/browse/CASSANDRA-2497

Conan

On 6 May 2012 13:19, cyril auburtin  wrote:

> It's the comparator so? because I tried without the single quotes on
> column_name and same error
> thanks
>
>
> 2012/5/6 Pierre Chalamet 
>
>> create column family Post with comparator=UTF8Type and
>> colum_metadata=[{column_name : user, validation_class : UTF8Type}] and
>> comment='bla’;
>>
>> ** **
>>
>> - Pierre
>>
>> ** **
>>
>> *From:* cyril auburtin [mailto:cyril.aubur...@gmail.com]
>> *Sent:* dimanche 6 mai 2012 13:10
>> *To:* user@cassandra.apache.org
>> *Subject:* [1.1] Can't create column
>>
>> ** **
>>
>> [default@ks] create column family Post with column_type = 'Standard' and
>> column_metadata = [{column_name: 'user', validation_class: 'UTF8Type'},
>> {column_name: 'type', validation_class: 'UTF8Type'}] and comment = 'bla';
>> 
>>
>> java.lang.RuntimeException:
>> org.apache.cassandra.db.marshal.MarshalException: cannot parse 'user' as
>> hex bytes
>>
>> ** **
>>
>
>


Re: Keyspace lost after restart

2012-05-11 Thread Conan Cook
Hi,

OK we're pretty sure we dropped and re-created the keyspace before
restarting the Cassandra nodes during some testing (we've been migrating to
a new cluster).  The keyspace was created via the cli:

create keyspace m7
  with placement_strategy = 'NetworkTopologyStrategy'
  and strategy_options = {us-east: 3}
  and durable_writes = true;


I'm pretty confident that it's a result of the issue I spotted before:

https://issues.apache.org/jira/browse/CASSANDRA-4219

Does anyone know whether this also affected versions before 1.1.0?  If not
then we can just roll back until there's a fix; we're not using our cluster
in production so we can afford to just bin it all and load it again.  +1
for this being a major issue though, the fact that you can't see it until
you restart a node makes it quite dangerous, and that node is lost when it
occurs (I also haven't been able to restore the schema in any way).

Thanks very much,


Conan



On 10 May 2012 17:15, Conan Cook  wrote:

> Hi Aaron,
>
> Thanks for getting back to me!  Yes, I believe our keyspace was created
> prior to 1.1, and I think I also understand why you're asking that, having
> found this:
>
> https://issues.apache.org/jira/browse/CASSANDRA-4219
>
> Here's our startup log:
>
> https://gist.github.com/2654155
>
> There isn't much in there of interest however.  It may well be the case
> that we created our keyspace, dropped it, then created it again.  The dev
> responsible for setting it up is ill today, but I'll get back to you
> tomorrow with exact details of how it was originally created and whether we
> did definitely drop and re-create it.
>
> Ta,
>
> Conan
>
>
> On 10 May 2012 11:43, aaron morton  wrote:
>
>> Was this a schema that was created prior to 1.1 ?
>>
>> What process are you using to create the schema ?
>>
>> Can you share the logs from system startup ? Up until it logs "Listening
>> for thrift clients". (if they are long please link to them)
>>
>> Cheers
>>
>>   -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 10/05/2012, at 1:04 AM, Conan Cook wrote:
>>
>> Sorry, forgot to mention we're running Cassandra 1.1.
>>
>> Conan
>>
>> On 8 May 2012 17:51, Conan Cook  wrote:
>>
>>> Hi Cassandra Folk,
>>>
>>> We've experienced a problem a couple of times where Cassandra nodes lose
>>> a keyspace after a restart.  We've restarted 2 out of 3 nodes, and they
>>> have both experienced this problem; clearly we're doing something wrong,
>>> but don't know what.  The data files are all still there, as before, but
>>> the node can't see the keyspace (we only have one).  Tthe nodetool still
>>> says that each one is responsible for 33% of the keys, but the disk usage
>>> has dropped to a tiny amount on the nodes that we've restarted.  I saw this:
>>>
>>>
>>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3c4f3582e7.20...@conga.com%3E
>>>
>>> Seems to be exactly our problem, but we have not modified the
>>> cassandra.yaml - we have overwritten it through an automated process, and
>>> that happened just before restarting, but the contents did not change.
>>>
>>> Any ideas as to what might cause this, or how the keyspace can be
>>> restored (like I say, the data is all still in the data directory).
>>>
>>> We're running in AWS.
>>>
>>> Thanks,
>>>
>>>
>>> Conan
>>>
>>
>>
>>
>


Re: Keyspace lost after restart

2012-05-11 Thread Conan Cook
Hi Jeff,

Great!  We'll roll back for now, thanks for letting me know.

Conan

On 11 May 2012 10:18, Jeff Williams  wrote:

> Conan,
>
> Good to see I'm not alone in this! I just set up a fresh test cluster. I
> first did a fresh install of 1.1.0 and was able to replicate the issue. I
> then did a fresh install using 1.0.10 and didn't see the issue. So it looks
> like rolling back to 1.0.10 could be the answer for now.
>
> Jeff
>
> On May 11, 2012, at 10:40 AM, Conan Cook wrote:
>
> Hi,
>
> OK we're pretty sure we dropped and re-created the keyspace before
> restarting the Cassandra nodes during some testing (we've been migrating to
> a new cluster).  The keyspace was created via the cli:
>
>
> create keyspace m7
>
>   with placement_strategy = 'NetworkTopologyStrategy'
>
>   and strategy_options = {us-east: 3}
>
>   and durable_writes = true;
>
>
> I'm pretty confident that it's a result of the issue I spotted before:
>
> https://issues.apache.org/jira/browse/CASSANDRA-4219
>
> Does anyone know whether this also affected versions before 1.1.0?  If not
> then we can just roll back until there's a fix; we're not using our cluster
> in production so we can afford to just bin it all and load it again.  +1
> for this being a major issue though, the fact that you can't see it until
> you restart a node makes it quite dangerous, and that node is lost when it
> occurs (I also haven't been able to restore the schema in any way).
>
> Thanks very much,
>
>
> Conan
>
>
>
> On 10 May 2012 17:15, Conan Cook  wrote:
>
>> Hi Aaron,
>>
>> Thanks for getting back to me!  Yes, I believe our keyspace was created
>> prior to 1.1, and I think I also understand why you're asking that, having
>> found this:
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-4219
>>
>> Here's our startup log:
>>
>> https://gist.github.com/2654155
>>
>> There isn't much in there of interest however.  It may well be the case
>> that we created our keyspace, dropped it, then created it again.  The dev
>> responsible for setting it up is ill today, but I'll get back to you
>> tomorrow with exact details of how it was originally created and whether we
>> did definitely drop and re-create it.
>>
>> Ta,
>>
>> Conan
>>
>>
>> On 10 May 2012 11:43, aaron morton  wrote:
>>
>>> Was this a schema that was created prior to 1.1 ?
>>>
>>> What process are you using to create the schema ?
>>>
>>> Can you share the logs from system startup ? Up until it logs "Listening
>>> for thrift clients". (if they are long please link to them)
>>>
>>> Cheers
>>>
>>>   -
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 10/05/2012, at 1:04 AM, Conan Cook wrote:
>>>
>>> Sorry, forgot to mention we're running Cassandra 1.1.
>>>
>>> Conan
>>>
>>> On 8 May 2012 17:51, Conan Cook  wrote:
>>>
>>>> Hi Cassandra Folk,
>>>>
>>>> We've experienced a problem a couple of times where Cassandra nodes
>>>> lose a keyspace after a restart.  We've restarted 2 out of 3 nodes, and
>>>> they have both experienced this problem; clearly we're doing something
>>>> wrong, but don't know what.  The data files are all still there, as before,
>>>> but the node can't see the keyspace (we only have one).  Tthe nodetool
>>>> still says that each one is responsible for 33% of the keys, but the disk
>>>> usage has dropped to a tiny amount on the nodes that we've restarted.  I
>>>> saw this:
>>>>
>>>>
>>>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201202.mbox/%3c4f3582e7.20...@conga.com%3E
>>>>
>>>> Seems to be exactly our problem, but we have not modified the
>>>> cassandra.yaml - we have overwritten it through an automated process, and
>>>> that happened just before restarting, but the contents did not change.
>>>>
>>>> Any ideas as to what might cause this, or how the keyspace can be
>>>> restored (like I say, the data is all still in the data directory).
>>>>
>>>> We're running in AWS.
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> Conan
>>>>
>>>
>>>
>>>
>>
>
>


Re: Migrating from a windows cluster to a linux cluster.

2012-05-29 Thread Conan Cook
Hi,

We were trying to do a similar kind of migration (to a new cluster, no
downtime) in order to remove a legacy OrderedPartitioner limitation.  In
the end we were allowed enough downtime to migrate, but originally we were
proposing a similar solution based around deploying an update to the
application to write to two clusters simultaneously, and a background copy
of older data in some way.

I'd love to hear how the migration went, and whether there were any
(un)expected hurdles along the way!

Thanks,


Conan

On 24 May 2012 23:56, Rob Coli  wrote:

> On Thu, May 24, 2012 at 12:44 PM, Steve Neely  wrote:
> > It also seems like a dark deployment of your new cluster is a great
> method
> > for testing the Linux-based systems before switching your mision critical
> > traffic over. Monitor them for a while with real traffic and you can have
> > confidence that they'll function correctly when you perform the
> switchover.
>
> FWIW, I would love to see graphs which show their compared performance
> under identical write load and then show the cut-over point for reads
> between the two clusters. My hypothesis is that your linux cluster
> will magically be much more perfomant/less loaded due to many
> linux-specific optimizations in Cassandra, but I'd dig seeing this
> illustrated in an apples to apples sense with real app traffic.
>
> =Rob
>
> --
> =Robert Coli
> AIM>ALK - rc...@palominodb.com
> YAHOO - rcoli.palominob
> SKYPE - rcoli_palominodb
>