Re: repair hangs

2013-03-14 Thread Dane Miller
On Thu, Mar 14, 2013 at 6:34 AM, aaron morton wrote: >> 1. is this a nodetool bug? is there any way to propagate the >> java.io.IOException back to nodetool? > The repair continues to work even if nodetool fails, it's a server side thing. > >> 2. network problems on EC2, I'm shocked! are there r

Re: Backup solution

2013-03-14 Thread Jabbar Azam
Hello, If the live data centre disappears restoring the data from the backup is going to take ages especially if the data is going from one data centre to another, unless you have a high bandwidth connection between data centres or you have a small amount of data. Jabbar Azam On 14 Mar 2013 14:31

Cassandra Platform Support

2013-03-14 Thread Rick Cole
We are looking to use Cassandra for our persistent store. However, we have requirements to support multiple platforms including Windows Server, Linux, HP-UX, AIX and Solaris. I have searched the web and literally find nothing about Cassandra running on some of these platforms. I don't even find

Iterating over all keys in column family with LongType key

2013-03-14 Thread sj.climber
Hi, I have two column families, one keyed by UTF8Type (let's call it StringFamily), the other by LongType (LongFamily). Both are stored in a keyspace using the RandomPartitioner. I need to iterate over all columns in both families, although order does not matter. Iterating over StringFamily is

Re: 13k pending compaction tasks but ZERO running?

2013-03-14 Thread Wei Zhu
No problem. Back to the old trick, doesn't work, restart:)  From: "Hiller, Dean" To: "user@cassandra.apache.org" ; Wei Zhu Sent: Thursday, March 14, 2013 9:53 AM Subject: Re: 13k pending compaction tasks but ZERO running? Ah, you are a lifesaver.  I was so

Re: 13k pending compaction tasks but ZERO running?

2013-03-14 Thread Hiller, Dean
Ah, you are a lifesaver. I was so used to keeping the nodes always up. That worked It is finally taking affect. Thanks, Dean On 3/14/13 10:47 AM, "Wei Zhu" wrote: >Did you restart the node? As I can tell compactions start a few minutes >after restarting. Did you see a file called $CFName.j

Re: 13k pending compaction tasks but ZERO running?

2013-03-14 Thread Wei Zhu
Did you restart the node? As I can tell compactions start a few minutes after restarting. Did you see a file called $CFName.json ($CFName is your cf name) in your data directory? -Wei - Original Message - From: "Dean Hiller" To: user@cassandra.apache.org Sent: Thursday, March 14, 201

Re: About the heap

2013-03-14 Thread Michael Theroux
Hi Aaron, If you have the chance, could you expand on m1.xlarge being the much better choice? We are going to need to make a choice of expanding from a 12 node -> 24 node cluster using .large instances, vs. upgrading all instances to m1.xlarge, soon and the justifications would be helpful (alt

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Alain RODRIGUEZ
"I feel your pain" => I is quite big :). This morning I had 6 nodes, tried to up one, have it decommissioned, try to replace it by an other in 1.1.6, failed and now try to restart one of the 5 remaining nodes and it fails restarting... Now I have 4 nodes instead of 6 and I am taking heavy load...

Re: 13k pending compaction tasks but ZERO running?

2013-03-14 Thread Hiller, Dean
Duh me. I forgot to mention I ran "nodetool compact " and it was done in 45 seconds and I still had a 36G file(darn, I am usually better about putting the detail in my emailsŠI thought I added that). I also went into JMX and then did it there. I ran it again and it took 15 seconds. My script i

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Alain RODRIGUEZ
>From output.log with debug enable i have more information. ... DEBUG 15:29:58,885 collecting 0 of 2147483647: users:bloom_filter_fp_chance:false:8@1363254809666000 DEBUG 15:29:58,885 collecting 1 of 2147483647: users:caching:false:9@1363254809666000 DEBUG 15:29:58,886 collecting 2 of 2147483647:

Re: 13k pending compaction tasks but ZERO running?

2013-03-14 Thread Michael Theroux
One more warning (which I'm sure you know, but in case others see this), nodetool compact does a major compaction for STS, and is in general, not recommended for STS. I only ran it on the tables we've converted to LCS. -Mike On Mar 14, 2013, at 11:26 AM, Michael Theroux wrote: > Hi Dean, > >

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Hiller, Dean
Did you try restoring the snapshots you took and downgrading to 1.1.6 temporarily to get the node back online? That typically works fine. I feel your pain. We are still waiting on 12 more nodes and until then we are barely trying to make our cluster stay up and it is pretty much nearly maxed

Re: 13k pending compaction tasks but ZERO running?

2013-03-14 Thread Michael Theroux
Hi Dean, I saw the same behavior when we switched from STCS to LCS on a couple of our tables. Not sure why it doesn't proceed immediately (I pinged the list, but didn't get any feedback). However, running nodetool compact got things moving for me. -Mike On Mar 14, 2013, at 10:44 AM, Hille

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Michal Michalski
"on my workstation with a < 0.01% sample of production" Is there a simple way of getting that ? On "Cassandra level"? Nope. I just had to prepare these data "manualy" using software we develop on very small input. I understand that it might not be so easy in all the use cases, as it was in

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Alain RODRIGUEZ
@Dean "It is expensive?" I was talking about a full time QA environment equal or similar to a prod env. I didn't thought about using a temp QA, and you are right I should have. "And sorry for not providing the detail on the rolling restart not working….my bad" No problem, my point was just to

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Michal Michalski
We have no such environment. It is expensive, we can't afford this for now. We do have QA cluster, but before even trying the 1.1.0 -> 1.1.9 / 1.2.1 upgrade on it (we were a bit undecided about the version ;-) ), I did some experiments using ccm ( https://github.com/pcmanus/ccm ) on my workst

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Hiller, Dean
It is expensive?……personally, sorry, I don't really buy that since I spent less than 400 bucks on 100 servers at amazon to play with for 1 or 2 hours or maybe it was 8 hours…I can't remember AND you can use small instances for a test like this. You can write EC2 scripts to startup a QA system f

13k pending compaction tasks but ZERO running?

2013-03-14 Thread Hiller, Dean
How do I get my node to run through the 13k pending compaction tasks? I had to use iptables to take the ring out of the cluster for now and he is my only node still on STCS. In cassandra-cli, it shows LCS but on disk, I see a 36Gig file(ie. Must be STCS still). How can I get the 13k pending t

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Alain RODRIGUEZ
@Aaron "You can try to reset the cluster ring state by doing a rolling restart passing -Dcassandra.load_ring_state=false as a JVM param in cassandra-env.sh" Now my can't restart properly. I stop restarting and last logged message is: INFO [SSTableBatchOpen:1] 2013-03-14 14:36:09,813 SSTableReade

Backup solution

2013-03-14 Thread Rene Kochen
Hi all, Is the following a good backup solution. Create two data-centers: - A live data-center with multiple nodes (commodity hardware). Clients connect to this cluster with LOCAL_QUORUM. - A backup data-center with 1 node (with fast SSDs). Clients do not connect to this cluster. Cluster only us

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Hiller, Dean
You should really be testing this stuff in QA. We had the exact same issue from 1.1.4 to 1.2.2. In QA, we decided we could take an outage so we tested taking every node down, upgrading every node and bringing the cluster back online. This worked perfectly so we rolled it into production….prod

Re: About the heap

2013-03-14 Thread Hiller, Dean
Oh, and one other way to lower your RAM is to scale out….add more machines. Since bloomfilters use up a lot of memory, doubling your cluster and significantly reduce your RAM usage. We have switched to LCS but are being forced to double our cluster as well which reduces RAM quite a bit. Thoug

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread aaron morton
> ERROR [RequestResponseStage:2] 2013-03-14 09:53:25,750 > AbstractCassandraDaemon.java (line 135) Exception in thread > Thread[RequestResponseStage:2,5,main] > java.io.IOError: java.io.EOFException > at > org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.ja

Re: About the heap

2013-03-14 Thread Alain RODRIGUEZ
"Using half as many m1.xlarge is the way to go." OK, good to know. Are you getting too much GC or running OOM ? GC, it is always gc, I neved had OOM as far as I remember. "Are you using the default GC configuration ?" Yes, as I don't know a lot about it and think default should be fine. Is ca

Re: Row cache off-heap ?

2013-03-14 Thread aaron morton
> Should I raise a ticket since we are at least 3 having this issue from what I > saw in the mailing list ? Sure, if you can come up with steps to reproduce the problem. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com

Re: repair hangs

2013-03-14 Thread aaron morton
> 1. is this a nodetool bug? is there any way to propagate the > java.io.IOException back to nodetool? The repair continues to work even if nodetool fails, it's a server side thing. > 2. network problems on EC2, I'm shocked! are there recommended > network settings for EC2? Streaming does not p

Re: About the heap

2013-03-14 Thread aaron morton
> Because of this I have an unstable cluster and have no other choice than use > Amazon EC2 xLarge instances when we would rather use twice more EC2 Large > nodes. m1.xlarge is a MUCH better choice than m1.large. You get more ram and better IO and less steal. Using half as many m1.xlarge is the

Re: Accessing timestamp of a cassandra column Using CQL3

2013-03-14 Thread Gabriel Ciuloaica
documentation is here: http://www.datastax.com/docs/1.2/cql_cli/using/writetime Cheers, Gabi On 3/14/13 3:19 PM, Haithem Jarraya wrote: nice! Thanks Gabi! On 14 March 2013 13:14, Gabriel Ciuloaica > wrote: You can. Ex: select writetime(avatar) from a

Re: Accessing timestamp of a cassandra column Using CQL3

2013-03-14 Thread Haithem Jarraya
nice! Thanks Gabi! On 14 March 2013 13:14, Gabriel Ciuloaica wrote: > You can. > > Ex: > select writetime(avatar) from avatars where id=1; > > Br, > Gabi > > On 3/14/13 3:12 PM, aaron morton wrote: > > I'm not sure you can. > > Sylvain / Michael? Is this possible? > > Cheers > >

Re: Pig / Map Reduce on Cassandra

2013-03-14 Thread aaron morton
Did the example work as it was presented in the README.txt ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 13/03/2013, at 11:31 AM, cscetbon@orange.com wrote: > Finally I've found the answer in CassandraStorag

Re: Accessing timestamp of a cassandra column Using CQL3

2013-03-14 Thread Gabriel Ciuloaica
You can. Ex: select writetime(avatar) from avatars where id=1; Br, Gabi On 3/14/13 3:12 PM, aaron morton wrote: I'm not sure you can. Sylvain / Michael? Is this possible? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.co

Re: Accessing timestamp of a cassandra column Using CQL3

2013-03-14 Thread aaron morton
I'm not sure you can. Sylvain / Michael? Is this possible? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 13/03/2013, at 10:19 AM, Haithem Jarraya wrote: > Hi there, > > I am wondering if it's possible to acces

Re: commitlog -deleted keyspaces.

2013-03-14 Thread aaron morton
It might be this https://issues.apache.org/jira/browse/CASSANDRA-4201 As dean said, if you drain the node during shut down you can delete the logs. If you are paranoid just move them out of the way. > Another thing I have noticed is that upon restarts, the old keyspaces that > were deleted re-

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Alain RODRIGUEZ
We have it set to 0.0.0.0 but anyway, as told before, I don't think our problem come from this bug. 2013/3/14 Michal Michalski > > It will happen if your rpc_address is set to 0.0.0.0. >> > > Ops, it's not what I meant ;-) > It will happen, if your rpc_address is set to IP that is not defined

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Michal Michalski
It will happen if your rpc_address is set to 0.0.0.0. Ops, it's not what I meant ;-) It will happen, if your rpc_address is set to IP that is not defined in your cluster's config (e.g. in cassandra-topology.properties for PropertyFileSnitch) M. M. W dniu 14.03.2013 13:03, Alain RODRIGU

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Alain RODRIGUEZ
Well it seems I have nothing like this when I run a $grep "Unknown host" /var/log/cassandra/system.log. This issue was reported in 1.2.1 and commited to the trunk. It may have been fixed in 1.2.2 even if I can't see the release version from the jira nor can I see it in the changelog. Thanks again

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Michal Michalski
Just to make it clear: This bug will occur on single-DC configuration too. In our case it resulted in Exception like this at the very end of node startup: ERROR [WRITE-/] 2013-02-27 12:14:55,433 CassandraDaemon.java (line 133) Exception in thread Thread[WRITE-/,5,main] java.lang.RuntimeExcep

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Alain RODRIGUEZ
Thanks for this pointer but I don't think this is the source of our problem since we use 1 data center and Ec2Snitch. 2013/3/14 Jean-Armel Luce > Hi Alain, > > Maybe it is due to https://issues.apache.org/jira/browse/CASSANDRA-5299 > > A patch is provided with this ticket. > > Regards. > > Jea

Re: Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Jean-Armel Luce
Hi Alain, Maybe it is due to https://issues.apache.org/jira/browse/CASSANDRA-5299 A patch is provided with this ticket. Regards. Jean Armel 2013/3/14 Alain RODRIGUEZ > Hi > > We just tried to migrate our production cluster from C* 1.1.6 to 1.2.2. > > This has been a disaster. I just switch o

Failed migration from 1.1.6 to 1.2.2

2013-03-14 Thread Alain RODRIGUEZ
Hi We just tried to migrate our production cluster from C* 1.1.6 to 1.2.2. This has been a disaster. I just switch one node to 1.2.2, updated its configuration (cassandra.yaml / cassandra-env.sh) and restart it. It resulted on error on all the 5 remaining 1.1.6 nodes : ERROR [RequestResponseSta

Re: Row cache off-heap ?

2013-03-14 Thread Alain RODRIGUEZ
Thanks, I'll let you know when I'll do so. But any Idea about the increase of the heap used if all seems to be well configured ? Should I raise a ticket since we are at least 3 having this issue from what I saw in the mailing list ? 2013/3/14 aaron morton > > No, I didn't. I used the nodetool s