Re:Cassandra + Hadoop - 2 Task attempts with million of rows

2013-04-23 Thread Shamim
Hello Aron, We have build up our new cluster from the scratch with version 1.2 - partition murmor3. We are not using vnodes at all. Actually log is clean and nothing serious, now investigating logs and post soon if found something criminal >>> Our cluster is evenly partitioned (Murmur3Partitio

Re: Ec2Snitch to Ec2MultiRegionSnitch

2013-04-23 Thread Alain RODRIGUEZ
"If you are only using one Available Zone per region then you have only one rack per DC and the NetworkTopologyStrategy will do the right thing." So you mean this part doesn't need more testing ? This will work for sure ? Did you already did it yourself ? "Because you are going to replicate your

Re: readable (not hex encoded) column names using sstable2json

2013-04-23 Thread aaron morton
What the CF definition ? What are the errors you are getting? > We're trying to move data over to another cluster but this prevents us from > doing so. Is there a reason you are converting the SSTables to JSON ? You could just copy the sstables. Cheers - Aaron Morton Freelanc

Re: Ec2Snitch to Ec2MultiRegionSnitch

2013-04-23 Thread aaron morton
> You are advising me to test it, what would be a good way of testing it (I can > use AWS EC2 instances if needed) ? If you are only using one Available Zone per region then you have only one rack per DC and the NetworkTopologyStrategy will do the right thing. > Why ? I mean we have maybe only

Re: How to find total number of rows in Cassandra databaase?

2013-04-23 Thread aaron morton
cassandra-cli has some good online help. There are no features to count rows as cassandra does not count them, but it it's only 1,000 try using list; You can also see the number of rows by using nodetool cfstats. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand

Re: 'sstableloader' is not recognized as an internal or external command,

2013-04-23 Thread aaron morton
> Is sstableloader supported in windows, looking at the source it seems to be > unix shell file? Yup. If you would like to put together an sstableloader.bat file use the sstablekeys.bat file as a template but use org.apache.cassandra.tools.BulkLoader and the CASSANDRA_MAIN If you can get it w

Re: Unable to drop secondary index

2013-04-23 Thread aaron morton
That sounds horrible. The log messages seem fine to me. It's handling eventually updating the secondary indexes. Good luck. - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 23/04/2013, at 6:34 PM, Michal Michalski wrote

Re: Insert into column which is of DateType

2013-04-23 Thread aaron morton
Have you tried to Astyanax example and use the Date override ? https://github.com/Netflix/astyanax/wiki/Writing-data http://netflix.github.io/astyanax/javadoc/com/netflix/astyanax/ColumnMutation.html#putValue(java.util.Date, java.lang.Integer) Cheers - Aaron Morton Freelance Ca

Re: Datastax Java Driver connection issue

2013-04-23 Thread aaron morton
> Just for clarification, why it is necessary to set the server rpc address to > 127.0.0.1? It's not necessary for it to be 127.0.0.1. But it is necessary for the server to be listening for client connections (the rpc_address) on the same interface / IP you are trying to connect to. In your ca

Re: com.datastax.driver.core.exceptions.InvalidQueryException using Datastax Java driver

2013-04-23 Thread aaron morton
> Can I insert into Column Family (that I created from CLI mode) using Datastax > Java driver or not with Cassandra 1.2.3? No. Create you table using CQL 3 via the cqlsh. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com

Re: Cassandra + Hadoop - 2 Task attempts with million of rows

2013-04-23 Thread aaron morton
>> Our cluster is evenly partitioned (Murmur3Partitioner) Murmor3Partitioner is only available in 1.2 and changing partitioners is not supported. Did you change from Random Partitioner under 1.1? Are you using virtual nodes in your 1.2 cluster ? >> We have roughly 97million rows in our cluster.

Re: loading all rows from cassandra using multiple (python) clients in parallel

2013-04-23 Thread aaron morton
> > EDIT: works after switching to testing against the lastest version of the > cassandra database (doh!), and also updating the syntax per notes below: http://stackoverflow.com/questions/16137944/loading-all-rows-from-cassandra-using-multiple-python-clients-in-parallel Is this still a problem?

Re: Building SSTables using SSTableSimpleUnsortedWriter (v. 1.2.3)

2013-04-23 Thread aaron morton
You should be able to call CompositeType.getInstance(List> types) to construct a CompositeType with the appropriate components. Then call CompositeType.decompose() with a list of the values for the key, that will get you a byte buffer. Cheers - Aaron Morton Freelance Cassandra

Re: How to make compaction run faster?

2013-04-23 Thread Hiller, Dean
I assume you are trying to maximize your PER node write throughput? If not determining per node throughput, just add more nodes so your nodes can keep up. That is the easiest way. Finding that sweet spot of per node write throughtput will take some doing. If compaction can't keep up, the rea

Re: How to make compaction run faster?

2013-04-23 Thread Jay Svc
Thanks Aaron, The parameters I tried above are set one at a time, based on what I observed, the problem at the core is that "can compaction catch up with write speed". I have gone up to 30,000 to 35000 writes per second. I do not see number of writes a much issue either. I see compaction is not

Re: move data from Cassandra 1.1.6 to 1.2.4

2013-04-23 Thread Hiller, Dean
But 1.1.4 does not have Vnodes, right? In that case, I would baby step it doing the upgrade to 1.24 first on the old nodes, and after that is done then adding the new nodes in, and after that is done then decommissioning the old nodes…finally I would convert to vnodes….and I would try that all

Re: move data from Cassandra 1.1.6 to 1.2.4

2013-04-23 Thread Wei Zhu
Hi Dean, It's a bit different case for us. We will have a set of new machines to replace the old ones and we want to migrate those data over. I would imagine to do something like * Let new nodes (with VNodes) join the cluster * decommission the old nodes. (Without VNodes) Thanks.

Re: move data from Cassandra 1.1.6 to 1.2.4

2013-04-23 Thread Hiller, Dean
We went from 1.1.4 to 1.2.2 and in QA rolling restart failed but in production and QA bringing down the whole cluster upgrading every node and then bringing it back up worked fine. We left ours at randompartitioner and had LCS as well. We did not convert to Vnodes at all. Don't know if it hel

move data from Cassandra 1.1.6 to 1.2.4

2013-04-23 Thread Wei Zhu
Hi, We are trying to upgrade from 1.1.6 to 1.2.4, it's not really a live upgrade. We are going to retire the old hardware and bring in a set of new hardware for 1.2.4.  For old cluster, we have 5 nodes with RF = 3, total of 1TB data. For new cluster, we will have 10 nodes with RF = 3. We will use

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Hiller, Dean
Nice, Thanks, Dean From: Sylvain Lebresne mailto:sylv...@datastax.com>> Reply-To: "user@cassandra.apache.org" mailto:user@cassandra.apache.org>> Date: Tuesday, April 23, 2013 11:31 AM To: "user@cassandra.apache.org" mailto:user@

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Sylvain Lebresne
On Tue, Apr 23, 2013 at 6:02 PM, Hiller, Dean wrote: > Out of curiosity, why did cassandra choose to re-invent the wheel instead > of using something like google protobuf which spans multiple languages? I see it as a step better than thrift since it is really only defining > message format and h

Re: Advice on memory warning

2013-04-23 Thread Haithem Jarraya
We are facing similar issue, and we are not able to have the ring stable. We are using C*1.2.3 on Centos6, 32GB - RAM, 8GB-heap, 6 Nodes. The total data ~ 84gb (which is relatively small for C* to handle, with a RF of 3). Our application is heavy read, we see the GC complaints in all nodes, I cop

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Edward Capriolo
Cassandra has a non thrift protocol called the "native protocol" aka "cql binary protocol" http://www.datastax.com/docs/1.2/cql_cli/cql_binary_protocol It is its own port, with it's own protocol, and it does not have thrift methods. In my opinion, switching from the thrift to the native protocol

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Hiller, Dean
Out of curiosity, why did cassandra choose to re-invent the wheel instead of using something like google protobuf which spans multiple languages? I see it as a step better than thrift since it is really only defining message format and has all sorts of goodies with it. I think you only need to

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Stuart Broad
Aha - got it. Thanks for everyones help. I think I will stick with the prepare/execute CQL (with the InvalidRequestException check) for now. I will take a look at the driver you mentioned though. Cheers, Stuart On Tue, Apr 23, 2013 at 4:55 PM, Sylvain Lebresne wrote: > When we speak of "bi

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Sylvain Lebresne
When we speak of "binary protocol", we talk about the protocol introduced in Cassandra 1.2 that is an alternative to thrift for CQL3. It's a custom, binary, protocol, that has not link to thrift whatsoever. That protocol is defined by the document here: https://git-wip-us.apache.org/repos/asf?p=ca

Re: Advice on memory warning

2013-04-23 Thread Ralph Goers
We are using DSE, which I believe is also 1.1.9. We have basically had a non-usable cluster for months due to this error. In our case, once it starts doing this it starts flushing sstables to disk and eventually fills up the disk to the point where it can't compact. If we catch it soon enough

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Stuart Broad
Hi Edward, My understanding was that thrift supports a number of protocols (binary being one of them). I don't understand what switching to "binary protocol" but not using thrift means. Can you point me to any code examples? Regards, Stuart On Tue, Apr 23, 2013 at 4:21 PM, Edward Capriolo wr

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Edward Capriolo
Having to catch the exception and parse it is a bit ugly, however this is close to what someone might do with an SQLException to determine if the error was transient etc. If there is an error code it is possible that it could be added as an optional property of the InvalidRequestException in futur

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Stuart Broad
Hi Edward, Thanks for your reply - I was already using the prepare/execute cql methods that you suggested. My problem is that these methods 'mask' the PreparedQueryNotFoundException as an InvalidRequestException. At present I catch the InvalidRequestException (when cassandra has been re-started)

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Edward Capriolo
Thrift has a prepare_cql call which returns an ID. Then it has an exececute_cql call which takes the id and a map or variable bindings. On Tue, Apr 23, 2013 at 10:29 AM, Stuart Broad wrote: > Hi all, > > I just realised that the binary protocol is the low-level thrift api that > I was originall

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Stuart Broad
Hi all, I just realised that the binary protocol is the low-level thrift api that I was originally using (Cassandra.Client>> get / insert ...). How can a prepared statement be called through the thrift api (i.e. not the cql methods)? Cheers, Stuart On Tue, Apr 23, 2013 at 11:48 AM, Stuart Bro

Re: Plans for CQL3 (non-compact storage) table support in Cassandra's Pig support

2013-04-23 Thread cscetbon.ext
+1 We're also waiting for this bugfix :( -- Cyril SCETBON On Apr 23, 2013, at 2:42 PM, Ondřej Černoš mailto:cern...@gmail.com>> wrote: Hi all, is there someone on this list knowledgable enough about the plans for support on non-compact storage tables (https://issues.apache.org/jira/browse/CA

Plans for CQL3 (non-compact storage) table support in Cassandra's Pig support

2013-04-23 Thread Ondřej Černoš
Hi all, is there someone on this list knowledgable enough about the plans for support on non-compact storage tables ( https://issues.apache.org/jira/browse/CASSANDRA-5234) in Cassandra's Pig support? Currently Pig cannot be used with Cassandra 1.2 and CQL3-only tables and this hurts a lot (I found

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Stuart Broad
Hi Sylvain, Thanks for your response. I am handling the 'PreparedQueryNotFoundException' more for the case of a cassandra re-start (rather then expecting to build 10 statements). I am not familiar with the binary protocol - which class/methods should I look at? Regards, Stuart On Tue, A

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Sylvain Lebresne
In thrift, a lot of exceptions (like PreparedQueryNotFoundException) are simply returned as InvalidRequestException. The reason for that was a mix of not wanting to change the thrift API too much and because we didn't knew how to return a lot of different exception with thrift without making it hor

Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-23 Thread Stuart Broad
Hi Sorin, The PreparedQueryNotFoundException is not thrown from Cassandra.Client>>execute_prepared_cql3_query method. I created some prepared statements and then re-started cassandra and received the following exception: InvalidRequestException(why: Prepared query with ID 1124421588 not found (e

readable (not hex encoded) column names using sstable2json

2013-04-23 Thread Hans Melgers
Hello, Using Cassandra 1.0.7 sstable2json on some tables I get readable column names. This leads to problems (java.lang.NumberFormatException: Non-hex characters in) when importing later. We're trying to move data over to another cluster but this prevents us from doing so. Could it have to do wit

RE: ordered partitioner

2013-04-23 Thread Desimpel, Ignace
I got into problems on starting a new database. That starts up ok. Then I add a keyspace and it goes wrong. The error came from DefsTable.mergeSchema (working on version 1.2.1). It starts by fetching the current keyspace and stores it in a map of DecoratedKey elements. Then it applies the mutati

Re: Ec2Snitch to Ec2MultiRegionSnitch

2013-04-23 Thread Alain RODRIGUEZ
Hi,these advice are very welcome. @Dane, about the rack awareness, we use only one rack per DC, so I guess using EC2MultiRegionSnitch will do just fine and it doesn't need any configuration. Does it seem right to you. If we are someday interested on multi racks I will make sure to use them properl