Re: seed faq

2011-04-20 Thread Narendra Sharma
Here are some more details that might help: 1. You are right that Seeds are referred on startup to learn about the ring. 2. It is a good idea to have more than 1 seed. Seed is not SPoF. Remember Gossip also provides eventual consistency. So if seed is missing, the new node may not have the correct

Re: CQL in future 8.0 cassandra will work as I'm expecting ?

2011-04-20 Thread Constantin Teodorescu
Thank you very much Ellis, I heard about Brisk two weeks ago and I'm already checking DataStax web site twice a day waiting for Brisk to come. It seems that it will be a good solution for us. Once more question please: the Brisk way of operation need to transfer intermediate data from Cassandra st

Re: Limitations on number of secondary indexes

2011-04-20 Thread aaron morton
AFAIk the practical limit comes from creating a CF for each secondary index that has the same memtable settings as the containing CF. And the extra time it takes to maintain the index for each indexed column, sample the files when starting up, hold their bloom filters and index samples, keep the

Re: CQL in future 8.0 cassandra will work as I'm expecting ?

2011-04-20 Thread Jonathan Ellis
You want to run map/reduce jobs for your use case. You can already do this with Cassandra (http://wiki.apache.org/cassandra/HadoopSupport), and DataStax is introducing Brisk soon to make it easier: http://www.datastax.com/products/brisk On Wed, Apr 20, 2011 at 9:36 PM, Jonathan Ellis wrote: > CQL

Re: Multi-DC Deployment

2011-04-20 Thread Terje Marthinussen
Sure, the update queue could just as well replicate problems, but the queue would be a lot simpler than cassandra and it would not modify already acknowledged data like like for instance compaction or read-repair/hint deliveries may. There is a fair bit of re-writing/re-assemblying of data even tho

Re: CQL in future 8.0 cassandra will work as I'm expecting ?

2011-04-20 Thread Jonathan Ellis
CQL changes the API, that is all. On Wed, Apr 20, 2011 at 5:40 PM, Constantin Teodorescu wrote: > My use case is as follows: we are using in 70% of the jobs information > retrieval using keys, column names and ranges and up to now, what we have > tested suits our need. > However, the rest of 30%

seed faq

2011-04-20 Thread Maki Watanabe
I made self answered faqs on seed after reading the wiki and code. If I misunderstand something, please point out to me. == What are seeds? == Seeds, or seed nodes are the nodes which new nodes refer to on bootstrap to know ring information. When you add a new node to ring, you need to specify at

Re: Multi-DC Deployment

2011-04-20 Thread Adrian Cockcroft
Queues replicate bad data just as well as anything else. The biggest source of bad data is broken app code... You will still need to implement a reconciliation/repair checker, as queues have their own failure modes when they get backed up. We have also looked at using queues to bounce data between

Re: system_* consistency level?

2011-04-20 Thread William Oberman
That was the trick. Thanks! On Apr 20, 2011, at 6:05 PM, Jonathan Ellis wrote: > See the comments for describe_schema_versions. > > On Wed, Apr 20, 2011 at 4:59 PM, William Oberman > wrote: >> Hi, >> >> My unit tests started failing once I upgraded from a single node cassandra >> cluster to a

Re: Multi-DC Deployment

2011-04-20 Thread Terje Marthinussen
Assuming that you generally put an API on top of this, delivering to two or more systems then boils down to a message queue issue or some similar mechanism which handles secure delivery of messages. Maybe not trivial, but there are many products that can help you with this, and it is a lot easier t

CQL in future 8.0 cassandra will work as I'm expecting ?

2011-04-20 Thread Constantin Teodorescu
My use case is as follows: we are using in 70% of the jobs information retrieval using keys, column names and ranges and up to now, what we have tested suits our need. However, the rest of 30% of the jobs involve full sequential scan of all records in the database. I found some web pages describin

Re: Cannot find row when using 3 indices for search, able to find it using only 2

2011-04-20 Thread Constantin Teodorescu
Thank you, I'll wait for 0.7.5 distribution when it will be shipped to test it again! Up to now, I'm satisfied with cassandra, we are evaluating it for migrating our PostgreSQL solution to a mixed [couchdb + bigcouch + cassandra] architecture ! Best regards, Teo On Thu, Apr 21, 2011 at 1:15 AM, Jo

Re: Cannot find row when using 3 indices for search, able to find it using only 2

2011-04-20 Thread Jonathan Ellis
sounds like https://issues.apache.org/jira/browse/CASSANDRA-2347 On Wed, Apr 20, 2011 at 5:10 PM, Constantin Teodorescu wrote: > Cassandra 0.7.4 on 4 nodes Linux Ubuntu 10.10 i386 , 32 bit > root@bigcouch-106:/etc/cassandra# nodetool -h 172.16.1.106 ring > Address         Status State   Load    

Re: Ec2 Stress Results

2011-04-20 Thread Jonathan Ellis
A few months ago I was seeing 12k writes/s on a single EC2 XL. So something is wrong. My first suspicion is that your client node may be the bottleneck. On Wed, Apr 20, 2011 at 2:56 PM, Alex Araujo wrote: > Does anyone have any Ec2 benchmarks/experiences they can share?  I am trying > to get a s

Cannot find row when using 3 indices for search, able to find it using only 2

2011-04-20 Thread Constantin Teodorescu
Cassandra 0.7.4 on 4 nodes Linux Ubuntu 10.10 i386 , 32 bit root@bigcouch-106:/etc/cassandra# nodetool -h 172.16.1.106 ring Address Status State LoadOwnsToken 172.16.1.104Up Normal 1.8 GB 22.33% 4778396862879243066278530647513341098 172.16.1.8 Up

Re: system_* consistency level?

2011-04-20 Thread Jonathan Ellis
See the comments for describe_schema_versions. On Wed, Apr 20, 2011 at 4:59 PM, William Oberman wrote: > Hi, > > My unit tests started failing once I upgraded from a single node cassandra > cluster to a full "N" node cluster (I'm starting with 4).  I had a few > various bugs, mostly due to forget

Re: Internal error processing get_range_slices

2011-04-20 Thread Jonathan Ellis
"internal error" means an error on the server. check the server log for the stacktrace. On Wed, Apr 20, 2011 at 11:54 AM, Renato Bacelar da Silveira < renat...@indabamobile.co.za> wrote: > Hi all > > I am just augmenting the information on the following error: > > -- error --

system_* consistency level?

2011-04-20 Thread William Oberman
Hi, My unit tests started failing once I upgraded from a single node cassandra cluster to a full "N" node cluster (I'm starting with 4). I had a few various bugs, mostly due to forgetting to read/write at a quorum level in places I needed stronger consistency guarantees. But, I kept getting rand

Re: Multi-DC Deployment

2011-04-20 Thread Adrian Cockcroft
Hi Terje, If you feed data to two rings, you will get inconsistency drift as an update to one succeeds and to the other fails from time to time. You would have to build your own read repair. This all starts to look like "I don't trust Cassandra code to work, so I will write my own buggy one off ve

Ec2 Stress Results

2011-04-20 Thread Alex Araujo
Does anyone have any Ec2 benchmarks/experiences they can share? I am trying to get a sense for what to expect from a production cluster on Ec2 so that I can compare my application's performance against a sane baseline. What I have done so far is: 1. Lunched a 4 node cluster of m1.xlarge inst

Re: cluster IP question and Jconsole?

2011-04-20 Thread Tyler Hobbs
See the first entry in http://wiki.apache.org/cassandra/JmxGotchas On Wed, Apr 20, 2011 at 9:54 AM, tinhuty he wrote: > Maki, > > Yes you are right, 8081 is mx4j port, the JMX_PORT is 8001 in the > cassandra-env.sh. > > in the cassandra Linux server itself, I can run this successfully: > nodetoo

Re: NotSerializableException of FutureTask

2011-04-20 Thread Jonathan Ellis
You must be using an old Cassandra and/or nodetool; current nodetool calls forceBlockingFlush which does not try to return a Future over JMX. On Wed, Apr 20, 2011 at 9:38 AM, Desimpel, Ignace wrote: > Using own JMX java code and when using the NodeTool I get the following > exception when calling

Re: Ec2Snitch + NetworkTopologyStrategy if only in one region?

2011-04-20 Thread William Oberman
Also for the new users like me, don't assume DC1 is a keyword like I did. A working example of a keyspace in EC2 is: create keyspace test with replication_factor=3 and strategy_options = [{us-east:3}] and placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy'; For a single DC

Re: cluster IP question and Jconsole?

2011-04-20 Thread tinhuty he
Maki, Yes you are right, 8081 is mx4j port, the JMX_PORT is 8001 in the cassandra-env.sh. in the cassandra Linux server itself, I can run this successfully: nodetool -host x -p 8001 ring x is the actually IP address however when I run the same command in another windows machine(which

NotSerializableException of FutureTask

2011-04-20 Thread Desimpel, Ignace
Using own JMX java code and when using the NodeTool I get the following exception when calling the forceFlush function. But it seems that the flushing itself is started although the exception occurred. Any idea? (running jdk 1.6, 64 bits) Ignace 2011-04-20 16:23:45 INFO ColumnFamilyStor

RE: Question about AbstractType class

2011-04-20 Thread Desimpel, Ignace
Thanks Sylvain. Your answer already helped me out a lot! I was using a ByteBuffer.get function that is changing the ByteBuffer's position. And I got all kinds of stranges effects and exceptions I didn't get in 0.6.x. Changed that code and all problems are gone... Many thanks!! Ignace -Origina

Re: Question about AbstractType class

2011-04-20 Thread Sylvain Lebresne
On Wed, Apr 20, 2011 at 3:06 PM, Desimpel, Ignace wrote: > As said above, the remaing bytes won't (always) be the actual bytes. Sorry I answered a bit quickly, I meant to say that the actual bytes won't (always) be the full backing array. That is, we never guarantee that BB.arrayOffset() == 0, no

RE: Different result after restart

2011-04-20 Thread Desimpel, Ignace
Aaron, Already found out what the problem was. I was using an AbstractType comparator for a column family. That code was changing the given ByteBuffer position and was not supposed to do that (Hinted by Sylvain Lebresne !). Anyway, after correcting that problem I got back the results as before

RE: Question about AbstractType class

2011-04-20 Thread Desimpel, Ignace
-Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: Wednesday, April 20, 2011 2:07 PM To: user@cassandra.apache.org Subject: Re: Question about AbstractType class On Wed, Apr 20, 2011 at 1:35 PM, Desimpel, Ignace wrote: > Cassandra version 0.7.4 > > > > Hi, >

RE: Different result after restart

2011-04-20 Thread Desimpel, Ignace
I'm using the org.apache.cassandra.thrift.CassandraDeamon implementation. I have done the same with version 0.6.x but now modified the code for version 0.7.4. I could restart without problem in 0.6.x. I get (did not add them all) the following messages : (Keyspace is 'SearchSpace', CF names li

Re: Question about AbstractType class

2011-04-20 Thread Sylvain Lebresne
On Wed, Apr 20, 2011 at 1:35 PM, Desimpel, Ignace wrote: > Cassandra version 0.7.4 > > > > Hi, > > > > I created my own java class as an extension of the AbstractType class. But > I’m not sure about the following items related to the compare function : > > # The remaining bytes of the buffer somet

Re: Different result after restart

2011-04-20 Thread aaron morton
Checking the simple things first, are you using the o.a.c.service.EmbeddedCassandraService or the o.a.c.EmbeddedServer in the unit test directory ? The later deletes data, but it does not sound like you are using it. When the server starts it should read any commit logs, roll them forward and

Question about AbstractType class

2011-04-20 Thread Desimpel, Ignace
Cassandra version 0.7.4 Hi, I created my own java class as an extension of the AbstractType class. But I'm not sure about the following items related to the compare function : # The remaining bytes of the buffer sometimes is zero during thrift get_slice execution, however I never store any

Re: Tombstones and memtable_operations

2011-04-20 Thread Héctor Izquierdo Seliva
El mié, 20-04-2011 a las 23:00 +1200, aaron morton escribió: > Looks like a bug, I've added a patch > here https://issues.apache.org/jira/browse/CASSANDRA-2519 > > > Aaron > That was fast! Thanks Aaron

Different result after restart

2011-04-20 Thread Desimpel, Ignace
Cassandra version 0.7.4 Hi, I'm storing (no deletion) in a small test some records to an embedded Cassandra instance. Then I connect using Thrift and I can retrieve the data as excepted. Then I restart the server with the embedded Cassandra, reconnect using Thrift but now the same quer

Re: Tombstones and memtable_operations

2011-04-20 Thread aaron morton
Looks like a bug, I've added a patch here https://issues.apache.org/jira/browse/CASSANDRA-2519 Aaron On 20 Apr 2011, at 13:15, aaron morton wrote: > Thats what I was looking for, thanks. > > At first glance the behaviour looks inconsistent, we count the number of > columns in the delete muta

Re: pig + hadoop

2011-04-20 Thread pob
my false, ignore last post. 2011/4/20 pob > Hi, > > everything works fine with cassandra 0.7.5, but when I tried with 0.7.3 > another errors showed up, but task finished with success whats strange. > > > 2011-04-20 11:45:40,674 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from att

Re: pig + hadoop

2011-04-20 Thread pob
Hi, everything works fine with cassandra 0.7.5, but when I tried with 0.7.3 another errors showed up, but task finished with success whats strange. 2011-04-20 11:45:40,674 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201104201139_0004_m_00_3: Error: java.lang.ClassNot

Re: pig + hadoop

2011-04-20 Thread pob
Hi, that was the problem! Thanks, you should pick that stuff into your documentation. Thanks for help! Best, P 2011/4/20 Jeremy Hanna > Just as an example: > > >cassandra.thrift.address >10.12.34.56 > > >cassandra.thrift.port >9160 > > >cassandra.partitioner.cl