Re: i have one mistake in Cassandra.java when i build it

2010-04-06 Thread 叶江
thanks,just as Ellis say,i rebuild the thrift . thanks for your help. 2010/4/6 Jonathan Ellis > This means you rebuilt the Thrift code with an old compiler. > > If you look in lib/ the thrift jar is tagged with the svn revision we > built with. Thrift has frequent regressions, so using that sam

Re: A question of 'referential integrity'...

2010-04-06 Thread Tatu Saloranta
On Tue, Apr 6, 2010 at 2:12 PM, Steve wrote: ... > Should I assume that it isn't common practice to write updates > atomically in-real time, and batch process them 'off-line' to increase > the atomic granularity?  It seems an obvious strategy... possibly one > for which an implementation might use

Re: Net::Cassandra::Easy deletion failed

2010-04-06 Thread Mike Gallamore
On 04/06/2010 01:36 PM, Ted Zlatanov wrote: On Tue, 06 Apr 2010 13:24:45 -0700 Mike Gallamore wrote: MG> Thanks for the reply. The newest version of the module I see on CPAN MG> is 0.08b. I actually had 0.07 installed and am using 0.6beta3 for MG> cassandra. Is there somewhere else I should

Re: A question of 'referential integrity'...

2010-04-06 Thread Steve
On 06/04/2010 21:40, Benjamin Black wrote: > I suggest the reasons you list (which are certainly great reasons!) > are also the reasons there is no referential integrity or transaction > support. Quite. I'm not trying to make recommendations for how Cassandra should be changed to be more like a

Re: Consistent counters (was: Memcached protocol)

2010-04-06 Thread Jonathan Ellis
On Tue, Apr 6, 2010 at 2:27 PM, Paul Prescod wrote: > I *believe* that the key messages of those blog posts was: >  1. Using distributed vector clocks are easy once they are implemented. >  2. Implementing distributed vector clocks is hard on the datastore vendor. >  3. If you have long-term netwo

Re: How do vector clocks and conflicts work?

2010-04-06 Thread Mike Malone
On Tue, Apr 6, 2010 at 11:03 AM, Tatu Saloranta wrote: > On Tue, Apr 6, 2010 at 8:45 AM, Mike Malone wrote: > >> As long as the conflict resolver knows that two writers each tried to > >> increment, then it can increment twice. The conflict resolver must know > >> about the semantics of "incremen

Re: A question of 'referential integrity'...

2010-04-06 Thread Benjamin Black
I suggest the reasons you list (which are certainly great reasons!) are also the reasons there is no referential integrity or transaction support. It seems the common practice of using a system like Zookeeper for the synchronization parts alongside Cassandra would be applicable here. Have you inv

Re: Net::Cassandra::Easy deletion failed

2010-04-06 Thread Ted Zlatanov
On Tue, 06 Apr 2010 13:24:45 -0700 Mike Gallamore wrote: MG> Thanks for the reply. The newest version of the module I see on CPAN MG> is 0.08b. I actually had 0.07 installed and am using 0.6beta3 for MG> cassandra. Is there somewhere else I should look for the 0.09 version MG> of the module? I'

Re: Net::Cassandra::Easy deletion failed

2010-04-06 Thread Mike Gallamore
Thanks for the reply. The newest version of the module I see on CPAN is 0.08b. I actually had 0.07 installed and am using 0.6beta3 for cassandra. Is there somewhere else I should look for the 0.09 version of the module? I'll also upgrade to the release candidate version of Cassandra and see if

Re: Net::Cassandra::Easy deletion failed

2010-04-06 Thread Ted Zlatanov
On Tue, 06 Apr 2010 11:07:03 -0700 Mike Gallamore wrote: MG> Seems to be internal to java/cassandra itself. MG> I have some tests and I want to make sure that I have a "clean slate" MG> each time I run the test. Clean as far as my code cares is that MG> "value" is not defined. I'm running "bin

problem with Net::Cassanda::Easy deleting columns

2010-04-06 Thread Mike Gallamore
Hello I tried to post this earlier but something seems to have gone wrong with sending the message. I have a test perl script that I'm using to test the behaviour of some of my existing code. It is important that the values start in a clean state at the beginning of the tests, as I'm incrementing

Consistent counters (was: Memcached protocol)

2010-04-06 Thread Paul Prescod
I *believe* that the key messages of those blog posts was: 1. Using distributed vector clocks are easy once they are implemented. 2. Implementing distributed vector clocks is hard on the datastore vendor. 3. If you have long-term network partitions you're kind of screwed (which is probably tr

Re: How do vector clocks and conflicts work?

2010-04-06 Thread gabriele renzi
On Tue, Apr 6, 2010 at 9:11 AM, Paul Prescod wrote: > This may be the blind leading the blind... > On Mon, Apr 5, 2010 at 11:54 PM, Tatu Saloranta > wrote: >>... > >> >> I think the key is that this is not automatic -- there is no general >> mechanism for aggregating distinct modifications. Point

Re: odd problem retrieving binary values using C++

2010-04-06 Thread Jonathan Ellis
Glad it's working now! :) On Tue, Apr 6, 2010 at 12:14 PM, Chris Beaumont wrote: > mmmh...  well... wasn't long before I figured out the problem sits between > the chair and the keyboard!!! > > I had a bad case of copy/paste dealing with super-columns and multiple > rows in the actual code (origi

Re: A question of 'referential integrity'...

2010-04-06 Thread Steve
On 06/04/2010 18:53, Tatu Saloranta wrote: >> I've read all about QUORUM, and it is generally useful, but as far as I >> can tell, it can't give me a transaction... >> > Correct. Only individual operations are atomic, and ordering of > insertions is not guaranteed. > As I thought. > I think

Re: A question of 'referential integrity'...

2010-04-06 Thread Steve
On 06/04/2010 18:50, Benjamin Black wrote: > I'm finding this exchange very confusing. What exactly about > Cassandra 'looks absolutely ideal' to you for your project? The write > performance, the symmetric, peer to peer architecture, etc? > Reasons I like Cassandra for this project: * C

Net::Cassandra::Easy deletion failed

2010-04-06 Thread Mike Gallamore
Seems to be internal to java/cassandra itself. I have some tests and I want to make sure that I have a "clean slate" each time I run the test. Clean as far as my code cares is that "value" is not defined. I'm running "bin/cassandra -f" with the default install/options. So at the beginning of

Re: How do vector clocks and conflicts work?

2010-04-06 Thread Tatu Saloranta
On Tue, Apr 6, 2010 at 8:45 AM, Mike Malone wrote: >> As long as the conflict resolver knows that two writers each tried to >> increment, then it can increment twice. The conflict resolver must know >> about the semantics of "increment" or "decrement" or "string append" or >> "binary patch" or wha

Re: Overwhelming a cluster with writes?

2010-04-06 Thread Tatu Saloranta
On Tue, Apr 6, 2010 at 8:17 AM, Jonathan Ellis wrote: > On Tue, Apr 6, 2010 at 2:13 AM, Ilya Maykov wrote: >> That does sound similar. It's possible that the difference I'm seeing >> between ConsistencyLevel.ZERO and ConsistencyLevel.ALL is simply due >> to the fact that using ALL slows down the

Re: Heap sudden jump during import

2010-04-06 Thread Tatu Saloranta
On Tue, Apr 6, 2010 at 12:15 AM, JKnight JKnight wrote: > When import, all data in json file will load in memory. So that, you can not > import large data. > You need to export large sstable file to many small json files, and run > import. Why would you ever read the whole file in memory? JSON is

Re: A question of 'referential integrity'...

2010-04-06 Thread Tatu Saloranta
On Tue, Apr 6, 2010 at 10:12 AM, Steve wrote: > On 06/04/2010 15:26, Eric Evans wrote: ... > I've read all about QUORUM, and it is generally useful, but as far as I > can tell, it can't give me a transaction... Correct. Only individual operations are atomic, and ordering of insertions is not guar

Re: A question of 'referential integrity'...

2010-04-06 Thread Benjamin Black
I'm finding this exchange very confusing. What exactly about Cassandra 'looks absolutely ideal' to you for your project? The write performance, the symmetric, peer to peer architecture, etc? b

Re: how to store list data in Apache Cassndra ?

2010-04-06 Thread Tatu Saloranta
On Tue, Apr 6, 2010 at 8:06 AM, Shuge Lee wrote: >>     'girls': pickle.dumps(['java', 'actionscript', 'python']) > > I think this is a really bad idea, I can't do any search if using Pickle. Just to be sure: are you thinking of traditional queries, lookups by values (find entries that have certa

Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?

2010-04-06 Thread Chris Goffinet
http://issues.apache.org/jira/browse/CASSANDRA-704 http://issues.apache.org/jira/browse/CASSANDRA-721 We have our own internal codebase of Cassandra at Digg. But we are using those above patches until we have the vector clock work cleaned up, that patch will also goto jira. Most likely the vecto

Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?

2010-04-06 Thread S Ahmed
Chris, When you so patch, does that mean for Cassandra or your own internal codebase? Sounds interesting thanks! On Tue, Apr 6, 2010 at 12:54 PM, Chris Goffinet wrote: > That's not true. We have been using the Zookeper work we posted on jira. > That's what we are using internally and have been

Re: odd problem retrieving binary values using C++

2010-04-06 Thread Chris Beaumont
mmmh... well... wasn't long before I figured out the problem sits between the chair and the keyboard!!! I had a bad case of copy/paste dealing with super-columns and multiple rows in the actual code (original post was wayyy stripped). Everything is fine and returning the proper buffer size (

Re: A question of 'referential integrity'...

2010-04-06 Thread Steve
On 06/04/2010 15:26, Eric Evans wrote: > On Tue, 2010-04-06 at 12:00 +0100, Steve wrote: > >> First, I apologise sending this to the 'dev' mailing list - I couldn't >> find one for Cassandra users - and also for the basic nature of my >> questions... >> > user@cassandra.apache.org, (follow-

Re: cassandra data viewer?

2010-04-06 Thread AJ Chen
that looks good. is there a similar cassandra tool in java? On Mon, Apr 5, 2010 at 5:59 PM, selam wrote: > look at chiton on github. > > On Tue, Apr 6, 2010 at 3:06 AM, AJ Chen wrote: > > Is there a generic GUI tool for viewing cassandra datastore? being able > to > > view and edit data from a

Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?

2010-04-06 Thread Chris Goffinet
That's not true. We have been using the Zookeper work we posted on jira. That's what we are using internally and have been for months. We are now just wrapping up our vector clocks + distributed counter patch so we can begin transitioning away from the Zookeeper approach because there are proble

Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?

2010-04-06 Thread S Ahmed
Is it just the counters they are using mysql/postgresql for or also the list of stories? e.g. get me the top stories in category x. On Tue, Apr 6, 2010 at 12:50 PM, Ryan King wrote: > They don't use cassandra for it yet. > > -ryan > > On Tue, Apr 6, 2010 at 9:00 AM, S Ahmed wrote: > > From wha

Re: if cassandra isn't ideal for keep track of counts, how does digg count diggs?

2010-04-06 Thread Ryan King
They don't use cassandra for it yet. -ryan On Tue, Apr 6, 2010 at 9:00 AM, S Ahmed wrote: > From what I read in another thread, Cassandra isn't used for isn't 'ideal' > for keeping track of counts. > For example, I would undertand this to mean keeping track of which stories > were dugg. > If thi

odd problem retrieving binary values using C++

2010-04-06 Thread Chris Beaumont
Hi all... I am having a pretty tough time retrieving binary values out of my DB... I am using cassandra 0.5.1 on Centos 5.4 with java 1.6.0-19 Here is the simple test I am trying to run in C++ /* snip initialization */ { transport->open(); ColumnPath new_col; new_col.__isset.colum

if cassandra isn't ideal for keep track of counts, how does digg count diggs?

2010-04-06 Thread S Ahmed
>From what I read in another thread, Cassandra isn't used for isn't 'ideal' for keeping track of counts. For example, I would undertand this to mean keeping track of which stories were dugg. If this is true, how would a site like digg keep track of the 'dugg' counter? Also, I am assuming with ev

Re: Memcached protocol?

2010-04-06 Thread Jonathan Ellis
On Mon, Apr 5, 2010 at 6:48 PM, Tatu Saloranta wrote: > I would think that there is also possibility of losing some > increments, or perhaps getting duplicate increments? > It is not just isolation but also correctness that is hard to maintain > but correctness also. This can be more easily worked

Re: How do vector clocks and conflicts work?

2010-04-06 Thread Mike Malone
> > As long as the conflict resolver knows that two writers each tried to > increment, then it can increment twice. The conflict resolver must know > about the semantics of "increment" or "decrement" or "string append" or > "binary patch" or whatever other merge strategy you choose. You'll register

Re: Overwhelming a cluster with writes?

2010-04-06 Thread Jonathan Ellis
On Tue, Apr 6, 2010 at 2:13 AM, Ilya Maykov wrote: > That does sound similar. It's possible that the difference I'm seeing > between ConsistencyLevel.ZERO and ConsistencyLevel.ALL is simply due > to the fact that using ALL slows down the writers enough that the GC > can keep up. No, it's mostly d

Re: how to store list data in Apache Cassndra ?

2010-04-06 Thread Shuge Lee
> 'girls': pickle.dumps(['java', 'actionscript', 'python']) I think this is a really bad idea, I can't do any search if using Pickle. > use a SuperColumnFamily sounds nice :) I'm try handle it with following: value = { 'name': 'Lee Li', 'age'; '21', 'girls': { 'java': '

Re: A question of 'referential integrity'...

2010-04-06 Thread Eric Evans
On Tue, 2010-04-06 at 12:00 +0100, Steve wrote: > First, I apologise sending this to the 'dev' mailing list - I couldn't > find one for Cassandra users - and also for the basic nature of my > questions... user@cassandra.apache.org, (follow-ups there). > I'm trying to get my head around the possib

Re: i have one mistake in Cassandra.java when i build it

2010-04-06 Thread Jonathan Ellis
This means you rebuilt the Thrift code with an old compiler. If you look in lib/ the thrift jar is tagged with the svn revision we built with. Thrift has frequent regressions, so using that same revision is the best way to avoid unpleasant surprises. On Tue, Apr 6, 2010 at 4:34 AM, 叶江 wrote: >

Re: i have one mistake in Cassandra.java when i build it

2010-04-06 Thread Sylvain Lebresne
That's a bit short of a description, but quite possibly you have a mismatch somehow between your compiled thrift java bindings and cassandra itself. Try to do an ant clean followed by ant gen-thrift-java followed by ant to rebuild everything. -- Sylvain On Tue, Apr 6, 2010 at 11:34 AM, 叶江

i have one mistake in Cassandra.java when i build it

2010-04-06 Thread 叶江
hi: i want to take some experiments on cassandra by java, but when i write client,a mistake can not convert int to ConsistencyLevel appear, so how can i solve ? thanks very much.

Re: how to store list data in Apache Cassndra ?

2010-04-06 Thread David Strauss
Another option is to use a SuperColumnFamily, but that extends the depth of all such values to be arrays. The "name" and "age" columns would therefore also need to be SuperColumns -- just with a single sub-column each. Like many things in Cassandra, the preferred storage method depends on your app

Re: Overwhelming a cluster with writes?

2010-04-06 Thread Jake Luciani
Hi Ilya, You will always blow up if you use consistancy level zero to write gigs of data. The safe minimum for writes is ONE. Zero is meant for small non batched writes. Also look into batch_mutation call to write lots of data at once, in a series of chunks. this helps save on network

Re: how to store list data in Apache Cassndra ?

2010-04-06 Thread Michael Pearson
Column Families are keyed attribute/value pairs, your 'girls' column will need to be serialised on save, and deserialiased on load so that it can treated as your intended array. Pickle will do this for you (http://docs.python.org/library/pickle.html) eg: import pycassa import pickle client =

how to store list data in Apache Cassndra ?

2010-04-06 Thread Shuge Lee
Dear firends: how to store list data in Apache Cassndra ? For example: user['lee'] = { 'name': 'lee', 'age'; '21', 'girls': ['java', 'actionscript', 'python'], } Notice key `gils` I using pycassa (a python lib of cassandra) import pycassa client = pycassa.connect() cf = pycassa.Co

Re: Memcached protocol?

2010-04-06 Thread gabriele renzi
On Tue, Apr 6, 2010 at 2:10 AM, Paul Prescod wrote: > On Mon, Apr 5, 2010 at 4:48 PM, Tatu Saloranta wrote: >> ... >> >> I would think that there is also possibility of losing some >> increments, or perhaps getting duplicate increments? > > I believe that with vector clocks in Cassandra 0.7 you w

Re: Overwhelming a cluster with writes?

2010-04-06 Thread Benjamin Black
OK, cool, looking forward to your results. On Tue, Apr 6, 2010 at 12:18 AM, Ilya Maykov wrote: > Right, I meant 4GB heap vs. the standard 1GB. And all other options in > cassandra.in.sh at their defaults. > > Sorry I am a bit new to JVM tuning, and very new to Cassandra :) > > -- Ilya > > On Tue,

Re: Overwhelming a cluster with writes?

2010-04-06 Thread Ilya Maykov
Right, I meant 4GB heap vs. the standard 1GB. And all other options in cassandra.in.sh at their defaults. Sorry I am a bit new to JVM tuning, and very new to Cassandra :) -- Ilya On Tue, Apr 6, 2010 at 12:16 AM, Benjamin Black wrote: > I am specifically suggesting you NOT use a heap that large

Re: Overwhelming a cluster with writes?

2010-04-06 Thread Benjamin Black
I am specifically suggesting you NOT use a heap that large with your 8GB machines. Please test with 4GB first. On Tue, Apr 6, 2010 at 12:13 AM, Ilya Maykov wrote: > That does sound similar. It's possible that the difference I'm seeing > between ConsistencyLevel.ZERO and ConsistencyLevel.ALL is s

Re: Heap sudden jump during import

2010-04-06 Thread JKnight JKnight
When import, all data in json file will load in memory. So that, you can not import large data. You need to export large sstable file to many small json files, and run import. On Mon, Apr 5, 2010 at 5:26 PM, Jonathan Ellis wrote: > Usually sudden heap jumps involve compacting large rows. > > 0.

Re: Overwhelming a cluster with writes?

2010-04-06 Thread Ilya Maykov
That does sound similar. It's possible that the difference I'm seeing between ConsistencyLevel.ZERO and ConsistencyLevel.ALL is simply due to the fact that using ALL slows down the writers enough that the GC can keep up. I could do a test with multiple clients writing at ALL in parallel tomorrow. I

How do vector clocks and conflicts work?

2010-04-06 Thread Paul Prescod
This may be the blind leading the blind... On Mon, Apr 5, 2010 at 11:54 PM, Tatu Saloranta wrote: >... > I think the key is that this is not automatic -- there is no general > mechanism for aggregating distinct modifications. Point being that you > could choose one amongst right answers, but not

Re: Flush Commit Log

2010-04-06 Thread JKnight JKnight
Yes, no problem with my live Cassandra server. Thanks, Jonathan. On Mon, Apr 5, 2010 at 11:19 PM, Jonathan Ellis wrote: > On Mon, Apr 5, 2010 at 9:11 PM, JKnight JKnight > wrote: > > Thanks Jonathan, > > > > When I run "nodeprobe flush" with parameter -host is Cassandra server > setup > > on m

Re: Overwhelming a cluster with writes?

2010-04-06 Thread Rob Coli
On 4/5/10 11:48 PM, Ilya Maykov wrote: No, the disks on all nodes have about 750GB free space. Also as mentioned in my follow-up email, writing with ConsistencyLevel.ALL makes the slowdowns / crashes go away. I am not sure if the above is consistent with the cause of #896, but the other sympto