Re: quorum calculation seems to depend on previous selected nodes

2011-01-18 Thread Stephen Connolly
On 18 January 2011 07:15, Samuel Benz  wrote:
> On 01/17/2011 09:28 PM, Jonathan Ellis wrote:
>> On Mon, Jan 17, 2011 at 2:10 PM, Samuel Benz  wrote:
> Case1:
> If 'TEST' was previous stored on Node1, Node2, Node3 -> The update will
> succeed.
>
> Case2:
> If 'TEST' was previous stored on Node2, Node3, Node4 -> The update will
> not work.

 If you have RF=2 then it will be stored on 2 nodes, not 3.  I think
 this is the source of the confusion.

>>>
>>> I checked the existence of the row on the different serverver with
>>> sstablekeys after flushing. So I saw three copies of every key in the
>>> cluster.
>>
>> If you want to be guaranteed to be able to read with two nodes down
>> and RF=3, you have to read at CL.ONE, since if the two nodes that are
>> down are replicas of the data you are reading (as in the 2nd case
>> here) Cassandra will be unable to achieve quorum (quorum of 3 is 2
>> live nodes).
>>
>
> Now it seems clear to me. Thanks!
>
> I was confused by the fact that: "live nodes" != "replica live nodes"
>
> Correct me if I'm wrong, but even in a cluster with 1000 nodes and RF=3,
> if I shut down the wrong two nodes, i have the same problem as in my
> mini cluster.

Correct

>
>
> --
> Sam
>


RE: Super CF or two CFs?

2011-01-18 Thread Steven Mac

Some of the fields are indeed written in one shot, but others (such as label 
and categories) are added later, so I think the question still stands.

Hugo.

From: dri...@gmail.com
Date: Mon, 17 Jan 2011 18:47:28 -0600
Subject: Re: Super CF or two CFs?
To: user@cassandra.apache.org

On Mon, Jan 17, 2011 at 5:12 PM, Steven Mac  wrote:







I guess I was maybe trying to simplify the question too much. In reality I do 
not have one volatile part, but multiple ones (say all trading data of day). 
Each would be a supercolumn identified by the time slot, with the individual 
fields as subcolumns.



If you're always going to write these attributes in one shot, then just 
serialize them and use a simple CF, there's no need for a SCF.
-Brandon

  

Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread David Boxenhorn
Thanks. In other words, before I delete something, I should check to see
whether it exists as a live row in the first place.

On Tue, Jan 18, 2011 at 9:24 AM, Ryan King  wrote:

> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn 
> wrote:
> > If I delete a row, and later on delete it again, before GCGraceSeconds
> has
> > elapsed, does the tombstone live longer?
>
> Each delete is a new tombstone, which should answer your question.
>
> -ryan
>
> > In other words, if I have the following scenario:
> >
> > GCGraceSeconds = 10 days
> > On day 1 I delete a row
> > On day 5 I delete the row again
> >
> > Will the tombstone be removed on day 10 or day 15?
> >
>


Cassandra/Hadoop only write few columns

2011-01-18 Thread Trung Tran
Hi,

I'm working on ColumnFamilyOutputFormat and for some reasons my reduce
class does not write all columns to cassandra. I tried to modify
mapreduce.output.columnfamilyoutputformat.batch.threshold with some
different values (1, 8, .. etc) but no thing changes.

What i'm having in my reduce class is :

ArrayList a = new ArrayList();

a.add(getMutation(colNam1, val1));
a.add(getMutation(colNam2, val2));
a.add(getMutation(colNam2, val2)); ...etc

context.write(key,a);

Only 2 columns are written in to cassandra, and no error log is found
on both hadoop and cassandra log. Any help is appreciated.

Thanks,
Trung.


Re: Super CF or two CFs?

2011-01-18 Thread Aaron Morton
With regard to overwrites, and assuming you always want to get all the data for 
a stock ticker. Any read on the volatile data will potentially touch many 
sstables, this IO is unavoidable to read this data so we may as well read as 
many cols as possible at this time. Whereas if you split the data into two cf's 
you would incure all the IO for the volatile data plus IO for the non volatile, 
and have to make two calls. (Or use different keys and make a multiget_slice 
call, the IO argument still stands)

Thanks to compaction less volatile data, say cols that are written once a day, 
week or month, will be tend to accrete into fewer sstables. To that end it may 
make sense to schedule compactions to run after weekly bulk operations. Also 
take a look at the per CF compaction thresholds.

I'd recommend trying one standard CF (with the quotes packed as suggested) to 
start with, run some tests and let us know how you go. There are some small 
penalties to using super Cfs, see the limitations page on the wiki.

Hope that helps.
Aaron



On 18/01/2011, at 9:29 PM, Steven Mac  wrote:

> Some of the fields are indeed written in one shot, but others (such as label 
> and categories) are added later, so I think the question still stands.
> 
> Hugo.
> 
> From: dri...@gmail.com
> Date: Mon, 17 Jan 2011 18:47:28 -0600
> Subject: Re: Super CF or two CFs?
> To: user@cassandra.apache.org
> 
> On Mon, Jan 17, 2011 at 5:12 PM, Steven Mac  wrote:
> I guess I was maybe trying to simplify the question too much. In reality I do 
> not have one volatile part, but multiple ones (say all trading data of day). 
> Each would be a supercolumn identified by the time slot, with the individual 
> fields as subcolumns.
> 
> If you're always going to write these attributes in one shot, then just 
> serialize them and use a simple CF, there's no need for a SCF.
> 
> -Brandon


Is there a concept of a session

2011-01-18 Thread indika kumara
Hi All,

Is there a concept of a session? I would like to log-in(authenticate) one
time into the Cassandra, and then subsequently access the Cassandra without
authenticating again.

Thanks,

Indika


Re: Is there a concept of a session

2011-01-18 Thread Aaron Morton
Yes, the client should maintain it's connection to the cluster. The connection 
holds the login credentials and the keyspace to use.

This is normally managed by the client, which one are you using?

Aaron
On 18/01/2011, at 9:58 PM, indika kumara  wrote:

> Hi All,
> 
> Is there a concept of a session? I would like to log-in(authenticate) one 
> time into the Cassandra, and then subsequently access the Cassandra without 
> authenticating again.  
> 
> Thanks,
> 
> Indika


Re: balancing load

2011-01-18 Thread Karl Hiramoto

On 17/01/2011 19:27, Edward Capriolo wrote:

cfstats is reporting you have an 8GB Row! I think you could be writing
all your data to a few keys.
Your right,  my n00b fault,   I was writing everything to one key,  the 
problem was i had   Offer['id'][$UID] = value

it made it easy before to do a "count Offer['id'] in the CLI"
I was doing that for months, but only recently went from one node to 
five and added a million times more data exposed the issue.


Now everything balances well.

Thanks all.



Re: What is be the best possible client option available to a PHP developer for implementing an application ready for production environments ?

2011-01-18 Thread Dave Gardner
I can't comment of phpcassa directly, but we use Cassandra plus PHP in
production without any difficulties. We are happy with the
performance.

Most of the information we needed to get started we found here:

https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP

This includes details on how to compile the native PHP C Extension for
Thrift. We use a bespoke client which wraps the Thrift interface.

You may be better of with a higher level client, although when we were
starting out there was less of a push away from Thrift directly. I
found using Thrift useful as you gain an appreciation for what calls
Cassandra actually supports. One potential advantage of using a higher
level client is that it may protect you from the frequent Thrift
interface changes which currently seem to accompany every major
release.

Dave




On Tuesday, 18 January 2011, Tyler Hobbs  wrote:
>
> 1. )  Is it devloped to the level in order to support all the
> necessary features to take full advantage of Cassandra?
>
> Yes.  There aren't some of the niceties of pycassa yet, but you can do 
> everything that Cassandra offers with it.
>
>
> 2. )  Is it used in production by anyone ?
>
> Yes, I've talked to a few people at least who are using it in production.  It 
> tends to play a limited role instead of a central one, though.
>
>
> 3. )  What are its limitations?
>
> Being written in PHP.  Seriously.  The lack of universal 64bit integer 
> support can be problematic if you don't have a fully 64bit system.  PHP is 
> fairly slow.  PHP makes a few other things less easy to do.  If you're doing 
> some pretty lightweight interaction with Cassandra through PHP, these might 
> not be a problem for you.
>
> - Tyler
>
>

-- 
*Dave Gardner*
Technical Architect

[image: imagini_58mmX15mm.png]   [image: VisualDNA-Logo-small.png]

*Imagini Europe Limited*
7 Moor Street, London W1D 5NB

[image: phone_icon.png] +44 20 7734 7033
[image: skype_icon.png] daveg79
[image: emailIcon.png] dave.gard...@imagini.net
[image: icon-web.png] http://www.visualdna.com

Imagini Europe Limited, Company number 5565112 (England
and Wales), Registered address: c/o Bird & Bird,
90 Fetter Lane, London, EC4A 1EQ, United Kingdom


Re: Cassandra/Hadoop only write few columns

2011-01-18 Thread Aaron Morton
May just be your example code, but you are repeating colName2 . Can you log the 
mutation list before you write it and confirm you have unique column names?

Can you turn up the logging to DEBUG for the hadoop job and the Cassandra 
cluster to see what's happening?

Aaron

On 18/01/2011, at 9:40 PM, Trung Tran  wrote:

> Hi,
> 
> I'm working on ColumnFamilyOutputFormat and for some reasons my reduce
> class does not write all columns to cassandra. I tried to modify
> mapreduce.output.columnfamilyoutputformat.batch.threshold with some
> different values (1, 8, .. etc) but no thing changes.
> 
> What i'm having in my reduce class is :
> 
> ArrayList a = new ArrayList();
> 
> a.add(getMutation(colNam1, val1));
> a.add(getMutation(colNam2, val2));
> a.add(getMutation(colNam2, val2)); ...etc
> 
> context.write(key,a);
> 
> Only 2 columns are written in to cassandra, and no error log is found
> on both hadoop and cassandra log. Any help is appreciated.
> 
> Thanks,
> Trung.


Re: Cassandra/Hadoop only write few columns

2011-01-18 Thread Trung Tran
It was a typo in my example code in this email. I logged the list to
make sure that everything was correct before trigger the write.

Will try to enable debug on both cassandra and hadoop next.

Thanks,
Trung.

On Tue, Jan 18, 2011 at 1:21 AM, Aaron Morton  wrote:
> May just be your example code, but you are repeating colName2 . Can you log 
> the mutation list before you write it and confirm you have unique column 
> names?
>
> Can you turn up the logging to DEBUG for the hadoop job and the Cassandra 
> cluster to see what's happening?
>
> Aaron
>
> On 18/01/2011, at 9:40 PM, Trung Tran  wrote:
>
>> Hi,
>>
>> I'm working on ColumnFamilyOutputFormat and for some reasons my reduce
>> class does not write all columns to cassandra. I tried to modify
>> mapreduce.output.columnfamilyoutputformat.batch.threshold with some
>> different values (1, 8, .. etc) but no thing changes.
>>
>> What i'm having in my reduce class is :
>>
>> ArrayList a = new ArrayList();
>>
>> a.add(getMutation(colNam1, val1));
>> a.add(getMutation(colNam2, val2));
>> a.add(getMutation(colNam2, val2)); ...etc
>>
>> context.write(key,a);
>>
>> Only 2 columns are written in to cassandra, and no error log is found
>> on both hadoop and cassandra log. Any help is appreciated.
>>
>> Thanks,
>> Trung.
>


Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread Aaron Morton
AFAIK that's not necessary, there is no need to worry about previous deletes. 
You can delete stuff that does not even exist, neither batch_mutate or remove 
are going to throw an error.

All the columns that were (roughly speaking) present at your first deletion 
will be available for GC at the end of the first tombstones life. Same for the 
second.

Say you were to write a col between the two deletes with the same name as one 
present at the start. The first version of the col is avail for GC after 
tombstone 1, and the second after tombstone 2. 

Hope that helps
Aaron
On 18/01/2011, at 9:37 PM, David Boxenhorn  wrote:

> Thanks. In other words, before I delete something, I should check to see 
> whether it exists as a live row in the first place. 
> 
> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King  wrote:
> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn  wrote:
> > If I delete a row, and later on delete it again, before GCGraceSeconds has
> > elapsed, does the tombstone live longer?
> 
> Each delete is a new tombstone, which should answer your question.
> 
> -ryan
> 
> > In other words, if I have the following scenario:
> >
> > GCGraceSeconds = 10 days
> > On day 1 I delete a row
> > On day 5 I delete the row again
> >
> > Will the tombstone be removed on day 10 or day 15?
> >
> 


Re: Is there a concept of a session

2011-01-18 Thread indika kumara
Hi Aaron,

Thank you very much.

I am going to use the hector client library. There is a method for creating
a connection for a cluster in that library. But, inside the source code, I
noticed that each time it calls 'login' method. Is there a server-side
session?

Thanks,

Indika

On Tue, Jan 18, 2011 at 3:07 PM, Aaron Morton wrote:

> Yes, the client should maintain it's connection to the cluster. The
> connection holds the login credentials and the keyspace to use.
>
> This is normally managed by the client, which one are you using?
>
> Aaron
> On 18/01/2011, at 9:58 PM, indika kumara  wrote:
>
> > Hi All,
> >
> > Is there a concept of a session? I would like to log-in(authenticate) one
> time into the Cassandra, and then subsequently access the Cassandra without
> authenticating again.
> >
> > Thanks,
> >
> > Indika
>


Re: Is there a concept of a session

2011-01-18 Thread Aaron Morton
I'm just going to assume Hector is doing the right thing, and you probably can 
as well :)

Have you checked out the documentation here ?
http://www.riptano.com/sites/default/files/hector-v2-client-doc.pdf

(also yes the session is server side, each connection has a thread on the 
server it connects to)

Aaron

On 18/01/2011, at 10:40 PM, indika kumara  wrote:

> Hi Aaron,
> 
> Thank you very much.
> 
> I am going to use the hector client library. There is a method for creating a 
> connection for a cluster in that library. But, inside the source code, I 
> noticed that each time it calls 'login' method. Is there a server-side 
> session?
> 
> Thanks,
> 
> Indika
> 
> On Tue, Jan 18, 2011 at 3:07 PM, Aaron Morton  wrote:
> Yes, the client should maintain it's connection to the cluster. The 
> connection holds the login credentials and the keyspace to use.
> 
> This is normally managed by the client, which one are you using?
> 
> Aaron
> On 18/01/2011, at 9:58 PM, indika kumara  wrote:
> 
> > Hi All,
> >
> > Is there a concept of a session? I would like to log-in(authenticate) one 
> > time into the Cassandra, and then subsequently access the Cassandra without 
> > authenticating again.
> >
> > Thanks,
> >
> > Indika
> 


RE: Super CF or two CFs?

2011-01-18 Thread Steven Mac

Thanks for the answer. It provides me the insight I'm looking for.

However, I'm also a bit confused as your first paragraph seems to indicate that 
using a SCF is better, whereas the last sentence states just the opposite. Do I 
interpret correctly that this is because of the compactions that put all 
non-volatile data together in one sstable, leading to compact sstable if the 
non-volatile data is put into a separate CF? Can this then be generalised into 
a rule of thumb to separate non-volatile data from volatile data into separate 
CFs, or am I going too far then?

I will definitely be trying out both suggestions and post my findings.

Hugo.

Subject: Re: Super CF or two CFs?
From: aa...@thelastpickle.com
Date: Tue, 18 Jan 2011 21:54:25 +1300
To: user@cassandra.apache.org

With regard to overwrites, and assuming you always want to get all the data for 
a stock ticker. Any read on the volatile data will potentially touch many 
sstables, this IO is unavoidable to read this data so we may as well read as 
many cols as possible at this time. Whereas if you split the data into two cf's 
you would incure all the IO for the volatile data plus IO for the non volatile, 
and have to make two calls. (Or use different keys and make a multiget_slice 
call, the IO argument still stands)
Thanks to compaction less volatile data, say cols that are written once a day, 
week or month, will be tend to accrete into fewer sstables. To that end it may 
make sense to schedule compactions to run after weekly bulk operations. Also 
take a look at the per CF compaction thresholds.
I'd recommend trying one standard CF (with the quotes packed as suggested) to 
start with, run some tests and let us know how you go. There are some small 
penalties to using super Cfs, see the limitations page on the wiki.
Hope that helps.Aaron


On 18/01/2011, at 9:29 PM, Steven Mac  wrote:


Some of the fields are indeed written in one shot, but others (such as label 
and categories) are added later, so I think the question still stands.

Hugo.

From: dri...@gmail.com
Date: Mon, 17 Jan 2011 18:47:28 -0600
Subject: Re: Super CF or two CFs?
To: user@cassandra.apache.org

On Mon, Jan 17, 2011 at 5:12 PM, Steven Mac  wrote:







I guess I was maybe trying to simplify the question too much. In reality I do 
not have one volatile part, but multiple ones (say all trading data of day). 
Each would be a supercolumn identified by the time slot, with the individual 
fields as subcolumns.



If you're always going to write these attributes in one shot, then just 
serialize them and use a simple CF, there's no need for a SCF.
-Brandon

  
  

Re: Is there a concept of a session

2011-01-18 Thread indika kumara
Thanks Aaron...  Hector cannot uses strategies such as cookies for
maintaining session, so it has to make the authentication call each time?
In the Cassandra server, I see 'ThreadLocal'.  It keeps the
session information?  How long is a session alive?  Does the connection
means a TCP connection?  is it a persistent connection - send and receive
multiple requests/responses?

Thanks,

Indika

On Tue, Jan 18, 2011 at 3:48 PM, Aaron Morton wrote:

> I'm just going to assume Hector is doing the right thing, and you probably
> can as well :)
>
> Have you checked out the documentation here ?
> http://www.riptano.com/sites/default/files/hector-v2-client-doc.pdf
>
> (also yes the session is server side, each connection has a thread on the
> server it connects to)
>
> Aaron
>
> On 18/01/2011, at 10:40 PM, indika kumara  wrote:
>
> Hi Aaron,
>
> Thank you very much.
>
> I am going to use the hector client library. There is a method for creating
> a connection for a cluster in that library. But, inside the source code, I
> noticed that each time it calls 'login' method. Is there a server-side
> session?
>
> Thanks,
>
> Indika
>
> On Tue, Jan 18, 2011 at 3:07 PM, Aaron Morton < 
> aa...@thelastpickle.com> wrote:
>
>> Yes, the client should maintain it's connection to the cluster. The
>> connection holds the login credentials and the keyspace to use.
>>
>> This is normally managed by the client, which one are you using?
>>
>> Aaron
>> On 18/01/2011, at 9:58 PM, indika kumara < 
>> indika.k...@gmail.com> wrote:
>>
>> > Hi All,
>> >
>> > Is there a concept of a session? I would like to log-in(authenticate)
>> one time into the Cassandra, and then subsequently access the Cassandra
>> without authenticating again.
>> >
>> > Thanks,
>> >
>> > Indika
>>
>
>


Java cient

2011-01-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
What is the most commonly used java client library? Which is the the most
mature/feature complete?
Noble


Re: Java cient

2011-01-18 Thread sharanabasava raddi
I think its Hector...

2011/1/18 Noble Paul നോബിള്‍ नोब्ळ् 

> What is the most commonly used java client library? Which is the the most
> mature/feature complete?
> Noble
>


Re: Super CF or two CFs?

2011-01-18 Thread Aaron Morton
Sorry was not suggesting super CF is better in the first para, I think it 
applies to any CF.

The role of compaction is to (among other things) reduce the number of SSTables 
for each CF. The logical endpoint of this process would be a single file for 
each CF, giving the lowest possible IO. The volatility of your data (overwrites 
and new colums for a row) fights against this process. In reality it will not 
get to that endstate. Even in the best case I think it will only go down to 3 
sstables. See http://wiki.apache.org/cassandra/MemtableSSTable

If you do have a some data that is highly volatile, and you have performance 
problems. Then changing compaction thresholds is a recommended approach I 
think. See the comments in Cassandra.yaml.

My argument is for you to keep data in one CF if you want to read it together. 
As always store the data to serve the read requests. Do some tests and see 
where your bottle necks may be for your HW and usage. I may be wrong.

IMHO in this discussion Super or Standard CF will make little performance 
difference, other the super CF limitations mentioned.

Aaron

On 18/01/2011, at 11:14 PM, Steven Mac  wrote:

> Thanks for the answer. It provides me the insight I'm looking for.
> 
> However, I'm also a bit confused as your first paragraph seems to indicate 
> that using a SCF is better, whereas the last sentence states just the 
> opposite. Do I interpret correctly that this is because of the compactions 
> that put all non-volatile data together in one sstable, leading to compact 
> sstable if the non-volatile data is put into a separate CF? Can this then be 
> generalised into a rule of thumb to separate non-volatile data from volatile 
> data into separate CFs, or am I going too far then?
> 
> I will definitely be trying out both suggestions and post my findings.
> 
> Hugo.
> 
> Subject: Re: Super CF or two CFs?
> From: aa...@thelastpickle.com
> Date: Tue, 18 Jan 2011 21:54:25 +1300
> To: user@cassandra.apache.org
> 
> With regard to overwrites, and assuming you always want to get all the data 
> for a stock ticker. Any read on the volatile data will potentially touch many 
> sstables, this IO is unavoidable to read this data so we may as well read as 
> many cols as possible at this time. Whereas if you split the data into two 
> cf's you would incure all the IO for the volatile data plus IO for the non 
> volatile, and have to make two calls. (Or use different keys and make a 
> multiget_slice call, the IO argument still stands)
> 
> Thanks to compaction less volatile data, say cols that are written once a 
> day, week or month, will be tend to accrete into fewer sstables. To that end 
> it may make sense to schedule compactions to run after weekly bulk 
> operations. Also take a look at the per CF compaction thresholds.
> 
> I'd recommend trying one standard CF (with the quotes packed as suggested) to 
> start with, run some tests and let us know how you go. There are some small 
> penalties to using super Cfs, see the limitations page on the wiki.
> 
> Hope that helps.
> Aaron
> 
> 
> 
> On 18/01/2011, at 9:29 PM, Steven Mac  wrote:
> 
> Some of the fields are indeed written in one shot, but others (such as label 
> and categories) are added later, so I think the question still stands.
> 
> Hugo.
> 
> From: dri...@gmail.com
> Date: Mon, 17 Jan 2011 18:47:28 -0600
> Subject: Re: Super CF or two CFs?
> To: user@cassandra.apache.org
> 
> On Mon, Jan 17, 2011 at 5:12 PM, Steven Mac  wrote:
> I guess I was maybe trying to simplify the question too much. In reality I do 
> not have one volatile part, but multiple ones (say all trading data of day). 
> Each would be a supercolumn identified by the time slot, with the individual 
> fields as subcolumns.
> 
> If you're always going to write these attributes in one shot, then just 
> serialize them and use a simple CF, there's no need for a SCF.
> 
> -Brandon


cassandra-cli: where a and b (works) vs. where b and a (doesn't)

2011-01-18 Thread Timo Nentwig
I put a secondary index on rc (IntegerType) and user_agent (AsciiType).

Don't understand this bevahiour at all, can somebody explain?

[default@tracking] get crawler where user_agent=foo and rc=200;

0 Row Returned.
[default@tracking] get crawler where rc=200 and user_agent=foo;   
---
RowKey: -??>2
=> (column=rc, value=200, timestamp=1295347760933000)
=> (column=url, value=http://www/0, timestamp=1295347760933000)
=> (column=user_agent, value=foo, timestamp=1295347760915000)

1 Row Returned.
[default@tracking] get crawler where rc>199 and user_agent=foo;

0 Row Returned.
[default@tracking] get crawler where user_agent=foo; 
---
RowKey: -??>7
=> (column=rc, value=207, timestamp=1295347760935000)
=> (column=url, value=http://www/8, timestamp=1295347760933000)
=> (column=user_agent, value=foo, timestamp=1295347760917000)
---
RowKey: -??>8
=> (column=rc, value=209, timestamp=1295347760935000)
=> (column=url, value=http://www/9, timestamp=1295347760933000)
=> (column=user_agent, value=foo, timestamp=1295347760916000)
---
RowKey: -??>5
=> (column=rc, value=201, timestamp=1295347760937000)
=> (column=url, value=http://www/2, timestamp=1295347760933000)
=> (column=user_agent, value=foo, timestamp=1295347760916000)
---
RowKey: -??>6
=> (column=rc, value=205, timestamp=1295347760935000)
=> (column=url, value=http://www/5, timestamp=1295347760933000)
=> (column=user_agent, value=foo, timestamp=1295347760917000)
---
RowKey: -??>2
=> (column=rc, value=200, timestamp=1295347760933000)
=> (column=url, value=http://www/0, timestamp=1295347760933000)
=> (column=user_agent, value=foo, timestamp=1295347760915000)

5 Rows Returned.



Re: Java cient

2011-01-18 Thread Daniel Lundin
Hector is excellent.

https://github.com/rantav/hector
http://www.datastax.com/sites/default/files/hector-v2-client-doc.pdf

2011/1/18 Noble Paul നോബിള്‍  नोब्ळ् :
> What is the most commonly used java client library? Which is the the most
> mature/feature complete?
> Noble


Re: Is there a concept of a session

2011-01-18 Thread Aaron Morton
There are no cookies in thrift. 

All connection state is managed by the server. It's a tcp connection. Multiple 
request are sent over it,it stays around as long as the client wants it to.

Try the Hector mailing list for details on it's implementation.
Aaron
On 18/01/2011, at 11:15 PM, indika kumara  wrote:

> Thanks Aaron...  Hector cannot uses strategies such as cookies for 
> maintaining session, so it has to make the authentication call each time?  In 
> the Cassandra server, I see 'ThreadLocal'.  It keeps the session 
> information?  How long is a session alive?  Does the connection means a TCP 
> connection?  is it a persistent connection - send and receive multiple 
> requests/responses?
> 
> Thanks,
> 
> Indika
> 
> On Tue, Jan 18, 2011 at 3:48 PM, Aaron Morton  wrote:
> I'm just going to assume Hector is doing the right thing, and you probably 
> can as well :)
> 
> Have you checked out the documentation here ?
> http://www.riptano.com/sites/default/files/hector-v2-client-doc.pdf
> 
> (also yes the session is server side, each connection has a thread on the 
> server it connects to)
> 
> Aaron
> 
> On 18/01/2011, at 10:40 PM, indika kumara  wrote:
> 
>> Hi Aaron,
>> 
>> Thank you very much.
>> 
>> I am going to use the hector client library. There is a method for creating 
>> a connection for a cluster in that library. But, inside the source code, I 
>> noticed that each time it calls 'login' method. Is there a server-side 
>> session?
>> 
>> Thanks,
>> 
>> Indika
>> 
>> On Tue, Jan 18, 2011 at 3:07 PM, Aaron Morton  
>> wrote:
>> Yes, the client should maintain it's connection to the cluster. The 
>> connection holds the login credentials and the keyspace to use.
>> 
>> This is normally managed by the client, which one are you using?
>> 
>> Aaron
>> On 18/01/2011, at 9:58 PM, indika kumara  wrote:
>> 
>> > Hi All,
>> >
>> > Is there a concept of a session? I would like to log-in(authenticate) one 
>> > time into the Cassandra, and then subsequently access the Cassandra 
>> > without authenticating again.
>> >
>> > Thanks,
>> >
>> > Indika
>> 
> 


Re: Java cient

2011-01-18 Thread Aaron Morton
http://wiki.apache.org/cassandra/ClientOptions
Hector


On 18/01/2011, at 11:48 PM, Noble Paul നോബിള്‍  नोब्ळ् 
wrote:

> What is the most commonly used java client library? Which is the the most 
> mature/feature complete?
> Noble


Re: Is there a concept of a session

2011-01-18 Thread indika kumara
Thanks Aaron. I will look into codebase.

Thanks,

Indika

On Tue, Jan 18, 2011 at 4:55 PM, Aaron Morton wrote:

> There are no cookies in thrift.
>
> All connection state is managed by the server. It's a tcp connection.
> Multiple request are sent over it,it stays around as long as the client
> wants it to.
>
> Try the Hector mailing list for details on it's implementation.
> Aaron
>
> On 18/01/2011, at 11:15 PM, indika kumara  wrote:
>
> Thanks Aaron...  Hector cannot uses strategies such as cookies for
> maintaining session, so it has to make the authentication call each time?
> In the Cassandra server, I see 'ThreadLocal'.  It keeps the
> session information?  How long is a session alive?  Does the connection
> means a TCP connection?  is it a persistent connection - send and receive
> multiple requests/responses?
>
> Thanks,
>
> Indika
>
> On Tue, Jan 18, 2011 at 3:48 PM, Aaron Morton < 
> aa...@thelastpickle.com> wrote:
>
>> I'm just going to assume Hector is doing the right thing, and you probably
>> can as well :)
>>
>> Have you checked out the documentation here ?
>> 
>> http://www.riptano.com/sites/default/files/hector-v2-client-doc.pdf
>>
>> (also yes the session is server side, each connection has a thread on the
>> server it connects to)
>>
>> Aaron
>>
>> On 18/01/2011, at 10:40 PM, indika kumara < 
>> indika.k...@gmail.com> wrote:
>>
>> Hi Aaron,
>>
>> Thank you very much.
>>
>> I am going to use the hector client library. There is a method for
>> creating a connection for a cluster in that library. But, inside the source
>> code, I noticed that each time it calls 'login' method. Is there a
>> server-side session?
>>
>> Thanks,
>>
>> Indika
>>
>> On Tue, Jan 18, 2011 at 3:07 PM, Aaron Morton < 
>> 
>> aa...@thelastpickle.com> wrote:
>>
>>> Yes, the client should maintain it's connection to the cluster. The
>>> connection holds the login credentials and the keyspace to use.
>>>
>>> This is normally managed by the client, which one are you using?
>>>
>>> Aaron
>>> On 18/01/2011, at 9:58 PM, indika kumara < 
>>> 
>>> indika.k...@gmail.com> wrote:
>>>
>>> > Hi All,
>>> >
>>> > Is there a concept of a session? I would like to log-in(authenticate)
>>> one time into the Cassandra, and then subsequently access the Cassandra
>>> without authenticating again.
>>> >
>>> > Thanks,
>>> >
>>> > Indika
>>>
>>
>>
>


Re: cassandra-cli: where a and b (works) vs. where b and a (doesn't)

2011-01-18 Thread Aaron Morton
Does wrapping foo in single quotes help?
Also, does this help 
http://www.datastax.com/blog/whats-new-cassandra-07-secondary-indexes

Aaron

On 18/01/2011, at 11:54 PM, Timo Nentwig  wrote:

> I put a secondary index on rc (IntegerType) and user_agent (AsciiType).
> 
> Don't understand this bevahiour at all, can somebody explain?
> 
> [default@tracking] get crawler where user_agent=foo and rc=200;
> 
> 0 Row Returned.
> [default@tracking] get crawler where rc=200 and user_agent=foo;   
> ---
> RowKey: -??>2
> => (column=rc, value=200, timestamp=1295347760933000)
> => (column=url, value=http://www/0, timestamp=1295347760933000)
> => (column=user_agent, value=foo, timestamp=1295347760915000)
> 
> 1 Row Returned.
> [default@tracking] get crawler where rc>199 and user_agent=foo;
> 
> 0 Row Returned.
> [default@tracking] get crawler where user_agent=foo; 
> ---
> RowKey: -??>7
> => (column=rc, value=207, timestamp=1295347760935000)
> => (column=url, value=http://www/8, timestamp=1295347760933000)
> => (column=user_agent, value=foo, timestamp=1295347760917000)
> ---
> RowKey: -??>8
> => (column=rc, value=209, timestamp=1295347760935000)
> => (column=url, value=http://www/9, timestamp=1295347760933000)
> => (column=user_agent, value=foo, timestamp=1295347760916000)
> ---
> RowKey: -??>5
> => (column=rc, value=201, timestamp=1295347760937000)
> => (column=url, value=http://www/2, timestamp=1295347760933000)
> => (column=user_agent, value=foo, timestamp=1295347760916000)
> ---
> RowKey: -??>6
> => (column=rc, value=205, timestamp=1295347760935000)
> => (column=url, value=http://www/5, timestamp=1295347760933000)
> => (column=user_agent, value=foo, timestamp=1295347760917000)
> ---
> RowKey: -??>2
> => (column=rc, value=200, timestamp=1295347760933000)
> => (column=url, value=http://www/0, timestamp=1295347760933000)
> => (column=user_agent, value=foo, timestamp=1295347760915000)
> 
> 5 Rows Returned.
> 


RE: Super CF or two CFs?

2011-01-18 Thread Steven Mac

Thanks for the clarification.

Hugo & Steven.

Subject: Re: Super CF or two CFs?
From: aa...@thelastpickle.com
Date: Tue, 18 Jan 2011 23:51:25 +1300
To: user@cassandra.apache.org

Sorry was not suggesting super CF is better in the first para, I think it 
applies to any CF.
The role of compaction is to (among other things) reduce the number of SSTables 
for each CF. The logical endpoint of this process would be a single file for 
each CF, giving the lowest possible IO. The volatility of your data (overwrites 
and new colums for a row) fights against this process. In reality it will not 
get to that endstate. Even in the best case I think it will only go down to 3 
sstables. See http://wiki.apache.org/cassandra/MemtableSSTable

If you do have a some data that is highly volatile, and you have performance 
problems. Then changing compaction thresholds is a recommended approach I 
think. See the comments in Cassandra.yaml.
My argument is for you to keep data in one CF if you want to read it together. 
As always store the data to serve the read requests. Do some tests and see 
where your bottle necks may be for your HW and usage. I may be wrong.
IMHO in this discussion Super or Standard CF will make little performance 
difference, other the super CF limitations mentioned.
Aaron
On 18/01/2011, at 11:14 PM, Steven Mac  wrote:


Thanks for the answer. It provides me the insight I'm looking for.

However, I'm also a bit confused as your first paragraph seems to indicate that 
using a SCF is better, whereas the last sentence states just the opposite. Do I 
interpret correctly that this is because of the compactions that put all 
non-volatile data together in one sstable, leading to compact sstable if the 
non-volatile data is put into a separate CF? Can this then be generalised into 
a rule of thumb to separate non-volatile data from volatile data into separate 
CFs, or am I going too far then?

I will definitely be trying out both suggestions and post my findings.

Hugo.

Subject: Re: Super CF or two CFs?
From: aa...@thelastpickle.com
Date: Tue, 18 Jan 2011 21:54:25 +1300
To: user@cassandra.apache.org

With regard to overwrites, and assuming you always want to get all the data for 
a stock ticker. Any read on the volatile data will potentially touch many 
sstables, this IO is unavoidable to read this data so we may as well read as 
many cols as possible at this time. Whereas if you split the data into two cf's 
you would incure all the IO for the volatile data plus IO for the non volatile, 
and have to make two calls. (Or use different keys and make a multiget_slice 
call, the IO argument still stands)
Thanks to compaction less volatile data, say cols that are written once a day, 
week or month, will be tend to accrete into fewer sstables. To that end it may 
make sense to schedule compactions to run after weekly bulk operations. Also 
take a look at the per CF compaction thresholds.
I'd recommend trying one standard CF (with the quotes packed as suggested) to 
start with, run some tests and let us know how you go. There are some small 
penalties to using super Cfs, see the limitations page on the wiki.
Hope that helps.Aaron


On 18/01/2011, at 9:29 PM, Steven Mac  wrote:


Some of the fields are indeed written in one shot, but others (such as label 
and categories) are added later, so I think the question still stands.

Hugo.

From: dri...@gmail.com
Date: Mon, 17 Jan 2011 18:47:28 -0600
Subject: Re: Super CF or two CFs?
To: user@cassandra.apache.org

On Mon, Jan 17, 2011 at 5:12 PM, Steven Mac  wrote:







I guess I was maybe trying to simplify the question too much. In reality I do 
not have one volatile part, but multiple ones (say all trading data of day). 
Each would be a supercolumn identified by the time slot, with the individual 
fields as subcolumns.



If you're always going to write these attributes in one shot, then just 
serialize them and use a simple CF, there's no need for a SCF.
-Brandon

  
  
  

Re: cassandra-cli: where a and b (works) vs. where b and a (doesn't)

2011-01-18 Thread Timo Nentwig

On Jan 18, 2011, at 12:02, Aaron Morton wrote:

> Does wrapping foo in single quotes help?

No.

> Also, does this help 
> http://www.datastax.com/blog/whats-new-cassandra-07-secondary-indexes

Actually this doesn't even compile because addGtExpression expects a String 
type (?!).

StringSerializer ss = StringSerializer.get();
IndexedSlicesQuery indexedSlicesQuery = 
HFactory.createIndexedSlicesQuery(keyspace, ss, ss, ss);
indexedSlicesQuery.setColumnNames("full_name", "birth_date", "state");
indexedSlicesQuery.addGtExpression("birth_date", 1970L);
indexedSlicesQuery.addEqualsExpression("state", "UT");
indexedSlicesQuery.setColumnFamily("users");
indexedSlicesQuery.setStartKey("");
QueryResult> result = 
indexedSlicesQuery.execute();

> Aaron
> 
> On 18/01/2011, at 11:54 PM, Timo Nentwig  wrote:
> 
>> I put a secondary index on rc (IntegerType) and user_agent (AsciiType).
>> 
>> Don't understand this bevahiour at all, can somebody explain?
>> 
>> [default@tracking] get crawler where user_agent=foo and rc=200;
>> 
>> 0 Row Returned.
>> [default@tracking] get crawler where rc=200 and user_agent=foo;   
>> ---
>> RowKey: -??>2
>> => (column=rc, value=200, timestamp=1295347760933000)
>> => (column=url, value=http://www/0, timestamp=1295347760933000)
>> => (column=user_agent, value=foo, timestamp=1295347760915000)
>> 
>> 1 Row Returned.
>> [default@tracking] get crawler where rc>199 and user_agent=foo;
>> 
>> 0 Row Returned.
>> [default@tracking] get crawler where user_agent=foo; 
>> ---
>> RowKey: -??>7
>> => (column=rc, value=207, timestamp=1295347760935000)
>> => (column=url, value=http://www/8, timestamp=1295347760933000)
>> => (column=user_agent, value=foo, timestamp=1295347760917000)
>> ---
>> RowKey: -??>8
>> => (column=rc, value=209, timestamp=1295347760935000)
>> => (column=url, value=http://www/9, timestamp=1295347760933000)
>> => (column=user_agent, value=foo, timestamp=1295347760916000)
>> ---
>> RowKey: -??>5
>> => (column=rc, value=201, timestamp=1295347760937000)
>> => (column=url, value=http://www/2, timestamp=1295347760933000)
>> => (column=user_agent, value=foo, timestamp=1295347760916000)
>> ---
>> RowKey: -??>6
>> => (column=rc, value=205, timestamp=1295347760935000)
>> => (column=url, value=http://www/5, timestamp=1295347760933000)
>> => (column=user_agent, value=foo, timestamp=1295347760917000)
>> ---
>> RowKey: -??>2
>> => (column=rc, value=200, timestamp=1295347760933000)
>> => (column=url, value=http://www/0, timestamp=1295347760933000)
>> => (column=user_agent, value=foo, timestamp=1295347760915000)
>> 
>> 5 Rows Returned.
>> 



Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread indika kumara
Moving to user list

On Tue, Jan 18, 2011 at 4:05 PM, Aaron Morton wrote:

> Have a read about JVM heap sizing here
> http://wiki.apache.org/cassandra/MemtableThresholds
>
> If you let people create keyspaces with a mouse click you will soon run out
> of memory.
>
> I use Cassandra to provide a self service "storage service" at my
> organisation. All virtual databases operate in the same Cassandra keyspace
> (which does not change), and I use namespaces in the keys to separate
> things. Take a look at how amazon S3 works, it may give you some ideas.
>
> If you want to continue to discussion let's move this to the user list.
>
> A
>
>
> On 17/01/2011, at 7:44 PM, indika kumara  wrote:
>
> > Hi Stu,
> >
> > In our app,  we would like to offer cassandra 'as-is' to tenants. It that
> > case, each tenant should be able to create Keyspaces as needed. Based on
> the
> > authorization, I expect to implement it. In my view, the implementation
> > options are as follows.
> >
> > 1) The name of a keyspace would be 'the actual keyspace name' + 'tenant
> ID'
> >
> > 2) The name of a keyspace would not be changed, but the name of a column
> > family would be the 'the actual column family name' + 'tenant ID'.  It is
> > needed to keep a separate mapping for keyspace vs tenants.
> >
> > 3) The name of a keypace or a column family would not be changed, but the
> > name of a column would be 'the actual column name' + 'tenant ID'. It is
> > needed to keep separate mappings for keyspace vs tenants and column
> family
> > vs tenants
> >
> > Could you please give your opinions on the above three options?  if there
> > are any issue regarding above approaches and if those issues can be
> solved,
> > I would love to contribute on that.
> >
> > Thanks,
> >
> > Indika
> >
> >
> > On Fri, Jan 7, 2011 at 11:22 AM, Stu Hood  wrote:
> >
> >>> (1) has the problem of multiple memtables (a large amount just isn't
> >> viable
> >> There are some very straightforward solutions to this particular
> problem: I
> >> wouldn't rule out running with a very large number of
> >> keyspace/columnfamilies given some minor changes.
> >>
> >> As Brandon said, some of the folks that were working on multi-tenancy
> for
> >> Cassandra are no longer focused on it. But the code that was generated
> >> during our efforts is very much available, and is unlikely to have gone
> >> stale. Would love to talk about this with you.
> >>
> >> Thanks,
> >> Stu
> >>
> >> On Thu, Jan 6, 2011 at 8:08 PM, indika kumara 
> >> wrote:
> >>
> >>> Thank you very much Brandon!
> >>>
> >>> On Fri, Jan 7, 2011 at 12:40 AM, Brandon Williams 
> >>> wrote:
> >>>
>  On Thu, Jan 6, 2011 at 12:33 PM, indika kumara  > wrote:
> 
> > Hi Brandon,
> >
> > I would like you feedback on my two ideas for implementing mufti
> >>> tenancy
> > with the existing implementation.  Would those be possible to
> >>> implement?
> >
> > Thanks,
> >
> > Indika
> >
> >> Two vague ideas: (1) qualified keyspaces (by the tenet domain)
> >>> (2)
> > multiple Cassandra storage configurations in a single node (one per
> > tenant).
> > For both options, the resource hierarchy would be /cassandra/
> > //keyspaces//
> >
> 
>  (1) has the problem of multiple memtables (a large amount just isn't
> >>> viable
>  right now.)  (2) more or less has the same problem, but in JVM
> >> instances.
> 
>  I would suggest a) not trying to offer cassandra itself, and instead
> >>> build
>  a
>  service that uses cassandra under the hood, and b) splitting up
> tenants
> >>> in
>  this layer.
> 
>  -Brandon
> 
> >>>
> >>
>


Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread indika kumara
Hi Aaron,

I appreciate your help. I am a newbie to Cassandra - just began to study the
code-base.

Do you suggest the following approach?

*1) No changes are in either keyspace names or column family names but the
row-key would be ‘the actual row key’ + 'tenant ID'. It is needed to keep
separate mappings for keyspace vs tenants and column family vs tenants (can
be a form of authorization).*

2) *keep a keyspace per tenant yet expose virtually as many keyspaces.*

3)* A single keyspace for all tenant *

What do you mean by 'use namespaces in the keys'?  Can a key be an QName?

Thanks,

Indika


On Tue, Jan 18, 2011 at 5:26 PM, indika kumara wrote:

> Moving to user list
>
>
> On Tue, Jan 18, 2011 at 4:05 PM, Aaron Morton wrote:
>
>> Have a read about JVM heap sizing here
>> http://wiki.apache.org/cassandra/MemtableThresholds
>>
>> If you let people create keyspaces with a mouse click you will soon run
>> out of memory.
>>
>> I use Cassandra to provide a self service "storage service" at my
>> organization. All virtual databases operate in the same Cassandra keyspace
>> (which does not change), and I use namespaces in the keys to separate
>> things. Take a look at how amazon S3 works, it may give you some ideas.
>>
>> If you want to continue to discussion let's move this to the user list.
>>
>> A
>>
>>
>> On 17/01/2011, at 7:44 PM, indika kumara  wrote:
>>
>> > Hi Stu,
>> >
>> > In our app,  we would like to offer cassandra 'as-is' to tenants. It
>> that
>> > case, each tenant should be able to create Keyspaces as needed. Based on
>> the
>> > authorization, I expect to implement it. In my view, the implementation
>> > options are as follows.
>> >
>> > 1) The name of a keyspace would be 'the actual keyspace name' + 'tenant
>> ID'
>> >
>> > 2) The name of a keyspace would not be changed, but the name of a column
>> > family would be the 'the actual column family name' + 'tenant ID'.  It
>> is
>> > needed to keep a separate mapping for keyspace vs tenants.
>> >
>> > 3) The name of a keypace or a column family would not be changed, but
>> the
>> > name of a column would be 'the actual column name' + 'tenant ID'. It is
>> > needed to keep separate mappings for keyspace vs tenants and column
>> family
>> > vs tenants
>> >
>> > Could you please give your opinions on the above three options?  if
>> there
>> > are any issue regarding above approaches and if those issues can be
>> solved,
>> > I would love to contribute on that.
>> >
>> > Thanks,
>> >
>> > Indika
>> >
>> >
>> > On Fri, Jan 7, 2011 at 11:22 AM, Stu Hood  wrote:
>> >
>> >>> (1) has the problem of multiple memtables (a large amount just isn't
>> >> viable
>> >> There are some very straightforward solutions to this particular
>> problem: I
>> >> wouldn't rule out running with a very large number of
>> >> keyspace/columnfamilies given some minor changes.
>> >>
>> >> As Brandon said, some of the folks that were working on multi-tenancy
>> for
>> >> Cassandra are no longer focused on it. But the code that was generated
>> >> during our efforts is very much available, and is unlikely to have gone
>> >> stale. Would love to talk about this with you.
>> >>
>> >> Thanks,
>> >> Stu
>> >>
>> >> On Thu, Jan 6, 2011 at 8:08 PM, indika kumara 
>> >> wrote:
>> >>
>> >>> Thank you very much Brandon!
>> >>>
>> >>> On Fri, Jan 7, 2011 at 12:40 AM, Brandon Williams 
>> >>> wrote:
>> >>>
>>  On Thu, Jan 6, 2011 at 12:33 PM, indika kumara <
>> indika.k...@gmail.com
>> > wrote:
>> 
>> > Hi Brandon,
>> >
>> > I would like you feedback on my two ideas for implementing mufti
>> >>> tenancy
>> > with the existing implementation.  Would those be possible to
>> >>> implement?
>> >
>> > Thanks,
>> >
>> > Indika
>> >
>> >> Two vague ideas: (1) qualified keyspaces (by the tenet domain)
>> >>> (2)
>> > multiple Cassandra storage configurations in a single node (one per
>> > tenant).
>> > For both options, the resource hierarchy would be /cassandra/
>> > //keyspaces//
>> >
>> 
>>  (1) has the problem of multiple memtables (a large amount just isn't
>> >>> viable
>>  right now.)  (2) more or less has the same problem, but in JVM
>> >> instances.
>> 
>>  I would suggest a) not trying to offer cassandra itself, and instead
>> >>> build
>>  a
>>  service that uses cassandra under the hood, and b) splitting up
>> tenants
>> >>> in
>>  this layer.
>> 
>>  -Brandon
>> 
>> >>>
>> >>
>>
>
>


Re: Java cient

2011-01-18 Thread Jools
We moved over to Hector when we went to Cassandra 0.7, it was a painless and
worthwhile experience.

> What is the most commonly used java client library? Which is the the most
> mature/feature complete?
>

--Jools


Re: Java cient

2011-01-18 Thread Alois Bělaška
Definitelly Pelops https://github.com/s7/scale7-pelops

2011/1/18 Noble Paul നോബിള്‍ नोब्ळ् 

> What is the most commonly used java client library? Which is the the most
> mature/feature complete?
> Noble
>


Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread David Boxenhorn
Thanks, Aaron, but I'm not 100% clear.

My situation is this: My use case spins off rows (not columns) that I no
longer need and want to delete. It is possible that these rows were never
created in the first place, or were already deleted. This is a very large
cleanup task that normally deletes a lot of rows, and the last thing that I
want to do is create tombstones for rows that didn't exist in the first
place, or lengthen the life on disk of tombstones of rows that are already
deleted.

So the question is: before I delete, do I have to retrieve the row to see if
it exists in the first place?



On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton wrote:

> AFAIK that's not necessary, there is no need to worry about previous
> deletes. You can delete stuff that does not even exist, neither batch_mutate
> or remove are going to throw an error.
>
> All the columns that were (roughly speaking) present at your first deletion
> will be available for GC at the end of the first tombstones life. Same for
> the second.
>
> Say you were to write a col between the two deletes with the same name as
> one present at the start. The first version of the col is avail for GC after
> tombstone 1, and the second after tombstone 2.
>
> Hope that helps
> Aaron
>
> On 18/01/2011, at 9:37 PM, David Boxenhorn  wrote:
>
> Thanks. In other words, before I delete something, I should check to see
> whether it exists as a live row in the first place.
>
> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King < 
> r...@twitter.com> wrote:
>
>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn < 
>> da...@lookin2.com> wrote:
>> > If I delete a row, and later on delete it again, before GCGraceSeconds
>> has
>> > elapsed, does the tombstone live longer?
>>
>> Each delete is a new tombstone, which should answer your question.
>>
>> -ryan
>>
>> > In other words, if I have the following scenario:
>> >
>> > GCGraceSeconds = 10 days
>> > On day 1 I delete a row
>> > On day 5 I delete the row again
>> >
>> > Will the tombstone be removed on day 10 or day 15?
>> >
>>
>
>


Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread Sylvain Lebresne
On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn  wrote:
> Thanks, Aaron, but I'm not 100% clear.
>
> My situation is this: My use case spins off rows (not columns) that I no
> longer need and want to delete. It is possible that these rows were never
> created in the first place, or were already deleted. This is a very large
> cleanup task that normally deletes a lot of rows, and the last thing that I
> want to do is create tombstones for rows that didn't exist in the first
> place, or lengthen the life on disk of tombstones of rows that are already
> deleted.
>
> So the question is: before I delete, do I have to retrieve the row to see if
> it exists in the first place?

Yes, in your situation you do.

>
>
>
> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton 
> wrote:
>>
>> AFAIK that's not necessary, there is no need to worry about previous
>> deletes. You can delete stuff that does not even exist, neither batch_mutate
>> or remove are going to throw an error.
>> All the columns that were (roughly speaking) present at your first
>> deletion will be available for GC at the end of the first tombstones life.
>> Same for the second.
>> Say you were to write a col between the two deletes with the same name as
>> one present at the start. The first version of the col is avail for GC after
>> tombstone 1, and the second after tombstone 2.
>> Hope that helps
>> Aaron
>> On 18/01/2011, at 9:37 PM, David Boxenhorn  wrote:
>>
>> Thanks. In other words, before I delete something, I should check to see
>> whether it exists as a live row in the first place.
>>
>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King  wrote:
>>>
>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn 
>>> wrote:
>>> > If I delete a row, and later on delete it again, before GCGraceSeconds
>>> > has
>>> > elapsed, does the tombstone live longer?
>>>
>>> Each delete is a new tombstone, which should answer your question.
>>>
>>> -ryan
>>>
>>> > In other words, if I have the following scenario:
>>> >
>>> > GCGraceSeconds = 10 days
>>> > On day 1 I delete a row
>>> > On day 5 I delete the row again
>>> >
>>> > Will the tombstone be removed on day 10 or day 15?
>>> >
>>
>
>


Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread David Boxenhorn
Thanks.

On Tue, Jan 18, 2011 at 3:55 PM, Sylvain Lebresne wrote:

> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn 
> wrote:
> > Thanks, Aaron, but I'm not 100% clear.
> >
> > My situation is this: My use case spins off rows (not columns) that I no
> > longer need and want to delete. It is possible that these rows were never
> > created in the first place, or were already deleted. This is a very large
> > cleanup task that normally deletes a lot of rows, and the last thing that
> I
> > want to do is create tombstones for rows that didn't exist in the first
> > place, or lengthen the life on disk of tombstones of rows that are
> already
> > deleted.
> >
> > So the question is: before I delete, do I have to retrieve the row to see
> if
> > it exists in the first place?
>
> Yes, in your situation you do.
>
> >
> >
> >
> > On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton 
> > wrote:
> >>
> >> AFAIK that's not necessary, there is no need to worry about previous
> >> deletes. You can delete stuff that does not even exist, neither
> batch_mutate
> >> or remove are going to throw an error.
> >> All the columns that were (roughly speaking) present at your first
> >> deletion will be available for GC at the end of the first tombstones
> life.
> >> Same for the second.
> >> Say you were to write a col between the two deletes with the same name
> as
> >> one present at the start. The first version of the col is avail for GC
> after
> >> tombstone 1, and the second after tombstone 2.
> >> Hope that helps
> >> Aaron
> >> On 18/01/2011, at 9:37 PM, David Boxenhorn  wrote:
> >>
> >> Thanks. In other words, before I delete something, I should check to see
> >> whether it exists as a live row in the first place.
> >>
> >> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King  wrote:
> >>>
> >>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn 
> >>> wrote:
> >>> > If I delete a row, and later on delete it again, before
> GCGraceSeconds
> >>> > has
> >>> > elapsed, does the tombstone live longer?
> >>>
> >>> Each delete is a new tombstone, which should answer your question.
> >>>
> >>> -ryan
> >>>
> >>> > In other words, if I have the following scenario:
> >>> >
> >>> > GCGraceSeconds = 10 days
> >>> > On day 1 I delete a row
> >>> > On day 5 I delete the row again
> >>> >
> >>> > Will the tombstone be removed on day 10 or day 15?
> >>> >
> >>
> >
> >
>


Re: What is be the best possible client option available to a PHP developer for implementing an application ready for production environments ?

2011-01-18 Thread Ertio Lew
I think we might need to go with full Java implementation only, in
that case, to live up with Hector as we do not find any other better
option.

@Dave: Thanks for the links but we wouldn't much prefer to go with
thrift implementation because of frequently changing api and other
complexities there.

Also we would not like to lock ourselves with implementation in a
language with a client option that has limitations that we can bear
now but not necessarily in future.

If anybody else has a better solution to this please let me know.

Thank you all.
Ertio Lew


On Tue, Jan 18, 2011 at 2:49 PM, Dave Gardner  wrote:
> I can't comment of phpcassa directly, but we use Cassandra plus PHP in
> production without any difficulties. We are happy with the
> performance.
>
> Most of the information we needed to get started we found here:
>
> https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP
>
> This includes details on how to compile the native PHP C Extension for
> Thrift. We use a bespoke client which wraps the Thrift interface.
>
> You may be better of with a higher level client, although when we were
> starting out there was less of a push away from Thrift directly. I
> found using Thrift useful as you gain an appreciation for what calls
> Cassandra actually supports. One potential advantage of using a higher
> level client is that it may protect you from the frequent Thrift
> interface changes which currently seem to accompany every major
> release.
>
> Dave
>
>
>
>
> On Tuesday, 18 January 2011, Tyler Hobbs  wrote:
>>
>> 1. )  Is it devloped to the level in order to support all the
>> necessary features to take full advantage of Cassandra?
>>
>> Yes.  There aren't some of the niceties of pycassa yet, but you can do 
>> everything that Cassandra offers with it.
>>
>>
>> 2. )  Is it used in production by anyone ?
>>
>> Yes, I've talked to a few people at least who are using it in production.  
>> It tends to play a limited role instead of a central one, though.
>>
>>
>> 3. )  What are its limitations?
>>
>> Being written in PHP.  Seriously.  The lack of universal 64bit integer 
>> support can be problematic if you don't have a fully 64bit system.  PHP is 
>> fairly slow.  PHP makes a few other things less easy to do.  If you're doing 
>> some pretty lightweight interaction with Cassandra through PHP, these might 
>> not be a problem for you.
>>
>> - Tyler
>>
>>
>
> --
> *Dave Gardner*
> Technical Architect
>
> [image: imagini_58mmX15mm.png]   [image: VisualDNA-Logo-small.png]
>
> *Imagini Europe Limited*
> 7 Moor Street, London W1D 5NB
>
> [image: phone_icon.png] +44 20 7734 7033
> [image: skype_icon.png] daveg79
> [image: emailIcon.png] dave.gard...@imagini.net
> [image: icon-web.png] http://www.visualdna.com
>
> Imagini Europe Limited, Company number 5565112 (England
> and Wales), Registered address: c/o Bird & Bird,
> 90 Fetter Lane, London, EC4A 1EQ, United Kingdom
>


Re: Question re: the use of multiple ColumnFamilies

2011-01-18 Thread Andy Burgess
Sorry for the delayed reply, but thanks very much - this pointed me at 
the exact problem. I found that the queue size here was equal to the 
number of configured DataFileDirectories, so a good test was to lie to 
Cassandra and claim that there were more DataFileDirectories than I 
needed. Interestingly, it still only ever wrote to the first configured 
DataFileDirectory, but it certainly eliminated the problem, which I 
think means that for my use case at least, it will be good enough to 
patch Cassandra to introduce more control of the queue size.


On 08/01/11 18:20, Peter Schuller wrote:

[multiple active cf;s, often triggering flush at the same time]


Can anyone confirm whether or not this behaviour is expected, and
suggest anything that I could do about it? This is on 0.6.6, by the way.
Patched with time-to-live code, if that makes a difference.

I looked at the code (trunk though, not 0.6.6) and was a bit
surprised. There seems to be a single shared (static) executor for the
sorting and writing stages of memtable flushing (so far so good). But
what I didn't expect was that they seem to have a work queue of a size
equal to the concurrency.

In the case of the writer, the concurrency is the
memtable_flush_writers option (not available in 0.6.6). For the
sorter, it is the number of CPU cores on the system. This makes sense
for the concurrency aspect.

If my understanding is correct and I am not missing something else,
this means that for multiple column families you do indeed need to
expect to have this problem. The more column families the greater the
probability.

What I expected to find was to see that each cf would be guaranteed to
have at least one memtable in queue before writes would block for that
cf.

Assuming the same holds true in your case on 0.6.6 (it looks to be so
on the 0.6 branch by quick examination), I would have to assume that
either one of the following is true:

(1) You have more cf:s actively written to than the number of CPU
cores on your machine so that you're waiting on flushSorter.
   or
(2) Your write speed is overall higher than what can be sustained by
an sstable writer.

If you are willing to patch Cassandra and do the appropriate testing,
and are find with the implications on heap size, you should be able to
work around this by adjusting the size of the work queues for the
flushSorter and flushWriter in ColumnFamilyStory.java.

Note that I did not test this, so proceed with caution if you do.

It will definitely mean that you will eat more heap space if you
submit writes to the cluster faster than they are processed. So in
particular if you're relying on backpressure mechanisms to avoid
causing problems when you do non-rate-limited writes to the cluster,
results are probably negative.

I'll file a bug about this to (1) elicit feedback if I'm wrong, and
(2) to fix it.



--
Andy Burgess
Principal Development Engineer
Application Delivery
WorldPay Ltd.
270-289 Science Park, Milton Road
Cambridge, CB4 0WE, United Kingdom (Depot Code: 024)
Office: +44 (0)1223 706 779| Mobile: +44 (0)7909 534 940
andy.burg...@worldpay.com

WorldPay (UK) Limited, Company No. 07316500. Registered Office: 55 Mansell 
Street, London E1 8AN

Authorised and regulated by the Financial Services Authority.

‘WorldPay Group’ means WorldPay (UK) Limited and its affiliates from time to 
time.  A reference to an “affiliate” means any Subsidiary Undertaking, any 
Parent Undertaking and any Subsidiary Undertaking of any such Parent 
Undertaking and reference to a “Parent Undertaking” or a “Subsidiary 
Undertaking” is to be construed in accordance with section 1162 of the 
Companies Act 2006, as amended.

DISCLAIMER: This email and any files transmitted with it, including replies and 
forwarded copies (which may contain alterations) subsequently transmitted from 
the WorldPay Group, are confidential and solely for the use of the intended 
recipient. If you are not the intended recipient (or authorised to receive for 
the intended recipient), you have received this email in error and any review, 
use, distribution or disclosure of its content is strictly prohibited. If you 
have received this email in error please notify the sender immediately by 
replying to this message. Please then delete this email and destroy any copies 
of it.

Messages sent to and from the WorldPay Group may be monitored to ensure 
compliance with internal policies and to protect our business.  Emails are not 
necessarily secure.  The WorldPay Group does not accept responsibility for 
changes made to this message after it was sent. Please note that neither the 
WorldPay Group nor the sender accepts any responsibility for viruses and it is 
the responsibility of the recipient to ensure that the onward transmission, 
opening or use of this message and any attachments will not adversely affect 
its systems or data. Anyone who communicates with us by email is taken to 
accept these risks. Opinions, conclusions and other information contai

Re: Question re: the use of multiple ColumnFamilies

2011-01-18 Thread Peter Schuller
> Sorry for the delayed reply, but thanks very much - this pointed me at the
> exact problem. I found that the queue size here was equal to the number of
> configured DataFileDirectories, so a good test was to lie to Cassandra and
> claim that there were more DataFileDirectories than I needed. Interestingly,
> it still only ever wrote to the first configured DataFileDirectory, but it
> certainly eliminated the problem, which I think means that for my use case
> at least, it will be good enough to patch Cassandra to introduce more
> control of the queue size.

Based on your use case as you originally stated it (some cf:s that got
written at a slow pace and just happened to flush at the same time),
that should be enough.

(If you have some CF:s being written to faster than they are flushed,
there would still be potential for one CF to hog the flush writers
unfairly.)

-- 
/ Peter Schuller


Re: cassandra-cli: where a and b (works) vs. where b and a (doesn't)

2011-01-18 Thread Timo Nentwig

On Jan 18, 2011, at 12:05, Timo Nentwig wrote:

> 
> On Jan 18, 2011, at 12:02, Aaron Morton wrote:
> 
>> Does wrapping foo in single quotes help?
> 
> No.
> 
>> Also, does this help 
>> http://www.datastax.com/blog/whats-new-cassandra-07-secondary-indexes
> 
> Actually this doesn't even compile because addGtExpression expects a String 
> type (?!).

This works as expected: 

 .addInsertion(now, cf, createColumn("rc", "a", SS, 
StringSerializer.get())).execute();

while this doesn't:

 .addInsertion(now, cf, createColumn("rc", 97, SS, 
IntegerSerializer.get())).execute();

The only difference is that the IntegerSerializer pads the byte array with 
zeros. Shouldn't matter (?). But it does.

I dumped both versions to JSON and reimported them[*]. Same behavior. Then I 
manually removed the trailing six zeros from the IntegerSerializer version and 
retried. Same behavior.

[*] BTW when reimporting the JSON data the secondary indices are not being 
recreated. I had to remove the system keyspace and reimport the schema in order 
to trigger that...

> StringSerializer ss = StringSerializer.get();
> IndexedSlicesQuery indexedSlicesQuery = 
> HFactory.createIndexedSlicesQuery(keyspace, ss, ss, ss);
> indexedSlicesQuery.setColumnNames("full_name", "birth_date", "state");
> indexedSlicesQuery.addGtExpression("birth_date", 1970L);
> indexedSlicesQuery.addEqualsExpression("state", "UT");
> indexedSlicesQuery.setColumnFamily("users");
> indexedSlicesQuery.setStartKey("");
> QueryResult> result = 
> indexedSlicesQuery.execute();
> 
>> Aaron
>> 
>> On 18/01/2011, at 11:54 PM, Timo Nentwig  wrote:
>> 
>>> I put a secondary index on rc (IntegerType) and user_agent (AsciiType).
>>> 
>>> Don't understand this bevahiour at all, can somebody explain?
>>> 
>>> [default@tracking] get crawler where user_agent=foo and rc=200;
>>> 
>>> 0 Row Returned.
>>> [default@tracking] get crawler where rc=200 and user_agent=foo;   
>>> ---
>>> RowKey: -??>2
>>> => (column=rc, value=200, timestamp=1295347760933000)
>>> => (column=url, value=http://www/0, timestamp=1295347760933000)
>>> => (column=user_agent, value=foo, timestamp=1295347760915000)
>>> 
>>> 1 Row Returned.
>>> [default@tracking] get crawler where rc>199 and user_agent=foo;
>>> 
>>> 0 Row Returned.
>>> [default@tracking] get crawler where user_agent=foo; 
>>> ---
>>> RowKey: -??>7
>>> => (column=rc, value=207, timestamp=1295347760935000)
>>> => (column=url, value=http://www/8, timestamp=1295347760933000)
>>> => (column=user_agent, value=foo, timestamp=1295347760917000)
>>> ---
>>> RowKey: -??>8
>>> => (column=rc, value=209, timestamp=1295347760935000)
>>> => (column=url, value=http://www/9, timestamp=1295347760933000)
>>> => (column=user_agent, value=foo, timestamp=1295347760916000)
>>> ---
>>> RowKey: -??>5
>>> => (column=rc, value=201, timestamp=1295347760937000)
>>> => (column=url, value=http://www/2, timestamp=1295347760933000)
>>> => (column=user_agent, value=foo, timestamp=1295347760916000)
>>> ---
>>> RowKey: -??>6
>>> => (column=rc, value=205, timestamp=1295347760935000)
>>> => (column=url, value=http://www/5, timestamp=1295347760933000)
>>> => (column=user_agent, value=foo, timestamp=1295347760917000)
>>> ---
>>> RowKey: -??>2
>>> => (column=rc, value=200, timestamp=1295347760933000)
>>> => (column=url, value=http://www/0, timestamp=1295347760933000)
>>> => (column=user_agent, value=foo, timestamp=1295347760915000)
>>> 
>>> 5 Rows Returned.
>>> 
> 



Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread indika kumara
Hi Aaron,

I read some articles about the Cassandra, and now understand a little bit
about trade-offs.

I feel the goal should be to optimize memory as well as performance. I have
to consider the number of column families, the columns per a family, the
number of rows, the memtable’s threshold, and so on. I also have to consider
how to maximize resource sharing among tenants. However, I feel that a
keyspace should be able to be configured based on the tenant’s class (e.g
replication factor). As per some resources, I feel that the issue is not in
the number of keyspaces, but with the number of CF, the number of the rows
in a CF, the numbers of columns, the size of the data in a column, and so
on. Am I correct? I appreciate your opinion.

What would be the suitable approach? A keyspace per tenant (there would be a
limit on the tenants per a Cassandra cluster) or a keyspace for all tenant.

I still would love to expose the Cassandra ‘as-is’ to a tenant virtually yet
with acceptable memory consumption and performance.

Thanks,

Indika


Re: cassandra-cli: where a and b (works) vs. where b and a (doesn't)

2011-01-18 Thread Nate McCall
When doing mixed types on slicing operations, you should use
ByteArraySerializer and handle the conversions by hand.

We have an issue open for making this more graceful.

On Tue, Jan 18, 2011 at 10:07 AM, Timo Nentwig  wrote:
>
> On Jan 18, 2011, at 12:05, Timo Nentwig wrote:
>
>>
>> On Jan 18, 2011, at 12:02, Aaron Morton wrote:
>>
>>> Does wrapping foo in single quotes help?
>>
>> No.
>>
>>> Also, does this help 
>>> http://www.datastax.com/blog/whats-new-cassandra-07-secondary-indexes
>>
>> Actually this doesn't even compile because addGtExpression expects a String 
>> type (?!).
>
> This works as expected:
>
>  .addInsertion(now, cf, createColumn("rc", "a", SS, 
> StringSerializer.get())).execute();
>
> while this doesn't:
>
>  .addInsertion(now, cf, createColumn("rc", 97, SS, 
> IntegerSerializer.get())).execute();
>
> The only difference is that the IntegerSerializer pads the byte array with 
> zeros. Shouldn't matter (?). But it does.
>
> I dumped both versions to JSON and reimported them[*]. Same behavior. Then I 
> manually removed the trailing six zeros from the IntegerSerializer version 
> and retried. Same behavior.
>
> [*] BTW when reimporting the JSON data the secondary indices are not being 
> recreated. I had to remove the system keyspace and reimport the schema in 
> order to trigger that...
>
>> StringSerializer ss = StringSerializer.get();
>> IndexedSlicesQuery indexedSlicesQuery = 
>> HFactory.createIndexedSlicesQuery(keyspace, ss, ss, ss);
>> indexedSlicesQuery.setColumnNames("full_name", "birth_date", "state");
>> indexedSlicesQuery.addGtExpression("birth_date", 1970L);
>> indexedSlicesQuery.addEqualsExpression("state", "UT");
>> indexedSlicesQuery.setColumnFamily("users");
>> indexedSlicesQuery.setStartKey("");
>> QueryResult> result = 
>> indexedSlicesQuery.execute();
>>
>>> Aaron
>>>
>>> On 18/01/2011, at 11:54 PM, Timo Nentwig  wrote:
>>>
 I put a secondary index on rc (IntegerType) and user_agent (AsciiType).

 Don't understand this bevahiour at all, can somebody explain?

 [default@tracking] get crawler where user_agent=foo and rc=200;

 0 Row Returned.
 [default@tracking] get crawler where rc=200 and user_agent=foo;
 ---
 RowKey: -??>2
 => (column=rc, value=200, timestamp=1295347760933000)
 => (column=url, value=http://www/0, timestamp=1295347760933000)
 => (column=user_agent, value=foo, timestamp=1295347760915000)

 1 Row Returned.
 [default@tracking] get crawler where rc>199 and user_agent=foo;

 0 Row Returned.
 [default@tracking] get crawler where user_agent=foo;
 ---
 RowKey: -??>7
 => (column=rc, value=207, timestamp=1295347760935000)
 => (column=url, value=http://www/8, timestamp=1295347760933000)
 => (column=user_agent, value=foo, timestamp=1295347760917000)
 ---
 RowKey: -??>8
 => (column=rc, value=209, timestamp=1295347760935000)
 => (column=url, value=http://www/9, timestamp=1295347760933000)
 => (column=user_agent, value=foo, timestamp=1295347760916000)
 ---
 RowKey: -??>5
 => (column=rc, value=201, timestamp=1295347760937000)
 => (column=url, value=http://www/2, timestamp=1295347760933000)
 => (column=user_agent, value=foo, timestamp=1295347760916000)
 ---
 RowKey: -??>6
 => (column=rc, value=205, timestamp=1295347760935000)
 => (column=url, value=http://www/5, timestamp=1295347760933000)
 => (column=user_agent, value=foo, timestamp=1295347760917000)
 ---
 RowKey: -??>2
 => (column=rc, value=200, timestamp=1295347760933000)
 => (column=url, value=http://www/0, timestamp=1295347760933000)
 => (column=user_agent, value=foo, timestamp=1295347760915000)

 5 Rows Returned.

>>
>
>


RE: please help with multiget

2011-01-18 Thread Shu Zhang
Well, maybe making a batch-get is not  anymore efficient on the server side but 
without it, you can get bottlenecked on client-server connections and client 
resources. If the number of requests you want to batch is on the order of 
connections in your pool, then yes, making gets in parallel is as good or maybe 
better. But what if you want to batch thousands of requests?

The server I can scale out, I would want to get my requests there without 
needing to wait for connections on my client to free up.

I just don't really understand the reasoning for designing muliget_slice the 
way it is. I still think if you're gonna have a batch-get request 
(multiget_slice), you should be able to add to the batch a reasonable number of 
ANY corresponding non-batch get requests. And you can't do that... Plus, it's 
not symmetrical to the batch-mutate. Is there a good reason for that?

From: Brandon Williams [dri...@gmail.com]
Sent: Monday, January 17, 2011 5:09 PM
To: user@cassandra.apache.org
Cc: hector-us...@googlegroups.com
Subject: Re: please help with multiget

On Mon, Jan 17, 2011 at 6:53 PM, Shu Zhang 
mailto:szh...@mediosystems.com>> wrote:
Here's the method declaration for quick reference:
map> multiget_slice(string keyspace, 
list keys, ColumnParent column_parent, SlicePredicate predicate, 
ConsistencyLevel consistency_level)

It looks like you must have the same SlicePredicate for every key in your batch 
retrieval, so what are you suppose to do when you need to retrieve different 
columns for different keys?

Issue multiple gets in parallel yourself.  Keep in mind that multiget is not an 
optimization, in fact, it can work against you when one key exceeds the rpc 
timeout, because you get nothing back.

-Brandon


Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread Ed Anuff
Hi Indika, I've done a lot of work using the keyspace per tenant model, and
I'm seeing big problems with the memory consumption, even though it's
certainly the most clean way to implement it.  Luckily, before I used the
keyspace per tenant approach, I'd implemented my system using a single
keyspace approach and can still revert back to that.  The rest of the stuff
for multi-tenancy on the wiki is largely irrelevant, but the keyspace issue
is a big concern at the moment.

Ed

On Tue, Jan 18, 2011 at 9:40 AM, indika kumara wrote:

> Hi Aaron,
>
> I read some articles about the Cassandra, and now understand a little bit
> about trade-offs.
>
> I feel the goal should be to optimize memory as well as performance. I have
> to consider the number of column families, the columns per a family, the
> number of rows, the memtable’s threshold, and so on. I also have to consider
> how to maximize resource sharing among tenants. However, I feel that a
> keyspace should be able to be configured based on the tenant’s class (e.g
> replication factor). As per some resources, I feel that the issue is not
> in the number of keyspaces, but with the number of CF, the number of the
> rows in a CF, the numbers of columns, the size of the data in a column, and
> so on. Am I correct? I appreciate your opinion.
>
> What would be the suitable approach? A keyspace per tenant (there would be
> a limit on the tenants per a Cassandra cluster) or a keyspace for all
> tenant.
>
> I still would love to expose the Cassandra ‘as-is’ to a tenant virtually
> yet with acceptable memory consumption and performance.
>
> Thanks,
>
> Indika
>
>


changing the replication level on the fly

2011-01-18 Thread Jeremy Stribling

Hi,

I've noticed in the new Cassandra 0.7.0 release that if I have a 
keyspace with a replication level of 2, but only one Cassandra node, I 
cannot insert anything into the system.  Likely this was a bug in the 
old release I was using (0.6.8 -- is there a JIRA describing this 
problem?).  However, this is a problem for our application, as we don't 
want to have to predefine the number of nodes, but rather start with one 
node, and add nodes as needed.


Ideally, we could start our system with one node, and be able to insert 
data just on that one node.  Then, when a second node is added, we can 
start using that node to store replicas for the keyspace.  I know that 
0.7.0 has a new operation for updating keyspace properties like 
replication level, but in the documentation there is some mention about 
having to run manual repair operations after using it.  My question is: 
what happens if we do not run these repair operations?


Here's what I'd like to do:
1) Start with a single node with autobootstrap=false and replication 
level=1.
2) Later, start a second node with autobootstrap=true and join it to the 
first.
3) The application detects that there are now two nodes, and issues the 
command to pump up the replication level to 2.
4) If it ever drops back down to one node, it will turn the replication 
level down again.


If we do not do a repair, will all hell break lose, or will it just be 
the case that data inserted when there was only one node will continue 
to be unreplicated, but data inserted when there were two nodes will 
have two replicas?  Thanks,


Jeremy



Re: changing the replication level on the fly

2011-01-18 Thread Edward Capriolo
On Tue, Jan 18, 2011 at 2:14 PM, Jeremy Stribling  wrote:
> Hi,
>
> I've noticed in the new Cassandra 0.7.0 release that if I have a keyspace
> with a replication level of 2, but only one Cassandra node, I cannot insert
> anything into the system.  Likely this was a bug in the old release I was
> using (0.6.8 -- is there a JIRA describing this problem?).  However, this is
> a problem for our application, as we don't want to have to predefine the
> number of nodes, but rather start with one node, and add nodes as needed.
>
> Ideally, we could start our system with one node, and be able to insert data
> just on that one node.  Then, when a second node is added, we can start
> using that node to store replicas for the keyspace.  I know that 0.7.0 has a
> new operation for updating keyspace properties like replication level, but
> in the documentation there is some mention about having to run manual repair
> operations after using it.  My question is: what happens if we do not run
> these repair operations?
>
> Here's what I'd like to do:
> 1) Start with a single node with autobootstrap=false and replication
> level=1.
> 2) Later, start a second node with autobootstrap=true and join it to the
> first.
> 3) The application detects that there are now two nodes, and issues the
> command to pump up the replication level to 2.
> 4) If it ever drops back down to one node, it will turn the replication
> level down again.
>
> If we do not do a repair, will all hell break lose, or will it just be the
> case that data inserted when there was only one node will continue to be
> unreplicated, but data inserted when there were two nodes will have two
> replicas?  Thanks,
>
> Jeremy
>
>

If you up your replication Factor and do not repair this is what happens:

READ.QUORUM -> This is safe. Over time all entries that are read will
be fixed through read repair. Reads will return correct data.
BUT data never read will never be copied to the new node.
READ.ONE -> 50% of your reads will return correct data. 50% of your
Reads will return NO data the first time (based on the server your
read hits). Then they will be read repaired. Second read will return
the correct data.

You can extrapolate the complications caused be this if you are add 10
or 15 nodes over time. You are never really sure if the data from the
first node got replicated to the second, did the second get replicated
to the third ? Brian hurting... CAP complicated enough...


Re: changing the replication level on the fly

2011-01-18 Thread Jeremy Stribling



On 01/18/2011 11:36 AM, Edward Capriolo wrote:

On Tue, Jan 18, 2011 at 2:14 PM, Jeremy Stribling  wrote:
   

Hi,

I've noticed in the new Cassandra 0.7.0 release that if I have a keyspace
with a replication level of 2, but only one Cassandra node, I cannot insert
anything into the system.  Likely this was a bug in the old release I was
using (0.6.8 -- is there a JIRA describing this problem?).  However, this is
a problem for our application, as we don't want to have to predefine the
number of nodes, but rather start with one node, and add nodes as needed.

Ideally, we could start our system with one node, and be able to insert data
just on that one node.  Then, when a second node is added, we can start
using that node to store replicas for the keyspace.  I know that 0.7.0 has a
new operation for updating keyspace properties like replication level, but
in the documentation there is some mention about having to run manual repair
operations after using it.  My question is: what happens if we do not run
these repair operations?

Here's what I'd like to do:
1) Start with a single node with autobootstrap=false and replication
level=1.
2) Later, start a second node with autobootstrap=true and join it to the
first.
3) The application detects that there are now two nodes, and issues the
command to pump up the replication level to 2.
4) If it ever drops back down to one node, it will turn the replication
level down again.

If we do not do a repair, will all hell break lose, or will it just be the
case that data inserted when there was only one node will continue to be
unreplicated, but data inserted when there were two nodes will have two
replicas?  Thanks,

Jeremy


 

If you up your replication Factor and do not repair this is what happens:

READ.QUORUM ->  This is safe. Over time all entries that are read will
be fixed through read repair. Reads will return correct data.
BUT data never read will never be copied to the new node.
READ.ONE ->  50% of your reads will return correct data. 50% of your
Reads will return NO data the first time (based on the server your
read hits). Then they will be read repaired. Second read will return
the correct data.

You can extrapolate the complications caused be this if you are add 10
or 15 nodes over time. You are never really sure if the data from the
first node got replicated to the second, did the second get replicated
to the third ? Brian hurting... CAP complicated enough...
   


Thanks.  Are you referring only to data that was written at replication 
factor 1, or any data?


Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread Jeremy Hanna
Feel free to use that wiki page or another wiki page to collaborate on more 
pressing multi tenant issues.  The wiki is editable by all.  The MultiTenant 
page was meant as a launching point for tracking progress on things we could 
think of wrt MT.

Obviously the memtable problem is the largest concern at this point.  If you 
have any ideas wrt that and want to collaborate on how to address that, perhaps 
even in a way that would get accepted in core cassandra, feel free to propose 
solutions in a jira ticket or on the list.

A caveat to getting things into core cassandra - make sure anything you do is 
considerate of single-tenant cassandra.  If possible, make things pluggable and 
optional.  The round robin request scheduler is an example.  The functionality 
is there but you have to enable it.  If it can't be made pluggable/optional, 
you can get good feedback from the community about proposed solutions in core 
Cassandra (like for the memtable issue in particular).

Anyway, just wanted to chime in with 2 cents about that page (since I created 
it and was helping maintain it before getting pulled off onto other projects).

On Jan 18, 2011, at 1:12 PM, Ed Anuff wrote:

> Hi Indika, I've done a lot of work using the keyspace per tenant model, and 
> I'm seeing big problems with the memory consumption, even though it's 
> certainly the most clean way to implement it.  Luckily, before I used the 
> keyspace per tenant approach, I'd implemented my system using a single 
> keyspace approach and can still revert back to that.  The rest of the stuff 
> for multi-tenancy on the wiki is largely irrelevant, but the keyspace issue 
> is a big concern at the moment.
> 
> Ed
> 
> On Tue, Jan 18, 2011 at 9:40 AM, indika kumara  wrote:
> Hi Aaron,
> 
> I read some articles about the Cassandra, and now understand a little bit 
> about trade-offs.
> 
> I feel the goal should be to optimize memory as well as performance. I have 
> to consider the number of column families, the columns per a family, the 
> number of rows, the memtable’s threshold, and so on. I also have to consider 
> how to maximize resource sharing among tenants. However, I feel that a 
> keyspace should be able to be configured based on the tenant’s class (e.g 
> replication factor). As per some resources, I feel that the issue is not in 
> the number of keyspaces, but with the number of CF, the number of the rows in 
> a CF, the numbers of columns, the size of the data in a column, and so on. Am 
> I correct? I appreciate your opinion. 
> 
> What would be the suitable approach? A keyspace per tenant (there would be a 
> limit on the tenants per a Cassandra cluster) or a keyspace for all tenant.
> 
> I still would love to expose the Cassandra ‘as-is’ to a tenant virtually yet 
> with acceptable memory consumption and performance.
> 
> Thanks,
> 
> Indika
> 
> 



Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread Ed Anuff
Hi Jeremy, thanks, I was really coming at it from the question of whether
keyspaces were a functional basis for multitenancy in Cassandra.  I think
the MT issues discussed on the wiki page are the , but I'd like to get a
better understanding of the core issue of keyspaces and then try to get that
onto the page as maybe the first section.

Ed

On Tue, Jan 18, 2011 at 11:42 AM, Jeremy Hanna
wrote:

> Feel free to use that wiki page or another wiki page to collaborate on more
> pressing multi tenant issues.  The wiki is editable by all.  The MultiTenant
> page was meant as a launching point for tracking progress on things we could
> think of wrt MT.
>
> Obviously the memtable problem is the largest concern at this point.  If
> you have any ideas wrt that and want to collaborate on how to address that,
> perhaps even in a way that would get accepted in core cassandra, feel free
> to propose solutions in a jira ticket or on the list.
>
> A caveat to getting things into core cassandra - make sure anything you do
> is considerate of single-tenant cassandra.  If possible, make things
> pluggable and optional.  The round robin request scheduler is an example.
>  The functionality is there but you have to enable it.  If it can't be made
> pluggable/optional, you can get good feedback from the community about
> proposed solutions in core Cassandra (like for the memtable issue in
> particular).
>
> Anyway, just wanted to chime in with 2 cents about that page (since I
> created it and was helping maintain it before getting pulled off onto other
> projects).
>
> On Jan 18, 2011, at 1:12 PM, Ed Anuff wrote:
>
> > Hi Indika, I've done a lot of work using the keyspace per tenant model,
> and I'm seeing big problems with the memory consumption, even though it's
> certainly the most clean way to implement it.  Luckily, before I used the
> keyspace per tenant approach, I'd implemented my system using a single
> keyspace approach and can still revert back to that.  The rest of the stuff
> for multi-tenancy on the wiki is largely irrelevant, but the keyspace issue
> is a big concern at the moment.
> >
> > Ed
> >
> > On Tue, Jan 18, 2011 at 9:40 AM, indika kumara 
> wrote:
> > Hi Aaron,
> >
> > I read some articles about the Cassandra, and now understand a little bit
> about trade-offs.
> >
> > I feel the goal should be to optimize memory as well as performance. I
> have to consider the number of column families, the columns per a family,
> the number of rows, the memtable’s threshold, and so on. I also have to
> consider how to maximize resource sharing among tenants. However, I feel
> that a keyspace should be able to be configured based on the tenant’s class
> (e.g replication factor). As per some resources, I feel that the issue is
> not in the number of keyspaces, but with the number of CF, the number of the
> rows in a CF, the numbers of columns, the size of the data in a column, and
> so on. Am I correct? I appreciate your opinion.
> >
> > What would be the suitable approach? A keyspace per tenant (there would
> be a limit on the tenants per a Cassandra cluster) or a keyspace for all
> tenant.
> >
> > I still would love to expose the Cassandra ‘as-is’ to a tenant virtually
> yet with acceptable memory consumption and performance.
> >
> > Thanks,
> >
> > Indika
> >
> >
>
>


Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread Aaron Morton
Sylvain,

Just to check my knowledge. Is this only the case if the delete is sent without 
a super column or predicate? What about a delete for a specific column that did 
not exist?

Thanks
Aaron 

On 19/01/2011, at 2:58 AM, David Boxenhorn  wrote:

> Thanks. 
> 
> On Tue, Jan 18, 2011 at 3:55 PM, Sylvain Lebresne  wrote:
> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn  wrote:
> > Thanks, Aaron, but I'm not 100% clear.
> >
> > My situation is this: My use case spins off rows (not columns) that I no
> > longer need and want to delete. It is possible that these rows were never
> > created in the first place, or were already deleted. This is a very large
> > cleanup task that normally deletes a lot of rows, and the last thing that I
> > want to do is create tombstones for rows that didn't exist in the first
> > place, or lengthen the life on disk of tombstones of rows that are already
> > deleted.
> >
> > So the question is: before I delete, do I have to retrieve the row to see if
> > it exists in the first place?
> 
> Yes, in your situation you do.
> 
> >
> >
> >
> > On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton 
> > wrote:
> >>
> >> AFAIK that's not necessary, there is no need to worry about previous
> >> deletes. You can delete stuff that does not even exist, neither 
> >> batch_mutate
> >> or remove are going to throw an error.
> >> All the columns that were (roughly speaking) present at your first
> >> deletion will be available for GC at the end of the first tombstones life.
> >> Same for the second.
> >> Say you were to write a col between the two deletes with the same name as
> >> one present at the start. The first version of the col is avail for GC 
> >> after
> >> tombstone 1, and the second after tombstone 2.
> >> Hope that helps
> >> Aaron
> >> On 18/01/2011, at 9:37 PM, David Boxenhorn  wrote:
> >>
> >> Thanks. In other words, before I delete something, I should check to see
> >> whether it exists as a live row in the first place.
> >>
> >> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King  wrote:
> >>>
> >>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn 
> >>> wrote:
> >>> > If I delete a row, and later on delete it again, before GCGraceSeconds
> >>> > has
> >>> > elapsed, does the tombstone live longer?
> >>>
> >>> Each delete is a new tombstone, which should answer your question.
> >>>
> >>> -ryan
> >>>
> >>> > In other words, if I have the following scenario:
> >>> >
> >>> > GCGraceSeconds = 10 days
> >>> > On day 1 I delete a row
> >>> > On day 5 I delete the row again
> >>> >
> >>> > Will the tombstone be removed on day 10 or day 15?
> >>> >
> >>
> >
> >
> 


Re: please help with multiget

2011-01-18 Thread Aaron Morton
I think the general approach is to denormalise data to remove the need for 
complicated semantics when reading. 

Aaron

On 19/01/2011, at 7:57 AM, Shu Zhang  wrote:

> Well, maybe making a batch-get is not  anymore efficient on the server side 
> but without it, you can get bottlenecked on client-server connections and 
> client resources. If the number of requests you want to batch is on the order 
> of connections in your pool, then yes, making gets in parallel is as good or 
> maybe better. But what if you want to batch thousands of requests?
> 
> The server I can scale out, I would want to get my requests there without 
> needing to wait for connections on my client to free up.
> 
> I just don't really understand the reasoning for designing muliget_slice the 
> way it is. I still think if you're gonna have a batch-get request 
> (multiget_slice), you should be able to add to the batch a reasonable number 
> of ANY corresponding non-batch get requests. And you can't do that... Plus, 
> it's not symmetrical to the batch-mutate. Is there a good reason for that?
> 
> From: Brandon Williams [dri...@gmail.com]
> Sent: Monday, January 17, 2011 5:09 PM
> To: user@cassandra.apache.org
> Cc: hector-us...@googlegroups.com
> Subject: Re: please help with multiget
> 
> On Mon, Jan 17, 2011 at 6:53 PM, Shu Zhang 
> mailto:szh...@mediosystems.com>> wrote:
> Here's the method declaration for quick reference:
> map> multiget_slice(string keyspace, 
> list keys, ColumnParent column_parent, SlicePredicate predicate, 
> ConsistencyLevel consistency_level)
> 
> It looks like you must have the same SlicePredicate for every key in your 
> batch retrieval, so what are you suppose to do when you need to retrieve 
> different columns for different keys?
> 
> Issue multiple gets in parallel yourself.  Keep in mind that multiget is not 
> an optimization, in fact, it can work against you when one key exceeds the 
> rpc timeout, because you get nothing back.
> 
> -Brandon


Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread Aaron Morton
As everyone says, it's not issues with the Keyspace directly as they are just a 
container. It's the CF's in the keyspace, but let's just say keyspace cause 
it's easier.

As things stand, if you allow point and click creation for keyspaces you will 
hand over control of the memory requirements to the users. This will be a bad 
thing. E.g. Lots of cf's will get created and you will run out of memory, or 
cf's will get created with huge Memtable settings and you will run out of 
memory, or caches will get set huge and you get the picture. One badly behaving 
keyspace or column family can take down a node / cluster.

IMHO currently the best way to share a Cassandra cluster is through some sort 
of application layer that uses as static keyspace. Others have a better 
understanding of the internals and may have ideas about how this could change 
in the future.

Aaron

On 19/01/2011, at 9:07 AM, Ed Anuff  wrote:

> Hi Jeremy, thanks, I was really coming at it from the question of whether 
> keyspaces were a functional basis for multitenancy in Cassandra.  I think the 
> MT issues discussed on the wiki page are the , but I'd like to get a better 
> understanding of the core issue of keyspaces and then try to get that onto 
> the page as maybe the first section.
> 
> Ed
> 
> On Tue, Jan 18, 2011 at 11:42 AM, Jeremy Hanna  
> wrote:
> Feel free to use that wiki page or another wiki page to collaborate on more 
> pressing multi tenant issues.  The wiki is editable by all.  The MultiTenant 
> page was meant as a launching point for tracking progress on things we could 
> think of wrt MT.
> 
> Obviously the memtable problem is the largest concern at this point.  If you 
> have any ideas wrt that and want to collaborate on how to address that, 
> perhaps even in a way that would get accepted in core cassandra, feel free to 
> propose solutions in a jira ticket or on the list.
> 
> A caveat to getting things into core cassandra - make sure anything you do is 
> considerate of single-tenant cassandra.  If possible, make things pluggable 
> and optional.  The round robin request scheduler is an example.  The 
> functionality is there but you have to enable it.  If it can't be made 
> pluggable/optional, you can get good feedback from the community about 
> proposed solutions in core Cassandra (like for the memtable issue in 
> particular).
> 
> Anyway, just wanted to chime in with 2 cents about that page (since I created 
> it and was helping maintain it before getting pulled off onto other projects).
> 
> On Jan 18, 2011, at 1:12 PM, Ed Anuff wrote:
> 
> > Hi Indika, I've done a lot of work using the keyspace per tenant model, and 
> > I'm seeing big problems with the memory consumption, even though it's 
> > certainly the most clean way to implement it.  Luckily, before I used the 
> > keyspace per tenant approach, I'd implemented my system using a single 
> > keyspace approach and can still revert back to that.  The rest of the stuff 
> > for multi-tenancy on the wiki is largely irrelevant, but the keyspace issue 
> > is a big concern at the moment.
> >
> > Ed
> >
> > On Tue, Jan 18, 2011 at 9:40 AM, indika kumara  
> > wrote:
> > Hi Aaron,
> >
> > I read some articles about the Cassandra, and now understand a little bit 
> > about trade-offs.
> >
> > I feel the goal should be to optimize memory as well as performance. I have 
> > to consider the number of column families, the columns per a family, the 
> > number of rows, the memtable’s threshold, and so on. I also have to 
> > consider how to maximize resource sharing among tenants. However, I feel 
> > that a keyspace should be able to be configured based on the tenant’s class 
> > (e.g replication factor). As per some resources, I feel that the issue is 
> > not in the number of keyspaces, but with the number of CF, the number of 
> > the rows in a CF, the numbers of columns, the size of the data in a column, 
> > and so on. Am I correct? I appreciate your opinion.
> >
> > What would be the suitable approach? A keyspace per tenant (there would be 
> > a limit on the tenants per a Cassandra cluster) or a keyspace for all 
> > tenant.
> >
> > I still would love to expose the Cassandra ‘as-is’ to a tenant virtually 
> > yet with acceptable memory consumption and performance.
> >
> > Thanks,
> >
> > Indika
> >
> >
> 
> 


Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread Stephen Connolly
I would imagine it to be somewhat easy to implement this via a thrift
wrapper so that each tenant is connecting to the proxy thrift server that
masks the fact that there are multiple tenants... or is that how people are
thinking about this

- Stephen

---
Sent from my Android phone, so random spelling mistakes, random nonsense
words and other nonsense are a direct result of using swype to type on the
screen
On 18 Jan 2011 21:20, "Aaron Morton"  wrote:
> As everyone says, it's not issues with the Keyspace directly as they are
just a container. It's the CF's in the keyspace, but let's just say keyspace
cause it's easier.
>
> As things stand, if you allow point and click creation for keyspaces you
will hand over control of the memory requirements to the users. This will be
a bad thing. E.g. Lots of cf's will get created and you will run out of
memory, or cf's will get created with huge Memtable settings and you will
run out of memory, or caches will get set huge and you get the picture. One
badly behaving keyspace or column family can take down a node / cluster.
>
> IMHO currently the best way to share a Cassandra cluster is through some
sort of application layer that uses as static keyspace. Others have a better
understanding of the internals and may have ideas about how this could
change in the future.
>
> Aaron
>
> On 19/01/2011, at 9:07 AM, Ed Anuff  wrote:
>
>> Hi Jeremy, thanks, I was really coming at it from the question of whether
keyspaces were a functional basis for multitenancy in Cassandra. I think the
MT issues discussed on the wiki page are the , but I'd like to get a better
understanding of the core issue of keyspaces and then try to get that onto
the page as maybe the first section.
>>
>> Ed
>>
>> On Tue, Jan 18, 2011 at 11:42 AM, Jeremy Hanna <
jeremy.hanna1...@gmail.com> wrote:
>> Feel free to use that wiki page or another wiki page to collaborate on
more pressing multi tenant issues. The wiki is editable by all. The
MultiTenant page was meant as a launching point for tracking progress on
things we could think of wrt MT.
>>
>> Obviously the memtable problem is the largest concern at this point. If
you have any ideas wrt that and want to collaborate on how to address that,
perhaps even in a way that would get accepted in core cassandra, feel free
to propose solutions in a jira ticket or on the list.
>>
>> A caveat to getting things into core cassandra - make sure anything you
do is considerate of single-tenant cassandra. If possible, make things
pluggable and optional. The round robin request scheduler is an example. The
functionality is there but you have to enable it. If it can't be made
pluggable/optional, you can get good feedback from the community about
proposed solutions in core Cassandra (like for the memtable issue in
particular).
>>
>> Anyway, just wanted to chime in with 2 cents about that page (since I
created it and was helping maintain it before getting pulled off onto other
projects).
>>
>> On Jan 18, 2011, at 1:12 PM, Ed Anuff wrote:
>>
>> > Hi Indika, I've done a lot of work using the keyspace per tenant model,
and I'm seeing big problems with the memory consumption, even though it's
certainly the most clean way to implement it. Luckily, before I used the
keyspace per tenant approach, I'd implemented my system using a single
keyspace approach and can still revert back to that. The rest of the stuff
for multi-tenancy on the wiki is largely irrelevant, but the keyspace issue
is a big concern at the moment.
>> >
>> > Ed
>> >
>> > On Tue, Jan 18, 2011 at 9:40 AM, indika kumara 
wrote:
>> > Hi Aaron,
>> >
>> > I read some articles about the Cassandra, and now understand a little
bit about trade-offs.
>> >
>> > I feel the goal should be to optimize memory as well as performance. I
have to consider the number of column families, the columns per a family,
the number of rows, the memtable’s threshold, and so on. I also have to
consider how to maximize resource sharing among tenants. However, I feel
that a keyspace should be able to be configured based on the tenant’s class
(e.g replication factor). As per some resources, I feel that the issue is
not in the number of keyspaces, but with the number of CF, the number of the
rows in a CF, the numbers of columns, the size of the data in a column, and
so on. Am I correct? I appreciate your opinion.
>> >
>> > What would be the suitable approach? A keyspace per tenant (there would
be a limit on the tenants per a Cassandra cluster) or a keyspace for all
tenant.
>> >
>> > I still would love to expose the Cassandra ‘as-is’ to a tenant
virtually yet with acceptable memory consumption and performance.
>> >
>> > Thanks,
>> >
>> > Indika
>> >
>> >
>>
>>


RE: please help with multiget

2011-01-18 Thread Shu Zhang
Well, I don't think what I'm describing is complicated semantics. I think I've 
described general batch operation design and something that is symmetrical the 
batch_mutate method already on the Cassandra API. You are right, I can solve 
the problem with further denormalization, and the approach of making individual 
gets in parallel as described by Brandon will work too. I'll be doing one of 
these for now. But I think neither is as efficient, and I guess I'm still not 
sure why the multiget is designed the way it is.

The problem with denormalization is you gotta make multiple row writes in place 
of one, adding load to the server, adding required physical space and losing 
atomicity on write operations. I know writes are cheap in cassandra, and you 
can catch failed writes and retry so these problems are not major, but it still 
seems clear that having a batch-get that works appropriately is a least a 
little better... 

From: Aaron Morton [aa...@thelastpickle.com]
Sent: Tuesday, January 18, 2011 12:55 PM
To: user@cassandra.apache.org
Subject: Re: please help with multiget

I think the general approach is to denormalise data to remove the need for 
complicated semantics when reading.

Aaron

On 19/01/2011, at 7:57 AM, Shu Zhang  wrote:

> Well, maybe making a batch-get is not  anymore efficient on the server side 
> but without it, you can get bottlenecked on client-server connections and 
> client resources. If the number of requests you want to batch is on the order 
> of connections in your pool, then yes, making gets in parallel is as good or 
> maybe better. But what if you want to batch thousands of requests?
>
> The server I can scale out, I would want to get my requests there without 
> needing to wait for connections on my client to free up.
>
> I just don't really understand the reasoning for designing muliget_slice the 
> way it is. I still think if you're gonna have a batch-get request 
> (multiget_slice), you should be able to add to the batch a reasonable number 
> of ANY corresponding non-batch get requests. And you can't do that... Plus, 
> it's not symmetrical to the batch-mutate. Is there a good reason for that?
> 
> From: Brandon Williams [dri...@gmail.com]
> Sent: Monday, January 17, 2011 5:09 PM
> To: user@cassandra.apache.org
> Cc: hector-us...@googlegroups.com
> Subject: Re: please help with multiget
>
> On Mon, Jan 17, 2011 at 6:53 PM, Shu Zhang 
> mailto:szh...@mediosystems.com>> wrote:
> Here's the method declaration for quick reference:
> map> multiget_slice(string keyspace, 
> list keys, ColumnParent column_parent, SlicePredicate predicate, 
> ConsistencyLevel consistency_level)
>
> It looks like you must have the same SlicePredicate for every key in your 
> batch retrieval, so what are you suppose to do when you need to retrieve 
> different columns for different keys?
>
> Issue multiple gets in parallel yourself.  Keep in mind that multiget is not 
> an optimization, in fact, it can work against you when one key exceeds the 
> rpc timeout, because you get nothing back.
>
> -Brandon


Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread Aaron Morton
I've used an S3 style data model with a REST interface (varnish > nginx > tornado > cassandra), users do not see anything remotely cassandra like. AaronOn 19 Jan, 2011,at 10:27 AM, Stephen Connolly  wrote:I would imagine it to be somewhat easy to implement this via a thrift wrapper so that each tenant is connecting to the proxy thrift server that masks the fact that there are multiple tenants... or is that how people are thinking about this

- Stephen
---
Sent from my Android phone, so random spelling mistakes, random nonsense words and other nonsense are a direct result of using swype to type on the screen
On 18 Jan 2011 21:20, "Aaron Morton"  wrote:> As everyone says, it's not issues with the Keyspace directly as they are just a container. It's the CF's in the keyspace, but let's just say keyspace cause it's easier.
> > As things stand, if you allow point and click creation for keyspaces you will hand over control of the memory requirements to the users. This will be a bad thing. E.g. Lots of cf's will get created and you will run out of memory, or cf's will get created with huge Memtable settings and you will run out of memory, or caches will get set huge and you get the picture. One badly behaving keyspace or column family can take down a node / cluster.
> > IMHO currently the best way to share a Cassandra cluster is through some sort of application layer that uses as static keyspace. Others have a better understanding of the internals and may have ideas about how this could change in the future.
> > Aaron> > On 19/01/2011, at 9:07 AM, Ed Anuff  wrote:> >> Hi Jeremy, thanks, I was really coming at it from the question of whether keyspaces were a functional basis for multitenancy in Cassandra.  I think the MT issues discussed on the wiki page are the , but I'd like to get a better understanding of the core issue of keyspaces and then try to get that onto the page as maybe the first section.
>> >> Ed>> >> On Tue, Jan 18, 2011 at 11:42 AM, Jeremy Hanna  wrote:>> Feel free to use that wiki page or another wiki page to collaborate on more pressing multi tenant issues.  The wiki is editable by all.  The MultiTenant page was meant as a launching point for tracking progress on things we could think of wrt MT.
>> >> Obviously the memtable problem is the largest concern at this point.  If you have any ideas wrt that and want to collaborate on how to address that, perhaps even in a way that would get accepted in core cassandra, feel free to propose solutions in a jira ticket or on the list.
>> >> A caveat to getting things into core cassandra - make sure anything you do is considerate of single-tenant cassandra.  If possible, make things pluggable and optional  The round robin request scheduler is an example.  The functionality is there but you have to enable it.  If it can't be made pluggable/optional, you can get good feedback from the community about proposed solutions in core Cassandra (like for the memtable issue in particular).
>> >> Anyway, just wanted to chime in with 2 cents about that page (since I created it and was helping maintain it before getting pulled off onto other projects).>> >> On Jan 18, 2011, at 1:12 PM, Ed Anuff wrote:
>> >> > Hi Indika, I've done a lot of work using the keyspace per tenant model, and I'm seeing big problems with the memory consumption, even though it's certainly the most clean way to implement it.  Luckily, before I used the keyspace per tenant approach, I'd implemented my system using a single keyspace approach and can still revert back to that.  The rest of the stuff for multi-tenancy on the wiki is largely irrelevant, but the keyspace issue is a big concern at the moment.
>> >>> > Ed>> >>> > On Tue, Jan 18, 2011 at 9:40 AM, indika kumara  wrote:>> > Hi Aaron,
>> >>> > I read some articles about the Cassandra, and now understand a little bit about trade-offs.>> >>> > I feel the goal should be to optimize memory as well as performance. I have to consider the number of column families, the columns per a family, the number of rows, the memtable’s threshold, and so on. I also have to consider how to maximize resource sharing among tenants. However, I feel that a keyspace should be able to be configured based on the tenant’s class (e.g replication factor). As per some resources, I feel that the issue is not in the number of keyspaces, but with the number of CF, the number of the rows in a CF, the numbers of columns, the size of the data in a column, and so on. Am I correct? I appreciate your opinion.
>> >>> > What would be the suitable approach? A keyspace per tenant (there would be a limit on the tenants per a Cassandra cluster) or a keyspace for all tenant.>> >>> > I still would love to expose the Cassandra ‘as-is’ to a tenant virtually yet with acceptable memory consumption and performance.
>> >>> > Thanks,>> >>> > Indika>> >>> >>> >> 


Re: please help with multiget

2011-01-18 Thread Edward Capriolo
On Tue, Jan 18, 2011 at 4:29 PM, Shu Zhang  wrote:
> Well, I don't think what I'm describing is complicated semantics. I think 
> I've described general batch operation design and something that is 
> symmetrical the batch_mutate method already on the Cassandra API. You are 
> right, I can solve the problem with further denormalization, and the approach 
> of making individual gets in parallel as described by Brandon will work too. 
> I'll be doing one of these for now. But I think neither is as efficient, and 
> I guess I'm still not sure why the multiget is designed the way it is.
>
> The problem with denormalization is you gotta make multiple row writes in 
> place of one, adding load to the server, adding required physical space and 
> losing atomicity on write operations. I know writes are cheap in cassandra, 
> and you can catch failed writes and retry so these problems are not major, 
> but it still seems clear that having a batch-get that works appropriately is 
> a least a little better...
> 
> From: Aaron Morton [aa...@thelastpickle.com]
> Sent: Tuesday, January 18, 2011 12:55 PM
> To: user@cassandra.apache.org
> Subject: Re: please help with multiget
>
> I think the general approach is to denormalise data to remove the need for 
> complicated semantics when reading.
>
> Aaron
>
> On 19/01/2011, at 7:57 AM, Shu Zhang  wrote:
>
>> Well, maybe making a batch-get is not  anymore efficient on the server side 
>> but without it, you can get bottlenecked on client-server connections and 
>> client resources. If the number of requests you want to batch is on the 
>> order of connections in your pool, then yes, making gets in parallel is as 
>> good or maybe better. But what if you want to batch thousands of requests?
>>
>> The server I can scale out, I would want to get my requests there without 
>> needing to wait for connections on my client to free up.
>>
>> I just don't really understand the reasoning for designing muliget_slice the 
>> way it is. I still think if you're gonna have a batch-get request 
>> (multiget_slice), you should be able to add to the batch a reasonable number 
>> of ANY corresponding non-batch get requests. And you can't do that... Plus, 
>> it's not symmetrical to the batch-mutate. Is there a good reason for that?
>> 
>> From: Brandon Williams [dri...@gmail.com]
>> Sent: Monday, January 17, 2011 5:09 PM
>> To: user@cassandra.apache.org
>> Cc: hector-us...@googlegroups.com
>> Subject: Re: please help with multiget
>>
>> On Mon, Jan 17, 2011 at 6:53 PM, Shu Zhang 
>> mailto:szh...@mediosystems.com>> wrote:
>> Here's the method declaration for quick reference:
>> map> multiget_slice(string keyspace, 
>> list keys, ColumnParent column_parent, SlicePredicate predicate, 
>> ConsistencyLevel consistency_level)
>>
>> It looks like you must have the same SlicePredicate for every key in your 
>> batch retrieval, so what are you suppose to do when you need to retrieve 
>> different columns for different keys?
>>
>> Issue multiple gets in parallel yourself.  Keep in mind that multiget is not 
>> an optimization, in fact, it can work against you when one key exceeds the 
>> rpc timeout, because you get nothing back.
>>
>> -Brandon
>

muliget_slice is very useful I IMHO. In my testing, the roundtrip time
for 1000 get requests all being acked individually is much higher then
rountrip time for 200 get_slice grouped 5 at a time. For anyone that
needs that type of access they are in good shape.

I was also theorizing that a CF using RowCache with very, very high
read rate would benefit from "pooling" a bunch of reads together with
multiget.

I do agree that the first time I looked at the multi_get_slice
signature I realized I could do many of the things I was expecting
from a multi-get.


json2sstable NPE

2011-01-18 Thread ruslan usifov
Hello

I have problem when use json2sstable (in cassandra 0.7). When i invoke:

json2sstable -K test -c test
D:\apache-cassandra-0.7.0\bin\test-e-1-Data.json
F:\cassandra\test\test\test-e-1-Data.db

I got NPE:

 WARN 01:31:38,750 Schema definitions were defined both locally and in
cassandra.yaml. Definitions in cassan
a.yaml were ignored.
Exception in thread "main" java.lang.NullPointerException
at org.apache.cassandra.db.ColumnFamily.create(ColumnFamily.java:68)
at org.apache.cassandra.db.ColumnFamily.create(ColumnFamily.java:62)
at
org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:174)
at
org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:251)


Re: json2sstable NPE

2011-01-18 Thread Aaron Morton
Thats odd, the line before line 68 has an assertion that should have kicked in. Are you on the release version of 0.7.0 ? Does the "test" CF exist in the keyspace "test" in your cluster ? AaronOn 19 Jan, 2011,at 11:37 AM, ruslan usifov  wrote:HelloI have problem when use json2sstable (in cassandra 0.7). When i invoke:json2sstable -K test -c test D:\apache-cassandra-0.7.0\bin\test-e-1-Data.json F:\cassandra\test\test\test-e-1-Data.dbI got NPE:
 WARN 01:31:38,750 Schema definitions were defined both locally and in cassandra.yaml. Definitions in cassana.yaml were ignored.Exception in thread "main" java.lang.NullPointerException    at org.apache.cassandra.db.ColumnFamily.create(ColumnFamily.java:68)
    at org.apache.cassandra.db.ColumnFamily.create(ColumnFamily.java:62)    at org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:174)    at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:251)


Re: json2sstable NPE

2011-01-18 Thread ruslan usifov
Thats odd, the line before line 68 has an assertion that should have kicked
> in. Are you on the release version of 0.7.0 ?
>
> Yes i use release downloaded from official site


> Does the "test" CF exist in the keyspace "test" in your cluster ?
>
>
>
no it doesn't exists


Re: json2sstable NPE

2011-01-18 Thread Aaron Morton
AFAIK the CF must exist. Create it and try again.AOn 19 Jan, 2011,at 12:03 PM, ruslan usifov  wrote:Thats odd, the line before line 68 has an assertion that should have kicked in. Are you on the release version of 0.7.0 ? 
Yes i use release downloaded from official site 
Does the "test" CF exist in the keyspace "test" in your cluster ? no it doesn't exists


Re: Java cient

2011-01-18 Thread Jason Pell
Pelops is a nice lib.  I found it very easy to use and the developers
are very responsive to requests for information and/or bugs, etc.
I have not tried hector

On Tue, Jan 18, 2011 at 11:11 PM, Alois Bělaška  wrote:
> Definitelly Pelops https://github.com/s7/scale7-pelops
>
> 2011/1/18 Noble Paul നോബിള്‍ नोब्ळ् 
>>
>> What is the most commonly used java client library? Which is the the most
>> mature/feature complete?
>> Noble
>


Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread Germán Kondolf
Maybe it could be taken into account when the compaction is executed,
if I only have a consecutive list of uninterrupted tombstones it could
only care about the first. It sounds like the-way-it-should-be, maybe
as a part of the "row-reduce" process.

Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.

//GK
http://twitter.com/germanklf
http://code.google.com/p/seide/

On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne  wrote:
> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn  wrote:
>> Thanks, Aaron, but I'm not 100% clear.
>>
>> My situation is this: My use case spins off rows (not columns) that I no
>> longer need and want to delete. It is possible that these rows were never
>> created in the first place, or were already deleted. This is a very large
>> cleanup task that normally deletes a lot of rows, and the last thing that I
>> want to do is create tombstones for rows that didn't exist in the first
>> place, or lengthen the life on disk of tombstones of rows that are already
>> deleted.
>>
>> So the question is: before I delete, do I have to retrieve the row to see if
>> it exists in the first place?
>
> Yes, in your situation you do.
>
>>
>>
>>
>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton 
>> wrote:
>>>
>>> AFAIK that's not necessary, there is no need to worry about previous
>>> deletes. You can delete stuff that does not even exist, neither batch_mutate
>>> or remove are going to throw an error.
>>> All the columns that were (roughly speaking) present at your first
>>> deletion will be available for GC at the end of the first tombstones life.
>>> Same for the second.
>>> Say you were to write a col between the two deletes with the same name as
>>> one present at the start. The first version of the col is avail for GC after
>>> tombstone 1, and the second after tombstone 2.
>>> Hope that helps
>>> Aaron
>>> On 18/01/2011, at 9:37 PM, David Boxenhorn  wrote:
>>>
>>> Thanks. In other words, before I delete something, I should check to see
>>> whether it exists as a live row in the first place.
>>>
>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King  wrote:

 On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn 
 wrote:
 > If I delete a row, and later on delete it again, before GCGraceSeconds
 > has
 > elapsed, does the tombstone live longer?

 Each delete is a new tombstone, which should answer your question.

 -ryan

 > In other words, if I have the following scenario:
 >
 > GCGraceSeconds = 10 days
 > On day 1 I delete a row
 > On day 5 I delete the row again
 >
 > Will the tombstone be removed on day 10 or day 15?
 >
>>>
>>
>>
>


Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread Jonathan Ellis
If you mean that multiple tombstones for the same row or column should
be merged into a single one at compaction time, then yes, that is what
happens.

On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
 wrote:
> Maybe it could be taken into account when the compaction is executed,
> if I only have a consecutive list of uninterrupted tombstones it could
> only care about the first. It sounds like the-way-it-should-be, maybe
> as a part of the "row-reduce" process.
>
> Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
>
> //GK
> http://twitter.com/germanklf
> http://code.google.com/p/seide/
>
> On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne  
> wrote:
>> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn  wrote:
>>> Thanks, Aaron, but I'm not 100% clear.
>>>
>>> My situation is this: My use case spins off rows (not columns) that I no
>>> longer need and want to delete. It is possible that these rows were never
>>> created in the first place, or were already deleted. This is a very large
>>> cleanup task that normally deletes a lot of rows, and the last thing that I
>>> want to do is create tombstones for rows that didn't exist in the first
>>> place, or lengthen the life on disk of tombstones of rows that are already
>>> deleted.
>>>
>>> So the question is: before I delete, do I have to retrieve the row to see if
>>> it exists in the first place?
>>
>> Yes, in your situation you do.
>>
>>>
>>>
>>>
>>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton 
>>> wrote:

 AFAIK that's not necessary, there is no need to worry about previous
 deletes. You can delete stuff that does not even exist, neither 
 batch_mutate
 or remove are going to throw an error.
 All the columns that were (roughly speaking) present at your first
 deletion will be available for GC at the end of the first tombstones life.
 Same for the second.
 Say you were to write a col between the two deletes with the same name as
 one present at the start. The first version of the col is avail for GC 
 after
 tombstone 1, and the second after tombstone 2.
 Hope that helps
 Aaron
 On 18/01/2011, at 9:37 PM, David Boxenhorn  wrote:

 Thanks. In other words, before I delete something, I should check to see
 whether it exists as a live row in the first place.

 On Tue, Jan 18, 2011 at 9:24 AM, Ryan King  wrote:
>
> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn 
> wrote:
> > If I delete a row, and later on delete it again, before GCGraceSeconds
> > has
> > elapsed, does the tombstone live longer?
>
> Each delete is a new tombstone, which should answer your question.
>
> -ryan
>
> > In other words, if I have the following scenario:
> >
> > GCGraceSeconds = 10 days
> > On day 1 I delete a row
> > On day 5 I delete the row again
> >
> > Will the tombstone be removed on day 10 or day 15?
> >

>>>
>>>
>>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread Zhu Han
If the tombstone is older than the row or column inserted later, is the
tombstone skipped entirely after compaction?

best regards,
hanzhu


On Wed, Jan 19, 2011 at 11:16 AM, Jonathan Ellis  wrote:

> If you mean that multiple tombstones for the same row or column should
> be merged into a single one at compaction time, then yes, that is what
> happens.
>
> On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
>  wrote:
> > Maybe it could be taken into account when the compaction is executed,
> > if I only have a consecutive list of uninterrupted tombstones it could
> > only care about the first. It sounds like the-way-it-should-be, maybe
> > as a part of the "row-reduce" process.
> >
> > Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
> >
> > //GK
> > http://twitter.com/germanklf
> > http://code.google.com/p/seide/
> >
> > On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne 
> wrote:
> >> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn 
> wrote:
> >>> Thanks, Aaron, but I'm not 100% clear.
> >>>
> >>> My situation is this: My use case spins off rows (not columns) that I
> no
> >>> longer need and want to delete. It is possible that these rows were
> never
> >>> created in the first place, or were already deleted. This is a very
> large
> >>> cleanup task that normally deletes a lot of rows, and the last thing
> that I
> >>> want to do is create tombstones for rows that didn't exist in the first
> >>> place, or lengthen the life on disk of tombstones of rows that are
> already
> >>> deleted.
> >>>
> >>> So the question is: before I delete, do I have to retrieve the row to
> see if
> >>> it exists in the first place?
> >>
> >> Yes, in your situation you do.
> >>
> >>>
> >>>
> >>>
> >>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <
> aa...@thelastpickle.com>
> >>> wrote:
> 
>  AFAIK that's not necessary, there is no need to worry about previous
>  deletes. You can delete stuff that does not even exist, neither
> batch_mutate
>  or remove are going to throw an error.
>  All the columns that were (roughly speaking) present at your first
>  deletion will be available for GC at the end of the first tombstones
> life.
>  Same for the second.
>  Say you were to write a col between the two deletes with the same name
> as
>  one present at the start. The first version of the col is avail for GC
> after
>  tombstone 1, and the second after tombstone 2.
>  Hope that helps
>  Aaron
>  On 18/01/2011, at 9:37 PM, David Boxenhorn  wrote:
> 
>  Thanks. In other words, before I delete something, I should check to
> see
>  whether it exists as a live row in the first place.
> 
>  On Tue, Jan 18, 2011 at 9:24 AM, Ryan King  wrote:
> >
> > On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn 
> > wrote:
> > > If I delete a row, and later on delete it again, before
> GCGraceSeconds
> > > has
> > > elapsed, does the tombstone live longer?
> >
> > Each delete is a new tombstone, which should answer your question.
> >
> > -ryan
> >
> > > In other words, if I have the following scenario:
> > >
> > > GCGraceSeconds = 10 days
> > > On day 1 I delete a row
> > > On day 5 I delete the row again
> > >
> > > Will the tombstone be removed on day 10 or day 15?
> > >
> 
> >>>
> >>>
> >>
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>


Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread Zhu Han
I'm not clear here.  Are you worried about the later inserted tombstone
prevents the whole row from being reclaimed and the storage space can not be
freed?

To my knowledge,  after major compaction,  only  the row key and tombstone
are kept. Is it a big deal?

best regards,
hanzhu


On Tue, Jan 18, 2011 at 9:41 PM, David Boxenhorn  wrote:

> Thanks, Aaron, but I'm not 100% clear.
>
> My situation is this: My use case spins off rows (not columns) that I no
> longer need and want to delete. It is possible that these rows were never
> created in the first place, or were already deleted. This is a very large
> cleanup task that normally deletes a lot of rows, and the last thing that I
> want to do is create tombstones for rows that didn't exist in the first
> place, or lengthen the life on disk of tombstones of rows that are already
> deleted.
>
> So the question is: before I delete, do I have to retrieve the row to see
> if it exists in the first place?
>
>
>
> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton wrote:
>
>> AFAIK that's not necessary, there is no need to worry about previous
>> deletes. You can delete stuff that does not even exist, neither batch_mutate
>> or remove are going to throw an error.
>>
>> All the columns that were (roughly speaking) present at your first
>> deletion will be available for GC at the end of the first tombstones life.
>> Same for the second.
>>
>> Say you were to write a col between the two deletes with the same name as
>> one present at the start. The first version of the col is avail for GC after
>> tombstone 1, and the second after tombstone 2.
>>
>> Hope that helps
>> Aaron
>>
>> On 18/01/2011, at 9:37 PM, David Boxenhorn  wrote:
>>
>> Thanks. In other words, before I delete something, I should check to see
>> whether it exists as a live row in the first place.
>>
>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King < 
>> r...@twitter.com> wrote:
>>
>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn < 
>>> da...@lookin2.com> wrote:
>>> > If I delete a row, and later on delete it again, before GCGraceSeconds
>>> has
>>> > elapsed, does the tombstone live longer?
>>>
>>> Each delete is a new tombstone, which should answer your question.
>>>
>>> -ryan
>>>
>>> > In other words, if I have the following scenario:
>>> >
>>> > GCGraceSeconds = 10 days
>>> > On day 1 I delete a row
>>> > On day 5 I delete the row again
>>> >
>>> > Will the tombstone be removed on day 10 or day 15?
>>> >
>>>
>>
>>
>


Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread Germán Kondolf
Yes, that's what I meant, but correct me if I'm wrong, when a deletion comes 
after another deletion for the same row or column will the gc-before count 
against the last one, isn't it?

Maybe knowing that all the subsequent versions of a deletion are deletions too, 
it could take the first timestamp against the gc-grace-seconds when is reducing 
& compacting.

// Germán Kondolf
http://twitter.com/germanklf
http://code.google.com/p/seide/
// @i4

On 19/01/2011, at 00:16, Jonathan Ellis  wrote:

> If you mean that multiple tombstones for the same row or column should
> be merged into a single one at compaction time, then yes, that is what
> happens.
> 
> On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
>  wrote:
>> Maybe it could be taken into account when the compaction is executed,
>> if I only have a consecutive list of uninterrupted tombstones it could
>> only care about the first. It sounds like the-way-it-should-be, maybe
>> as a part of the "row-reduce" process.
>> 
>> Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
>> 
>> //GK
>> http://twitter.com/germanklf
>> http://code.google.com/p/seide/
>> 
>> On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne  
>> wrote:
>>> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn  wrote:
 Thanks, Aaron, but I'm not 100% clear.
 
 My situation is this: My use case spins off rows (not columns) that I no
 longer need and want to delete. It is possible that these rows were never
 created in the first place, or were already deleted. This is a very large
 cleanup task that normally deletes a lot of rows, and the last thing that I
 want to do is create tombstones for rows that didn't exist in the first
 place, or lengthen the life on disk of tombstones of rows that are already
 deleted.
 
 So the question is: before I delete, do I have to retrieve the row to see 
 if
 it exists in the first place?
>>> 
>>> Yes, in your situation you do.
>>> 
 
 
 
 On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton 
 wrote:
> 
> AFAIK that's not necessary, there is no need to worry about previous
> deletes. You can delete stuff that does not even exist, neither 
> batch_mutate
> or remove are going to throw an error.
> All the columns that were (roughly speaking) present at your first
> deletion will be available for GC at the end of the first tombstones life.
> Same for the second.
> Say you were to write a col between the two deletes with the same name as
> one present at the start. The first version of the col is avail for GC 
> after
> tombstone 1, and the second after tombstone 2.
> Hope that helps
> Aaron
> On 18/01/2011, at 9:37 PM, David Boxenhorn  wrote:
> 
> Thanks. In other words, before I delete something, I should check to see
> whether it exists as a live row in the first place.
> 
> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King  wrote:
>> 
>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn 
>> wrote:
>>> If I delete a row, and later on delete it again, before GCGraceSeconds
>>> has
>>> elapsed, does the tombstone live longer?
>> 
>> Each delete is a new tombstone, which should answer your question.
>> 
>> -ryan
>> 
>>> In other words, if I have the following scenario:
>>> 
>>> GCGraceSeconds = 10 days
>>> On day 1 I delete a row
>>> On day 5 I delete the row again
>>> 
>>> Will the tombstone be removed on day 10 or day 15?
>>> 
> 
 
 
>>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com



Re: Tombstone lifespan after multiple deletions

2011-01-18 Thread Zhu Han
On Wed, Jan 19, 2011 at 11:35 AM, Germán Kondolf
wrote:

> Yes, that's what I meant, but correct me if I'm wrong, when a deletion
> comes after another deletion for the same row or column will the gc-before
> count against the last one, isn't it?
>
> IIRC, after compaction. even if the row key is not wiped, all the CF are
replaced by the youngest tombstone.  I do not understand very clearly the
benefit of wiping out the whole row as early as possible.


>
> Maybe knowing that all the subsequent versions of a deletion are deletions
> too, it could take the first timestamp against the gc-grace-seconds when is
> reducing & compacting.
>
> // Germán Kondolf
> http://twitter.com/germanklf
> http://code.google.com/p/seide/
> // @i4
>
> On 19/01/2011, at 00:16, Jonathan Ellis  wrote:
>
> > If you mean that multiple tombstones for the same row or column should
> > be merged into a single one at compaction time, then yes, that is what
> > happens.
> >
> > On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
> >  wrote:
> >> Maybe it could be taken into account when the compaction is executed,
> >> if I only have a consecutive list of uninterrupted tombstones it could
> >> only care about the first. It sounds like the-way-it-should-be, maybe
> >> as a part of the "row-reduce" process.
> >>
> >> Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
> >>
> >> //GK
> >> http://twitter.com/germanklf
> >> http://code.google.com/p/seide/
> >>
> >> On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne 
> wrote:
> >>> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn 
> wrote:
>  Thanks, Aaron, but I'm not 100% clear.
> 
>  My situation is this: My use case spins off rows (not columns) that I
> no
>  longer need and want to delete. It is possible that these rows were
> never
>  created in the first place, or were already deleted. This is a very
> large
>  cleanup task that normally deletes a lot of rows, and the last thing
> that I
>  want to do is create tombstones for rows that didn't exist in the
> first
>  place, or lengthen the life on disk of tombstones of rows that are
> already
>  deleted.
> 
>  So the question is: before I delete, do I have to retrieve the row to
> see if
>  it exists in the first place?
> >>>
> >>> Yes, in your situation you do.
> >>>
> 
> 
> 
>  On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <
> aa...@thelastpickle.com>
>  wrote:
> >
> > AFAIK that's not necessary, there is no need to worry about previous
> > deletes. You can delete stuff that does not even exist, neither
> batch_mutate
> > or remove are going to throw an error.
> > All the columns that were (roughly speaking) present at your first
> > deletion will be available for GC at the end of the first tombstones
> life.
> > Same for the second.
> > Say you were to write a col between the two deletes with the same
> name as
> > one present at the start. The first version of the col is avail for
> GC after
> > tombstone 1, and the second after tombstone 2.
> > Hope that helps
> > Aaron
> > On 18/01/2011, at 9:37 PM, David Boxenhorn 
> wrote:
> >
> > Thanks. In other words, before I delete something, I should check to
> see
> > whether it exists as a live row in the first place.
> >
> > On Tue, Jan 18, 2011 at 9:24 AM, Ryan King  wrote:
> >>
> >> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn  >
> >> wrote:
> >>> If I delete a row, and later on delete it again, before
> GCGraceSeconds
> >>> has
> >>> elapsed, does the tombstone live longer?
> >>
> >> Each delete is a new tombstone, which should answer your question.
> >>
> >> -ryan
> >>
> >>> In other words, if I have the following scenario:
> >>>
> >>> GCGraceSeconds = 10 days
> >>> On day 1 I delete a row
> >>> On day 5 I delete the row again
> >>>
> >>> Will the tombstone be removed on day 10 or day 15?
> >>>
> >
> 
> 
> >>>
> >>
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder of Riptano, the source for professional Cassandra support
> > http://riptano.com
>
>


about the hector client

2011-01-18 Thread raoyixuan (Shandy)
Can you tell me the exactly steps to create a keyspace by hector client?



华为技术有限公司 Huawei Technologies Co., Ltd.[Company_logo]




Phone: 28358610
Mobile: 13425182943
Email: raoyix...@huawei.com
地址:深圳市龙岗区坂田华为基地 邮编:518129
Huawei Technologies Co., Ltd.
Bantian, Longgang District,Shenzhen 518129, P.R.China
http://www.huawei.com

本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from HUAWEI, 
which
is intended only for the person or entity whose address is listed above. Any 
use of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender by
phone or email immediately and delete it!

<>

Re: about the hector client

2011-01-18 Thread Aaron Morton
Try the hector user group for help on how to use the client http://groups.google.com/group/hector-usersYou can also create a keyspace in a cassandra cluster via the cassandra-cli command line interface Take a look at the tools online help if you're interested. AaronOn 19 Jan, 2011,at 05:00 PM, "raoyixuan (Shandy)"  wrote:







Can you tell me the exactly steps to create a keyspace by hector client?
 
 
 
华为技术有限公司
Huawei Technologies Co., Ltd.  


 
 
 
Phone: 28358610
Mobile: 13425182943
Email: raoyix...@huawei.com
地址:深圳市龙岗区坂田华为基地
邮编:518129
Huawei Technologies Co., Ltd.
Bantian, Longgang District,Shenzhen 518129, P.R.China
http://www.huawei.com




本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from HUAWEI, which

is intended only for the person or entity whose address is listed above. Any use of the

information contained herein in any way (including, but not limited to, total or partial

disclosure, reproduction, or dissemination) by persons other than the intended 
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by

phone or email immediately and delete it!
 





Re: about the hector client

2011-01-18 Thread Jeremy Hanna
Definitely get involved with that google group, but some examples are found 
here:
https://github.com/zznate/hector-examples/blob/master/src/main/java/com/riptano/cassandra/hector/example/SchemaManipulation.java

On Jan 18, 2011, at 10:17 PM, Aaron Morton wrote:

> Try the hector user group for help on how to use the client 
> http://groups.google.com/group/hector-users
> 
> You can also create a keyspace in a cassandra cluster via the cassandra-cli 
> command line interface Take a look at the tools online help if you're 
> interested. 
> 
> Aaron
> 
> On 19 Jan, 2011,at 05:00 PM, "raoyixuan (Shandy)"  
> wrote:
> 
>> Can you tell me the exactly steps to create a keyspace by hector client?
>>  
>>  
>>  
>> 华为技术有限公司 Huawei Technologies Co., Ltd.
>> 
>> 
>> 
>> 
>>  
>>  
>>  
>> Phone: 28358610
>> Mobile: 13425182943
>> Email: raoyix...@huawei.com
>> 地址:深圳市龙岗区坂田华为基地 邮编:518129
>> Huawei Technologies Co., Ltd.
>> Bantian, Longgang District,Shenzhen 518129, P.R.China
>> http://www.huawei.com
>> 本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
>> 止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
>> 的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
>> This e-mail and its attachments contain confidential information from 
>> HUAWEI, which 
>> is intended only for the person or entity whose address is listed above. Any 
>> use of the 
>> information contained herein in any way (including, but not limited to, 
>> total or partial 
>> disclosure, reproduction, or dissemination) by persons other than the 
>> intended 
>> recipient(s) is prohibited. If you receive this e-mail in error, please 
>> notify the sender by 
>> phone or email immediately and delete it!
>>  



Re: about the hector client

2011-01-18 Thread Jonathan Ellis
Most often, you will define schema with the cli.  Programmatic schema
definition is "advanced" in Cassandra, just as in relational
databases.

On Tue, Jan 18, 2011 at 10:19 PM, Jeremy Hanna
 wrote:
> Definitely get involved with that google group, but some examples are found 
> here:
> https://github.com/zznate/hector-examples/blob/master/src/main/java/com/riptano/cassandra/hector/example/SchemaManipulation.java
>
> On Jan 18, 2011, at 10:17 PM, Aaron Morton wrote:
>
>> Try the hector user group for help on how to use the client 
>> http://groups.google.com/group/hector-users
>>
>> You can also create a keyspace in a cassandra cluster via the cassandra-cli 
>> command line interface Take a look at the tools online help if you're 
>> interested.
>>
>> Aaron
>>
>> On 19 Jan, 2011,at 05:00 PM, "raoyixuan (Shandy)"  
>> wrote:
>>
>>> Can you tell me the exactly steps to create a keyspace by hector client?
>>>
>>>
>>>
>>> 华为技术有限公司 Huawei Technologies Co., Ltd.
>>> 
>>>
>>>
>>>
>>>
>>>
>>>
>>> Phone: 28358610
>>> Mobile: 13425182943
>>> Email: raoyix...@huawei.com
>>> 地址:深圳市龙岗区坂田华为基地 邮编:518129
>>> Huawei Technologies Co., Ltd.
>>> Bantian, Longgang District,Shenzhen 518129, P.R.China
>>> http://www.huawei.com
>>> 本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
>>> 止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
>>> 的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
>>> This e-mail and its attachments contain confidential information from 
>>> HUAWEI, which
>>> is intended only for the person or entity whose address is listed above. 
>>> Any use of the
>>> information contained herein in any way (including, but not limited to, 
>>> total or partial
>>> disclosure, reproduction, or dissemination) by persons other than the 
>>> intended
>>> recipient(s) is prohibited. If you receive this e-mail in error, please 
>>> notify the sender by
>>> phone or email immediately and delete it!
>>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: about the hector client

2011-01-18 Thread Aaron Morton
OK if I add a link to https://github.com/zznate/hector-examples to the wiki page for clients http://wiki.apache.org/cassandra/ClientOptions  ?AOn 19 Jan, 2011,at 05:22 PM, Jonathan Ellis  wrote:Most often, you will define schema with the cli.  Programmatic schema
definition is "advanced" in Cassandra, just as in relational
databases.

On Tue, Jan 18, 2011 at 10:19 PM, Jeremy Hanna
 wrote:
> Definitely get involved with that google group, but some examples are found here:
> https://github.com/zznate/hector-examples/blob/master/src/main/java/com/riptano/cassandra/hector/example/SchemaManipulation.java
>
> On Jan 18, 2011, at 10:17 PM, Aaron Morton wrote:
>
>> Try the hector user group for help on how to use the client http://groups.google.com/group/hector-users
>>
>> You can also create a keyspace in a cassandra cluster via the cassandra-cli command line interface Take a look at the tools online help if you're interested.
>>
>> Aaron
>>
>> On 19 Jan, 2011,at 05:00 PM, "raoyixuan (Shandy)"  wrote:
>>
>>> Can you tell me the exactly steps to create a keyspace by hector client?
>>>
>>>
>>>
>>> 华为技术有限公司 Huawei Technologies Co., Ltd.
>>> 
>>>
>>>
>>>
>>>
>>>
>>>
>>> Phone: 28358610
>>> Mobile: 13425182943
>>> Email: raoyix...@huawei.com
>>> 地址:深圳市龙岗区坂田华为基地 邮编:518129
>>> Huawei Technologies Co., Ltd.
>>> Bantian, Longgang District,Shenzhen 518129, P.R.China
>>> http://www.huawei.com
>>> 本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
>>> 止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
>>> 的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
>>> This e-mail and its attachments contain confidential information from HUAWEI, which
>>> is intended only for the person or entity whose address is listed above. Any use of the
>>> information contained herein in any way (including, but not limited to, total or partial
>>> disclosure, reproduction, or dissemination) by persons other than the intended
>>> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by
>>> phone or email immediately and delete it!
>>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


RE: about the hector client

2011-01-18 Thread raoyixuan (Shandy)
The url is unavailable

From: Aaron Morton [mailto:aa...@thelastpickle.com]
Sent: Wednesday, January 19, 2011 12:17 PM
To: user@cassandra.apache.org
Subject: Re: about the hector client

Try the hector user group for help on how to use the client 
http://groups.google.com/group/hector-users

You can also create a keyspace in a cassandra cluster via the cassandra-cli 
command line interface Take a look at the tools online help if you're 
interested.

Aaron

On 19 Jan, 2011,at 05:00 PM, "raoyixuan (Shandy)"  wrote:
Can you tell me the exactly steps to create a keyspace by hector client?



华为技术有限公司 Huawei Technologies Co., Ltd.







Phone: 28358610
Mobile: 13425182943
Email: raoyix...@huawei.com
地址:深圳市龙岗区坂田华为基地 邮编:518129
Huawei Technologies Co., Ltd.
Bantian, Longgang District,Shenzhen 518129, P.R.China
http://www.huawei.com

本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from HUAWEI, 
which
is intended only for the person or entity whose address is listed above. Any 
use of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender by
phone or email immediately and delete it!



Re: about the hector client

2011-01-18 Thread Ashish
Working fine for me. Can you pls try again.

thanks
ashish

On Wed, Jan 19, 2011 at 11:42 AM, raoyixuan (Shandy)
 wrote:
> The url is unavailable
>
>
>
> From: Aaron Morton [mailto:aa...@thelastpickle.com]
> Sent: Wednesday, January 19, 2011 12:17 PM
> To: user@cassandra.apache.org
> Subject: Re: about the hector client
>
>
>
> Try the hector user group for help on how to use the
> client http://groups.google.com/group/hector-users
>
>
>
> You can also create a keyspace in a cassandra cluster via the cassandra-cli
> command line interface Take a look at the tools online help if you're
> interested.
>
>
>
> Aaron
>
> On 19 Jan, 2011,at 05:00 PM, "raoyixuan (Shandy)" 
> wrote:
>
> Can you tell me the exactly steps to create a keyspace by hector client?
>
>
>
>
>
>
>
> 华为技术有限公司 Huawei Technologies Co., Ltd.


RE: about the hector client

2011-01-18 Thread raoyixuan (Shandy)
I will try it again, thank you .

-Original Message-
From: Ashish [mailto:paliwalash...@gmail.com] 
Sent: Wednesday, January 19, 2011 2:16 PM
To: user@cassandra.apache.org
Subject: Re: about the hector client

Working fine for me. Can you pls try again.

thanks
ashish

On Wed, Jan 19, 2011 at 11:42 AM, raoyixuan (Shandy)
 wrote:
> The url is unavailable
>
>
>
> From: Aaron Morton [mailto:aa...@thelastpickle.com]
> Sent: Wednesday, January 19, 2011 12:17 PM
> To: user@cassandra.apache.org
> Subject: Re: about the hector client
>
>
>
> Try the hector user group for help on how to use the
> client http://groups.google.com/group/hector-users
>
>
>
> You can also create a keyspace in a cassandra cluster via the cassandra-cli
> command line interface Take a look at the tools online help if you're
> interested.
>
>
>
> Aaron
>
> On 19 Jan, 2011,at 05:00 PM, "raoyixuan (Shandy)" 
> wrote:
>
> Can you tell me the exactly steps to create a keyspace by hector client?
>
>
>
>
>
>
>
> 华为技术有限公司 Huawei Technologies Co., Ltd.


Re: Java cient

2011-01-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
Thanks everyone. I guess, I should go with hector
On 18 Jan 2011 17:41, "Alois Bělaška"  wrote:
> Definitelly Pelops https://github.com/s7/scale7-pelops
>
> 2011/1/18 Noble Paul നോബിള്‍ नोब्ळ् 
>
>> What is the most commonly used java client library? Which is the the most
>> mature/feature complete?
>> Noble
>>


Keys must be written in ascending order

2011-01-18 Thread David King
I'm upgrading an 0.6 cluster to 0.7 in a testing environment. In cleaning up 
one of the nodes I get the exception below. Googling around seems to reveal 
people having trouble with it caused by too-small heap sizes but that doesn't 
look to be what's going on here. Am I missing something obvious?

$ time ./cassandra-0.7/bin/nodetool -h cassa7test01 cleanup
Error occured while cleaning up keyspace keyspace
java.util.concurrent.ExecutionException: java.io.IOException: Keys must be 
written in ascending order.
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
at java.util.concurrent.FutureTask.get(FutureTask.java:111)
at 
org.apache.cassandra.db.CompactionManager.performCleanup(CompactionManager.java:180)
at 
org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:909)
at 
org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1127)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:111)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:45)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:226)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:251)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:857)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:795)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1449)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:90)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1284)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1382)
at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:807)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
at sun.rmi.transport.Transport$1.run(Transport.java:177)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.IOException: Keys must be written in ascending order.
at 
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:107)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:124)
at 
org.apache.cassandra.db.CompactionManager.doCleanupCompaction(CompactionManager.java:411)
at 
org.apache.cassandra.db.CompactionManager.access$400(CompactionManager.java:54)
at 
org.apache.cassandra.db.CompactionManager$2.call(CompactionManager.java:171)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
... 3 more

real14m27.895s
user0m0.670s
sys 0m0.200s




Re: Multi-tenancy, and authentication and authorization

2011-01-18 Thread David Boxenhorn
I think tuning of Cassandra is overly complex, and even with a single tenant
you can run into problems with too many CFs.

Right now there is a one-to-one mapping between memtables and SSTables.
Instead of that, would it be possible to have one giant memtable for each
Cassandra instance, with partial flushing to SSTs?

It seems to me like a single memtable would make it MUCH easier to tune
Cassandra, since the decision whether to (partially) flush the memtable to
disk could be made on a node-wide basis, based on the resources you really
have, instead of the guess-work that we are forced to do today.