questions about consistency

2010-04-21 Thread Даниел Симеонов
Hello,
   I am pretty new to Cassandra and I have some questions, they may seem
trivial, but still I am pretty new to the subject. First is about the lack
of a compareAndSet() operation, as I understood it is not supported
currently in Cassandra, do you know of use cases which really require such
operations and how these use cases currently workaround this .
Second topic I'd like to discuss a little bit more is about the read repair,
as I understand is that it is being done by the timestamps supplied by the
client application servers. Since computer clocks (which requires
synchronization algorithms working regularly) diverge there should be a time
frame during which the order of the client request written to the database
is not guaranteed, do you have real world experiences with this? Is this
similar to the casual consistency (
http://en.wikipedia.org/wiki/Causal_consistency ) .What happens if two
application servers try to update the same data and supply one and the same
timestamp (it could happen although rarely), what if they try to update
several columns in batch operation this way, is there a chance that the
column value could be intermixed between the two update requests?
I have one last question about the consistency level ALL, do you know of
real use cases where it is required (instead of QUORUM) and why (both read
and write)?
Thank you very much for your help to better understand 'Cassandra'!
Best regards, Daniel.


problem with get_key_range in cassandra 0.4.1

2010-04-21 Thread ROGER PUIG GANZA
Hi all.
I'm benchmarking several  nosql datastores and I'm going nuts with Cassandra.
The version of Cassandra we are using is 0.4.1 I know 0.4.1 is a bit outdated 
but my implementation is done with that version.


The thing is that every time the test runs, I need to reset the data inside the 
datastore to try with different workloads.
Easy on mysql with a trunc sentence.
I need to delete all columns on each columnFamily. What I do is getting all the 
keys with get_key_range and iterating over the list  and removing them.
Like that:

List keys = client.get_key_range(keyspace, columnFamily, min_value, 
max_value, count, defaultConsistencyLevel);
Iterator iterator = keys.iterator();
String key;
while (iterator.hasNext()) {
  key = iterator.next();
  client.remove(keyspace, key, path, time, 
defaultConsistencyLevel);
}

The problem is: client.get_key_range throws Internal error processing 
get_key_range

My questions are: is there any workaround?
Am I missing any configuration 
setting which could help me?


Thank you very much in advance
Roger Puig Ganza



Re: questions about consistency

2010-04-21 Thread Paul Prescod
I'm not an expert, so take what I say with a grain of salt.

2010/4/21 Даниел Симеонов :
> Hello,
>    I am pretty new to Cassandra and I have some questions, they may seem
> trivial, but still I am pretty new to the subject. First is about the lack
> of a compareAndSet() operation, as I understood it is not supported
> currently in Cassandra, do you know of use cases which really require such
> operations and how these use cases currently workaround this .

I think your question is paradoxical. If the use case really requires
the operation then there is no workaround by definition. The existence
of the workaround implies that the use case really did not require the
operation.

Anyhow, vector clocks are probably relevant to this question and your next one.

> Second topic I'd like to discuss a little bit more is about the read repair,
> as I understand is that it is being done by the timestamps supplied by the
> client application servers. Since computer clocks (which requires
> synchronization algorithms working regularly) diverge there should be a time
> frame during which the order of the client request written to the database
> is not guaranteed, do you have real world experiences with this? Is this
> similar to the casual consistency (
> http://en.wikipedia.org/wiki/Causal_consistency ) .What happens if two
> application servers try to update the same data and supply one and the same
> timestamp (it could happen although rarely), what if they try to update
> several columns in batch operation this way, is there a chance that the
> column value could be intermixed between the two update requests?

All of this is changing with vector clocks in Cassandra 0.7.

https://issues.apache.org/jira/browse/CASSANDRA-580

> I have one last question about the consistency level ALL, do you know of
> real use cases where it is required (instead of QUORUM) and why (both read
> and write)?

It would be required when your business rules do not allow any client
to read the old value. For example if it would be illegal to provide
an obsolete stock value.

> Thank you very much for your help to better understand 'Cassandra'!
> Best regards, Daniel.
>


Re: questions about consistency

2010-04-21 Thread Даниел Симеонов
Hi Paul,
   about the last answer I still need some more clarifications, as I
understand it if QUORUM is used, then reads doesn't get old values either?
Or am I wrong?
Thank you very much!
Best regards, Daniel.

2010/4/21 Paul Prescod 

> I'm not an expert, so take what I say with a grain of salt.
>
> 2010/4/21 Даниел Симеонов :
> > Hello,
> >I am pretty new to Cassandra and I have some questions, they may seem
> > trivial, but still I am pretty new to the subject. First is about the
> lack
> > of a compareAndSet() operation, as I understood it is not supported
> > currently in Cassandra, do you know of use cases which really require
> such
> > operations and how these use cases currently workaround this .
>
> I think your question is paradoxical. If the use case really requires
> the operation then there is no workaround by definition. The existence
> of the workaround implies that the use case really did not require the
> operation.
>
> Anyhow, vector clocks are probably relevant to this question and your next
> one.
>
> > Second topic I'd like to discuss a little bit more is about the read
> repair,
> > as I understand is that it is being done by the timestamps supplied by
> the
> > client application servers. Since computer clocks (which requires
> > synchronization algorithms working regularly) diverge there should be a
> time
> > frame during which the order of the client request written to the
> database
> > is not guaranteed, do you have real world experiences with this? Is this
> > similar to the casual consistency (
> > http://en.wikipedia.org/wiki/Causal_consistency ) .What happens if two
> > application servers try to update the same data and supply one and the
> same
> > timestamp (it could happen although rarely), what if they try to update
> > several columns in batch operation this way, is there a chance that the
> > column value could be intermixed between the two update requests?
>
> All of this is changing with vector clocks in Cassandra 0.7.
>
> https://issues.apache.org/jira/browse/CASSANDRA-580
>
> > I have one last question about the consistency level ALL, do you know of
> > real use cases where it is required (instead of QUORUM) and why (both
> read
> > and write)?
>
> It would be required when your business rules do not allow any client
> to read the old value. For example if it would be illegal to provide
> an obsolete stock value.
>
> > Thank you very much for your help to better understand 'Cassandra'!
> > Best regards, Daniel.
> >
>


Cassandra tuning for running test on a desktop

2010-04-21 Thread Nicolas Labrot
Hello,

For my first message I will first thanks Cassandra contributors for their
great works.

I have a parameter issue with Cassandra (I hope it's just a parameter
issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a
simple dual core with 4GB of RAM on WinXP. I have keep the default JVM
option inside cassandra.bat (Xmx1G)

I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF (named
Super1). The insertion go to 1 millions of SC (without slowdown) and
Cassandra crash because of an OOM. (I store an average of 100 bytes per SC
with a max of 10kB).
I have aggressively decreased all the memories parameters without any
respect to the consistency (My config is here [1]), the cache is turn off
but Cassandra still go to OOM. I have joined the last line of the Cassandra
life [2].

What can I do to fix my issue ?  Is there another solution than increasing
the Xmx ?

Thanks for your help,

Nicolas





[1]
  

  

org.apache.cassandra.locator.RackUnawareStrategy
  1

org.apache.cassandra.locator.EndPointSnitch

  
  32

  auto
  64
  64
  16
  4
  64

  16
  32
  0.01
  0.01
  60
  4
  8



[2]
 INFO 13:36:41,062 Super1 has reached its threshold; switching in a fresh
Memtable at
CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
position=5417524)
 INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
 INFO 13:36:41,062 Writing Memtable(Super1)@15385755
 INFO 13:36:42,062 Completed flushing
d:\cassandra\data\Keyspace1\Super1-711-Data.db
 INFO 13:36:45,781 Super1 has reached its threshold; switching in a fresh
Memtable at
CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
position=6065637)
 INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910
 INFO 13:36:45,796 Writing Memtable(Super1)@15578910
 INFO 13:36:46,109 Completed flushing
d:\cassandra\data\Keyspace1\Super1-712-Data.db
 INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240 reclaimed
leaving 922392600 used; max is 1174208512
 INFO 13:36:54,593 Super1 has reached its threshold; switching in a fresh
Memtable at
CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
position=6722241)
 INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872
 INFO 13:36:54,593 Writing Memtable(Super1)@24468872
 INFO 13:36:55,421 Completed flushing
d:\cassandra\data\Keyspace1\Super1-713-Data.dbjava.lang.OutOfMemoryError:
Java heap space
 INFO 13:37:08,281 GC for ConcurrentMarkSweep: 5561 ms, 9432 reclaimed
leaving 971904520 used; max is 1174208512


Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Mark Greene
Trying increasing Xmx. 1G is probably not enough for the amount of inserts
you are doing.

On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot  wrote:

> Hello,
>
> For my first message I will first thanks Cassandra contributors for their
> great works.
>
> I have a parameter issue with Cassandra (I hope it's just a parameter
> issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a
> simple dual core with 4GB of RAM on WinXP. I have keep the default JVM
> option inside cassandra.bat (Xmx1G)
>
> I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF
> (named Super1). The insertion go to 1 millions of SC (without slowdown) and
> Cassandra crash because of an OOM. (I store an average of 100 bytes per SC
> with a max of 10kB).
> I have aggressively decreased all the memories parameters without any
> respect to the consistency (My config is here [1]), the cache is turn off
> but Cassandra still go to OOM. I have joined the last line of the Cassandra
> life [2].
>
> What can I do to fix my issue ?  Is there another solution than increasing
> the Xmx ?
>
> Thanks for your help,
>
> Nicolas
>
>
>
>
>
> [1]
>   
> 
>ColumnType="Super"
> CompareWith="BytesType"
> CompareSubcolumnsWith="BytesType" />
>
> org.apache.cassandra.locator.RackUnawareStrategy
>   1
>
> org.apache.cassandra.locator.EndPointSnitch
> 
>   
>   32
>
>   auto
>   64
>   64
>   16
>   4
>   64
>
>   16
>   32
>   0.01
>   0.01
>   60
>   4
>   8
> 
>
>
> [2]
>  INFO 13:36:41,062 Super1 has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> position=5417524)
>  INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
>  INFO 13:36:41,062 Writing Memtable(Super1)@15385755
>  INFO 13:36:42,062 Completed flushing
> d:\cassandra\data\Keyspace1\Super1-711-Data.db
>  INFO 13:36:45,781 Super1 has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> position=6065637)
>  INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910
>  INFO 13:36:45,796 Writing Memtable(Super1)@15578910
>  INFO 13:36:46,109 Completed flushing
> d:\cassandra\data\Keyspace1\Super1-712-Data.db
>  INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240 reclaimed
> leaving 922392600 used; max is 1174208512
>  INFO 13:36:54,593 Super1 has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> position=6722241)
>  INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872
>  INFO 13:36:54,593 Writing Memtable(Super1)@24468872
>  INFO 13:36:55,421 Completed flushing
> d:\cassandra\data\Keyspace1\Super1-713-Data.dbjava.lang.OutOfMemoryError:
> Java heap space
>  INFO 13:37:08,281 GC for ConcurrentMarkSweep: 5561 ms, 9432 reclaimed
> leaving 971904520 used; max is 1174208512
>
>


Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Nicolas Labrot
I have try 1400M, and Cassandra OOM too.

Is there another solution ? My data isn't very big.

It seems that is the merge of the db


On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene  wrote:

> Trying increasing Xmx. 1G is probably not enough for the amount of inserts
> you are doing.
>
>
> On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot  wrote:
>
>> Hello,
>>
>> For my first message I will first thanks Cassandra contributors for their
>> great works.
>>
>> I have a parameter issue with Cassandra (I hope it's just a parameter
>> issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a
>> simple dual core with 4GB of RAM on WinXP. I have keep the default JVM
>> option inside cassandra.bat (Xmx1G)
>>
>> I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF
>> (named Super1). The insertion go to 1 millions of SC (without slowdown) and
>> Cassandra crash because of an OOM. (I store an average of 100 bytes per SC
>> with a max of 10kB).
>> I have aggressively decreased all the memories parameters without any
>> respect to the consistency (My config is here [1]), the cache is turn off
>> but Cassandra still go to OOM. I have joined the last line of the Cassandra
>> life [2].
>>
>> What can I do to fix my issue ?  Is there another solution than increasing
>> the Xmx ?
>>
>> Thanks for your help,
>>
>> Nicolas
>>
>>
>>
>>
>>
>> [1]
>>   
>> 
>>   > ColumnType="Super"
>> CompareWith="BytesType"
>> CompareSubcolumnsWith="BytesType" />
>>
>> org.apache.cassandra.locator.RackUnawareStrategy
>>   1
>>
>> org.apache.cassandra.locator.EndPointSnitch
>> 
>>   
>>   32
>>
>>   auto
>>   64
>>   64
>>   16
>>   4
>>   64
>>
>>   16
>>   32
>>   0.01
>>   0.01
>>   60
>>   4
>>   8
>> 
>>
>>
>> [2]
>>  INFO 13:36:41,062 Super1 has reached its threshold; switching in a fresh
>> Memtable at
>> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
>> position=5417524)
>>  INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
>>  INFO 13:36:41,062 Writing Memtable(Super1)@15385755
>>  INFO 13:36:42,062 Completed flushing
>> d:\cassandra\data\Keyspace1\Super1-711-Data.db
>>  INFO 13:36:45,781 Super1 has reached its threshold; switching in a fresh
>> Memtable at
>> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
>> position=6065637)
>>  INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910
>>  INFO 13:36:45,796 Writing Memtable(Super1)@15578910
>>  INFO 13:36:46,109 Completed flushing
>> d:\cassandra\data\Keyspace1\Super1-712-Data.db
>>  INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240 reclaimed
>> leaving 922392600 used; max is 1174208512
>>  INFO 13:36:54,593 Super1 has reached its threshold; switching in a fresh
>> Memtable at
>> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
>> position=6722241)
>>  INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872
>>  INFO 13:36:54,593 Writing Memtable(Super1)@24468872
>>  INFO 13:36:55,421 Completed flushing
>> d:\cassandra\data\Keyspace1\Super1-713-Data.dbjava.lang.OutOfMemoryError:
>> Java heap space
>>  INFO 13:37:08,281 GC for ConcurrentMarkSweep: 5561 ms, 9432 reclaimed
>> leaving 971904520 used; max is 1174208512
>>
>>
>


RE: problem with get_key_range in cassandra 0.4.1

2010-04-21 Thread Mark Jones
Stop the program, wipe the data dir and commit logs, start the program, it's 
what I'm doing.

I even made a script that will do it so it's just a one line command.

From: ROGER PUIG GANZA [mailto:rp...@tid.es]
Sent: Wednesday, April 21, 2010 5:20 AM
To: cassandra-u...@incubator.apache.org
Subject: problem with get_key_range in cassandra 0.4.1

Hi all.
I'm benchmarking several  nosql datastores and I'm going nuts with Cassandra.
The version of Cassandra we are using is 0.4.1 I know 0.4.1 is a bit outdated 
but my implementation is done with that version.


The thing is that every time the test runs, I need to reset the data inside the 
datastore to try with different workloads.
Easy on mysql with a trunc sentence.
I need to delete all columns on each columnFamily. What I do is getting all the 
keys with get_key_range and iterating over the list  and removing them.
Like that:

List keys = client.get_key_range(keyspace, columnFamily, min_value, 
max_value, count, defaultConsistencyLevel);
Iterator iterator = keys.iterator();
String key;
while (iterator.hasNext()) {
  key = iterator.next();
  client.remove(keyspace, key, path, time, 
defaultConsistencyLevel);
}

The problem is: client.get_key_range throws Internal error processing 
get_key_range

My questions are: is there any workaround?
Am I missing any configuration 
setting which could help me?


Thank you very much in advance
Roger Puig Ganza



RE: Cassandra tuning for running test on a desktop

2010-04-21 Thread Mark Jones
On my 4GB machine I'm giving it 3GB and having no trouble with 60+ million 500 
byte columns

From: Nicolas Labrot [mailto:nith...@gmail.com]
Sent: Wednesday, April 21, 2010 7:47 AM
To: user@cassandra.apache.org
Subject: Re: Cassandra tuning for running test on a desktop

I have try 1400M, and Cassandra OOM too.

Is there another solution ? My data isn't very big.

It seems that is the merge of the db

On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene 
mailto:green...@gmail.com>> wrote:
Trying increasing Xmx. 1G is probably not enough for the amount of inserts you 
are doing.

On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot 
mailto:nith...@gmail.com>> wrote:
Hello,

For my first message I will first thanks Cassandra contributors for their great 
works.

I have a parameter issue with Cassandra (I hope it's just a parameter issue). 
I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a simple dual 
core with 4GB of RAM on WinXP. I have keep the default JVM option inside 
cassandra.bat (Xmx1G)

I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF (named 
Super1). The insertion go to 1 millions of SC (without slowdown) and Cassandra 
crash because of an OOM. (I store an average of 100 bytes per SC with a max of 
10kB).
I have aggressively decreased all the memories parameters without any respect 
to the consistency (My config is here [1]), the cache is turn off but Cassandra 
still go to OOM. I have joined the last line of the Cassandra life [2].

What can I do to fix my issue ?  Is there another solution than increasing the 
Xmx ?

Thanks for your help,

Nicolas





[1]
  

  
  
org.apache.cassandra.locator.RackUnawareStrategy
  1
  
org.apache.cassandra.locator.EndPointSnitch

  
  32

  auto
  64
  64
  16
  4
  64

  16
  32
  0.01
  0.01
  60
  4
  8



[2]
 INFO 13:36:41,062 Super1 has reached its threshold; switching in a fresh 
Memtable at 
CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', 
position=5417524)
 INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
 INFO 13:36:41,062 Writing Memtable(Super1)@15385755
 INFO 13:36:42,062 Completed flushing 
d:\cassandra\data\Keyspace1\Super1-711-Data.db
 INFO 13:36:45,781 Super1 has reached its threshold; switching in a fresh 
Memtable at 
CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', 
position=6065637)
 INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910
 INFO 13:36:45,796 Writing Memtable(Super1)@15578910
 INFO 13:36:46,109 Completed flushing 
d:\cassandra\data\Keyspace1\Super1-712-Data.db
 INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240 reclaimed 
leaving 922392600 used; max is 1174208512
 INFO 13:36:54,593 Super1 has reached its threshold; switching in a fresh 
Memtable at 
CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log', 
position=6722241)
 INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872
 INFO 13:36:54,593 Writing Memtable(Super1)@24468872
 INFO 13:36:55,421 Completed flushing 
d:\cassandra\data\Keyspace1\Super1-713-Data.dbjava.lang.OutOfMemoryError: Java 
heap space
 INFO 13:37:08,281 GC for ConcurrentMarkSweep: 5561 ms, 9432 reclaimed leaving 
971904520 used; max is 1174208512




Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Nicolas Labrot
So does it means the RAM needed is proportionnal with the data handled ?

Or Cassandra need a minimum amount or RAM when dataset is big?

I must confess this OOM behaviour is strange.


On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones  wrote:

>  On my 4GB machine I’m giving it 3GB and having no trouble with 60+
> million 500 byte columns
>
>
>
> *From:* Nicolas Labrot [mailto:nith...@gmail.com]
> *Sent:* Wednesday, April 21, 2010 7:47 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cassandra tuning for running test on a desktop
>
>
>
> I have try 1400M, and Cassandra OOM too.
>
> Is there another solution ? My data isn't very big.
>
> It seems that is the merge of the db
>
>  On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene  wrote:
>
> Trying increasing Xmx. 1G is probably not enough for the amount of inserts
> you are doing.
>
>
>
> On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot  wrote:
>
> Hello,
>
> For my first message I will first thanks Cassandra contributors for their
> great works.
>
> I have a parameter issue with Cassandra (I hope it's just a parameter
> issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a
> simple dual core with 4GB of RAM on WinXP. I have keep the default JVM
> option inside cassandra.bat (Xmx1G)
>
> I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF
> (named Super1). The insertion go to 1 millions of SC (without slowdown) and
> Cassandra crash because of an OOM. (I store an average of 100 bytes per SC
> with a max of 10kB).
> I have aggressively decreased all the memories parameters without any
> respect to the consistency (My config is here [1]), the cache is turn off
> but Cassandra still go to OOM. I have joined the last line of the Cassandra
> life [2].
>
> What can I do to fix my issue ?  Is there another solution than increasing
> the Xmx ?
>
> Thanks for your help,
>
> Nicolas
>
>
>
>
>
> [1]
>   
> 
>ColumnType="Super"
> CompareWith="BytesType"
> CompareSubcolumnsWith="BytesType" />
>
> org.apache.cassandra.locator.RackUnawareStrategy
>   1
>
> org.apache.cassandra.locator.EndPointSnitch
> 
>   
>   32
>
>   auto
>   64
>   64
>   16
>   4
>   64
>
>   16
>   32
>   0.01
>   0.01
>   60
>   4
>   8
> 
>
>
> [2]
>  INFO 13:36:41,062 Super1 has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> position=5417524)
>  INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
>  INFO 13:36:41,062 Writing Memtable(Super1)@15385755
>  INFO 13:36:42,062 Completed flushing
> d:\cassandra\data\Keyspace1\Super1-711-Data.db
>  INFO 13:36:45,781 Super1 has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> position=6065637)
>  INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910
>  INFO 13:36:45,796 Writing Memtable(Super1)@15578910
>  INFO 13:36:46,109 Completed flushing
> d:\cassandra\data\Keyspace1\Super1-712-Data.db
>  INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240 reclaimed
> leaving 922392600 used; max is 1174208512
>  INFO 13:36:54,593 Super1 has reached its threshold; switching in a fresh
> Memtable at
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> position=6722241)
>  INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872
>  INFO 13:36:54,593 Writing Memtable(Super1)@24468872
>  INFO 13:36:55,421 Completed flushing
> d:\cassandra\data\Keyspace1\Super1-713-Data.dbjava.lang.OutOfMemoryError:
> Java heap space
>  INFO 13:37:08,281 GC for ConcurrentMarkSweep: 5561 ms, 9432 reclaimed
> leaving 971904520 used; max is 1174208512
>
>
>
>
>


Re: problem with get_key_range in cassandra 0.4.1

2010-04-21 Thread Jonathan Ellis
first, upgrade to 0.6.1.

second, the easiest way to wipe everything is at the fs level like Mark said.

On Wed, Apr 21, 2010 at 5:20 AM, ROGER PUIG GANZA  wrote:
> Hi all.
>
> I’m benchmarking several  nosql datastores and I’m going nuts with
> Cassandra.
>
> The version of Cassandra we are using is 0.4.1 I know 0.4.1 is a bit
> outdated but my implementation is done with that version.
>
>
>
>
>
> The thing is that every time the test runs, I need to reset the data inside
> the datastore to try with different workloads.
>
> Easy on mysql with a trunc sentence.
>
> I need to delete all columns on each columnFamily. What I do is getting all
> the keys with get_key_range and iterating over the list  and removing them.
>
> Like that:
>
>
>
> List keys = client.get_key_range(keyspace, columnFamily, min_value,
> max_value, count, defaultConsistencyLevel);
>
>     Iterator iterator = keys.iterator();
>
>     String key;
>
>     while (iterator.hasNext()) {
>
>   key = iterator.next();
>
>   client.remove(keyspace, key, path, time,
> defaultConsistencyLevel);
>
> }
>
>
>
> The problem is: client.get_key_range throws Internal error processing
> get_key_range
>
>
>
> My questions are: is there any workaround?
>
>     Am I missing any
> configuration setting which could help me?
>
>
>
>
>
> Thank you very much in advance
>
> Roger Puig Ganza
>
>


Cassandra's bad behavior on disk failure

2010-04-21 Thread Oleg Anastasjev
Hello,

I am testing how cassandra behaves on single node disk failures to know what to
expect when things go bad.
I had a cluster of 4 cassandra nodes, stress loaded it with client and made 2
tests:
1. emulated disk failure of /data volume on read only stress test
2. emulated disk failure of /commitlog volumn on write intensive test

1. On read test with data volume down, a lot of
"org.apache.thrift.TApplicationException: Internal error processing get_slice"
was logged at client side. On cassandra server logged alot of IOExceptions
reading every *.db file it has. Node continued to show as UP in ring.

OK, the behavior is not ideal, but still can be worked around at client side,
throwing out nodes as soon as TApplicationException is received from cassandra.

2. Much worse was with write test:
No exception was seen at client, writes are going through normally, but
PERIODIC-COMMIT-LOG-SYNCER failed to sync commit logs, heap of node quickly
became full and node freezed in GC loop. Still, it continued to show as UP in
ring.

This, i believe, is bad, because no quick workaround could be done at client
side (no exceptions are coming from failed node) and in real system will lead to
dramatic slow down of the whole cluster, because clients, not knowing, that node
is actually dead, will direct 1/4th of requests to it and timeout.

I think that more correct behavior here could be halting cassandra server on any
disk IO error, so clients can quickly detect this and failover to healthy
servers.

What do you think ?

Did you guys experienced disk failure in production and how was it ?




Re: Big Data Workshop 4/23 was Re: Cassandra Hackathon in SF @ Digg - 04/22 6:30pm

2010-04-21 Thread Eric Evans
On Tue, 2010-04-20 at 17:28 -0700, Joseph Boyle wrote:
> We will have people from the Cassandra (including Stu Hood and Matt
> Pfeil) and other NoSQL communities as well as with broader Big Data
> interests, all available for discussion, and you can propose a session
> to learn about anything. 

Gary Dusbabek and myself will be there as well. Are there any others, or
is everyone conferenced-out? :)

-- 
Eric Evans
eev...@rackspace.com



Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Mark Greene
RAM doesn't necessarily need to be proportional but I would say the number
of nodes does. You can't just throw a bazillion inserts at one node. This is
the main benefit of Cassandra is that if you start hitting your capacity,
you add more machines and distribute the keys across more machines.

On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot  wrote:

> So does it means the RAM needed is proportionnal with the data handled ?
>
> Or Cassandra need a minimum amount or RAM when dataset is big?
>
> I must confess this OOM behaviour is strange.
>
>
> On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones  wrote:
>
>>  On my 4GB machine I’m giving it 3GB and having no trouble with 60+
>> million 500 byte columns
>>
>>
>>
>> *From:* Nicolas Labrot [mailto:nith...@gmail.com]
>> *Sent:* Wednesday, April 21, 2010 7:47 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Cassandra tuning for running test on a desktop
>>
>>
>>
>> I have try 1400M, and Cassandra OOM too.
>>
>> Is there another solution ? My data isn't very big.
>>
>> It seems that is the merge of the db
>>
>>  On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene  wrote:
>>
>> Trying increasing Xmx. 1G is probably not enough for the amount of inserts
>> you are doing.
>>
>>
>>
>> On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot 
>> wrote:
>>
>> Hello,
>>
>> For my first message I will first thanks Cassandra contributors for their
>> great works.
>>
>> I have a parameter issue with Cassandra (I hope it's just a parameter
>> issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a
>> simple dual core with 4GB of RAM on WinXP. I have keep the default JVM
>> option inside cassandra.bat (Xmx1G)
>>
>> I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF
>> (named Super1). The insertion go to 1 millions of SC (without slowdown) and
>> Cassandra crash because of an OOM. (I store an average of 100 bytes per SC
>> with a max of 10kB).
>> I have aggressively decreased all the memories parameters without any
>> respect to the consistency (My config is here [1]), the cache is turn off
>> but Cassandra still go to OOM. I have joined the last line of the Cassandra
>> life [2].
>>
>> What can I do to fix my issue ?  Is there another solution than increasing
>> the Xmx ?
>>
>> Thanks for your help,
>>
>> Nicolas
>>
>>
>>
>>
>>
>> [1]
>>   
>> 
>>   > ColumnType="Super"
>> CompareWith="BytesType"
>> CompareSubcolumnsWith="BytesType" />
>>
>> org.apache.cassandra.locator.RackUnawareStrategy
>>   1
>>
>> org.apache.cassandra.locator.EndPointSnitch
>> 
>>   
>>   32
>>
>>   auto
>>   64
>>   64
>>   16
>>   4
>>   64
>>
>>   16
>>   32
>>   0.01
>>   0.01
>>   60
>>   4
>>   8
>> 
>>
>>
>> [2]
>>  INFO 13:36:41,062 Super1 has reached its threshold; switching in a fresh
>> Memtable at
>> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
>> position=5417524)
>>  INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
>>  INFO 13:36:41,062 Writing Memtable(Super1)@15385755
>>  INFO 13:36:42,062 Completed flushing
>> d:\cassandra\data\Keyspace1\Super1-711-Data.db
>>  INFO 13:36:45,781 Super1 has reached its threshold; switching in a fresh
>> Memtable at
>> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
>> position=6065637)
>>  INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910
>>  INFO 13:36:45,796 Writing Memtable(Super1)@15578910
>>  INFO 13:36:46,109 Completed flushing
>> d:\cassandra\data\Keyspace1\Super1-712-Data.db
>>  INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240 reclaimed
>> leaving 922392600 used; max is 1174208512
>>  INFO 13:36:54,593 Super1 has reached its threshold; switching in a fresh
>> Memtable at
>> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
>> position=6722241)
>>  INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872
>>  INFO 13:36:54,593 Writing Memtable(Super1)@24468872
>>  INFO 13:36:55,421 Completed flushing
>> d:\cassandra\data\Keyspace1\Super1-713-Data.dbjava.lang.OutOfMemoryError:
>> Java heap space
>>  INFO 13:37:08,281 GC for ConcurrentMarkSweep: 5561 ms, 9432 reclaimed
>> leaving 971904520 used; max is 1174208512
>>
>>
>>
>>
>>
>
>


Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Mark Greene
Hit send to early

That being said a lot of people running Cassandra in production are using
4-6GB max heaps on 8GB machines, don't know if that helps but hopefully
gives you some perspective.

On Wed, Apr 21, 2010 at 10:39 AM, Mark Greene  wrote:

> RAM doesn't necessarily need to be proportional but I would say the number
> of nodes does. You can't just throw a bazillion inserts at one node. This is
> the main benefit of Cassandra is that if you start hitting your capacity,
> you add more machines and distribute the keys across more machines.
>
>
> On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot  wrote:
>
>> So does it means the RAM needed is proportionnal with the data handled ?
>>
>> Or Cassandra need a minimum amount or RAM when dataset is big?
>>
>> I must confess this OOM behaviour is strange.
>>
>>
>> On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones  wrote:
>>
>>>  On my 4GB machine I’m giving it 3GB and having no trouble with 60+
>>> million 500 byte columns
>>>
>>>
>>>
>>> *From:* Nicolas Labrot [mailto:nith...@gmail.com]
>>> *Sent:* Wednesday, April 21, 2010 7:47 AM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Re: Cassandra tuning for running test on a desktop
>>>
>>>
>>>
>>> I have try 1400M, and Cassandra OOM too.
>>>
>>> Is there another solution ? My data isn't very big.
>>>
>>> It seems that is the merge of the db
>>>
>>>  On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene 
>>> wrote:
>>>
>>> Trying increasing Xmx. 1G is probably not enough for the amount of
>>> inserts you are doing.
>>>
>>>
>>>
>>> On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot 
>>> wrote:
>>>
>>> Hello,
>>>
>>> For my first message I will first thanks Cassandra contributors for their
>>> great works.
>>>
>>> I have a parameter issue with Cassandra (I hope it's just a parameter
>>> issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a
>>> simple dual core with 4GB of RAM on WinXP. I have keep the default JVM
>>> option inside cassandra.bat (Xmx1G)
>>>
>>> I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF
>>> (named Super1). The insertion go to 1 millions of SC (without slowdown) and
>>> Cassandra crash because of an OOM. (I store an average of 100 bytes per SC
>>> with a max of 10kB).
>>> I have aggressively decreased all the memories parameters without any
>>> respect to the consistency (My config is here [1]), the cache is turn off
>>> but Cassandra still go to OOM. I have joined the last line of the Cassandra
>>> life [2].
>>>
>>> What can I do to fix my issue ?  Is there another solution than
>>> increasing the Xmx ?
>>>
>>> Thanks for your help,
>>>
>>> Nicolas
>>>
>>>
>>>
>>>
>>>
>>> [1]
>>>   
>>> 
>>>   >> ColumnType="Super"
>>> CompareWith="BytesType"
>>> CompareSubcolumnsWith="BytesType" />
>>>
>>> org.apache.cassandra.locator.RackUnawareStrategy
>>>   1
>>>
>>> org.apache.cassandra.locator.EndPointSnitch
>>> 
>>>   
>>>   32
>>>
>>>   auto
>>>   64
>>>   64
>>>   16
>>>   4
>>>   64
>>>
>>>   16
>>>   32
>>>   0.01
>>>   0.01
>>>   60
>>>   4
>>>   8
>>> 
>>>
>>>
>>> [2]
>>>  INFO 13:36:41,062 Super1 has reached its threshold; switching in a fresh
>>> Memtable at
>>> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
>>> position=5417524)
>>>  INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
>>>  INFO 13:36:41,062 Writing Memtable(Super1)@15385755
>>>  INFO 13:36:42,062 Completed flushing
>>> d:\cassandra\data\Keyspace1\Super1-711-Data.db
>>>  INFO 13:36:45,781 Super1 has reached its threshold; switching in a fresh
>>> Memtable at
>>> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
>>> position=6065637)
>>>  INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910
>>>  INFO 13:36:45,796 Writing Memtable(Super1)@15578910
>>>  INFO 13:36:46,109 Completed flushing
>>> d:\cassandra\data\Keyspace1\Super1-712-Data.db
>>>  INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240
>>> reclaimed leaving 922392600 used; max is 1174208512
>>>  INFO 13:36:54,593 Super1 has reached its threshold; switching in a fresh
>>> Memtable at
>>> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
>>> position=6722241)
>>>  INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872
>>>  INFO 13:36:54,593 Writing Memtable(Super1)@24468872
>>>  INFO 13:36:55,421 Completed flushing
>>> d:\cassandra\data\Keyspace1\Super1-713-Data.dbjava.lang.OutOfMemoryError:
>>> Java heap space
>>>  INFO 13:37:08,281 GC for ConcurrentMarkSweep: 5561 ms, 9432 reclaimed
>>> leaving 971904520 used; max is 1174208512
>>>
>>>
>>>
>>>
>>>
>>
>>
>


Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Nicolas Labrot
Thanks Mark.

Cassandra is maybe too much for my need ;)


On Wed, Apr 21, 2010 at 4:45 PM, Mark Greene  wrote:

> Hit send to early
>
> That being said a lot of people running Cassandra in production are using
> 4-6GB max heaps on 8GB machines, don't know if that helps but hopefully
> gives you some perspective.
>
>
> On Wed, Apr 21, 2010 at 10:39 AM, Mark Greene  wrote:
>
>> RAM doesn't necessarily need to be proportional but I would say the number
>> of nodes does. You can't just throw a bazillion inserts at one node. This is
>> the main benefit of Cassandra is that if you start hitting your capacity,
>> you add more machines and distribute the keys across more machines.
>>
>>
>> On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot wrote:
>>
>>> So does it means the RAM needed is proportionnal with the data handled ?
>>>
>>> Or Cassandra need a minimum amount or RAM when dataset is big?
>>>
>>> I must confess this OOM behaviour is strange.
>>>
>>>
>>> On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones wrote:
>>>
  On my 4GB machine I’m giving it 3GB and having no trouble with 60+
 million 500 byte columns



 *From:* Nicolas Labrot [mailto:nith...@gmail.com]
 *Sent:* Wednesday, April 21, 2010 7:47 AM
 *To:* user@cassandra.apache.org
 *Subject:* Re: Cassandra tuning for running test on a desktop



 I have try 1400M, and Cassandra OOM too.

 Is there another solution ? My data isn't very big.

 It seems that is the merge of the db

  On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene 
 wrote:

 Trying increasing Xmx. 1G is probably not enough for the amount of
 inserts you are doing.



 On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot 
 wrote:

 Hello,

 For my first message I will first thanks Cassandra contributors for
 their great works.

 I have a parameter issue with Cassandra (I hope it's just a parameter
 issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a
 simple dual core with 4GB of RAM on WinXP. I have keep the default JVM
 option inside cassandra.bat (Xmx1G)

 I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF
 (named Super1). The insertion go to 1 millions of SC (without slowdown) and
 Cassandra crash because of an OOM. (I store an average of 100 bytes per SC
 with a max of 10kB).
 I have aggressively decreased all the memories parameters without any
 respect to the consistency (My config is here [1]), the cache is turn off
 but Cassandra still go to OOM. I have joined the last line of the Cassandra
 life [2].

 What can I do to fix my issue ?  Is there another solution than
 increasing the Xmx ?

 Thanks for your help,

 Nicolas





 [1]
   
 
   >>> ColumnType="Super"
 CompareWith="BytesType"
 CompareSubcolumnsWith="BytesType" />

 org.apache.cassandra.locator.RackUnawareStrategy
   1

 org.apache.cassandra.locator.EndPointSnitch
 
   
   32

   auto
   64
   64
   16
   4
   64

   16
   32
   0.01
   0.01
   60
   4
   8
 


 [2]
  INFO 13:36:41,062 Super1 has reached its threshold; switching in a
 fresh Memtable at
 CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
 position=5417524)
  INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
  INFO 13:36:41,062 Writing Memtable(Super1)@15385755
  INFO 13:36:42,062 Completed flushing
 d:\cassandra\data\Keyspace1\Super1-711-Data.db
  INFO 13:36:45,781 Super1 has reached its threshold; switching in a
 fresh Memtable at
 CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
 position=6065637)
  INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910
  INFO 13:36:45,796 Writing Memtable(Super1)@15578910
  INFO 13:36:46,109 Completed flushing
 d:\cassandra\data\Keyspace1\Super1-712-Data.db
  INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240
 reclaimed leaving 922392600 used; max is 1174208512
  INFO 13:36:54,593 Super1 has reached its threshold; switching in a
 fresh Memtable at
 CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
 position=6722241)
  INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24468872
  INFO 13:36:54,593 Writing Memtable(Super1)@24468872
  INFO 13:36:55,421 Completed flushing
 d:\cassandra\data\Keyspace1\Super1-713-Data.dbjava.lang.OutOfMemoryError:
 Java heap space
  INFO 13:37:08,281 GC for ConcurrentMarkSweep: 5561 ms, 9432 reclaimed
 leaving 971904520 used; max is 1174208512





>>>
>>>
>>
>


Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Mark Greene
Maybe, maybe not. Presumably if you are running a RDMS with any reasonable
amount of traffic now a days, it's sitting on a machine with 4-8G of memory
at least.

On Wed, Apr 21, 2010 at 10:48 AM, Nicolas Labrot  wrote:

> Thanks Mark.
>
> Cassandra is maybe too much for my need ;)
>
>
>
> On Wed, Apr 21, 2010 at 4:45 PM, Mark Greene  wrote:
>
>> Hit send to early
>>
>> That being said a lot of people running Cassandra in production are using
>> 4-6GB max heaps on 8GB machines, don't know if that helps but hopefully
>> gives you some perspective.
>>
>>
>> On Wed, Apr 21, 2010 at 10:39 AM, Mark Greene  wrote:
>>
>>> RAM doesn't necessarily need to be proportional but I would say the
>>> number of nodes does. You can't just throw a bazillion inserts at one node.
>>> This is the main benefit of Cassandra is that if you start hitting your
>>> capacity, you add more machines and distribute the keys across more
>>> machines.
>>>
>>>
>>> On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot wrote:
>>>
 So does it means the RAM needed is proportionnal with the data handled ?

 Or Cassandra need a minimum amount or RAM when dataset is big?

 I must confess this OOM behaviour is strange.


 On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones wrote:

>  On my 4GB machine I’m giving it 3GB and having no trouble with 60+
> million 500 byte columns
>
>
>
> *From:* Nicolas Labrot [mailto:nith...@gmail.com]
> *Sent:* Wednesday, April 21, 2010 7:47 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cassandra tuning for running test on a desktop
>
>
>
> I have try 1400M, and Cassandra OOM too.
>
> Is there another solution ? My data isn't very big.
>
> It seems that is the merge of the db
>
>  On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene 
> wrote:
>
> Trying increasing Xmx. 1G is probably not enough for the amount of
> inserts you are doing.
>
>
>
> On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot 
> wrote:
>
> Hello,
>
> For my first message I will first thanks Cassandra contributors for
> their great works.
>
> I have a parameter issue with Cassandra (I hope it's just a parameter
> issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a
> simple dual core with 4GB of RAM on WinXP. I have keep the default JVM
> option inside cassandra.bat (Xmx1G)
>
> I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF
> (named Super1). The insertion go to 1 millions of SC (without slowdown) 
> and
> Cassandra crash because of an OOM. (I store an average of 100 bytes per SC
> with a max of 10kB).
> I have aggressively decreased all the memories parameters without any
> respect to the consistency (My config is here [1]), the cache is turn off
> but Cassandra still go to OOM. I have joined the last line of the 
> Cassandra
> life [2].
>
> What can I do to fix my issue ?  Is there another solution than
> increasing the Xmx ?
>
> Thanks for your help,
>
> Nicolas
>
>
>
>
>
> [1]
>   
> 
>    ColumnType="Super"
> CompareWith="BytesType"
> CompareSubcolumnsWith="BytesType" />
>
> org.apache.cassandra.locator.RackUnawareStrategy
>   1
>
> org.apache.cassandra.locator.EndPointSnitch
> 
>   
>   32
>
>   auto
>   64
>   64
>   16
>   4
>   64
>
>   16
>   32
>   0.01
>   0.01
>   60
>   4
>   8
> 
>
>
> [2]
>  INFO 13:36:41,062 Super1 has reached its threshold; switching in a
> fresh Memtable at
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> position=5417524)
>  INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
>  INFO 13:36:41,062 Writing Memtable(Super1)@15385755
>  INFO 13:36:42,062 Completed flushing
> d:\cassandra\data\Keyspace1\Super1-711-Data.db
>  INFO 13:36:45,781 Super1 has reached its threshold; switching in a
> fresh Memtable at
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> position=6065637)
>  INFO 13:36:45,781 Enqueuing flush of Memtable(Super1)@15578910
>  INFO 13:36:45,796 Writing Memtable(Super1)@15578910
>  INFO 13:36:46,109 Completed flushing
> d:\cassandra\data\Keyspace1\Super1-712-Data.db
>  INFO 13:36:54,296 GC for ConcurrentMarkSweep: 7149 ms, 58337240
> reclaimed leaving 922392600 used; max is 1174208512
>  INFO 13:36:54,593 Super1 has reached its threshold; switching in a
> fresh Memtable at
> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
> position=6722241)
>  INFO 13:36:54,593 Enqueuing flush of Memtable(Super1)@24

At what point does the cluster get faster than the individual nodes?

2010-04-21 Thread Mark Jones
I'm seeing a cluster of 4 (replication factor=2) to be about as slow overall as 
the barely faster than the slowest node in the group.  When I run the 4 nodes 
individually, I see:

For inserts:
Two nodes @ 12000/second
1 node @ 9000/second
1 node @ 7000/second

For reads:
Abysmal, less than 1000/second (not range slices, individual lookups)  Disk 
util @ 88+%


How many nodes are required before you see a net positive gain on inserts and 
reads (QUORUM consistency on both)?
When I use my 2 fastest nodes as a pair, the thruput is around 9000 
inserts/second.

What is a good to excellent hardware config for Cassandra?  I have separate 
drives for data and commit log and 8GB in 3 machines (all dual core).  My 
fastest insert node has 4GB and a triple core processor.

I've run py_stress, and my C++ code beats it by several 1000 inserts/second 
toward the end of the runs, so I don't think it is my app, and I've removed the 
super columns per some suggestions yesterday.

When Cassandra is working, it performs well, the problem is that is frequently 
slows down to < 50% of its peaks and occasionally slows down to 0 
inserts/second which greatly reduces aggregate thruput.


Re: At what point does the cluster get faster than the individual nodes?

2010-04-21 Thread Jim R. Wilson
Hi Mark,

I'm a relative newcomer to Cassandra, but I believe the common
experience is that you start seeing gains after 5 nodes in a
column-oriented data store.  It may also depend on your usage pattern.

Others may know better - hope this helps!

-- Jim R. Wilson (jimbojw)

On Wed, Apr 21, 2010 at 11:28 AM, Mark Jones  wrote:
> I’m seeing a cluster of 4 (replication factor=2) to be about as slow overall
> as the barely faster than the slowest node in the group.  When I run the 4
> nodes individually, I see:
>
>
>
> For inserts:
>
> Two nodes @ 12000/second
>
> 1 node @ 9000/second
>
> 1 node @ 7000/second
>
>
>
> For reads:
>
> Abysmal, less than 1000/second (not range slices, individual lookups)  Disk
> util @ 88+%
>
>
>
>
>
> How many nodes are required before you see a net positive gain on inserts
> and reads (QUORUM consistency on both)?
>
> When I use my 2 fastest nodes as a pair, the thruput is around 9000
> inserts/second.
>
>
>
> What is a good to excellent hardware config for Cassandra?  I have separate
> drives for data and commit log and 8GB in 3 machines (all dual core).  My
> fastest insert node has 4GB and a triple core processor.
>
>
>
> I’ve run py_stress, and my C++ code beats it by several 1000 inserts/second
> toward the end of the runs, so I don’t think it is my app, and I’ve removed
> the super columns per some suggestions yesterday.
>
>
>
> When Cassandra is working, it performs well, the problem is that is
> frequently slows down to < 50% of its peaks and occasionally slows down to 0
> inserts/second which greatly reduces aggregate thruput.


Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-21 Thread Anthony Molinaro
Hi,

  I'm still curious if I got the data movement right in this email from 
before?  Anyone?  Also, anyone know if I can scp the data directory from
a node I want to replace to a new machine?  The cassandra streaming seems
much slower than scp.

-Anthony

On Mon, Apr 19, 2010 at 04:48:23PM -0700, Anthony Molinaro wrote:
> 
> On Mon, Apr 19, 2010 at 03:28:26PM -0500, Jonathan Ellis wrote:
> > > Can I then 'nodeprobe move ', and
> > > achieve the same as step 2 above?
> > 
> > You can't have two nodes with the same token in the ring at once.  So,
> > you can removetoken the old node first, then bootstrap the new one
> > (just specify InitialToken in the config to avoid having it guess
> > one), or you can make it a 3 step process (bootstrap, remove, move) to
> > avoid transferring so much data around.
> 
> So I'm still a little fuzzy for your 3 step case on why less data moves,
> but let me run through the two scenarios and see where we get.  Please
> correct me if I'm wrong on some point.
> 
> Let say I have 3 nodes with random partitioner and rack unaware strategy.
> Which means I have something like
> 
> Node  Size   Token  KeyRange (self + next in ring)
>      -  --
> A 5 G  331 -> 66
> B 6 G  66   34 -> 0
> C 2 G   0  67 -> 33
> 
> Now lets say Node B is giving us some problems, so we want to replace it
> with another node D.
> 
> We've outlined 2 processes.
> 
> In the first process you recommend
> 
> 1. removetoken on node B
> 2. wait for data to move
> 3. add InitialToken of 66 and AutoBootstrap = true to node D storage-conf.xml
>then start it
> 4. wait for data to move
> 
> So when you do the removetoken, this will cause the following transfers
> at stage 2
>   Node A sends 34->66 to Node C
>   Node C sends 67->0  to Node A
> at stage 4
>   Node A sends 34->66 to Node D
>   Node C sends 67->0  to Node D
> 
> In the second process I assume you pick a token really close to another token?
> 
> 1. add InitialToken of 34 and AutoBootstrap to true to node D storage-conf.xml
>then start it
> 2. wait for data to move
> 3. removetoken on node B
> 4. wait for data to move
> 5. movetoken on node D to 66
> 6. wait for data to move
> 
> This results in the following moves
> at stage 2
>   Node A/B sends 33->34 to Node D (primary token range)
>   Node B sends 34->66 to Node D   (replica range)
> at stage 4
>   Node C sends 66->0 to Node D (replica range)
> at stage 6
>   No data movement as D already had 33->0
> 
> So seems like you move all the data twice for process 1 and only a small
> portion twice for process 2 (which is what you said, so hopefully I've
> outlined correctly what is happening).  Does all that sound right?
> 
> Once I've run bootstrap with the InitialToken value set in the config is
> it then ignored in subsequent restarts, and if so can I just remove it
> after that first time?
> 
> Thanks,
> 
> -Anthony
> 
> -- 
> 
> Anthony Molinaro   

-- 

Anthony Molinaro   


Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-21 Thread Jonathan Ellis
Yes, that looks right, where "token really close" means "slightly less
than" (more than would move it into a different node's range).

You can't really migrate via scp since only one node with a given
token can exist in the cluster at a time.

-Jonathan

On Wed, Apr 21, 2010 at 11:02 AM, Anthony Molinaro
 wrote:
> Hi,
>
>  I'm still curious if I got the data movement right in this email from
> before?  Anyone?  Also, anyone know if I can scp the data directory from
> a node I want to replace to a new machine?  The cassandra streaming seems
> much slower than scp.
>
> -Anthony
>
> On Mon, Apr 19, 2010 at 04:48:23PM -0700, Anthony Molinaro wrote:
>>
>> On Mon, Apr 19, 2010 at 03:28:26PM -0500, Jonathan Ellis wrote:
>> > > Can I then 'nodeprobe move ', and
>> > > achieve the same as step 2 above?
>> >
>> > You can't have two nodes with the same token in the ring at once.  So,
>> > you can removetoken the old node first, then bootstrap the new one
>> > (just specify InitialToken in the config to avoid having it guess
>> > one), or you can make it a 3 step process (bootstrap, remove, move) to
>> > avoid transferring so much data around.
>>
>> So I'm still a little fuzzy for your 3 step case on why less data moves,
>> but let me run through the two scenarios and see where we get.  Please
>> correct me if I'm wrong on some point.
>>
>> Let say I have 3 nodes with random partitioner and rack unaware strategy.
>> Which means I have something like
>>
>> Node  Size   Token  KeyRange (self + next in ring)
>>      -  --
>> A     5 G      33    1 -> 66
>> B     6 G      66       34 -> 0
>> C     2 G       0          67 -> 33
>>
>> Now lets say Node B is giving us some problems, so we want to replace it
>> with another node D.
>>
>> We've outlined 2 processes.
>>
>> In the first process you recommend
>>
>> 1. removetoken on node B
>> 2. wait for data to move
>> 3. add InitialToken of 66 and AutoBootstrap = true to node D storage-conf.xml
>>    then start it
>> 4. wait for data to move
>>
>> So when you do the removetoken, this will cause the following transfers
>> at stage 2
>>   Node A sends 34->66 to Node C
>>   Node C sends 67->0  to Node A
>> at stage 4
>>   Node A sends 34->66 to Node D
>>   Node C sends 67->0  to Node D
>>
>> In the second process I assume you pick a token really close to another 
>> token?
>>
>> 1. add InitialToken of 34 and AutoBootstrap to true to node D 
>> storage-conf.xml
>>    then start it
>> 2. wait for data to move
>> 3. removetoken on node B
>> 4. wait for data to move
>> 5. movetoken on node D to 66
>> 6. wait for data to move
>>
>> This results in the following moves
>> at stage 2
>>   Node A/B sends 33->34 to Node D (primary token range)
>>   Node B sends 34->66 to Node D   (replica range)
>> at stage 4
>>   Node C sends 66->0 to Node D (replica range)
>> at stage 6
>>   No data movement as D already had 33->0
>>
>> So seems like you move all the data twice for process 1 and only a small
>> portion twice for process 2 (which is what you said, so hopefully I've
>> outlined correctly what is happening).  Does all that sound right?
>>
>> Once I've run bootstrap with the InitialToken value set in the config is
>> it then ignored in subsequent restarts, and if so can I just remove it
>> after that first time?
>>
>> Thanks,
>>
>> -Anthony
>>
>> --
>> 
>> Anthony Molinaro                           
>
> --
> 
> Anthony Molinaro                           
>


Re: At what point does the cluster get faster than the individual nodes?

2010-04-21 Thread Mike Gallamore
Some people might be able to answer this better than me. However: with 
quorum consistency you have to communicate with n/2 + 1 where n is the 
replication factor nodes. So unless you are disk bound your real expense 
is going to be all those extra network latencies. I'd expect that you'll 
see a relatively flat throughput per thread once you reach the point 
that you aren't disk or CPU bound. That said the extra nodes mean if you 
should be able to handle more threads/connections at the same throughput 
on each thread/connection. So bigger cluster doesn't mean a single job 
goes faster necessarily, just that you can handle more jobs at the same 
time.

On 04/21/2010 08:28 AM, Mark Jones wrote:


I'm seeing a cluster of 4 (replication factor=2) to be about as slow 
overall as the barely faster than the slowest node in the group.  When 
I run the 4 nodes individually, I see:


For inserts:

Two nodes @ 12000/second

1 node @ 9000/second

1 node @ 7000/second

For reads:

Abysmal, less than 1000/second (not range slices, individual lookups)  
Disk util @ 88+%


How many nodes are required before you see a net positive gain on 
inserts and reads (QUORUM consistency on both)?


When I use my 2 fastest nodes as a pair, the thruput is around 9000 
inserts/second.


What is a good to excellent hardware config for Cassandra?  I have 
separate drives for data and commit log and 8GB in 3 machines (all 
dual core).  My fastest insert node has 4GB and a triple core processor.


I've run py_stress, and my C++ code beats it by several 1000 
inserts/second toward the end of the runs, so I don't think it is my 
app, and I've removed the super columns per some suggestions yesterday.


When Cassandra is working, it performs well, the problem is that is 
frequently slows down to < 50% of its peaks and occasionally slows 
down to 0 inserts/second which greatly reduces aggregate thruput.






Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-21 Thread Anthony Molinaro

On Wed, Apr 21, 2010 at 11:08:19AM -0500, Jonathan Ellis wrote:
> Yes, that looks right, where "token really close" means "slightly less
> than" (more than would move it into a different node's range).

Is it better to go slightly less than (say Token - 1), or slightly more than
the beginning of the range (PreviousTokenInRing + 1).  I was assuming the
latter in my earlier email, but you seem to be suggesting the former?

> You can't really migrate via scp since only one node with a given
> token can exist in the cluster at a time.

Right, I was mostly wondering if I could speed things up by scping the
sstables while the system was running (since they shouldn't be changing).
Then in quick succession removetoken and bootstrap with the old token.
Probably grasping at straws here :b

Thanks for the answers,

-Anthony

> On Wed, Apr 21, 2010 at 11:02 AM, Anthony Molinaro
>  wrote:
> > Hi,
> >
> >  I'm still curious if I got the data movement right in this email from
> > before?  Anyone?  Also, anyone know if I can scp the data directory from
> > a node I want to replace to a new machine?  The cassandra streaming seems
> > much slower than scp.
> >
> > -Anthony
> >
> > On Mon, Apr 19, 2010 at 04:48:23PM -0700, Anthony Molinaro wrote:
> >>
> >> On Mon, Apr 19, 2010 at 03:28:26PM -0500, Jonathan Ellis wrote:
> >> > > Can I then 'nodeprobe move ', and
> >> > > achieve the same as step 2 above?
> >> >
> >> > You can't have two nodes with the same token in the ring at once.  So,
> >> > you can removetoken the old node first, then bootstrap the new one
> >> > (just specify InitialToken in the config to avoid having it guess
> >> > one), or you can make it a 3 step process (bootstrap, remove, move) to
> >> > avoid transferring so much data around.
> >>
> >> So I'm still a little fuzzy for your 3 step case on why less data moves,
> >> but let me run through the two scenarios and see where we get.  Please
> >> correct me if I'm wrong on some point.
> >>
> >> Let say I have 3 nodes with random partitioner and rack unaware strategy.
> >> Which means I have something like
> >>
> >> Node  Size   Token  KeyRange (self + next in ring)
> >>      -  --
> >> A     5 G      33    1 -> 66
> >> B     6 G      66       34 -> 0
> >> C     2 G       0          67 -> 33
> >>
> >> Now lets say Node B is giving us some problems, so we want to replace it
> >> with another node D.
> >>
> >> We've outlined 2 processes.
> >>
> >> In the first process you recommend
> >>
> >> 1. removetoken on node B
> >> 2. wait for data to move
> >> 3. add InitialToken of 66 and AutoBootstrap = true to node D 
> >> storage-conf.xml
> >>    then start it
> >> 4. wait for data to move
> >>
> >> So when you do the removetoken, this will cause the following transfers
> >> at stage 2
> >>   Node A sends 34->66 to Node C
> >>   Node C sends 67->0  to Node A
> >> at stage 4
> >>   Node A sends 34->66 to Node D
> >>   Node C sends 67->0  to Node D
> >>
> >> In the second process I assume you pick a token really close to another 
> >> token?
> >>
> >> 1. add InitialToken of 34 and AutoBootstrap to true to node D 
> >> storage-conf.xml
> >>    then start it
> >> 2. wait for data to move
> >> 3. removetoken on node B
> >> 4. wait for data to move
> >> 5. movetoken on node D to 66
> >> 6. wait for data to move
> >>
> >> This results in the following moves
> >> at stage 2
> >>   Node A/B sends 33->34 to Node D (primary token range)
> >>   Node B sends 34->66 to Node D   (replica range)
> >> at stage 4
> >>   Node C sends 66->0 to Node D (replica range)
> >> at stage 6
> >>   No data movement as D already had 33->0
> >>
> >> So seems like you move all the data twice for process 1 and only a small
> >> portion twice for process 2 (which is what you said, so hopefully I've
> >> outlined correctly what is happening).  Does all that sound right?
> >>
> >> Once I've run bootstrap with the InitialToken value set in the config is
> >> it then ignored in subsequent restarts, and if so can I just remove it
> >> after that first time?
> >>
> >> Thanks,
> >>
> >> -Anthony
> >>
> >> --
> >> 
> >> Anthony Molinaro                           
> >
> > --
> > 
> > Anthony Molinaro                           
> >

-- 

Anthony Molinaro   


Re: At what point does the cluster get faster than the individual nodes?

2010-04-21 Thread Mark Greene
Right it's a similar concept to DB sharding where you spread the write load
around to different DB servers but won't necessarily increase the throughput
of an one DB server but rather collectively.

On Wed, Apr 21, 2010 at 12:16 PM, Mike Gallamore <
mike.e.gallam...@googlemail.com> wrote:

>  Some people might be able to answer this better than me. However: with
> quorum consistency you have to communicate with n/2 + 1 where n is the
> replication factor nodes. So unless you are disk bound your real expense is
> going to be all those extra network latencies. I'd expect that you'll see a
> relatively flat throughput per thread once you reach the point that you
> aren't disk or CPU bound. That said the extra nodes mean if you should be
> able to handle more threads/connections at the same throughput on each
> thread/connection. So bigger cluster doesn't mean a single job goes faster
> necessarily, just that you can handle more jobs at the same time.
>
> On 04/21/2010 08:28 AM, Mark Jones wrote:
>
>  I’m seeing a cluster of 4 (replication factor=2) to be about as slow
> overall as the barely faster than the slowest node in the group.  When I run
> the 4 nodes individually, I see:
>
>
>
> For inserts:
>
> Two nodes @ 12000/second
>
> 1 node @ 9000/second
>
> 1 node @ 7000/second
>
>
>
> For reads:
>
> Abysmal, less than 1000/second (not range slices, individual lookups)  Disk
> util @ 88+%
>
>
>
>
>
> How many nodes are required before you see a net positive gain on inserts
> and reads (QUORUM consistency on both)?
>
> When I use my 2 fastest nodes as a pair, the thruput is around 9000
> inserts/second.
>
>
>
> What is a good to excellent hardware config for Cassandra?  I have separate
> drives for data and commit log and 8GB in 3 machines (all dual core).  My
> fastest insert node has 4GB and a triple core processor.
>
>
>
> I’ve run py_stress, and my C++ code beats it by several 1000 inserts/second
> toward the end of the runs, so I don’t think it is my app, and I’ve removed
> the super columns per some suggestions yesterday.
>
>
>
> When Cassandra is working, it performs well, the problem is that is
> frequently slows down to < 50% of its peaks and occasionally slows down to 0
> inserts/second which greatly reduces aggregate thruput.
>
>
>


Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Nicolas Labrot
I donnot have a website ;)

I'm testing the viability of Cassandra to store XML documents and make fast
search queries. 4000 XML files (80MB of XML) create with my datamodel (one
SC per XML node) 100 SC which make Cassandra go OOM with Xmx 1GB. On the
contrary an xml DB like eXist handles 4000 XML doc without any problem with
an acceptable amount of memories.

What I like with Cassandra is his simplicity and his scalability. eXist is
not able to scale with data, the only viable solution his marklogic which
cost an harm and a feet... :)

I will install linux and buy some memories to continue my test.

Could a Cassandra developper give me the technical reason of this OOM ?





On Wed, Apr 21, 2010 at 5:13 PM, Mark Greene  wrote:

> Maybe, maybe not. Presumably if you are running a RDMS with any reasonable
> amount of traffic now a days, it's sitting on a machine with 4-8G of memory
> at least.
>
>
> On Wed, Apr 21, 2010 at 10:48 AM, Nicolas Labrot wrote:
>
>> Thanks Mark.
>>
>> Cassandra is maybe too much for my need ;)
>>
>>
>>
>> On Wed, Apr 21, 2010 at 4:45 PM, Mark Greene  wrote:
>>
>>> Hit send to early
>>>
>>> That being said a lot of people running Cassandra in production are using
>>> 4-6GB max heaps on 8GB machines, don't know if that helps but hopefully
>>> gives you some perspective.
>>>
>>>
>>> On Wed, Apr 21, 2010 at 10:39 AM, Mark Greene wrote:
>>>
 RAM doesn't necessarily need to be proportional but I would say the
 number of nodes does. You can't just throw a bazillion inserts at one node.
 This is the main benefit of Cassandra is that if you start hitting your
 capacity, you add more machines and distribute the keys across more
 machines.


 On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot wrote:

> So does it means the RAM needed is proportionnal with the data handled
> ?
>
> Or Cassandra need a minimum amount or RAM when dataset is big?
>
> I must confess this OOM behaviour is strange.
>
>
> On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones wrote:
>
>>  On my 4GB machine I’m giving it 3GB and having no trouble with 60+
>> million 500 byte columns
>>
>>
>>
>> *From:* Nicolas Labrot [mailto:nith...@gmail.com]
>> *Sent:* Wednesday, April 21, 2010 7:47 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Cassandra tuning for running test on a desktop
>>
>>
>>
>> I have try 1400M, and Cassandra OOM too.
>>
>> Is there another solution ? My data isn't very big.
>>
>> It seems that is the merge of the db
>>
>>  On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene 
>> wrote:
>>
>> Trying increasing Xmx. 1G is probably not enough for the amount of
>> inserts you are doing.
>>
>>
>>
>> On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot 
>> wrote:
>>
>> Hello,
>>
>> For my first message I will first thanks Cassandra contributors for
>> their great works.
>>
>> I have a parameter issue with Cassandra (I hope it's just a parameter
>> issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's 
>> a
>> simple dual core with 4GB of RAM on WinXP. I have keep the default JVM
>> option inside cassandra.bat (Xmx1G)
>>
>> I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF
>> (named Super1). The insertion go to 1 millions of SC (without slowdown) 
>> and
>> Cassandra crash because of an OOM. (I store an average of 100 bytes per 
>> SC
>> with a max of 10kB).
>> I have aggressively decreased all the memories parameters without any
>> respect to the consistency (My config is here [1]), the cache is turn off
>> but Cassandra still go to OOM. I have joined the last line of the 
>> Cassandra
>> life [2].
>>
>> What can I do to fix my issue ?  Is there another solution than
>> increasing the Xmx ?
>>
>> Thanks for your help,
>>
>> Nicolas
>>
>>
>>
>>
>>
>> [1]
>>   
>> 
>>   > ColumnType="Super"
>> CompareWith="BytesType"
>> CompareSubcolumnsWith="BytesType" />
>>
>> org.apache.cassandra.locator.RackUnawareStrategy
>>   1
>>
>> org.apache.cassandra.locator.EndPointSnitch
>> 
>>   
>>   32
>>
>>   auto
>>   64
>>   64
>>   16
>>   4
>>   64
>>
>>   16
>>   32
>>   0.01
>>   0.01
>>   60
>>   4
>>   8
>> 
>>
>>
>> [2]
>>  INFO 13:36:41,062 Super1 has reached its threshold; switching in a
>> fresh Memtable at
>> CommitLogContext(file='d:/cassandra/commitlog\CommitLog-1271849783703.log',
>> position=5417524)
>>  INFO 13:36:41,062 Enqueuing flush of Memtable(Super1)@15385755
>>  INFO 13:36:41,062 Writing Memtable(Super1)@15385755
>>  INFO

Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
Currently running on a single node with intensive write operations.


After running for a while...

Client starts outputting:

TimedOutException()
at 
org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12232)
at 
org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:670)
at 
org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:643)

Cassandra starts outputting:

 INFO 08:08:49,864 Cassandra starting up...
 INFO 08:18:09,782 GC for ParNew: 238 ms, 30728008 reclaimed leaving
220554976 used; max is 1190723584
 INFO 08:18:19,782 GC for ParNew: 231 ms, 30657944 reclaimed leaving
230245792 used; max is 1190723584
 INFO 08:18:39,782 GC for ParNew: 229 ms, 30567184 reclaimed leaving
250127792 used; max is 1190723584
 INFO 08:18:59,782 GC for ParNew: 358 ms, 46416776 reclaimed leaving
261657720 used; max is 1190723584
 INFO 08:19:09,782 GC for ParNew: 205 ms, 46331376 reclaimed leaving
273764040 used; max is 1190723584
 INFO 08:19:19,783 GC for ParNew: 335 ms, 46354968 reclaimed leaving
282912656 used; max is 1190723584
 INFO 08:19:30,064 GC for ParNew: 392 ms, 46403400 reclaimed leaving
294861824 used; max is 1190723584
 INFO 08:19:40,065 GC for ParNew: 326 ms, 4639 reclaimed leaving
304045640 used; max is 1190723584
 INFO 08:19:50,064 GC for ParNew: 256 ms, 46460824 reclaimed leaving
312964344 used; max is 1190723584
 INFO 08:20:00,065 GC for ParNew: 242 ms, 46357104 reclaimed leaving
324961320 used; max is 1190723584
 INFO 08:20:20,065 GC for ParNew: 336 ms, 46447216 reclaimed leaving
345874144 used; max is 1190723584
 INFO 08:20:27,357 Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1271866827357.log
 INFO 08:20:40,065 GC for ParNew: 265 ms, 46509608 reclaimed leaving
366587984 used; max is 1190723584
 INFO 08:21:00,069 GC for ParNew: 321 ms, 46478736 reclaimed leaving
384059832 used; max is 1190723584
 INFO 08:21:10,069 GC for ParNew: 223 ms, 62235000 reclaimed leaving
383631432 used; max is 1190723584
 INFO 08:21:30,069 GC for ParNew: 291 ms, 62261104 reclaimed leaving
399697888 used; max is 1190723584
 INFO 08:21:50,069 GC for ParNew: 245 ms, 62275528 reclaimed leaving
415428952 used; max is 1190723584
 INFO 08:22:10,248 GC for ParNew: 384 ms, 62219264 reclaimed leaving
433542656 used; max is 1190723584
 INFO 08:22:30,248 GC for ParNew: 215 ms, 62363608 reclaimed leaving
452030560 used; max is 1190723584
 INFO 08:22:40,248 GC for ParNew: 318 ms, 62104552 reclaimed leaving
464013992 used; max is 1190723584
 INFO 08:22:51,039 GC for ParNew: 845 ms, 62218296 reclaimed leaving
471978840 used; max is 1190723584
 INFO 08:23:01,040 GC for ParNew: 474 ms, 62258080 reclaimed leaving
475912120 used; max is 1190723584
 INFO 08:23:11,040 GC for ParNew: 738 ms, 62265328 reclaimed leaving
483742344 used; max is 1190723584
 INFO 08:23:21,040 GC for ParNew: 306 ms, 62218648 reclaimed leaving
491761672 used; max is 1190723584
 INFO 08:23:41,041 GC for ParNew: 279 ms, 62187536 reclaimed leaving
507442800 used; max is 1190723584
 INFO 08:24:01,041 GC for ParNew: 557 ms, 62310784 reclaimed leaving
523028304 used; max is 1190723584
 INFO 08:24:11,041 GC for ParNew: 221 ms, 62268456 reclaimed leaving
530865568 used; max is 1190723584
 INFO 08:24:21,041 GC for ParNew: 334 ms, 62258720 reclaimed leaving
542690216 used; max is 1190723584
 INFO 08:24:31,042 GC for ParNew: 262 ms, 62218624 reclaimed leaving
550728232 used; max is 1190723584
 INFO 08:24:51,045 GC for ParNew: 640 ms, 62235952 reclaimed leaving
573981584 used; max is 1190723584
 INFO 08:25:01,045 GC for ParNew: 309 ms, 62138776 reclaimed leaving
563891472 used; max is 1190723584
 INFO 08:25:11,046 GC for ParNew: 242 ms, 62255952 reclaimed leaving
575756040 used; max is 1190723584
 INFO 08:25:21,047 GC for ParNew: 326 ms, 62264432 reclaimed leaving
583631432 used; max is 1190723584
 INFO 08:25:31,047 GC for ParNew: 591 ms, 62231816 reclaimed leaving
595405816 used; max is 1190723584
 INFO 08:25:41,048 GC for ParNew: 478 ms, 62186088 reclaimed leaving
603389432 used; max is 1190723584
 INFO 08:25:51,049 GC for ParNew: 409 ms, 62264832 reclaimed leaving
615150584 used; max is 1190723584
 INFO 08:26:01,049 GC for ParNew: 416 ms, 62189952 reclaimed leaving
623125104 used; max is 1190723584
 INFO 08:26:11,049 GC for ParNew: 430 ms, 62382056 reclaimed leaving
634661008 used; max is 1190723584
 INFO 08:26:21,094 GC for ParNew: 436 ms, 62319088 reclaimed leaving
646272840 used; max is 1190723584
 INFO 08:26:31,094 GC for ParNew: 404 ms, 62379896 reclaimed leaving
653978688 used; max is 1190723584
 INFO 08:26:41,094 GC for ParNew: 568 ms, 62407112 reclaimed leaving
665462760 used; max is 1190723584
 INFO 08:26:44,895 Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1271867204895.log
 INFO 08:26:51,094 GC for ParNew: 682 ms, 62129816 reclaimed leaving
673423168 used; max is 1190723584
 INFO 08:27:01,094 GC for ParNew: 480 ms, 62284080 reclaimed leaving
6852776

Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-21 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 11:31 AM, Anthony Molinaro
 wrote:
>
> On Wed, Apr 21, 2010 at 11:08:19AM -0500, Jonathan Ellis wrote:
>> Yes, that looks right, where "token really close" means "slightly less
>> than" (more than would move it into a different node's range).
>
> Is it better to go slightly less than (say Token - 1), or slightly more than
> the beginning of the range (PreviousTokenInRing + 1).  I was assuming the
> latter in my earlier email, but you seem to be suggesting the former?

Right, the former.

> Right, I was mostly wondering if I could speed things up by scping the
> sstables while the system was running (since they shouldn't be changing).
> Then in quick succession removetoken and bootstrap with the old token.
> Probably grasping at straws here :b

Nope, bootstrap ignores any local data.

You could use scp-then-repair if you can tolerate slightly out of date
data being served by the new machine until the repair finishes.


Re: Just to be clear, cassandra is web framework agnostic b/c of Thrift?

2010-04-21 Thread Jonathan Ellis
There is a patch attached to
https://issues.apache.org/jira/browse/CASSANDRA-948 that needs
volunteers to test.

On Sun, Apr 18, 2010 at 11:13 PM, Mark Greene  wrote:
> With the 0.6.0 release, the windows cassandra.bat file errors out. There's a
> bug filed for this already. There's a README or something similar in the
> install directory, that tells you the basic CLI operations and explains the
> basic data model.
>
> On Sun, Apr 18, 2010 at 11:23 PM, S Ahmed  wrote:
>>
>> Interesting, I'm just finding windows to be a pain, particular starting up
>> java apps. (I guess I just need to learn!)
>> How exactly would you startup Cassandra on a windows machine? i.e when the
>> server reboots, how will it run the java -jar cassandar ?
>>
>>
>> On Sun, Apr 18, 2010 at 7:35 PM, Joe Stump  wrote:
>>>
>>> On Apr 18, 2010, at 5:33 PM, S Ahmed wrote:
>>>
>>> Obviously if you run asp.net on windows, it is probably a VERY good idea
>>> to be running cassandra on a linux box.
>>>
>>> Actually, I'm not sure this is true. A few people have found Windows
>>> performs fairly well with Cassandra, if I recall correctly. Obviously, all
>>> of the testing and most of the bigger users are running on Linux though.
>>> --Joe
>
>


Re: restore with snapshot

2010-04-21 Thread Jonathan Ellis
On Mon, Apr 19, 2010 at 2:03 PM, Lee Parker  wrote:
> I am working on finalizing our backup and restore procedures for a cassandra
> cluster running on EC2. I understand based on the wiki that in order to
> replace a single node, I don't actually need to put data on that node.  I
> just need to bootstrap the new node into the cluster and it will get data
> from the other nodes.  However, would is speed up the process if that node
> already has the data from the node it is replacing?

No, that would speed up repair, but not bootstrap.  See the section on
failure recovery in http://wiki.apache.org/cassandra/Operations

>  Also, what do I do if
> the entire cluster goes down?  I am planning to snapshot the data each night
> for each node.  Should I save the system keyspace snapshots?

Yes, since that is where the token is stored.

>  Is it
> problematic to bring the cluster back up with new ips on each node, but the
> same tokens as before?

No.  Just make sure all the old instances are down, before bringing up
instances w/ new ips.

-Jonathan


Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Jonathan Ellis
I'd like to get something besides "I'm seeing close wait but i have no
idea why" for a bug report, since most people aren't seeing that.

On Tue, Apr 20, 2010 at 9:33 AM, Ingram Chen  wrote:
> I trace IncomingStreamReader source and found that incoming socket comes
> from MessagingService$SocketThread.
> but there is no close() call on either accepted socket or socketChannel.
>
> Should I file a bug report ?
>
> On Tue, Apr 20, 2010 at 11:02, Ingram Chen  wrote:
>>
>> this happened after several hours of operations and both nodes are started
>> at the same time (clean start without any data). so it might not relate to
>> Bootstrap.
>>
>> In system.log I do not see any logs like "xxx node dead" or exceptions.
>> and both nodes in test are alive. they serve read/write well, too. Below
>> four connections between nodes are keep healthy from time to time.
>>
>> tcp    0  0 :::192.168.2.87:7000
>> :::192.168.2.88:58447   ESTABLISHED
>> tcp    0  0 :::192.168.2.87:54986
>> :::192.168.2.88:7000    ESTABLISHED
>> tcp    0  0 :::192.168.2.87:59138
>> :::192.168.2.88:7000    ESTABLISHED
>> tcp    0  0 :::192.168.2.87:7000
>> :::192.168.2.88:39074   ESTABLISHED
>>
>> so connections end in CLOSE_WAIT should be newly created. (for streaming
>> ?) This seems related to streaming issues we suffered recently:
>> http://n2.nabble.com/busy-thread-on-IncomingStreamReader-td4908640.html
>>
>> I would like add some debug codes around opening and closing of socket to
>> find out what happend.
>>
>> Could you give me some hint, about what classes I should take look ?
>>
>>
>> On Tue, Apr 20, 2010 at 04:47, Jonathan Ellis  wrote:
>>>
>>> Is this after doing a bootstrap or other streaming operation?  Or did
>>> a node go down?
>>>
>>> The internal sockets are supposed to remain open, otherwise.
>>>
>>> On Mon, Apr 19, 2010 at 10:56 AM, Ingram Chen 
>>> wrote:
>>> > Thank your information.
>>> >
>>> > We do use connection pools with thrift client and ThriftAdress is on
>>> > port
>>> > 9160.
>>> >
>>> > Those problematic connections we found are all in port 7000, which is
>>> > internal communications port between
>>> > nodes. I guess this related to StreamingService.
>>> >
>>> > On Mon, Apr 19, 2010 at 23:46, Brandon Williams 
>>> > wrote:
>>> >>
>>> >> On Mon, Apr 19, 2010 at 10:27 AM, Ingram Chen 
>>> >> wrote:
>>> >>>
>>> >>> Hi all,
>>> >>>
>>> >>>     We have observed several connections between nodes in CLOSE_WAIT
>>> >>> after several hours of operation:
>>> >>
>>> >> This is symptomatic of not pooling your client connections correctly.
>>> >>  Be
>>> >> sure you're using one connection per thread, not one connection per
>>> >> operation.
>>> >> -Brandon
>>> >
>>> >
>>> > --
>>> > Ingram Chen
>>> > online share order: http://dinbendon.net
>>> > blog: http://www.javaworld.com.tw/roller/page/ingramchen
>>> >
>>
>>
>>
>> --
>> Ingram Chen
>> online share order: http://dinbendon.net
>> blog: http://www.javaworld.com.tw/roller/page/ingramchen
>
>
>
> --
> Ingram Chen
> online share order: http://dinbendon.net
> blog: http://www.javaworld.com.tw/roller/page/ingramchen
>


Re: Delete row

2010-04-21 Thread Jonathan Ellis
You can serialize any RowMutation for BMT but if all you're doing is
deleting rows why bother with BMT?  It is not significantly more
efficient than Thrift for that.

On Tue, Apr 20, 2010 at 12:47 PM, Sonny Heer  wrote:
> How do i delete a row using BMT method?
>
> Do I simply do a mutate with column delete flag set to true?  Thanks.
>


Cassandra data model for financial data

2010-04-21 Thread Steve Lihn
Hi,
I am new to Cassandra. I would like to use Cassandra to store financial data
(time series). Have question on the data model design.

The example here is the daily stock data. This would be a column family
called dailyStockData. The raw key is stock ticker.
Everyday there are attributes like closingPrice, volume, sharesOutstanding,
etc. that need to be stored. There seems to be two ways to model it:

Design 1: Each attribute is a super column. Therefore each date is a column.
So we have:

AAPL -> closingPrice -> { '2010-04-13' : 242, '2010-04-14': 245 }
AAPL -> volume -> { '2010-04-13' : 10.9m, '2010-04-14': 14.4m }
etc.

Design 2: Each date is a super column. Therefore each attribute is a column.
So we have:

AAPL -> '2010-04-13' -> { closingPrice -> 242, volume -> 10.9m }
AAPL -> '2010-04-14' -> {closingPrice -> 245, volume -> 14.4m }
etc.

The date column / superColumn will need Order Perserving Partitioner since
we are going to do a lot of range queries. Examples are:
Query 1: Give me the data between date1 and date2 for a set of tickers (say,
the 100 tickers in QQQ).
Query 2: More often than not, the query is: Give me the data for the max
available dates (for each ticker) between date1 and date2 in a set of
tickers.
(Since not every day is traded, and we only want the most recent data, given
a range of dates.)

My questions are:
a. Is there any technical reason to prefer (or must choose) one rather than
the other between Design 1 and Design 2 ?
b. Are both queries possible (and comparable in speed) for the chosen design
?

Thanks,
Steve


Re: Modelling assets and user permissions

2010-04-21 Thread Jonathan Ellis
if you want to look up "what permissions does user X have on asset Y"
then i would model that as a row keyed by userid, containing
supercolumns named by asset ids, and containing subcolumns of the
permissions granted.

On Mon, Apr 19, 2010 at 12:03 PM, tsuraan  wrote:
> Suppose I have a CF that holds some sort of assets that some users of
> my program have access to, and that some do not.  In SQL-ish terms it
> would look something like this:
>
> TABLE Assets (
>  asset_id serial primary key,
>  ...
> );
>
> TABLE Users (
>  user_id serial primary key,
>  user_name text
> );
>
> TABLE Permissions (
>  asset_id integer references(Assets),
>  user_id integer references(Users)
> )
>
> Now, I can generate UUIDs for my asset keys without any trouble, so
> the serial that I have in my pseudo-SQL Assets table isn't a problem.
> My problem is that I can't see a good way to model the relationship
> between user ids and assets.  I see one way to do this, which has
> problems, and I think I sort of see a second way.
>
> The obvious way to do it is have the Assets CF have a SuperColumn that
> somehow enumerates the users allowed to see it, so when retrieving a
> specific Asset I can retrieve the users list and ensure that the user
> doing the request is allowed to see it.  This has quite a few
> problems.  The foremost is that Cassandra doesn't appear to have much
> for conflict resolution (at least I can't find any docs on it), so if
> two processes try to add permissions to the same Asset, it looks like
> one process will win and I have no idea what happens to the loser.
> Another problem is that Cassandra's SuperColumns don't appear to be
> ideal for storing lists of things; they store maps, which isn't a
> terrible problem, but it feels like a bit of a mismatch in my design.
> A SuperColumn mapping from user_ids to an empty byte array seems like
> it should work pretty efficiently for checking whether a user has
> permissions on an Asset, but it also seems pretty evil.
>
> The other idea that I have is a seperate CF for AssetPermissions that
> somehow stores pairs of asset_ids and user_names.  I don't know what
> I'd use for a key in that situation, so I haven't really gotten too
> far in seeing what else is broken with that idea.  I think it would
> get around the race condition, but I don't know how to do it, and I'm
> not sure how efficient it could be.
>
> What do people normally use in this situation?  I assume it's a pretty
> common problem, but I haven't see it in the various data modelling
> examples on the Wiki.
>


Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Jonathan Ellis
[moving to u...@]

0.6 fixes replaying faster than it can flush.

as for why it backs up in the first place before the restart, you can
either (a) throttle writes [set your timeout lower, make your clients
back off temporarily when it gets a timeoutexception] or (b) add
capacity.  (b) is recommended.

https://issues.apache.org/jira/browse/CASSANDRA-685 will mitigate this
but there is still no substitute for adding capacity to match demand.

On Tue, Apr 20, 2010 at 4:57 PM, Anthony Molinaro
 wrote:
> Hi,
>
>  I have a cassandra cluster where a couple things are happening.  Every
> once in a while a node will start to get backed up.  Checking tpstats I
> see a very large value for ROW-MUTATION-STAGE.  Sometimes it will be able
> to clear it if I give it enough time, other times the vm OOMs.  With some
> nodes I also see this happen during restarts, I'll restart and have to
> wait 6-12 hours for the node to not be marked as 'Down'.
> I've seen
> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
> and ended up with the following settings.
>
> KeysCachedFraction            : 0.01
> MemtableSizeInMB              : 100
> MemtableObjectCountInMillions : 0.5
> Heap                          : -Xmx5G
>
> I only have 2 CFs in this instance and entries are small so in most cases
> I hit MemtableObjectCountInMillions first and total MemtableSizeInMB is
> about 60MB-120MB for the 2 CFs combined.
>
> Anyone have any pointers on where to look next?  These are m1.large EC2
> instances (I want to move to xlarge to get more memory, but haven't yet
> gotten clarification on the best process for node replacement, per my
> other thread).
>
> Thanks,
>
> -Anthony
>
> --
> 
> Anthony Molinaro                           
>


Re: Cassandra's bad behavior on disk failure

2010-04-21 Thread Jonathan Ellis
We have a ticket open for this:
https://issues.apache.org/jira/browse/CASSANDRA-809

Ideally I think we'd like to leave the node up to serve reads, if a
disk is erroring out on writes but still read-able.  In my experience
this is very common when a disk first begins to fail, as well as in
the "disk is full" case where there is nothing actually wrong with the
disk per se.

On Wed, Apr 21, 2010 at 9:08 AM, Oleg Anastasjev  wrote:
> Hello,
>
> I am testing how cassandra behaves on single node disk failures to know what 
> to
> expect when things go bad.
> I had a cluster of 4 cassandra nodes, stress loaded it with client and made 2
> tests:
> 1. emulated disk failure of /data volume on read only stress test
> 2. emulated disk failure of /commitlog volumn on write intensive test
>
> 1. On read test with data volume down, a lot of
> "org.apache.thrift.TApplicationException: Internal error processing get_slice"
> was logged at client side. On cassandra server logged alot of IOExceptions
> reading every *.db file it has. Node continued to show as UP in ring.
>
> OK, the behavior is not ideal, but still can be worked around at client side,
> throwing out nodes as soon as TApplicationException is received from 
> cassandra.
>
> 2. Much worse was with write test:
> No exception was seen at client, writes are going through normally, but
> PERIODIC-COMMIT-LOG-SYNCER failed to sync commit logs, heap of node quickly
> became full and node freezed in GC loop. Still, it continued to show as UP in
> ring.
>
> This, i believe, is bad, because no quick workaround could be done at client
> side (no exceptions are coming from failed node) and in real system will lead 
> to
> dramatic slow down of the whole cluster, because clients, not knowing, that 
> node
> is actually dead, will direct 1/4th of requests to it and timeout.
>
> I think that more correct behavior here could be halting cassandra server on 
> any
> disk IO error, so clients can quickly detect this and failover to healthy
> servers.
>
> What do you think ?
>
> Did you guys experienced disk failure in production and how was it ?
>
>
>


Re: Cassandra data model for financial data

2010-04-21 Thread JKnight JKnight
I know Cassandra is very flexible.
a. Because of super_column can not contain large number of columns, you
should not use design 1
b. Maybe with each query, you have to separate to each ColumnFamily

On Wed, Apr 21, 2010 at 1:17 PM, Steve Lihn  wrote:

> Hi,
> I am new to Cassandra. I would like to use Cassandra to store financial
> data (time series). Have question on the data model design.
>
> The example here is the daily stock data. This would be a column family
> called dailyStockData. The raw key is stock ticker.
> Everyday there are attributes like closingPrice, volume, sharesOutstanding,
> etc. that need to be stored. There seems to be two ways to model it:
>
> Design 1: Each attribute is a super column. Therefore each date is a
> column. So we have:
>
> AAPL -> closingPrice -> { '2010-04-13' : 242, '2010-04-14': 245 }
> AAPL -> volume -> { '2010-04-13' : 10.9m, '2010-04-14': 14.4m }
> etc.
>
> Design 2: Each date is a super column. Therefore each attribute is a
> column. So we have:
>
> AAPL -> '2010-04-13' -> { closingPrice -> 242, volume -> 10.9m }
> AAPL -> '2010-04-14' -> {closingPrice -> 245, volume -> 14.4m }
> etc.
>
> The date column / superColumn will need Order Perserving Partitioner since
> we are going to do a lot of range queries. Examples are:
> Query 1: Give me the data between date1 and date2 for a set of tickers
> (say, the 100 tickers in QQQ).
> Query 2: More often than not, the query is: Give me the data for the max
> available dates (for each ticker) between date1 and date2 in a set of
> tickers.
> (Since not every day is traded, and we only want the most recent data,
> given a range of dates.)
>
> My questions are:
> a. Is there any technical reason to prefer (or must choose) one rather than
> the other between Design 1 and Design 2 ?
> b. Are both queries possible (and comparable in speed) for the chosen
> design ?
>
> Thanks,
> Steve
>
>
>
>
>
>
>


-- 
Best regards,
JKnight


Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Anthony Molinaro

On Wed, Apr 21, 2010 at 12:21:31PM -0500, Jonathan Ellis wrote:
> [moving to u...@]
> 
> 0.6 fixes replaying faster than it can flush.

Yeah, I noticed some of those fixes, and will probably take the leap into
0.6 if I can keep my cluster running (it's not doing too bad, I do about
400K reads and 250K writes per minute spread over 23 nodes), however some
of the m1.large instances get into this backed up state frequently. 
So I need to keep the cluster running first.

> as for why it backs up in the first place before the restart, you can
> either (a) throttle writes [set your timeout lower, make your clients
> back off temporarily when it gets a timeoutexception]

What timeout is this?  Something in the thrift API or a cassandra
configuration?

> or (b) add capacity.  (b) is recommended.

Yeah I've been doing that adding xlarge instances with raid0 disks which
work better, but I keep running into issues with the old instances which
hold up this work.  I'll keep chugging along and hopefully get things
sorted.

-Anthony

> 
> https://issues.apache.org/jira/browse/CASSANDRA-685 will mitigate this
> but there is still no substitute for adding capacity to match demand.
> 
> On Tue, Apr 20, 2010 at 4:57 PM, Anthony Molinaro
>  wrote:
> > Hi,
> >
> >  I have a cassandra cluster where a couple things are happening.  Every
> > once in a while a node will start to get backed up.  Checking tpstats I
> > see a very large value for ROW-MUTATION-STAGE.  Sometimes it will be able
> > to clear it if I give it enough time, other times the vm OOMs.  With some
> > nodes I also see this happen during restarts, I'll restart and have to
> > wait 6-12 hours for the node to not be marked as 'Down'.
> > I've seen
> > http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
> > and ended up with the following settings.
> >
> > KeysCachedFraction            : 0.01
> > MemtableSizeInMB              : 100
> > MemtableObjectCountInMillions : 0.5
> > Heap                          : -Xmx5G
> >
> > I only have 2 CFs in this instance and entries are small so in most cases
> > I hit MemtableObjectCountInMillions first and total MemtableSizeInMB is
> > about 60MB-120MB for the 2 CFs combined.
> >
> > Anyone have any pointers on where to look next?  These are m1.large EC2
> > instances (I want to move to xlarge to get more memory, but haven't yet
> > gotten clarification on the best process for node replacement, per my
> > other thread).
> >
> > Thanks,
> >
> > -Anthony
> >
> > --
> > 
> > Anthony Molinaro                           
> >

-- 

Anthony Molinaro   


Re: Just to be clear, cassandra is web framework agnostic b/c of Thrift?

2010-04-21 Thread Mark Greene
I'll try to test this out tonight.

On Wed, Apr 21, 2010 at 1:07 PM, Jonathan Ellis  wrote:

> There is a patch attached to
> https://issues.apache.org/jira/browse/CASSANDRA-948 that needs
> volunteers to test.
>
> On Sun, Apr 18, 2010 at 11:13 PM, Mark Greene  wrote:
> > With the 0.6.0 release, the windows cassandra.bat file errors out.
> There's a
> > bug filed for this already. There's a README or something similar in the
> > install directory, that tells you the basic CLI operations and explains
> the
> > basic data model.
> >
> > On Sun, Apr 18, 2010 at 11:23 PM, S Ahmed  wrote:
> >>
> >> Interesting, I'm just finding windows to be a pain, particular starting
> up
> >> java apps. (I guess I just need to learn!)
> >> How exactly would you startup Cassandra on a windows machine? i.e when
> the
> >> server reboots, how will it run the java -jar cassandar ?
> >>
> >>
> >> On Sun, Apr 18, 2010 at 7:35 PM, Joe Stump  wrote:
> >>>
> >>> On Apr 18, 2010, at 5:33 PM, S Ahmed wrote:
> >>>
> >>> Obviously if you run asp.net on windows, it is probably a VERY good
> idea
> >>> to be running cassandra on a linux box.
> >>>
> >>> Actually, I'm not sure this is true. A few people have found Windows
> >>> performs fairly well with Cassandra, if I recall correctly. Obviously,
> all
> >>> of the testing and most of the bigger users are running on Linux
> though.
> >>> --Joe
> >
> >
>


Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 12:45 PM, Anthony Molinaro
 wrote:
>> as for why it backs up in the first place before the restart, you can
>> either (a) throttle writes [set your timeout lower, make your clients
>> back off temporarily when it gets a timeoutexception]
>
> What timeout is this?  Something in the thrift API or a cassandra
> configuration?

the latter.  iirc it is "RPCTimeout"


Re: Import using cassandra 0.6.1

2010-04-21 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts

On Wed, Apr 21, 2010 at 12:02 PM, Sonny Heer  wrote:
> Currently running on a single node with intensive write operations.
>
>
> After running for a while...
>
> Client starts outputting:
>
> TimedOutException()
>        at 
> org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12232)
>        at 
> org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:670)
>        at 
> org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:643)
>
> Cassandra starts outputting:
>
>  INFO 08:08:49,864 Cassandra starting up...
>  INFO 08:18:09,782 GC for ParNew: 238 ms, 30728008 reclaimed leaving
> 220554976 used; max is 1190723584
>  INFO 08:18:19,782 GC for ParNew: 231 ms, 30657944 reclaimed leaving
> 230245792 used; max is 1190723584
>  INFO 08:18:39,782 GC for ParNew: 229 ms, 30567184 reclaimed leaving
> 250127792 used; max is 1190723584
>  INFO 08:18:59,782 GC for ParNew: 358 ms, 46416776 reclaimed leaving
> 261657720 used; max is 1190723584
>  INFO 08:19:09,782 GC for ParNew: 205 ms, 46331376 reclaimed leaving
> 273764040 used; max is 1190723584
>  INFO 08:19:19,783 GC for ParNew: 335 ms, 46354968 reclaimed leaving
> 282912656 used; max is 1190723584
>  INFO 08:19:30,064 GC for ParNew: 392 ms, 46403400 reclaimed leaving
> 294861824 used; max is 1190723584
>  INFO 08:19:40,065 GC for ParNew: 326 ms, 4639 reclaimed leaving
> 304045640 used; max is 1190723584
>  INFO 08:19:50,064 GC for ParNew: 256 ms, 46460824 reclaimed leaving
> 312964344 used; max is 1190723584
>  INFO 08:20:00,065 GC for ParNew: 242 ms, 46357104 reclaimed leaving
> 324961320 used; max is 1190723584
>  INFO 08:20:20,065 GC for ParNew: 336 ms, 46447216 reclaimed leaving
> 345874144 used; max is 1190723584
>  INFO 08:20:27,357 Creating new commitlog segment
> /var/lib/cassandra/commitlog/CommitLog-1271866827357.log
>  INFO 08:20:40,065 GC for ParNew: 265 ms, 46509608 reclaimed leaving
> 366587984 used; max is 1190723584
>  INFO 08:21:00,069 GC for ParNew: 321 ms, 46478736 reclaimed leaving
> 384059832 used; max is 1190723584
>  INFO 08:21:10,069 GC for ParNew: 223 ms, 62235000 reclaimed leaving
> 383631432 used; max is 1190723584
>  INFO 08:21:30,069 GC for ParNew: 291 ms, 62261104 reclaimed leaving
> 399697888 used; max is 1190723584
>  INFO 08:21:50,069 GC for ParNew: 245 ms, 62275528 reclaimed leaving
> 415428952 used; max is 1190723584
>  INFO 08:22:10,248 GC for ParNew: 384 ms, 62219264 reclaimed leaving
> 433542656 used; max is 1190723584
>  INFO 08:22:30,248 GC for ParNew: 215 ms, 62363608 reclaimed leaving
> 452030560 used; max is 1190723584
>  INFO 08:22:40,248 GC for ParNew: 318 ms, 62104552 reclaimed leaving
> 464013992 used; max is 1190723584
>  INFO 08:22:51,039 GC for ParNew: 845 ms, 62218296 reclaimed leaving
> 471978840 used; max is 1190723584
>  INFO 08:23:01,040 GC for ParNew: 474 ms, 62258080 reclaimed leaving
> 475912120 used; max is 1190723584
>  INFO 08:23:11,040 GC for ParNew: 738 ms, 62265328 reclaimed leaving
> 483742344 used; max is 1190723584
>  INFO 08:23:21,040 GC for ParNew: 306 ms, 62218648 reclaimed leaving
> 491761672 used; max is 1190723584
>  INFO 08:23:41,041 GC for ParNew: 279 ms, 62187536 reclaimed leaving
> 507442800 used; max is 1190723584
>  INFO 08:24:01,041 GC for ParNew: 557 ms, 62310784 reclaimed leaving
> 523028304 used; max is 1190723584
>  INFO 08:24:11,041 GC for ParNew: 221 ms, 62268456 reclaimed leaving
> 530865568 used; max is 1190723584
>  INFO 08:24:21,041 GC for ParNew: 334 ms, 62258720 reclaimed leaving
> 542690216 used; max is 1190723584
>  INFO 08:24:31,042 GC for ParNew: 262 ms, 62218624 reclaimed leaving
> 550728232 used; max is 1190723584
>  INFO 08:24:51,045 GC for ParNew: 640 ms, 62235952 reclaimed leaving
> 573981584 used; max is 1190723584
>  INFO 08:25:01,045 GC for ParNew: 309 ms, 62138776 reclaimed leaving
> 563891472 used; max is 1190723584
>  INFO 08:25:11,046 GC for ParNew: 242 ms, 62255952 reclaimed leaving
> 575756040 used; max is 1190723584
>  INFO 08:25:21,047 GC for ParNew: 326 ms, 62264432 reclaimed leaving
> 583631432 used; max is 1190723584
>  INFO 08:25:31,047 GC for ParNew: 591 ms, 62231816 reclaimed leaving
> 595405816 used; max is 1190723584
>  INFO 08:25:41,048 GC for ParNew: 478 ms, 62186088 reclaimed leaving
> 603389432 used; max is 1190723584
>  INFO 08:25:51,049 GC for ParNew: 409 ms, 62264832 reclaimed leaving
> 615150584 used; max is 1190723584
>  INFO 08:26:01,049 GC for ParNew: 416 ms, 62189952 reclaimed leaving
> 623125104 used; max is 1190723584
>  INFO 08:26:11,049 GC for ParNew: 430 ms, 62382056 reclaimed leaving
> 634661008 used; max is 1190723584
>  INFO 08:26:21,094 GC for ParNew: 436 ms, 62319088 reclaimed leaving
> 646272840 used; max is 1190723584
>  INFO 08:26:31,094 GC for ParNew: 404 ms, 62379896 reclaimed leaving
> 653978688 used; max is 1190723584
>  INFO 08:26:41,094 GC for ParNew: 568 ms, 62407112 reclaimed leaving
> 6654627

Should I use Cassandra for general purpose DB?

2010-04-21 Thread Soichi Hayashi
Hi.

So, I am interested in using Cassandra not because of large amount of data,
but because of following reasons.

1) It's easy to administrate and handle fail-over (and scale, of course)
2) Easy to write an application that makes sense to developers (Developers'
fully in control of how data is orchestrated - indexed, queried, etc..)
3) Easy to expand an application to some extend - as long as changes only
applies to adding /removing new column (not column family..)

Are these good enough reasons to start experimenting with Cassandra as a
general purpose data store? Or Cassandra, or any NOSQL solution really makes
no sense if you don't have or expect to have TB of data?

For bullet 3) above.. If I have 100 nodes that runs Cassandra, and want to
add a new table (..ColumnFamily) does that mean I have to update storage.xml
on all 100 nodes and restart them? For example, if user wants me to add a
capability to sort "stuff" in ways that I haven't supported yet, I might
have to do following.

1. Create a new ColumnFamily that orders "stuff" based on a new foreign key
currently stored inside one of column for "stuff".
2. Populate this new ColumnFamily based on all "stuff" records currently
exist.
3. Update application that access this new ColumnFamily for new sort
options.
4. Update application so that everytime "stuff" is added or removed, also
update this new ColumnFamily.
5. Update the storage.xml on ALL nodes in the cluster and restart them!

If I use a regular DB, I only have to do 3.. Does this mean, unless I have
some *very* stable application that no such user requirement could happen, I
should stick to using a regular DB? If this is the case, Cassandra only
makes sense in some special case where size of the data simply does not work
for regular DB (meaning - if data size is not an issue stick to regular DB).

Thanks,
Soichi


Re: Cassandra data model for financial data

2010-04-21 Thread Miguel Verde
On Wed, Apr 21, 2010 at 12:17 PM, Steve Lihn  wrote:

> [...]



> Design 1: Each attribute is a super column. Therefore each date is a
> column. So we have:
>
> AAPL -> closingPrice -> { '2010-04-13' : 242, '2010-04-14': 245 }
> AAPL -> volume -> { '2010-04-13' : 10.9m, '2010-04-14': 14.4m }
> etc.
>
I would suggest not using this design, as each query involving an attribute
will pull all dates for that attribute into memory on the server.  i.e.
getting the closingPrice for AAPL on '2010-04-13' would pull all closing
prices for AAPL across all dates into memory.


>
> Design 2: Each date is a super column. Therefore each attribute is a
> column. So we have:
>
> AAPL -> '2010-04-13' -> { closingPrice -> 242, volume -> 10.9m }
> AAPL -> '2010-04-14' -> {closingPrice -> 245, volume -> 14.4m }
> etc.
>
> The date column / superColumn will need Order Perserving Partitioner since
> we are going to do a lot of range queries.


Partitioners split up keys between nodes, the partitioner you use has no
effect on your ability to query columns in a row.


> Examples are:
> Query 1: Give me the data between date1 and date2 for a set of tickers
> (say, the 100 tickers in QQQ).
>
You could use http://wiki.apache.org/cassandra/API#multiget_slice for this.


> Query 2: More often than not, the query is: Give me the data for the max
> available dates (for each ticker) between date1 and date2 in a set of
> tickers.
> (Since not every day is traded, and we only want the most recent data,
> given a range of dates.)
>
A http://wiki.apache.org/cassandra/API#SliceRange allows you to specify
limits and ordering for columns you are slicing.


Re: Should I use Cassandra for general purpose DB?

2010-04-21 Thread Miguel Verde
On Wed, Apr 21, 2010 at 12:56 PM, Soichi Hayashi wrote:

> So, I am interested in using Cassandra not because of large amount of data,
> but because of following reasons.
>
> 1) It's easy to administrate and handle fail-over (and scale, of course)
> 2) Easy to write an application that makes sense to developers (Developers'
> fully in control of how data is orchestrated - indexed, queried, etc..)
> 3) Easy to expand an application to some extend - as long as changes only
> applies to adding /removing new column (not column family..)
>
> Are these good enough reasons to start experimenting with Cassandra as a
> general purpose data store? Or Cassandra, or any NOSQL solution really makes
> no sense if you don't have or expect to have TB of data?
>
You don't need a good reason to experiment, go for it!  Those are all
accurate points in Cassandra's favor. There are many potential arguments
about actually adopting such a solution for production use, but personally
if I didn't have or foresee scalability or availability problems Cassandra
would not be my choice.


> For bullet 3) above.. If I have 100 nodes that runs Cassandra, and want to
> add a new table (..ColumnFamily) does that mean I have to update storage.xml
> on all 100 nodes and restart them?


Currently, yes.  You can do a rolling restart, so the cluster remains up the
whole time, but the nodes would need to be restarted.  However, 0.7 will
include https://issues.apache.org/jira/browse/CASSANDRA-44 (live schema
updates), and this problem will finally go away.


Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Anthony Molinaro
On Wed, Apr 21, 2010 at 12:52:32PM -0500, Jonathan Ellis wrote:
> On Wed, Apr 21, 2010 at 12:45 PM, Anthony Molinaro
>  wrote:
> >> as for why it backs up in the first place before the restart, you can
> >> either (a) throttle writes [set your timeout lower, make your clients
> >> back off temporarily when it gets a timeoutexception]
> >
> > What timeout is this?  Something in the thrift API or a cassandra
> > configuration?
> 
> the latter.  iirc it is "RPCTimeout"

Interesting, in the config I see

 
 5000

So I thought that timeout was for inter-node communication not the thrift
API, but I see how you probably consider both inter-node traffic and
thrift traffic as clients.  Does this RPC Timeout apply to both?

Somewhat off-topic but relating to timeouts, is there any plans to tune
the timeouts for Gossip nodes?  EC2 network is horribly flakey, and I
often see node go Dead, the come back a few seconds later, so just
wondering if there's a way to tune the check to occur less frequently?

-Anthony

-- 

Anthony Molinaro   


Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 1:11 PM, Anthony Molinaro
 wrote:
> Interesting, in the config I see
>
>  
>  5000
>
> So I thought that timeout was for inter-node communication not the thrift
> API, but I see how you probably consider both inter-node traffic and
> thrift traffic as clients.  Does this RPC Timeout apply to both?

rpctimeout applies to internal messages but if an operation times out
at that level a Thrift exception will be passed to the client.

> Somewhat off-topic but relating to timeouts, is there any plans to tune
> the timeouts for Gossip nodes?  EC2 network is horribly flakey, and I
> often see node go Dead, the come back a few seconds later, so just
> wondering if there's a way to tune the check to occur less frequently?

increase failuredetector.phiConvictThreshold.


Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-21 Thread Anthony Molinaro

On Wed, Apr 21, 2010 at 12:05:07PM -0500, Jonathan Ellis wrote:
> On Wed, Apr 21, 2010 at 11:31 AM, Anthony Molinaro
>  wrote:
> >
> > On Wed, Apr 21, 2010 at 11:08:19AM -0500, Jonathan Ellis wrote:
> >> Yes, that looks right, where "token really close" means "slightly less
> >> than" (more than would move it into a different node's range).
> >
> > Is it better to go slightly less than (say Token - 1), or slightly more than
> > the beginning of the range (PreviousTokenInRing + 1).  I was assuming the
> > latter in my earlier email, but you seem to be suggesting the former?
> 
> Right, the former.

So why is Token - 1 better?  Doesn't that result in more data movement
than PreviousTokenInRing + 1?

> > Right, I was mostly wondering if I could speed things up by scping the
> > sstables while the system was running (since they shouldn't be changing).
> > Then in quick succession removetoken and bootstrap with the old token.
> > Probably grasping at straws here :b
> 
> Nope, bootstrap ignores any local data.
> 
> You could use scp-then-repair if you can tolerate slightly out of date
> data being served by the new machine until the repair finishes.

So with scp-then-repair, what would my config look like?  Would I specify
the InitialToken as the same as the old token, but have AutoBootstrap
set to false?  I guess this is interesting to me because I could do something
where I migrate my data on a running server to an attached ebs, then
after it's synced, detach and re-attach to the new machine.

Anyway, thanks for discussing the possibilities,

-Anthony

-- 

Anthony Molinaro   


Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Anthony Molinaro
On Wed, Apr 21, 2010 at 01:24:45PM -0500, Jonathan Ellis wrote:
> On Wed, Apr 21, 2010 at 1:11 PM, Anthony Molinaro
>  wrote:
> > Interesting, in the config I see
> >
> >  
> >  5000
> >
> > So I thought that timeout was for inter-node communication not the thrift
> > API, but I see how you probably consider both inter-node traffic and
> > thrift traffic as clients.  Does this RPC Timeout apply to both?
> 
> rpctimeout applies to internal messages but if an operation times out
> at that level a Thrift exception will be passed to the client.

Ahh, I see, basically percolates back up the call chain.

> > Somewhat off-topic but relating to timeouts, is there any plans to tune
> > the timeouts for Gossip nodes?  EC2 network is horribly flakey, and I
> > often see node go Dead, the come back a few seconds later, so just
> > wondering if there's a way to tune the check to occur less frequently?
> 
> increase failuredetector.phiConvictThreshold.

Is that a property? (ie, do I set it with -Dfailuredetector.phiConvictThreshold)
What is the unit?  Are there other super secret properties that might
be useful for tuning?

Thanks,

-Anthony

-- 

Anthony Molinaro   


Problem using get_range_slices

2010-04-21 Thread Guilherme Kaster
I've encountered a problem on cassandra 0.6 while using get_ranged_slices.
I use RP and when I use get_range_slices the keys are not returned in an
"ordered" maner, that means the last key on the list not always the
"greater" key in the list, so I started getting repetitions and ONCE entered
in an infinite loop because the last key on the list was the same start key
I used (keys are inclusive). So a dig a little in cassandra code and found
where keys are converted to tokens in RP (FBUtilities.hash) and started
using tokens, which are not inclusive, starting with token 0. That solved
one of my problems. But the keys where still not returned ascending in
order. I converted them to tokens but the tokens are also not in order. So I
now use as the "new" start token of the next iteration of get_range_slices
the "greater" token (converted from key) found in the returned list.

Is that correct? Has anyone else had the same problem?

-- 
Guilherme L. Kaster
Auspex Desenvolvimento de Negócios em Tecnologia Ltda.


Re: Problem using get_range_slices

2010-04-21 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 2:19 PM, Guilherme Kaster
 wrote:
> I've encountered a problem on cassandra 0.6 while using get_ranged_slices.
> I use RP and when I use get_range_slices the keys are not returned in an
> "ordered" maner, that means the last key on the list not always the
> "greater" key in the list, so I started getting repetitions and ONCE entered
> in an infinite loop because the last key on the list was the same start key
> I used (keys are inclusive). So a dig a little in cassandra code and found
> where keys are converted to tokens in RP (FBUtilities.hash) and started
> using tokens, which are not inclusive, starting with token 0. That solved
> one of my problems. But the keys where still not returned ascending in
> order. I converted them to tokens but the tokens are also not in order. So I
> now use as the "new" start token of the next iteration of get_range_slices
> the "greater" token (converted from key) found in the returned list.
> Is that correct? Has anyone else had the same problem?

Right, everything is working as designed.  If you want keys ordered
you have to use OPP.

You can use "start with the key that was last in the previous
iteration" with keys, you don't have to drop down to tokens for that.

-Jonathan


CassandraLimitations

2010-04-21 Thread Bill de hOra
http://wiki.apache.org/cassandra/CassandraLimitations has good coverage 
on the limits around columns.


Are there are design (or practical) limits to the number of rows a 
keyspace can have?


Bill


Re: CassandraLimitations

2010-04-21 Thread Jonathan Ellis
No.

On Wed, Apr 21, 2010 at 2:58 PM, Bill de hOra  wrote:
> http://wiki.apache.org/cassandra/CassandraLimitations has good coverage on
> the limits around columns.
>
> Are there are design (or practical) limits to the number of rows a keyspace
> can have?
>
> Bill
>


Re: CassandraLimitations

2010-04-21 Thread Mark Greene
Hey Bill,

Are you asking if there are limits in the context of a single node or a ring
of nodes?

On Wed, Apr 21, 2010 at 3:58 PM, Bill de hOra  wrote:

> http://wiki.apache.org/cassandra/CassandraLimitations has good coverage on
> the limits around columns.
>
> Are there are design (or practical) limits to the number of rows a keyspace
> can have?
>
> Bill
>


unsubscribe

2010-04-21 Thread Jennifer Huynh
Anyone know how to unsubscribe to the mailing list? I tried emailing the
server, user-unsubcr...@cassandra.apache.org, and had no luck.

Thanks in advance!!!





Re: Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
note: I'm using the Thrift API to insert.  The commitLog directory
continues to grow.  The heap size continues to grow as well.

I decreased MemtableSizeInMB size, but noticed no changes.  Any idea
what is causing this, and/or what property i need to tweek to
alleviate this?  What is the "insert threshold"?

I moved to a more powerful node as well, it still ended up failing
just after a longer period.

On Wed, Apr 21, 2010 at 10:53 AM, Jonathan Ellis  wrote:
> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
>
> On Wed, Apr 21, 2010 at 12:02 PM, Sonny Heer  wrote:
>> Currently running on a single node with intensive write operations.
>>
>>
>> After running for a while...
>>
>> Client starts outputting:
>>
>> TimedOutException()
>>        at 
>> org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12232)
>>        at 
>> org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:670)
>>        at 
>> org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:643)
>>
>> Cassandra starts outputting:
>>
>>  INFO 08:08:49,864 Cassandra starting up...
>>  INFO 08:18:09,782 GC for ParNew: 238 ms, 30728008 reclaimed leaving
>> 220554976 used; max is 1190723584
>>  INFO 08:18:19,782 GC for ParNew: 231 ms, 30657944 reclaimed leaving
>> 230245792 used; max is 1190723584
>>  INFO 08:18:39,782 GC for ParNew: 229 ms, 30567184 reclaimed leaving
>> 250127792 used; max is 1190723584
>>  INFO 08:18:59,782 GC for ParNew: 358 ms, 46416776 reclaimed leaving
>> 261657720 used; max is 1190723584
>>  INFO 08:19:09,782 GC for ParNew: 205 ms, 46331376 reclaimed leaving
>> 273764040 used; max is 1190723584
>>  INFO 08:19:19,783 GC for ParNew: 335 ms, 46354968 reclaimed leaving
>> 282912656 used; max is 1190723584
>>  INFO 08:19:30,064 GC for ParNew: 392 ms, 46403400 reclaimed leaving
>> 294861824 used; max is 1190723584
>>  INFO 08:19:40,065 GC for ParNew: 326 ms, 4639 reclaimed leaving
>> 304045640 used; max is 1190723584
>>  INFO 08:19:50,064 GC for ParNew: 256 ms, 46460824 reclaimed leaving
>> 312964344 used; max is 1190723584
>>  INFO 08:20:00,065 GC for ParNew: 242 ms, 46357104 reclaimed leaving
>> 324961320 used; max is 1190723584
>>  INFO 08:20:20,065 GC for ParNew: 336 ms, 46447216 reclaimed leaving
>> 345874144 used; max is 1190723584
>>  INFO 08:20:27,357 Creating new commitlog segment
>> /var/lib/cassandra/commitlog/CommitLog-1271866827357.log
>>  INFO 08:20:40,065 GC for ParNew: 265 ms, 46509608 reclaimed leaving
>> 366587984 used; max is 1190723584
>>  INFO 08:21:00,069 GC for ParNew: 321 ms, 46478736 reclaimed leaving
>> 384059832 used; max is 1190723584
>>  INFO 08:21:10,069 GC for ParNew: 223 ms, 62235000 reclaimed leaving
>> 383631432 used; max is 1190723584
>>  INFO 08:21:30,069 GC for ParNew: 291 ms, 62261104 reclaimed leaving
>> 399697888 used; max is 1190723584
>>  INFO 08:21:50,069 GC for ParNew: 245 ms, 62275528 reclaimed leaving
>> 415428952 used; max is 1190723584
>>  INFO 08:22:10,248 GC for ParNew: 384 ms, 62219264 reclaimed leaving
>> 433542656 used; max is 1190723584
>>  INFO 08:22:30,248 GC for ParNew: 215 ms, 62363608 reclaimed leaving
>> 452030560 used; max is 1190723584
>>  INFO 08:22:40,248 GC for ParNew: 318 ms, 62104552 reclaimed leaving
>> 464013992 used; max is 1190723584
>>  INFO 08:22:51,039 GC for ParNew: 845 ms, 62218296 reclaimed leaving
>> 471978840 used; max is 1190723584
>>  INFO 08:23:01,040 GC for ParNew: 474 ms, 62258080 reclaimed leaving
>> 475912120 used; max is 1190723584
>>  INFO 08:23:11,040 GC for ParNew: 738 ms, 62265328 reclaimed leaving
>> 483742344 used; max is 1190723584
>>  INFO 08:23:21,040 GC for ParNew: 306 ms, 62218648 reclaimed leaving
>> 491761672 used; max is 1190723584
>>  INFO 08:23:41,041 GC for ParNew: 279 ms, 62187536 reclaimed leaving
>> 507442800 used; max is 1190723584
>>  INFO 08:24:01,041 GC for ParNew: 557 ms, 62310784 reclaimed leaving
>> 523028304 used; max is 1190723584
>>  INFO 08:24:11,041 GC for ParNew: 221 ms, 62268456 reclaimed leaving
>> 530865568 used; max is 1190723584
>>  INFO 08:24:21,041 GC for ParNew: 334 ms, 62258720 reclaimed leaving
>> 542690216 used; max is 1190723584
>>  INFO 08:24:31,042 GC for ParNew: 262 ms, 62218624 reclaimed leaving
>> 550728232 used; max is 1190723584
>>  INFO 08:24:51,045 GC for ParNew: 640 ms, 62235952 reclaimed leaving
>> 573981584 used; max is 1190723584
>>  INFO 08:25:01,045 GC for ParNew: 309 ms, 62138776 reclaimed leaving
>> 563891472 used; max is 1190723584
>>  INFO 08:25:11,046 GC for ParNew: 242 ms, 62255952 reclaimed leaving
>> 575756040 used; max is 1190723584
>>  INFO 08:25:21,047 GC for ParNew: 326 ms, 62264432 reclaimed leaving
>> 583631432 used; max is 1190723584
>>  INFO 08:25:31,047 GC for ParNew: 591 ms, 62231816 reclaimed leaving
>> 595405816 used; max is 1190723584
>>  INFO 08:25:41,048 GC for ParNew: 478 ms, 62186088 reclaimed leaving
>> 603389432 used; max is 1190723584
>>  INFO 08:25:51,049 GC for ParNew: 409 ms, 62264832 recla

Re: Import using cassandra 0.6.1

2010-04-21 Thread Jonathan Ellis
you need to figure out where the memory is going.  check tpstats, if
the pending ops are large somewhere that means you're just generating
insert ops faster than it can handle.

On Wed, Apr 21, 2010 at 4:07 PM, Sonny Heer  wrote:
> note: I'm using the Thrift API to insert.  The commitLog directory
> continues to grow.  The heap size continues to grow as well.
>
> I decreased MemtableSizeInMB size, but noticed no changes.  Any idea
> what is causing this, and/or what property i need to tweek to
> alleviate this?  What is the "insert threshold"?
>
> I moved to a more powerful node as well, it still ended up failing
> just after a longer period.
>
> On Wed, Apr 21, 2010 at 10:53 AM, Jonathan Ellis  wrote:
>> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
>>
>> On Wed, Apr 21, 2010 at 12:02 PM, Sonny Heer  wrote:
>>> Currently running on a single node with intensive write operations.
>>>
>>>
>>> After running for a while...
>>>
>>> Client starts outputting:
>>>
>>> TimedOutException()
>>>        at 
>>> org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12232)
>>>        at 
>>> org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:670)
>>>        at 
>>> org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:643)
>>>
>>> Cassandra starts outputting:
>>>
>>>  INFO 08:08:49,864 Cassandra starting up...
>>>  INFO 08:18:09,782 GC for ParNew: 238 ms, 30728008 reclaimed leaving
>>> 220554976 used; max is 1190723584
>>>  INFO 08:18:19,782 GC for ParNew: 231 ms, 30657944 reclaimed leaving
>>> 230245792 used; max is 1190723584
>>>  INFO 08:18:39,782 GC for ParNew: 229 ms, 30567184 reclaimed leaving
>>> 250127792 used; max is 1190723584
>>>  INFO 08:18:59,782 GC for ParNew: 358 ms, 46416776 reclaimed leaving
>>> 261657720 used; max is 1190723584
>>>  INFO 08:19:09,782 GC for ParNew: 205 ms, 46331376 reclaimed leaving
>>> 273764040 used; max is 1190723584
>>>  INFO 08:19:19,783 GC for ParNew: 335 ms, 46354968 reclaimed leaving
>>> 282912656 used; max is 1190723584
>>>  INFO 08:19:30,064 GC for ParNew: 392 ms, 46403400 reclaimed leaving
>>> 294861824 used; max is 1190723584
>>>  INFO 08:19:40,065 GC for ParNew: 326 ms, 4639 reclaimed leaving
>>> 304045640 used; max is 1190723584
>>>  INFO 08:19:50,064 GC for ParNew: 256 ms, 46460824 reclaimed leaving
>>> 312964344 used; max is 1190723584
>>>  INFO 08:20:00,065 GC for ParNew: 242 ms, 46357104 reclaimed leaving
>>> 324961320 used; max is 1190723584
>>>  INFO 08:20:20,065 GC for ParNew: 336 ms, 46447216 reclaimed leaving
>>> 345874144 used; max is 1190723584
>>>  INFO 08:20:27,357 Creating new commitlog segment
>>> /var/lib/cassandra/commitlog/CommitLog-1271866827357.log
>>>  INFO 08:20:40,065 GC for ParNew: 265 ms, 46509608 reclaimed leaving
>>> 366587984 used; max is 1190723584
>>>  INFO 08:21:00,069 GC for ParNew: 321 ms, 46478736 reclaimed leaving
>>> 384059832 used; max is 1190723584
>>>  INFO 08:21:10,069 GC for ParNew: 223 ms, 62235000 reclaimed leaving
>>> 383631432 used; max is 1190723584
>>>  INFO 08:21:30,069 GC for ParNew: 291 ms, 62261104 reclaimed leaving
>>> 399697888 used; max is 1190723584
>>>  INFO 08:21:50,069 GC for ParNew: 245 ms, 62275528 reclaimed leaving
>>> 415428952 used; max is 1190723584
>>>  INFO 08:22:10,248 GC for ParNew: 384 ms, 62219264 reclaimed leaving
>>> 433542656 used; max is 1190723584
>>>  INFO 08:22:30,248 GC for ParNew: 215 ms, 62363608 reclaimed leaving
>>> 452030560 used; max is 1190723584
>>>  INFO 08:22:40,248 GC for ParNew: 318 ms, 62104552 reclaimed leaving
>>> 464013992 used; max is 1190723584
>>>  INFO 08:22:51,039 GC for ParNew: 845 ms, 62218296 reclaimed leaving
>>> 471978840 used; max is 1190723584
>>>  INFO 08:23:01,040 GC for ParNew: 474 ms, 62258080 reclaimed leaving
>>> 475912120 used; max is 1190723584
>>>  INFO 08:23:11,040 GC for ParNew: 738 ms, 62265328 reclaimed leaving
>>> 483742344 used; max is 1190723584
>>>  INFO 08:23:21,040 GC for ParNew: 306 ms, 62218648 reclaimed leaving
>>> 491761672 used; max is 1190723584
>>>  INFO 08:23:41,041 GC for ParNew: 279 ms, 62187536 reclaimed leaving
>>> 507442800 used; max is 1190723584
>>>  INFO 08:24:01,041 GC for ParNew: 557 ms, 62310784 reclaimed leaving
>>> 523028304 used; max is 1190723584
>>>  INFO 08:24:11,041 GC for ParNew: 221 ms, 62268456 reclaimed leaving
>>> 530865568 used; max is 1190723584
>>>  INFO 08:24:21,041 GC for ParNew: 334 ms, 62258720 reclaimed leaving
>>> 542690216 used; max is 1190723584
>>>  INFO 08:24:31,042 GC for ParNew: 262 ms, 62218624 reclaimed leaving
>>> 550728232 used; max is 1190723584
>>>  INFO 08:24:51,045 GC for ParNew: 640 ms, 62235952 reclaimed leaving
>>> 573981584 used; max is 1190723584
>>>  INFO 08:25:01,045 GC for ParNew: 309 ms, 62138776 reclaimed leaving
>>> 563891472 used; max is 1190723584
>>>  INFO 08:25:11,046 GC for ParNew: 242 ms, 62255952 reclaimed leaving
>>> 575756040 used; max is 1190723584
>>>  INFO 08:25:21,047 GC for ParNew: 326 ms, 622644

Re: unsubscribe

2010-04-21 Thread Jeremy Dunck
You have a typo: user-unsubscr...@cassandra.apache.org, not
user-unsubcr...@cassandra.apache.org.

:-)

On Wed, Apr 21, 2010 at 3:55 PM, Jennifer Huynh
 wrote:
> Anyone know how to unsubscribe to the mailing list? I tried emailing the
> server, user-unsubcr...@cassandra.apache.org, and had no luck.
>
> Thanks in advance!!!
>
>
>
>


security, firewall level only?

2010-04-21 Thread S Ahmed
Is security in terms of remote clients connecting to a cassandra node done
purely at the hardware/firewall level?

i.e. there is no username/pwd like in mysql/sqlserver correct?

Or permissions at the column family level per user ?


Re: Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
They are showing up as completed?  Is this correct:


Pool NameActive   Pending  Completed
STREAM-STAGE  0 0  0
RESPONSE-STAGE0 0  0
ROW-READ-STAGE0 0 517446
LB-OPERATIONS 0 0  0
MESSAGE-DESERIALIZER-POOL 0 0  0
GMFD  0 0  0
LB-TARGET 0 0  0
CONSISTENCY-MANAGER   0 0  0
ROW-MUTATION-STAGE0 01353622
MESSAGE-STREAMING-POOL0 0  0
LOAD-BALANCER-STAGE   0 0  0
FLUSH-SORTER-POOL 0 0  0
MEMTABLE-POST-FLUSHER 0 0  0
FLUSH-WRITER-POOL 0 0  0
AE-SERVICE-STAGE  0 0  0


On Wed, Apr 21, 2010 at 2:09 PM, Jonathan Ellis  wrote:
> you need to figure out where the memory is going.  check tpstats, if
> the pending ops are large somewhere that means you're just generating
> insert ops faster than it can handle.
>
> On Wed, Apr 21, 2010 at 4:07 PM, Sonny Heer  wrote:
>> note: I'm using the Thrift API to insert.  The commitLog directory
>> continues to grow.  The heap size continues to grow as well.
>>
>> I decreased MemtableSizeInMB size, but noticed no changes.  Any idea
>> what is causing this, and/or what property i need to tweek to
>> alleviate this?  What is the "insert threshold"?
>>
>> I moved to a more powerful node as well, it still ended up failing
>> just after a longer period.
>>
>> On Wed, Apr 21, 2010 at 10:53 AM, Jonathan Ellis  wrote:
>>> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
>>>
>>> On Wed, Apr 21, 2010 at 12:02 PM, Sonny Heer  wrote:
 Currently running on a single node with intensive write operations.


 After running for a while...

 Client starts outputting:

 TimedOutException()
        at 
 org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12232)
        at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:670)
        at 
 org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:643)

 Cassandra starts outputting:

  INFO 08:08:49,864 Cassandra starting up...
  INFO 08:18:09,782 GC for ParNew: 238 ms, 30728008 reclaimed leaving
 220554976 used; max is 1190723584
  INFO 08:18:19,782 GC for ParNew: 231 ms, 30657944 reclaimed leaving
 230245792 used; max is 1190723584
  INFO 08:18:39,782 GC for ParNew: 229 ms, 30567184 reclaimed leaving
 250127792 used; max is 1190723584
  INFO 08:18:59,782 GC for ParNew: 358 ms, 46416776 reclaimed leaving
 261657720 used; max is 1190723584
  INFO 08:19:09,782 GC for ParNew: 205 ms, 46331376 reclaimed leaving
 273764040 used; max is 1190723584
  INFO 08:19:19,783 GC for ParNew: 335 ms, 46354968 reclaimed leaving
 282912656 used; max is 1190723584
  INFO 08:19:30,064 GC for ParNew: 392 ms, 46403400 reclaimed leaving
 294861824 used; max is 1190723584
  INFO 08:19:40,065 GC for ParNew: 326 ms, 4639 reclaimed leaving
 304045640 used; max is 1190723584
  INFO 08:19:50,064 GC for ParNew: 256 ms, 46460824 reclaimed leaving
 312964344 used; max is 1190723584
  INFO 08:20:00,065 GC for ParNew: 242 ms, 46357104 reclaimed leaving
 324961320 used; max is 1190723584
  INFO 08:20:20,065 GC for ParNew: 336 ms, 46447216 reclaimed leaving
 345874144 used; max is 1190723584
  INFO 08:20:27,357 Creating new commitlog segment
 /var/lib/cassandra/commitlog/CommitLog-1271866827357.log
  INFO 08:20:40,065 GC for ParNew: 265 ms, 46509608 reclaimed leaving
 366587984 used; max is 1190723584
  INFO 08:21:00,069 GC for ParNew: 321 ms, 46478736 reclaimed leaving
 384059832 used; max is 1190723584
  INFO 08:21:10,069 GC for ParNew: 223 ms, 62235000 reclaimed leaving
 383631432 used; max is 1190723584
  INFO 08:21:30,069 GC for ParNew: 291 ms, 62261104 reclaimed leaving
 399697888 used; max is 1190723584
  INFO 08:21:50,069 GC for ParNew: 245 ms, 62275528 reclaimed leaving
 415428952 used; max is 1190723584
  INFO 08:22:10,248 GC for ParNew: 384 ms, 62219264 reclaimed leaving
 433542656 used; max is 1190723584
  INFO 08:22:30,248 GC for ParNew: 215 ms, 62363608 reclaimed leaving
 452030560 used; max is 1190723584
  INFO 08:22:40,248 GC for ParNew: 318 ms, 62104552 reclaimed leaving
 464013992 used; max is 1190723584
  INFO 08:22:51,039 GC for ParNew: 845 ms, 62218296 reclaimed leaving
 471978840 used; max is 1190723584
  INFO 08:23:01,040 GC for ParNew: 474 ms, 62258080 reclaimed leaving
 475912120 used; max is 

Re: Import using cassandra 0.6.1

2010-04-21 Thread Jonathan Ellis
then that's not the problem.

are you writing large rows that OOM during compaction?

On Wed, Apr 21, 2010 at 4:34 PM, Sonny Heer  wrote:
> They are showing up as completed?  Is this correct:
>
>
> Pool Name                    Active   Pending      Completed
> STREAM-STAGE                      0         0              0
> RESPONSE-STAGE                    0         0              0
> ROW-READ-STAGE                    0         0         517446
> LB-OPERATIONS                     0         0              0
> MESSAGE-DESERIALIZER-POOL         0         0              0
> GMFD                              0         0              0
> LB-TARGET                         0         0              0
> CONSISTENCY-MANAGER               0         0              0
> ROW-MUTATION-STAGE                0         0        1353622
> MESSAGE-STREAMING-POOL            0         0              0
> LOAD-BALANCER-STAGE               0         0              0
> FLUSH-SORTER-POOL                 0         0              0
> MEMTABLE-POST-FLUSHER             0         0              0
> FLUSH-WRITER-POOL                 0         0              0
> AE-SERVICE-STAGE                  0         0              0
>
>
> On Wed, Apr 21, 2010 at 2:09 PM, Jonathan Ellis  wrote:
>> you need to figure out where the memory is going.  check tpstats, if
>> the pending ops are large somewhere that means you're just generating
>> insert ops faster than it can handle.
>>
>> On Wed, Apr 21, 2010 at 4:07 PM, Sonny Heer  wrote:
>>> note: I'm using the Thrift API to insert.  The commitLog directory
>>> continues to grow.  The heap size continues to grow as well.
>>>
>>> I decreased MemtableSizeInMB size, but noticed no changes.  Any idea
>>> what is causing this, and/or what property i need to tweek to
>>> alleviate this?  What is the "insert threshold"?
>>>
>>> I moved to a more powerful node as well, it still ended up failing
>>> just after a longer period.
>>>
>>> On Wed, Apr 21, 2010 at 10:53 AM, Jonathan Ellis  wrote:
 http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts

 On Wed, Apr 21, 2010 at 12:02 PM, Sonny Heer  wrote:
> Currently running on a single node with intensive write operations.
>
>
> After running for a while...
>
> Client starts outputting:
>
> TimedOutException()
>        at 
> org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12232)
>        at 
> org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:670)
>        at 
> org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:643)
>
> Cassandra starts outputting:
>
>  INFO 08:08:49,864 Cassandra starting up...
>  INFO 08:18:09,782 GC for ParNew: 238 ms, 30728008 reclaimed leaving
> 220554976 used; max is 1190723584
>  INFO 08:18:19,782 GC for ParNew: 231 ms, 30657944 reclaimed leaving
> 230245792 used; max is 1190723584
>  INFO 08:18:39,782 GC for ParNew: 229 ms, 30567184 reclaimed leaving
> 250127792 used; max is 1190723584
>  INFO 08:18:59,782 GC for ParNew: 358 ms, 46416776 reclaimed leaving
> 261657720 used; max is 1190723584
>  INFO 08:19:09,782 GC for ParNew: 205 ms, 46331376 reclaimed leaving
> 273764040 used; max is 1190723584
>  INFO 08:19:19,783 GC for ParNew: 335 ms, 46354968 reclaimed leaving
> 282912656 used; max is 1190723584
>  INFO 08:19:30,064 GC for ParNew: 392 ms, 46403400 reclaimed leaving
> 294861824 used; max is 1190723584
>  INFO 08:19:40,065 GC for ParNew: 326 ms, 4639 reclaimed leaving
> 304045640 used; max is 1190723584
>  INFO 08:19:50,064 GC for ParNew: 256 ms, 46460824 reclaimed leaving
> 312964344 used; max is 1190723584
>  INFO 08:20:00,065 GC for ParNew: 242 ms, 46357104 reclaimed leaving
> 324961320 used; max is 1190723584
>  INFO 08:20:20,065 GC for ParNew: 336 ms, 46447216 reclaimed leaving
> 345874144 used; max is 1190723584
>  INFO 08:20:27,357 Creating new commitlog segment
> /var/lib/cassandra/commitlog/CommitLog-1271866827357.log
>  INFO 08:20:40,065 GC for ParNew: 265 ms, 46509608 reclaimed leaving
> 366587984 used; max is 1190723584
>  INFO 08:21:00,069 GC for ParNew: 321 ms, 46478736 reclaimed leaving
> 384059832 used; max is 1190723584
>  INFO 08:21:10,069 GC for ParNew: 223 ms, 62235000 reclaimed leaving
> 383631432 used; max is 1190723584
>  INFO 08:21:30,069 GC for ParNew: 291 ms, 62261104 reclaimed leaving
> 399697888 used; max is 1190723584
>  INFO 08:21:50,069 GC for ParNew: 245 ms, 62275528 reclaimed leaving
> 415428952 used; max is 1190723584
>  INFO 08:22:10,248 GC for ParNew: 384 ms, 62219264 reclaimed leaving
> 433542656 used; max is 1190723584
>  INFO 08:22:30,248 GC for ParNew: 215 ms, 62363608 reclaimed leaving
> 452030560 used; max is 1190723584
>  INFO 08:22:40,248 GC for ParNew: 318 ms, 62104552 reclaimed leaving

Re: Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
What does OOM stand for?

for a given insert the size is small (meaning the a single insert
operation only has about a sentence of data)  although as the insert
process continues, the columns under a given row key could potentially
grow to be large.  Is that what you mean?

An operation entails:
Read
Insert row  (IE: rowkey: FOO columnName: BLAH value: 10)
Delete a row
Insert different row (IE: rowkey: FOO columnName: 010|Blah value: 10)

millions of these operations are performed in sequence as files are
read from a directory source.

On Wed, Apr 21, 2010 at 2:37 PM, Jonathan Ellis  wrote:
> then that's not the problem.
>
> are you writing large rows that OOM during compaction?
>
> On Wed, Apr 21, 2010 at 4:34 PM, Sonny Heer  wrote:
>> They are showing up as completed?  Is this correct:
>>
>>
>> Pool Name                    Active   Pending      Completed
>> STREAM-STAGE                      0         0              0
>> RESPONSE-STAGE                    0         0              0
>> ROW-READ-STAGE                    0         0         517446
>> LB-OPERATIONS                     0         0              0
>> MESSAGE-DESERIALIZER-POOL         0         0              0
>> GMFD                              0         0              0
>> LB-TARGET                         0         0              0
>> CONSISTENCY-MANAGER               0         0              0
>> ROW-MUTATION-STAGE                0         0        1353622
>> MESSAGE-STREAMING-POOL            0         0              0
>> LOAD-BALANCER-STAGE               0         0              0
>> FLUSH-SORTER-POOL                 0         0              0
>> MEMTABLE-POST-FLUSHER             0         0              0
>> FLUSH-WRITER-POOL                 0         0              0
>> AE-SERVICE-STAGE                  0         0              0
>>
>>
>> On Wed, Apr 21, 2010 at 2:09 PM, Jonathan Ellis  wrote:
>>> you need to figure out where the memory is going.  check tpstats, if
>>> the pending ops are large somewhere that means you're just generating
>>> insert ops faster than it can handle.
>>>
>>> On Wed, Apr 21, 2010 at 4:07 PM, Sonny Heer  wrote:
 note: I'm using the Thrift API to insert.  The commitLog directory
 continues to grow.  The heap size continues to grow as well.

 I decreased MemtableSizeInMB size, but noticed no changes.  Any idea
 what is causing this, and/or what property i need to tweek to
 alleviate this?  What is the "insert threshold"?

 I moved to a more powerful node as well, it still ended up failing
 just after a longer period.

 On Wed, Apr 21, 2010 at 10:53 AM, Jonathan Ellis  wrote:
> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
>
> On Wed, Apr 21, 2010 at 12:02 PM, Sonny Heer  wrote:
>> Currently running on a single node with intensive write operations.
>>
>>
>> After running for a while...
>>
>> Client starts outputting:
>>
>> TimedOutException()
>>        at 
>> org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12232)
>>        at 
>> org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:670)
>>        at 
>> org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:643)
>>
>> Cassandra starts outputting:
>>
>>  INFO 08:08:49,864 Cassandra starting up...
>>  INFO 08:18:09,782 GC for ParNew: 238 ms, 30728008 reclaimed leaving
>> 220554976 used; max is 1190723584
>>  INFO 08:18:19,782 GC for ParNew: 231 ms, 30657944 reclaimed leaving
>> 230245792 used; max is 1190723584
>>  INFO 08:18:39,782 GC for ParNew: 229 ms, 30567184 reclaimed leaving
>> 250127792 used; max is 1190723584
>>  INFO 08:18:59,782 GC for ParNew: 358 ms, 46416776 reclaimed leaving
>> 261657720 used; max is 1190723584
>>  INFO 08:19:09,782 GC for ParNew: 205 ms, 46331376 reclaimed leaving
>> 273764040 used; max is 1190723584
>>  INFO 08:19:19,783 GC for ParNew: 335 ms, 46354968 reclaimed leaving
>> 282912656 used; max is 1190723584
>>  INFO 08:19:30,064 GC for ParNew: 392 ms, 46403400 reclaimed leaving
>> 294861824 used; max is 1190723584
>>  INFO 08:19:40,065 GC for ParNew: 326 ms, 4639 reclaimed leaving
>> 304045640 used; max is 1190723584
>>  INFO 08:19:50,064 GC for ParNew: 256 ms, 46460824 reclaimed leaving
>> 312964344 used; max is 1190723584
>>  INFO 08:20:00,065 GC for ParNew: 242 ms, 46357104 reclaimed leaving
>> 324961320 used; max is 1190723584
>>  INFO 08:20:20,065 GC for ParNew: 336 ms, 46447216 reclaimed leaving
>> 345874144 used; max is 1190723584
>>  INFO 08:20:27,357 Creating new commitlog segment
>> /var/lib/cassandra/commitlog/CommitLog-1271866827357.log
>>  INFO 08:20:40,065 GC for ParNew: 265 ms, 46509608 reclaimed leaving
>> 366587984 used; max is 1190723584
>>  INFO 08:21:00,069 GC for ParNew: 321 ms, 46478736 reclaimed le

Re: Import using cassandra 0.6.1

2010-04-21 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 5:05 PM, Sonny Heer  wrote:
> What does OOM stand for?

out of memory

> for a given insert the size is small (meaning the a single insert
> operation only has about a sentence of data)  although as the insert
> process continues, the columns under a given row key could potentially
> grow to be large.  Is that what you mean?

yes.

look in your log for warnings about row size during compaction.


Re: Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
what i mean by as data is processed is that the column size will grow
in cassandra, but my client isn't ever writing large column size under
a given row...

Any idea whats going on here?

On Wed, Apr 21, 2010 at 3:05 PM, Sonny Heer  wrote:
> What does OOM stand for?
>
> for a given insert the size is small (meaning the a single insert
> operation only has about a sentence of data)  although as the insert
> process continues, the columns under a given row key could potentially
> grow to be large.  Is that what you mean?
>
> An operation entails:
> Read
> Insert row  (IE: rowkey: FOO columnName: BLAH value: 10)
> Delete a row
> Insert different row (IE: rowkey: FOO columnName: 010|Blah value: 10)
>
> millions of these operations are performed in sequence as files are
> read from a directory source.
>
> On Wed, Apr 21, 2010 at 2:37 PM, Jonathan Ellis  wrote:
>> then that's not the problem.
>>
>> are you writing large rows that OOM during compaction?
>>
>> On Wed, Apr 21, 2010 at 4:34 PM, Sonny Heer  wrote:
>>> They are showing up as completed?  Is this correct:
>>>
>>>
>>> Pool Name                    Active   Pending      Completed
>>> STREAM-STAGE                      0         0              0
>>> RESPONSE-STAGE                    0         0              0
>>> ROW-READ-STAGE                    0         0         517446
>>> LB-OPERATIONS                     0         0              0
>>> MESSAGE-DESERIALIZER-POOL         0         0              0
>>> GMFD                              0         0              0
>>> LB-TARGET                         0         0              0
>>> CONSISTENCY-MANAGER               0         0              0
>>> ROW-MUTATION-STAGE                0         0        1353622
>>> MESSAGE-STREAMING-POOL            0         0              0
>>> LOAD-BALANCER-STAGE               0         0              0
>>> FLUSH-SORTER-POOL                 0         0              0
>>> MEMTABLE-POST-FLUSHER             0         0              0
>>> FLUSH-WRITER-POOL                 0         0              0
>>> AE-SERVICE-STAGE                  0         0              0
>>>
>>>
>>> On Wed, Apr 21, 2010 at 2:09 PM, Jonathan Ellis  wrote:
 you need to figure out where the memory is going.  check tpstats, if
 the pending ops are large somewhere that means you're just generating
 insert ops faster than it can handle.

 On Wed, Apr 21, 2010 at 4:07 PM, Sonny Heer  wrote:
> note: I'm using the Thrift API to insert.  The commitLog directory
> continues to grow.  The heap size continues to grow as well.
>
> I decreased MemtableSizeInMB size, but noticed no changes.  Any idea
> what is causing this, and/or what property i need to tweek to
> alleviate this?  What is the "insert threshold"?
>
> I moved to a more powerful node as well, it still ended up failing
> just after a longer period.
>
> On Wed, Apr 21, 2010 at 10:53 AM, Jonathan Ellis  
> wrote:
>> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
>>
>> On Wed, Apr 21, 2010 at 12:02 PM, Sonny Heer  wrote:
>>> Currently running on a single node with intensive write operations.
>>>
>>>
>>> After running for a while...
>>>
>>> Client starts outputting:
>>>
>>> TimedOutException()
>>>        at 
>>> org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12232)
>>>        at 
>>> org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:670)
>>>        at 
>>> org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:643)
>>>
>>> Cassandra starts outputting:
>>>
>>>  INFO 08:08:49,864 Cassandra starting up...
>>>  INFO 08:18:09,782 GC for ParNew: 238 ms, 30728008 reclaimed leaving
>>> 220554976 used; max is 1190723584
>>>  INFO 08:18:19,782 GC for ParNew: 231 ms, 30657944 reclaimed leaving
>>> 230245792 used; max is 1190723584
>>>  INFO 08:18:39,782 GC for ParNew: 229 ms, 30567184 reclaimed leaving
>>> 250127792 used; max is 1190723584
>>>  INFO 08:18:59,782 GC for ParNew: 358 ms, 46416776 reclaimed leaving
>>> 261657720 used; max is 1190723584
>>>  INFO 08:19:09,782 GC for ParNew: 205 ms, 46331376 reclaimed leaving
>>> 273764040 used; max is 1190723584
>>>  INFO 08:19:19,783 GC for ParNew: 335 ms, 46354968 reclaimed leaving
>>> 282912656 used; max is 1190723584
>>>  INFO 08:19:30,064 GC for ParNew: 392 ms, 46403400 reclaimed leaving
>>> 294861824 used; max is 1190723584
>>>  INFO 08:19:40,065 GC for ParNew: 326 ms, 4639 reclaimed leaving
>>> 304045640 used; max is 1190723584
>>>  INFO 08:19:50,064 GC for ParNew: 256 ms, 46460824 reclaimed leaving
>>> 312964344 used; max is 1190723584
>>>  INFO 08:20:00,065 GC for ParNew: 242 ms, 46357104 reclaimed leaving
>>> 324961320 used; max is 1190723584
>>>  INFO 08:20:20,065 GC for ParNew: 336 ms, 46447216 recla

Re: Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
Gotcha.  No i don't see anything particularly interesting in the log.
Do i need to turn on higher logging in log4j?

here it is after i killed the client:




 INFO [main] 2010-04-21 14:25:52,166 DatabaseDescriptor.java (line
229) Auto DiskAccessMode determined to be standard
 INFO [main] 2010-04-21 14:25:52,416 SystemTable.java (line 139) Saved
Token not found. Using LXzz63jDw6A1wZGz
 INFO [main] 2010-04-21 14:25:52,416 SystemTable.java (line 145) Saved
ClusterName not found. Using NGram Cluster
 INFO [main] 2010-04-21 14:25:52,416 CommitLogSegment.java (line 50)
Creating new commitlog segment
/var/lib/cassandra/commitlog\CommitLog-1271885152416.log
 INFO [main] 2010-04-21 14:25:52,448 StorageService.java (line 317)
Starting up server gossip
 INFO [main] 2010-04-21 14:25:52,495 CassandraDaemon.java (line 108)
Binding thrift service to localhost/127.0.0.1:9160
 INFO [main] 2010-04-21 14:25:52,495 CassandraDaemon.java (line 148)
Cassandra starting up...
 INFO [COMMIT-LOG-WRITER] 2010-04-21 14:30:44,248
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog\CommitLog-1271885444248.log
 INFO [COMMIT-LOG-WRITER] 2010-04-21 14:34:37,267
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog\CommitLog-1271885677267.log
 INFO [COMMIT-LOG-WRITER] 2010-04-21 14:39:08,723
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog\CommitLog-1271885948723.log
 INFO [GC inspection] 2010-04-21 14:41:48,757 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 11665 ms, 34488656 reclaimed leaving
1051534024 used; max is 1174208512
 INFO [GC inspection] 2010-04-21 14:42:00,913 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 9219 ms, 31272728 reclaimed leaving
1053427416 used; max is 1174208512
 INFO [GC inspection] 2010-04-21 14:42:13,054 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 9498 ms, 31116952 reclaimed leaving
1055163128 used; max is 1174208512
 INFO [GC inspection] 2010-04-21 14:42:25,195 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 10318 ms, 29605880 reclaimed leaving
1056674184 used; max is 1174208512
 INFO [GC inspection] 2010-04-21 14:42:37,289 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 10051 ms, 28492128 reclaimed leaving
1057795048 used; max is 1174208512
 INFO [GC inspection] 2010-04-21 14:42:49,398 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 9904 ms, 27056840 reclaimed leaving
1059227696 used; max is 1174208512
 INFO [GC inspection] 2010-04-21 14:43:01,336 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 9865 ms, 25652224 reclaimed leaving
1060635120 used; max is 1174208512
 INFO [GC inspection] 2010-04-21 14:43:12,977 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 11515 ms, 25644312 reclaimed leaving
1060642984 used; max is 1174208512
 INFO [GC inspection] 2010-04-21 14:43:23,758 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 10754 ms, 25640552 reclaimed leaving
1060645864 used; max is 1174208512
 INFO [GC inspection] 2010-04-21 14:43:35,446 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 11602 ms, 25642952 reclaimed leaving
1060647384 used; max is 1174208512
 INFO [GC inspection] 2010-04-21 14:43:47,790 GCInspector.java (line
110) GC for ConcurrentMarkSweep: 12313 ms, 25638616 reclaimed leaving
1060647272 used; max is 1174208512


On Wed, Apr 21, 2010 at 3:16 PM, Jonathan Ellis  wrote:
> On Wed, Apr 21, 2010 at 5:05 PM, Sonny Heer  wrote:
>> What does OOM stand for?
>
> out of memory
>
>> for a given insert the size is small (meaning the a single insert
>> operation only has about a sentence of data)  although as the insert
>> process continues, the columns under a given row key could potentially
>> grow to be large.  Is that what you mean?
>
> yes.
>
> look in your log for warnings about row size during compaction.
>


April Seattle Hadoop/Scalability/NoSQL Meetup: Cassandra, Science, More!

2010-04-21 Thread Bradford Stephens
Hey there! Wanted to let you all know about our next meetup, April
28th. We've got a killer new venue thanks to Amazon.

Check out the details at the link:
http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/calendar/13072272/

Our Speakers this month:
1. Nick Dimiduk, Drawn to Scale: Intro to Hadoop, HBase, and NoSQL
2. Benjamin Black: Intro to Cassandra
3. Adam Jacob, CTO, OpsCode: Chef and Cluster Management
4. Sarah Killcoyne, Systems Biology: Big Data in Science

We've had great success in the past, and are growing quickly!
Including guests from LinkedIn, Amazon, Cloudant, 10gen/MongoDB, and
more.

Our format is flexible: We usually have speakers who talk for ~20
minutes each and then do Q+A, plus lightning talks, dicussion, and
then social time.

There'll be beer afterwards, of course! Fierabend, 422 Yale Ave N

Meetup Location:
Amazon HQ, Von Vorst Building, 426 Terry Ave N., Seattle, WA 98109-5210

Hope to see you there! And we're always open to suggestions.

-- 
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: CassandraLimitations

2010-04-21 Thread Bill de hOra

> Are you asking if there are limits in the context
> of a single node or a
> ring of nodes?

A ring, but across a few (3+) datacenters.

Bill

Mark Greene wrote:

Hey Bill,

Are you asking if there are limits in the context of a single node or a 
ring of nodes?


On Wed, Apr 21, 2010 at 3:58 PM, Bill de hOra > wrote:


http://wiki.apache.org/cassandra/CassandraLimitations has good
coverage on the limits around columns.

Are there are design (or practical) limits to the number of rows a
keyspace can have?

Bill






Re: CassandraLimitations

2010-04-21 Thread Bill de hOra

Sweet.

Bill

Jonathan Ellis wrote:

No.

On Wed, Apr 21, 2010 at 2:58 PM, Bill de hOra  wrote:

http://wiki.apache.org/cassandra/CassandraLimitations has good coverage on
the limits around columns.

Are there are design (or practical) limits to the number of rows a keyspace
can have?

Bill





Re: questions about consistency

2010-04-21 Thread Masood Mortazavi
Hi Daniel,

For a general theoretical understanding, try reading some of the papers on
eventual consistency by Werner Vogels.

Reading the SOSP'07, Dynamo paper would also help with some of the
theoretical foundations and academic references.

To get even further into it, try reading  Replication Techniques in
Distributed Systems by Abdelsalam Helal, Abdelsalam Heddaya and Bharat
Bhargava (
http://www.amazon.com/Replication-Techniques-Distributed-Advances-Database/dp/0792398009/ref=sr_1_12?ie=UTF8&s=books&qid=1271891223&sr=8-12)

Regards,
m.


2010/4/21 Даниел Симеонов 

> Hi Paul,
>about the last answer I still need some more clarifications, as I
> understand it if QUORUM is used, then reads doesn't get old values either?
> Or am I wrong?
> Thank you very much!
> Best regards, Daniel.
>
> 2010/4/21 Paul Prescod 
>
> I'm not an expert, so take what I say with a grain of salt.
>>
>> 2010/4/21 Даниел Симеонов :
>> > Hello,
>> >I am pretty new to Cassandra and I have some questions, they may seem
>> > trivial, but still I am pretty new to the subject. First is about the
>> lack
>> > of a compareAndSet() operation, as I understood it is not supported
>> > currently in Cassandra, do you know of use cases which really require
>> such
>> > operations and how these use cases currently workaround this .
>>
>> I think your question is paradoxical. If the use case really requires
>> the operation then there is no workaround by definition. The existence
>> of the workaround implies that the use case really did not require the
>> operation.
>>
>> Anyhow, vector clocks are probably relevant to this question and your next
>> one.
>>
>> > Second topic I'd like to discuss a little bit more is about the read
>> repair,
>> > as I understand is that it is being done by the timestamps supplied by
>> the
>> > client application servers. Since computer clocks (which requires
>> > synchronization algorithms working regularly) diverge there should be a
>> time
>> > frame during which the order of the client request written to the
>> database
>> > is not guaranteed, do you have real world experiences with this? Is this
>> > similar to the casual consistency (
>> > http://en.wikipedia.org/wiki/Causal_consistency ) .What happens if two
>> > application servers try to update the same data and supply one and the
>> same
>> > timestamp (it could happen although rarely), what if they try to update
>> > several columns in batch operation this way, is there a chance that the
>> > column value could be intermixed between the two update requests?
>>
>> All of this is changing with vector clocks in Cassandra 0.7.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-580
>>
>> > I have one last question about the consistency level ALL, do you know of
>> > real use cases where it is required (instead of QUORUM) and why (both
>> read
>> > and write)?
>>
>> It would be required when your business rules do not allow any client
>> to read the old value. For example if it would be illegal to provide
>> an obsolete stock value.
>>
>> > Thank you very much for your help to better understand 'Cassandra'!
>> > Best regards, Daniel.
>> >
>>
>
>


RandomPartitioner doubts

2010-04-21 Thread Lucas Di Pentima
Hello,

I'm using Cassandra 0.6.1 and ruby's library. I did some tests on my one-node 
development installation about using get_range method to scan the whole CF.

What I want to prove is if a CF with RandomPartitioner can be used with 
get_range getting a fixed number of keys at a time, until all are requested. I 
know the keys are saved in random order, but is that randomness fixed? or every 
time I ask for a range I'll get something different even if I pass the :start 
parameter?

As I said above, I did this test with a 1-node installation and I seem to get 
the same (random) order every call I do, but I don't know if this is a design 
feature or some collateral characteristic that is likely to change in the 
future, or even the behaviour is different with N-node clusters.

Thanks in advance!
--
Lucas Di Pentima - Santa Fe, Argentina
Jabber: lu...@di-pentima.com.ar
MSN: ldipent...@hotmail.com






Re: TException: Error: TSocket: timed out reading 1024 bytes from 10.1.1.27:9160

2010-04-21 Thread Ken Sandney
I've tried the patch on https://issues.apache.org/jira/browse/THRIFT-347 ,
but still got this error:

PHP Fatal error:  Uncaught exception 'TException' with message 'TSocket:
> timed out reading 1024 bytes from 10.0.0.169:9160' in
> /home/phpcassa/include/thrift/transport/TSocket.php:266
> Stack trace:
> #0 /home/phpcassa/include/thrift/transport/TBufferedTransport.php(126):
> TSocket->read(1024)
> #1 [internal function]: TBufferedTransport->read(8192)
> #2 /home/phpcassa/include/thrift/packages/cassandra/Cassandra.php(642):
> thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated),
> 'cassandra_Cassa...', false)
> #3 /home/phpcassa/include/thrift/packages/cassandra/Cassandra.php(615):
> CassandraClient->recv_batch_insert()
> #4 /home/phpcassa/include/phpcassa.php(197):
> CassandraClient->batch_insert('Pujia', '38426', Array, 0)
> #5 /home/phpcassa/test1.php(46): CassandraCF->insert('38426', Array)
> #6 {main}
>   thrown in /home/phpcassa/include/thrift/transport/TSocket.php on line 266
>


Re: RandomPartitioner doubts

2010-04-21 Thread Jonathan Ellis
For each "page" of results, start with the key that was last in the
previous iteration, and you will get all the keys back.  The order is
random but consistent.

On Wed, Apr 21, 2010 at 7:55 PM, Lucas Di Pentima
 wrote:
> Hello,
>
> I'm using Cassandra 0.6.1 and ruby's library. I did some tests on my one-node 
> development installation about using get_range method to scan the whole CF.
>
> What I want to prove is if a CF with RandomPartitioner can be used with 
> get_range getting a fixed number of keys at a time, until all are requested. 
> I know the keys are saved in random order, but is that randomness fixed? or 
> every time I ask for a range I'll get something different even if I pass the 
> :start parameter?
>
> As I said above, I did this test with a 1-node installation and I seem to get 
> the same (random) order every call I do, but I don't know if this is a design 
> feature or some collateral characteristic that is likely to change in the 
> future, or even the behaviour is different with N-node clusters.
>
> Thanks in advance!
> --
> Lucas Di Pentima - Santa Fe, Argentina
> Jabber: lu...@di-pentima.com.ar
> MSN: ldipent...@hotmail.com
>
>
>
>
>


Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Ingram Chen
I agree your point. I patch the code and log more informations to find out
the real cause.

Here is the code snip I think may be the cause:

IncomingTcpConnection:

public void run()
{
while (true)
{
try
{
MessagingService.validateMagic(input.readInt());
int header = input.readInt();
int type = MessagingService.getBits(header, 1, 2);
boolean isStream = MessagingService.getBits(header, 3, 1) ==
1;
int version = MessagingService.getBits(header, 15, 8);

if (isStream)
{
new IncomingStreamReader(socket.getChannel()).read();
}
else
{
int size = input.readInt();
byte[] contentBytes = new byte[size];
input.readFully(contentBytes);
MessagingService.getDeserializationExecutor().submit(new
MessageDeserializationTask(new ByteArrayInputStream(contentBytes)));
}
}
catch (EOFException e)
{
if (logger.isTraceEnabled())
logger.trace("eof reading from socket; closing", e);
break;
}
catch (IOException e)
{
if (logger.isDebugEnabled())
logger.debug("error reading from socket; closing", e);
break;
}
}
}

In normal condition, while loop is terminated after input.readInt() throw
EOFException. but it quits without socket.close(). what I do is wrap whole
while block inside a try { ... } finally {socket.close();}


On Thu, Apr 22, 2010 at 01:14, Jonathan Ellis  wrote:

> I'd like to get something besides "I'm seeing close wait but i have no
> idea why" for a bug report, since most people aren't seeing that.
>
> On Tue, Apr 20, 2010 at 9:33 AM, Ingram Chen  wrote:
> > I trace IncomingStreamReader source and found that incoming socket comes
> > from MessagingService$SocketThread.
> > but there is no close() call on either accepted socket or socketChannel.
> >
> > Should I file a bug report ?
> >
> > On Tue, Apr 20, 2010 at 11:02, Ingram Chen  wrote:
> >>
> >> this happened after several hours of operations and both nodes are
> started
> >> at the same time (clean start without any data). so it might not relate
> to
> >> Bootstrap.
> >>
> >> In system.log I do not see any logs like "xxx node dead" or exceptions.
> >> and both nodes in test are alive. they serve read/write well, too. Below
> >> four connections between nodes are keep healthy from time to time.
> >>
> >> tcp0  0 :::192.168.2.87:7000
> >> :::192.168.2.88:58447   ESTABLISHED
> >> tcp0  0 :::192.168.2.87:54986
> >> :::192.168.2.88:7000ESTABLISHED
> >> tcp0  0 :::192.168.2.87:59138
> >> :::192.168.2.88:7000ESTABLISHED
> >> tcp0  0 :::192.168.2.87:7000
> >> :::192.168.2.88:39074   ESTABLISHED
> >>
> >> so connections end in CLOSE_WAIT should be newly created. (for streaming
> >> ?) This seems related to streaming issues we suffered recently:
> >> http://n2.nabble.com/busy-thread-on-IncomingStreamReader-td4908640.html
> >>
> >> I would like add some debug codes around opening and closing of socket
> to
> >> find out what happend.
> >>
> >> Could you give me some hint, about what classes I should take look ?
> >>
> >>
> >> On Tue, Apr 20, 2010 at 04:47, Jonathan Ellis 
> wrote:
> >>>
> >>> Is this after doing a bootstrap or other streaming operation?  Or did
> >>> a node go down?
> >>>
> >>> The internal sockets are supposed to remain open, otherwise.
> >>>
> >>> On Mon, Apr 19, 2010 at 10:56 AM, Ingram Chen 
> >>> wrote:
> >>> > Thank your information.
> >>> >
> >>> > We do use connection pools with thrift client and ThriftAdress is on
> >>> > port
> >>> > 9160.
> >>> >
> >>> > Those problematic connections we found are all in port 7000, which is
> >>> > internal communications port between
> >>> > nodes. I guess this related to StreamingService.
> >>> >
> >>> > On Mon, Apr 19, 2010 at 23:46, Brandon Williams 
> >>> > wrote:
> >>> >>
> >>> >> On Mon, Apr 19, 2010 at 10:27 AM, Ingram Chen  >
> >>> >> wrote:
> >>> >>>
> >>> >>> Hi all,
> >>> >>>
> >>> >>> We have observed several connections between nodes in
> CLOSE_WAIT
> >>> >>> after several hours of operation:
> >>> >>
> >>> >> This is symptomatic of not pooling your client connections
> correctly.
> >>> >>  Be
> >>> >> sure you're using one connection per thread, not one connection per
> >>> >> operation.
> >>> >> -Brandon
> >>> >
> >>> >
> >>> > --
> >>> > Ingram Chen
> >>> > online share order: http://dinbendon.net
> >>> > blog: http://www.javaworld.com.tw/roller/page/ingramchen
> >>> >
> >>
> >>
> >>
> >> --
> >> Ingram Chen
> >> online share order: http://dinbendon.net
> >> blog: http://www.javaworld.com.tw/roller

Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Jonathan Ellis
But those connections aren't supposed to ever terminate unless a node
dies or is partitioned.  So if we "fix" it by adding a socket.close I
worry that we're covering up something more important.

On Wed, Apr 21, 2010 at 8:53 PM, Ingram Chen  wrote:
> I agree your point. I patch the code and log more informations to find out
> the real cause.
>
> Here is the code snip I think may be the cause:
>
> IncomingTcpConnection:
>
>     public void run()
>     {
>     while (true)
>     {
>     try
>     {
>     MessagingService.validateMagic(input.readInt());
>     int header = input.readInt();
>     int type = MessagingService.getBits(header, 1, 2);
>     boolean isStream = MessagingService.getBits(header, 3, 1) ==
> 1;
>     int version = MessagingService.getBits(header, 15, 8);
>
>     if (isStream)
>     {
>     new IncomingStreamReader(socket.getChannel()).read();
>     }
>     else
>     {
>     int size = input.readInt();
>     byte[] contentBytes = new byte[size];
>     input.readFully(contentBytes);
>     MessagingService.getDeserializationExecutor().submit(new
> MessageDeserializationTask(new ByteArrayInputStream(contentBytes)));
>     }
>     }
>     catch (EOFException e)
>     {
>     if (logger.isTraceEnabled())
>     logger.trace("eof reading from socket; closing", e);
>     break;
>     }
>     catch (IOException e)
>     {
>     if (logger.isDebugEnabled())
>     logger.debug("error reading from socket; closing", e);
>     break;
>     }
>     }
>     }
>
> In normal condition, while loop is terminated after input.readInt() throw
> EOFException. but it quits without socket.close(). what I do is wrap whole
> while block inside a try { ... } finally {socket.close();}
>
>
> On Thu, Apr 22, 2010 at 01:14, Jonathan Ellis  wrote:
>>
>> I'd like to get something besides "I'm seeing close wait but i have no
>> idea why" for a bug report, since most people aren't seeing that.
>>
>> On Tue, Apr 20, 2010 at 9:33 AM, Ingram Chen  wrote:
>> > I trace IncomingStreamReader source and found that incoming socket comes
>> > from MessagingService$SocketThread.
>> > but there is no close() call on either accepted socket or socketChannel.
>> >
>> > Should I file a bug report ?
>> >
>> > On Tue, Apr 20, 2010 at 11:02, Ingram Chen  wrote:
>> >>
>> >> this happened after several hours of operations and both nodes are
>> >> started
>> >> at the same time (clean start without any data). so it might not relate
>> >> to
>> >> Bootstrap.
>> >>
>> >> In system.log I do not see any logs like "xxx node dead" or exceptions.
>> >> and both nodes in test are alive. they serve read/write well, too.
>> >> Below
>> >> four connections between nodes are keep healthy from time to time.
>> >>
>> >> tcp    0  0 :::192.168.2.87:7000
>> >> :::192.168.2.88:58447   ESTABLISHED
>> >> tcp    0  0 :::192.168.2.87:54986
>> >> :::192.168.2.88:7000    ESTABLISHED
>> >> tcp    0  0 :::192.168.2.87:59138
>> >> :::192.168.2.88:7000    ESTABLISHED
>> >> tcp    0  0 :::192.168.2.87:7000
>> >> :::192.168.2.88:39074   ESTABLISHED
>> >>
>> >> so connections end in CLOSE_WAIT should be newly created. (for
>> >> streaming
>> >> ?) This seems related to streaming issues we suffered recently:
>> >> http://n2.nabble.com/busy-thread-on-IncomingStreamReader-td4908640.html
>> >>
>> >> I would like add some debug codes around opening and closing of socket
>> >> to
>> >> find out what happend.
>> >>
>> >> Could you give me some hint, about what classes I should take look ?
>> >>
>> >>
>> >> On Tue, Apr 20, 2010 at 04:47, Jonathan Ellis 
>> >> wrote:
>> >>>
>> >>> Is this after doing a bootstrap or other streaming operation?  Or did
>> >>> a node go down?
>> >>>
>> >>> The internal sockets are supposed to remain open, otherwise.
>> >>>
>> >>> On Mon, Apr 19, 2010 at 10:56 AM, Ingram Chen 
>> >>> wrote:
>> >>> > Thank your information.
>> >>> >
>> >>> > We do use connection pools with thrift client and ThriftAdress is on
>> >>> > port
>> >>> > 9160.
>> >>> >
>> >>> > Those problematic connections we found are all in port 7000, which
>> >>> > is
>> >>> > internal communications port between
>> >>> > nodes. I guess this related to StreamingService.
>> >>> >
>> >>> > On Mon, Apr 19, 2010 at 23:46, Brandon Williams 
>> >>> > wrote:
>> >>> >>
>> >>> >> On Mon, Apr 19, 2010 at 10:27 AM, Ingram Chen
>> >>> >> 
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> Hi all,
>> >>> >>>
>> >>> >>>     We have observed several connections between nodes in
>> >>> >>> CLOSE_WAIT
>> >>> >>> after several hours of operation:
>> >>> >>
>> >>> >> This is symptomatic of n

Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Stu Hood
Nicolas,

Were all of those super column writes going to the same row? 
http://wiki.apache.org/cassandra/CassandraLimitations

Thanks,
Stu

-Original Message-
From: "Nicolas Labrot" 
Sent: Wednesday, April 21, 2010 11:54am
To: user@cassandra.apache.org
Subject: Re: Cassandra tuning for running test on a desktop

I donnot have a website ;)

I'm testing the viability of Cassandra to store XML documents and make fast
search queries. 4000 XML files (80MB of XML) create with my datamodel (one
SC per XML node) 100 SC which make Cassandra go OOM with Xmx 1GB. On the
contrary an xml DB like eXist handles 4000 XML doc without any problem with
an acceptable amount of memories.

What I like with Cassandra is his simplicity and his scalability. eXist is
not able to scale with data, the only viable solution his marklogic which
cost an harm and a feet... :)

I will install linux and buy some memories to continue my test.

Could a Cassandra developper give me the technical reason of this OOM ?





On Wed, Apr 21, 2010 at 5:13 PM, Mark Greene  wrote:

> Maybe, maybe not. Presumably if you are running a RDMS with any reasonable
> amount of traffic now a days, it's sitting on a machine with 4-8G of memory
> at least.
>
>
> On Wed, Apr 21, 2010 at 10:48 AM, Nicolas Labrot wrote:
>
>> Thanks Mark.
>>
>> Cassandra is maybe too much for my need ;)
>>
>>
>>
>> On Wed, Apr 21, 2010 at 4:45 PM, Mark Greene  wrote:
>>
>>> Hit send to early
>>>
>>> That being said a lot of people running Cassandra in production are using
>>> 4-6GB max heaps on 8GB machines, don't know if that helps but hopefully
>>> gives you some perspective.
>>>
>>>
>>> On Wed, Apr 21, 2010 at 10:39 AM, Mark Greene wrote:
>>>
 RAM doesn't necessarily need to be proportional but I would say the
 number of nodes does. You can't just throw a bazillion inserts at one node.
 This is the main benefit of Cassandra is that if you start hitting your
 capacity, you add more machines and distribute the keys across more
 machines.


 On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot wrote:

> So does it means the RAM needed is proportionnal with the data handled
> ?
>
> Or Cassandra need a minimum amount or RAM when dataset is big?
>
> I must confess this OOM behaviour is strange.
>
>
> On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones wrote:
>
>>  On my 4GB machine I’m giving it 3GB and having no trouble with 60+
>> million 500 byte columns
>>
>>
>>
>> *From:* Nicolas Labrot [mailto:nith...@gmail.com]
>> *Sent:* Wednesday, April 21, 2010 7:47 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Cassandra tuning for running test on a desktop
>>
>>
>>
>> I have try 1400M, and Cassandra OOM too.
>>
>> Is there another solution ? My data isn't very big.
>>
>> It seems that is the merge of the db
>>
>>  On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene 
>> wrote:
>>
>> Trying increasing Xmx. 1G is probably not enough for the amount of
>> inserts you are doing.
>>
>>
>>
>> On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot 
>> wrote:
>>
>> Hello,
>>
>> For my first message I will first thanks Cassandra contributors for
>> their great works.
>>
>> I have a parameter issue with Cassandra (I hope it's just a parameter
>> issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's 
>> a
>> simple dual core with 4GB of RAM on WinXP. I have keep the default JVM
>> option inside cassandra.bat (Xmx1G)
>>
>> I'm trying to insert 3 millions of SC with 6 Columns each inside 1 CF
>> (named Super1). The insertion go to 1 millions of SC (without slowdown) 
>> and
>> Cassandra crash because of an OOM. (I store an average of 100 bytes per 
>> SC
>> with a max of 10kB).
>> I have aggressively decreased all the memories parameters without any
>> respect to the consistency (My config is here [1]), the cache is turn off
>> but Cassandra still go to OOM. I have joined the last line of the 
>> Cassandra
>> life [2].
>>
>> What can I do to fix my issue ?  Is there another solution than
>> increasing the Xmx ?
>>
>> Thanks for your help,
>>
>> Nicolas
>>
>>
>>
>>
>>
>> [1]
>>   
>> 
>>   > ColumnType="Super"
>> CompareWith="BytesType"
>> CompareSubcolumnsWith="BytesType" />
>>
>> org.apache.cassandra.locator.RackUnawareStrategy
>>   1
>>
>> org.apache.cassandra.locator.EndPointSnitch
>> 
>>   
>>   32
>>
>>   auto
>>   64
>>   64
>>   16
>>   4
>>   64
>>
>>   16
>>   32
>>   0.01
>>   0.01
>>   60
>>   4
>>   8
>> 
>>
>>
>> [2]
>>  INFO 13:36:41,062 Sup

Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Ingram Chen
arh! That's right.

I check OutboundTcpConnection and it only does closeSocket() after something
went wrong. I will log more in OutboundTcpConnection to see what actually
happens.

Thank your help.



On Thu, Apr 22, 2010 at 10:03, Jonathan Ellis  wrote:

> But those connections aren't supposed to ever terminate unless a node
> dies or is partitioned.  So if we "fix" it by adding a socket.close I
> worry that we're covering up something more important.
>
> On Wed, Apr 21, 2010 at 8:53 PM, Ingram Chen  wrote:
> > I agree your point. I patch the code and log more informations to find
> out
> > the real cause.
> >
> > Here is the code snip I think may be the cause:
> >
> > IncomingTcpConnection:
> >
> > public void run()
> > {
> > while (true)
> > {
> > try
> > {
> > MessagingService.validateMagic(input.readInt());
> > int header = input.readInt();
> > int type = MessagingService.getBits(header, 1, 2);
> > boolean isStream = MessagingService.getBits(header, 3, 1)
> ==
> > 1;
> > int version = MessagingService.getBits(header, 15, 8);
> >
> > if (isStream)
> > {
> > new IncomingStreamReader(socket.getChannel()).read();
> > }
> > else
> > {
> > int size = input.readInt();
> > byte[] contentBytes = new byte[size];
> > input.readFully(contentBytes);
> >
> MessagingService.getDeserializationExecutor().submit(new
> > MessageDeserializationTask(new ByteArrayInputStream(contentBytes)));
> > }
> > }
> > catch (EOFException e)
> > {
> > if (logger.isTraceEnabled())
> > logger.trace("eof reading from socket; closing", e);
> > break;
> > }
> > catch (IOException e)
> > {
> > if (logger.isDebugEnabled())
> > logger.debug("error reading from socket; closing",
> e);
> > break;
> > }
> > }
> > }
> >
> > In normal condition, while loop is terminated after input.readInt() throw
> > EOFException. but it quits without socket.close(). what I do is wrap
> whole
> > while block inside a try { ... } finally {socket.close();}
> >
> >
> > On Thu, Apr 22, 2010 at 01:14, Jonathan Ellis  wrote:
> >>
> >> I'd like to get something besides "I'm seeing close wait but i have no
> >> idea why" for a bug report, since most people aren't seeing that.
> >>
> >> On Tue, Apr 20, 2010 at 9:33 AM, Ingram Chen 
> wrote:
> >> > I trace IncomingStreamReader source and found that incoming socket
> comes
> >> > from MessagingService$SocketThread.
> >> > but there is no close() call on either accepted socket or
> socketChannel.
> >> >
> >> > Should I file a bug report ?
> >> >
> >> > On Tue, Apr 20, 2010 at 11:02, Ingram Chen 
> wrote:
> >> >>
> >> >> this happened after several hours of operations and both nodes are
> >> >> started
> >> >> at the same time (clean start without any data). so it might not
> relate
> >> >> to
> >> >> Bootstrap.
> >> >>
> >> >> In system.log I do not see any logs like "xxx node dead" or
> exceptions.
> >> >> and both nodes in test are alive. they serve read/write well, too.
> >> >> Below
> >> >> four connections between nodes are keep healthy from time to time.
> >> >>
> >> >> tcp0  0 :::192.168.2.87:7000
> >> >> :::192.168.2.88:58447   ESTABLISHED
> >> >> tcp0  0 :::192.168.2.87:54986
> >> >> :::192.168.2.88:7000ESTABLISHED
> >> >> tcp0  0 :::192.168.2.87:59138
> >> >> :::192.168.2.88:7000ESTABLISHED
> >> >> tcp0  0 :::192.168.2.87:7000
> >> >> :::192.168.2.88:39074   ESTABLISHED
> >> >>
> >> >> so connections end in CLOSE_WAIT should be newly created. (for
> >> >> streaming
> >> >> ?) This seems related to streaming issues we suffered recently:
> >> >>
> http://n2.nabble.com/busy-thread-on-IncomingStreamReader-td4908640.html
> >> >>
> >> >> I would like add some debug codes around opening and closing of
> socket
> >> >> to
> >> >> find out what happend.
> >> >>
> >> >> Could you give me some hint, about what classes I should take look ?
> >> >>
> >> >>
> >> >> On Tue, Apr 20, 2010 at 04:47, Jonathan Ellis 
> >> >> wrote:
> >> >>>
> >> >>> Is this after doing a bootstrap or other streaming operation?  Or
> did
> >> >>> a node go down?
> >> >>>
> >> >>> The internal sockets are supposed to remain open, otherwise.
> >> >>>
> >> >>> On Mon, Apr 19, 2010 at 10:56 AM, Ingram Chen  >
> >> >>> wrote:
> >> >>> > Thank your information.
> >> >>> >
> >> >>> > We do use connection pools with thrift client and ThriftAdress is
> on
> >> >>> > port
> >> >>> > 9160.
> >> >>> >
> >> >>> > Those problematic connections we found are all in port 7000, which
> >> 

RE: security, firewall level only?

2010-04-21 Thread Stu Hood
It isn't very well documented apparently, but if you are using 0.6, you can 
look at the 'Authenticator' property in the default config for an explanation 
of how to authenticate users.

With the SimpleAuthenticator implementation, there are properties files that 
define your users and passwords, and there is a thrift method 'login' that must 
be called before any other operations.

I'll add this to the wiki.

-Original Message-
From: "S Ahmed" 
Sent: Wednesday, April 21, 2010 4:19pm
To: user@cassandra.apache.org
Subject: security, firewall level only?

Is security in terms of remote clients connecting to a cassandra node done
purely at the hardware/firewall level?

i.e. there is no username/pwd like in mysql/sqlserver correct?

Or permissions at the column family level per user ?




PHP client crashed if a column value > 8192 bytes

2010-04-21 Thread Ken Sandney
I am using PHP as client to talk to Cassandra server but I found out if any
column value > 8192 bytes, the client crashed with the following error:

PHP Fatal error:  Uncaught exception 'TException' with message 'TSocket:
> timed out reading 1024 bytes from 10.0.0.177:9160' in
> /home/phpcassa/include/thrift/transport/TSocket.php:264
> Stack trace:
> #0 /home/phpcassa/include/thrift/transport/TBufferedTransport.php(126):
> TSocket->read(1024)
> #1 [internal function]: TBufferedTransport->read(8192)
> #2 /home/phpcassa/include/thrift/packages/cassandra/Cassandra.php(642):
> thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated),
> 'cassandra_Cassa...', false)
> #3 /home/phpcassa/include/thrift/packages/cassandra/Cassandra.php(615):
> CassandraClient->recv_batch_insert()
> #4 /home/phpcassa/include/phpcassa.php(197):
> CassandraClient->batch_insert('Keyspace1', '38246', Array, 1)
> #5 /home/phpcassa/test1.php(51): CassandraCF->insert('38246', Array)
> #6 {main}
>   thrown in /home/phpcassa/include/thrift/transport/TSocket.php on line 264
>

Any idea about this?


Re: PHP client crashed if a column value > 8192 bytes

2010-04-21 Thread Ken Sandney
After many attempts I found this error only occurred when using PHP
thrift_protocol extension. I don't know if there are some parameters that I
could adjust for this issue. By the way, without the ext the speed is
obviously slow.

On Thu, Apr 22, 2010 at 12:01 PM, Ken Sandney  wrote:

> I am using PHP as client to talk to Cassandra server but I found out if any
> column value > 8192 bytes, the client crashed with the following error:
>
> PHP Fatal error:  Uncaught exception 'TException' with message 'TSocket:
>> timed out reading 1024 bytes from 10.0.0.177:9160' in
>> /home/phpcassa/include/thrift/transport/TSocket.php:264
>> Stack trace:
>> #0 /home/phpcassa/include/thrift/transport/TBufferedTransport.php(126):
>> TSocket->read(1024)
>> #1 [internal function]: TBufferedTransport->read(8192)
>> #2 /home/phpcassa/include/thrift/packages/cassandra/Cassandra.php(642):
>> thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated),
>> 'cassandra_Cassa...', false)
>> #3 /home/phpcassa/include/thrift/packages/cassandra/Cassandra.php(615):
>> CassandraClient->recv_batch_insert()
>> #4 /home/phpcassa/include/phpcassa.php(197):
>> CassandraClient->batch_insert('Keyspace1', '38246', Array, 1)
>> #5 /home/phpcassa/test1.php(51): CassandraCF->insert('38246', Array)
>> #6 {main}
>>   thrown in /home/phpcassa/include/thrift/transport/TSocket.php on line
>> 264
>>
>
> Any idea about this?
>


Re: Cassandra's bad behavior on disk failure

2010-04-21 Thread Oleg Anastasjev
> 
> Ideally I think we'd like to leave the node up to serve reads, if a
> disk is erroring out on writes but still read-able.  In my experience
> this is very common when a disk first begins to fail, as well as in
> the "disk is full" case where there is nothing actually wrong with the
> disk per se.

This depends on hardware/drivers in use as well as failling part. 
On some failures disk just disappears completely (controller failures, 
SAN links etc.).
And the easiest way to bring operations team attention to node is 
shutting it down - anyway ppl have something to do with it.
Furthermore, single node shutdown should be not very hurting to cluster's 
performance in production - everyone planning capacity in a way to survive 
single node failure.



New user asking for advice on database design

2010-04-21 Thread David Boxenhorn
Hi guys! I'm brand new to Cassandara, and I'm working on a database design.
I don't necessarily know all the advantages/limitations of Cassandra, so I'm
not sure that I'm doing it right...

It seems to me that I can divide my database into two parts:

1. The (mostly) normal data, where every piece of data appears only once (I
say "mostly" because I think I need reverse indexes for delete... and once
it's there, other things).

2. The indexes, which I use for queries.

Questions:

1. Is the above a good architecture?
2. Would there be an advantage to putting the two parts of the database in
different keyspaces? I expect the indexes to change every once in a while as
my querying needs progress, but the normal database won't change unless I
made a mistake.

Any other advice?