Re: Cassandra OOM crash while mapping commitlog

2012-08-12 Thread Robin Verlangen
Hmm, is issue caused by some 1.x version? Before it never occurred to us.
Op 11 aug. 2012 22:36 schreef "Tyler Hobbs"  het
volgende:

> We've seen something similar when running on a 32bit JVM, so make sure
> you're using the latest 64bit Java 6 JVM.
>
> On Sat, Aug 11, 2012 at 11:59 AM, Robin Verlangen  wrote:
>
>> Hi there,
>>
>> I currently see Cassandra crash every couple of days. I run a 3 node
>> cluster on version 1.1.2. Does anyone have a clue why it crashes? I
>> couldn't find it as fix in a newer release. Is this an actual bug or did I
>> do something wrong?
>>
>> Thank you in advance for your time.
>>
>> Last 100 log lines before crash:
>>
>> * INFO [FlushWriter:39] 2012-08-11 12:51:00,933 Memtable.java (line 307)
>> Completed flushing
>> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-hd-7-Data.db
>> (10778171 bytes) for commitlog position
>> ReplayPosition(segmentId=2831860362157183, position=89962041)*
>> * INFO [OptionalTasks:1] 2012-08-11 13:12:30,940 MeteredFlusher.java
>> (line 62) flushing high-traffic column family CFS(Keyspace='CloudPelican',
>> ColumnFamily='wordevents') (estimated 74393593 bytes)*
>> * INFO [OptionalTasks:1] 2012-08-11 13:12:30,941 ColumnFamilyStore.java
>> (line 643) Enqueuing flush of Memtable-wordevents@32552383(22883734/74393593
>> serialized/live bytes, 227279 ops)*
>> * INFO [FlushWriter:40] 2012-08-11 13:12:30,941 Memtable.java (line 266)
>> Writing Memtable-wordevents@32552383(22883734/74393593 serialized/live
>> bytes, 227279 ops)*
>> * INFO [FlushWriter:40] 2012-08-11 13:12:31,703 Memtable.java (line 307)
>> Completed flushing
>> /var/lib/cassandra/data/CloudPelican/wordevents/CloudPelican-wordevents-hd-158-Data.db
>> (11800327 bytes) for commitlog position
>> ReplayPosition(segmentId=2831860362157183, position=116934579)*
>> * INFO [MemoryMeter:1] 2012-08-11 14:01:36,942 Memtable.java (line 213)
>> CFS(Keyspace='OpsCenter', ColumnFamily='rollups7200') liveRatio is
>> 6.158919689235077 (just-counted was 4.408341190092955).  calculation took
>> 100ms for 16409 columns*
>> * INFO [CompactionExecutor:88] 2012-08-11 14:08:27,875
>> AutoSavingCache.java (line 262) Saved KeyCache (38164 items) in 70 ms*
>> * INFO [OptionalTasks:1] 2012-08-11 14:18:37,519 MeteredFlusher.java
>> (line 62) flushing high-traffic column family CFS(Keyspace='CloudPelican',
>> ColumnFamily='wordevents') (estimated 74346493 bytes)*
>> * INFO [OptionalTasks:1] 2012-08-11 14:18:37,519 ColumnFamilyStore.java
>> (line 643) Enqueuing flush of Memtable-wordevents@10789879(22869246/74346493
>> serialized/live bytes, 226341 ops)*
>> * INFO [FlushWriter:41] 2012-08-11 14:18:37,520 Memtable.java (line 266)
>> Writing Memtable-wordevents@10789879(22869246/74346493 serialized/live
>> bytes, 226341 ops)*
>> * INFO [FlushWriter:41] 2012-08-11 14:18:38,288 Memtable.java (line 307)
>> Completed flushing
>> /var/lib/cassandra/data/CloudPelican/wordevents/CloudPelican-wordevents-hd-159-Data.db
>> (11796722 bytes) for commitlog position
>> ReplayPosition(segmentId=2838466681767183, position=67094743)*
>> * WARN [MemoryMeter:1] 2012-08-11 14:21:55,676 Memtable.java (line 197)
>> setting live ratio to minimum of 1.0 instead of 0.45760196307363504*
>> * INFO [MemoryMeter:1] 2012-08-11 14:21:55,676 Memtable.java (line 213)
>> CFS(Keyspace='Wupa', ColumnFamily='PageViewsHost') liveRatio is
>> 1.0421914932457101 (just-counted was 1.0).  calculation took 2ms for 175
>> columns*
>> * INFO [MemoryMeter:1] 2012-08-11 14:33:20,916 Memtable.java (line 213)
>> CFS(Keyspace='OpsCenter', ColumnFamily='rollups60') liveRatio is
>> 4.067582667928898 (just-counted was 4.031462910772899).  calculation took
>> 711ms for 169224 columns*
>> * INFO [OptionalTasks:1] 2012-08-11 14:59:20,909 MeteredFlusher.java
>> (line 62) flushing high-traffic column family CFS(Keyspace='OpsCenter',
>> ColumnFamily='pdps') (estimated 74395427 bytes)*
>> * INFO [OptionalTasks:1] 2012-08-11 14:59:20,909 ColumnFamilyStore.java
>> (line 643) Enqueuing flush of Memtable-pdps@30500189(9222554/74395427
>> serialized/live bytes, 214478 ops)*
>> * INFO [FlushWriter:42] 2012-08-11 14:59:20,910 Memtable.java (line 266)
>> Writing Memtable-pdps@30500189(9222554/74395427 serialized/live bytes,
>> 214478 ops)*
>> * INFO [FlushWriter:42] 2012-08-11 14:59:21,420 Memtable.java (line 307)
>> Completed flushing
>> /var/lib/cassandra/data/OpsCenter/pdps/OpsCenter-pdps-hd-11351-Data.db
>> (6928124 bytes) for commitlog position
>> ReplayPosition(segmentId=2838466681767183, position=117115966)*
>> * INFO [MemoryMeter:1] 2012-08-11 14:59:31,138 Memtable.java (line 213)
>> CFS(Keyspace='OpsCenter', ColumnFamily='pdps') liveRatio is
>> 14.460953759840738 (just-counted was 14.460953759840738).  calculation took
>> 28ms for 878 columns*
>> * INFO [OptionalTasks:1] 2012-08-11 15:25:41,366 MeteredFlusher.java
>> (line 62) flushing high-traffic column family CFS(Keyspace='CloudPelican',
>> ColumnFamily='wordevents') (estimated 74974061 bytes)*
>> * INFO [Option

DSE solr HA

2012-08-12 Thread Mohit Anchlia
Going through this page and it looks like indexes are stored locally
http://www.datastax.com/dev/blog/cassandra-with-solr-integration-details .
My question is what happens if one of the solr nodes crashes? Is the data
indexed again on those nodes?

Also, if RF > 1 then is the same data being indexed on all RF nodes or is
that RF only for document replication?


Loading data on-demand in Cassandra

2012-08-12 Thread Oliver Plohmann

Hello,

I'm looking a bit into Cassandra to see whether it would be something to 
go with for my company. I searched through the Internet, looked through 
the FAQs, etc. but there are still some few open questions. Hope I don't 
bother anybody with the usual beginner questions ...


Is there a way to do load-on-demand of data in Cassandra? For the time 
being, we cannot afford to built up a cluster that holds our 700 GB 
SQL-Database in RAM. So we need to be able to load data on-demand from 
our relational database. Can this be done in Cassandra? Then there also 
needs to be a way to unload data in order to reclaim RAM space. Would be 
nice if it were possible to register for an asynchronous notification in 
case some value was changed. Can this be done?


Thanks for any answers.
Regards, Oliver



Re: Loading data on-demand in Cassandra

2012-08-12 Thread Dave Brosius
When data is first written it does remain in memory until that memory is 
flushed. After the data is only on disk, it remains there until a read 
for that row-key/column is requested so in essense it's always load on 
demand.


Currently there is no support for async notifications of changes.



On 08/12/2012 03:24 PM, Oliver Plohmann wrote:


Hello,

I'm looking a bit into Cassandra to see whether it would be something 
to go with for my company. I searched through the Internet, looked 
through the FAQs, etc. but there are still some few open questions. 
Hope I don't bother anybody with the usual beginner questions ...


Is there a way to do load-on-demand of data in Cassandra? For the time 
being, we cannot afford to built up a cluster that holds our 700 GB 
SQL-Database in RAM. So we need to be able to load data on-demand from 
our relational database. Can this be done in Cassandra? Then there 
also needs to be a way to unload data in order to reclaim RAM space. 
Would be nice if it were possible to register for an asynchronous 
notification in case some value was changed. Can this be done?


Thanks for any answers.
Regards, Oliver





Re: Cassandra OOM crash while mapping commitlog

2012-08-12 Thread Holger Hoffstaette
On Sun, 12 Aug 2012 13:36:42 +0200, Robin Verlangen wrote:

> Hmm, is issue caused by some 1.x version? Before it never occurred to us.

This bug was introduced in 1.1.0 and has been fixed in 1.1.3, where the
closed/recycled segments are now closed & unmapped properly. The default
sizes are also smaller.
Of course the question remains why an append-only commitlog needs to be
mmap'ed in the first place, especially for writing..

-h




[gem] does "disconnect!" work properly?

2012-08-12 Thread Satoshi Yamada
hi,
I wonder if disconnect! method works properly in gem cassandrabecause the code 
below does not cause exception.
-
client = Cassandra.new('pool', host_ip)
ret = client.get(:db, 'test', key, option_one)p retclient.disconnect!
ret = client.get(:db, 'test', key, option_one)p ret
-
I use gem cassandra 0.14.0http://rubygems.org/gems/cassandra/versions/0.14.0
thanks in advance,satoshi

Re: Loading data on-demand in Cassandra

2012-08-12 Thread Oliver Plohmann
Thanks Dave. Does anybody know of a distributed in-memory system that can do 
this and that supports structured data (e.g. tables)? 

/Oliver

Am 12.08.2012 um 21:39 schrieb Dave Brosius :

> When data is first written it does remain in memory until that memory is 
> flushed. After the data is only on disk, it remains there until a read for 
> that row-key/column is requested so in essense it's always load on demand.
> 
> Currently there is no support for async notifications of changes.
> 
> 
> 
> On 08/12/2012 03:24 PM, Oliver Plohmann wrote:
>> 
>> Hello,
>> 
>> I'm looking a bit into Cassandra to see whether it would be something to go 
>> with for my company. I searched through the Internet, looked through the 
>> FAQs, etc. but there are still some few open questions. Hope I don't bother 
>> anybody with the usual beginner questions ...
>> 
>> Is there a way to do load-on-demand of data in Cassandra? For the time 
>> being, we cannot afford to built up a cluster that holds our 700 GB 
>> SQL-Database in RAM. So we need to be able to load data on-demand from our 
>> relational database. Can this be done in Cassandra? Then there also needs to 
>> be a way to unload data in order to reclaim RAM space. Would be nice if it 
>> were possible to register for an asynchronous notification in case some 
>> value was changed. Can this be done?
>> 
>> Thanks for any answers.
>> Regards, Oliver
>> 
> 


Problem with cassandra startup on Linux

2012-08-12 Thread Dwight Smith
Installed 1.1.3 on my Linux cluster - the JVM_OPTS were truncated due to
a script error in Cassandra-env.sh:

 

Invalid token in the following.

 

   startswith () [ "${1#$2}" != "$1" ]

 



Re: Cassandra OOM crash while mapping commitlog

2012-08-12 Thread Robin Verlangen
@Tyler: We were already running most of our machines in 64bit JVM (Sun, not
the OpenJDK). Those also crashed.

@Holger: Good to hear that. I'll schedule an update for our Cassandra
cluster.

Thank you both for your time.

2012/8/13 Holger Hoffstaette 

> On Sun, 12 Aug 2012 13:36:42 +0200, Robin Verlangen wrote:
>
> > Hmm, is issue caused by some 1.x version? Before it never occurred to us.
>
> This bug was introduced in 1.1.0 and has been fixed in 1.1.3, where the
> closed/recycled segments are now closed & unmapped properly. The default
> sizes are also smaller.
> Of course the question remains why an append-only commitlog needs to be
> mmap'ed in the first place, especially for writing..
>
> -h
>
>
>


-- 
With kind regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E ro...@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.