Key_Cache @ Row_Cache

2011-07-13 Thread Nilabja Banerjee
Hi All,

Can you give me a bit idea how key_cache and row_cache effects on
performance of cassandra. How these things works in different scenario
depending upon the data size?

Thank You
Nilabja Banerjee


Re:Key_Cache @ Row_Cache

2011-07-13 Thread 魏金仙
row_Cache caches a whole row, Key_cache caches the key and the row location.
thus, if the request is hit in row_Cache then the result can be given without 
disk seek. If it is hit in key_Cache, result can be obtains after one disk seek.
without key_Cache or row_cache, it will check the index file for the record 
location.


Your caching strategy should therefore be tuned in accordance with a few 
factors: 

• Consider your queries, and use the cache type that best fits your queries. 

• Consider the ratio of your heap size to your cache size, and do not allow the 
cache 

to overwhelm your heap. 

• Consider the size of your rows against the size of your keys. Typically keys 
will be 

much smaller than entire rows. 

If your column family gets far more reads than writes, then setting this number 
very high 

will needlessly consume considerable server resources. If your column family 
has a 

lower ratio of reads to writes, but has rows with lots of data in them 
(hundreds of 

columns), then you’ll need to do some math before setting this number very 
high. And 

unless you have certain rows that get hit a lot and others that get hit very 
little, you’re 

not going to see much of a boost here. 

At 2011-07-13 15:16:10,"Nilabja Banerjee"  wrote:
Hi All,

Can you give me a bit idea how key_cache and row_cache effects on performance 
of cassandra. How these things works in different scenario depending upon the 
data size?

Thank You
Nilabja Banerjee


insert a super column

2011-07-13 Thread 魏金仙
insert(key, column_path, column, consistency_level) can only insert a standard 
column.
Is batch_mutate the only API to insert a super column?


and also can someone tell why batch_insert,multi_get is removed in version 
0.7.4?

R: Re: Re: Re: AntiEntropy?

2011-07-13 Thread cbert...@libero.it
Thanks for the confirmatio, Peter.
In the company I work for I suggested many times to run repair at least 1 
every 10 days (gcgraceseconds is set approx to 10 days in our config) -- but 
this book has been used against me :-) I will ask to run repair asap

>Messaggio originale
>Da: peter.schul...@infidyne.com
>Data: 13/07/2011 5.07
>A: , "cbert...@libero.it"
>Ogg: Re: Re: Re: AntiEntropy?
>
>> To be sure that I didn't misunderstand (English is not my mother tongue) 
here
>> is what the entire "repair paragraph" says ...
>
>Read it, I maintain my position - the book is wrong or at the very
>least strongly misleading.
>
>You *definitely* need to run nodetool repair periodically for the
>reasons documented in the link I sent before, unless you have specific
>reasons not to and know what you're doing.
>
>-- 
>/ Peter Schuller
>




Re: insert a super column

2011-07-13 Thread yulinyen

For batch_insert, I think you could use batch_mutate instead.

For multi_get, I think you could use multiget_slice instead.

Boris

在 ,魏金仙 寫道:
insert(key, column_path, column, consistency_level) can only insert a  
standard column.Is batch_mutate the only API to insert a super column?



and also can someone tell why batch_insert,multi_get is removed in  
version 0.7.4?






Re: Re: Re: AntiEntropy?

2011-07-13 Thread Maki Watanabe
I'll write a FAQ for this topic :-)

maki

2011/7/13 Peter Schuller :
>> To be sure that I didn't misunderstand (English is not my mother tongue) here
>> is what the entire "repair paragraph" says ...
>
> Read it, I maintain my position - the book is wrong or at the very
> least strongly misleading.
>
> You *definitely* need to run nodetool repair periodically for the
> reasons documented in the link I sent before, unless you have specific
> reasons not to and know what you're doing.
>
> --
> / Peter Schuller
>



-- 
w3m


Re: Key_Cache @ Row_Cache

2011-07-13 Thread samal
>
> Can you give me a bit idea how key_cache and row_cache effects on
> performance of cassandra. How these things works in different scenario
> depending upon the data size?
>
>  While reading, if row_cached is set, it check for row_cache first then
key_cached, memtable & disk.

row_cache store all data on memory, need tuning, generally
lowered preferred
key_cache store only key and location of row in memory, higher is preferred

if row if frequently read it is good to cache it but row size matters large
row size can eat too much memory.

Also this may help
http://www.datastax.com/docs/0.8/operations/cache_tuning#configuring-key-and-row-caches

/Samal


Why do Digest Queries return hash instead of timestamp?

2011-07-13 Thread David Boxenhorn
I just saw this

http://wiki.apache.org/cassandra/DigestQueries

and I was wondering why it returns a hash of the data. Wouldn't it be better
and easier to return the timestamp? You don't really care what the data is,
you only care whether it is more or less recent than another piece of data.


Re: Why do Digest Queries return hash instead of timestamp?

2011-07-13 Thread Boris Yen
I guess it is because the timestamp does not guarantee data consistency, but
hash does.

Boris

On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn  wrote:

> I just saw this
>
> http://wiki.apache.org/cassandra/DigestQueries
>
> and I was wondering why it returns a hash of the data. Wouldn't it be
> better and easier to return the timestamp? You don't really care what the
> data is, you only care whether it is more or less recent than another piece
> of data.
>


Re: Why do Digest Queries return hash instead of timestamp?

2011-07-13 Thread David Boxenhorn
If you have to pieces of data that are different but have the same
timestamp, how can you resolve consistency?

This is a pathological situation to begin with, why should you waste effort
to (not) solve it?

On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen  wrote:

> I guess it is because the timestamp does not guarantee data consistency,
> but hash does.
>
> Boris
>
>
> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn wrote:
>
>> I just saw this
>>
>> http://wiki.apache.org/cassandra/DigestQueries
>>
>> and I was wondering why it returns a hash of the data. Wouldn't it be
>> better and easier to return the timestamp? You don't really care what the
>> data is, you only care whether it is more or less recent than another piece
>> of data.
>>
>
>


Re: Why do Digest Queries return hash instead of timestamp?

2011-07-13 Thread Boris Yen
I can only say, "data" does matter, that is why the developers use hash
instead of timestamp. If hash value comes from other node is not a match, a
read repair would perform. so that correct data can be returned.

On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn  wrote:

> If you have to pieces of data that are different but have the same
> timestamp, how can you resolve consistency?
>
> This is a pathological situation to begin with, why should you waste effort
> to (not) solve it?
>
> On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen  wrote:
>
>> I guess it is because the timestamp does not guarantee data consistency,
>> but hash does.
>>
>> Boris
>>
>>
>> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn wrote:
>>
>>> I just saw this
>>>
>>> http://wiki.apache.org/cassandra/DigestQueries
>>>
>>> and I was wondering why it returns a hash of the data. Wouldn't it be
>>> better and easier to return the timestamp? You don't really care what the
>>> data is, you only care whether it is more or less recent than another piece
>>> of data.
>>>
>>
>>
>


Re: Why do Digest Queries return hash instead of timestamp?

2011-07-13 Thread David Boxenhorn
How would you know which data is correct, if they both have the same
timestamp?

On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen  wrote:

> I can only say, "data" does matter, that is why the developers use hash
> instead of timestamp. If hash value comes from other node is not a match, a
> read repair would perform. so that correct data can be returned.
>
>
> On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn wrote:
>
>> If you have to pieces of data that are different but have the same
>> timestamp, how can you resolve consistency?
>>
>> This is a pathological situation to begin with, why should you waste
>> effort to (not) solve it?
>>
>> On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen  wrote:
>>
>>> I guess it is because the timestamp does not guarantee data consistency,
>>> but hash does.
>>>
>>> Boris
>>>
>>>
>>> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn wrote:
>>>
 I just saw this

 http://wiki.apache.org/cassandra/DigestQueries

 and I was wondering why it returns a hash of the data. Wouldn't it be
 better and easier to return the timestamp? You don't really care what the
 data is, you only care whether it is more or less recent than another piece
 of data.

>>>
>>>
>>
>


Re: Why do Digest Queries return hash instead of timestamp?

2011-07-13 Thread Boris Yen
For a specific column, If there are two versions with the same timestamp,
the value of the column is used to break the tie.

if v1.value().compareTo(v2.value()) < 0, it means that v2 wins.

On Wed, Jul 13, 2011 at 7:13 PM, David Boxenhorn  wrote:

> How would you know which data is correct, if they both have the same
> timestamp?
>
> On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen  wrote:
>
>> I can only say, "data" does matter, that is why the developers use hash
>> instead of timestamp. If hash value comes from other node is not a match, a
>> read repair would perform. so that correct data can be returned.
>>
>>
>> On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn wrote:
>>
>>> If you have to pieces of data that are different but have the same
>>> timestamp, how can you resolve consistency?
>>>
>>> This is a pathological situation to begin with, why should you waste
>>> effort to (not) solve it?
>>>
>>> On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen  wrote:
>>>
 I guess it is because the timestamp does not guarantee data consistency,
 but hash does.

 Boris


 On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn wrote:

> I just saw this
>
> http://wiki.apache.org/cassandra/DigestQueries
>
> and I was wondering why it returns a hash of the data. Wouldn't it be
> better and easier to return the timestamp? You don't really care what the
> data is, you only care whether it is more or less recent than another 
> piece
> of data.
>


>>>
>>
>


RE: sstabletojson

2011-07-13 Thread Stephen Pope
 Perfect, thanks!

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Tuesday, July 12, 2011 5:53 PM
To: user@cassandra.apache.org
Subject: Re: sstabletojson

You can upgrade to 0.8.1 to fix this. :)

On Tue, Jul 12, 2011 at 1:03 PM, Stephen Pope  wrote:
>  Hey there. I'm trying to convert one of my sstables to json, but it doesn't 
> appear to be escaping quotes. As a result, I've got a line in my resulting 
> json like this:
>
> "3230303930373139313734303236efbfbf3331313733": [["6d6573736167655f6964", 
> ""<66AA9165386616028BD3FECF893BBAC204347F3BAF@CONFLICT,6.HUSHEDFIRE.COM>"", 
> 634447747524175316]],
>
>  Attempting to convert this json back into an sstable results in:
>
> C:\cassandra\apache-cassandra-0.8.0\bin>json2sstable.bat -K BIM -c 
> TransactionLogs json.dat out.db
>
> org.codehaus.jackson.JsonParseException: Unexpected character ('<' (code 
> 60)): w
> as expecting comma to separate ARRAY entries
>  at [Source: json.dat; line: 31175, column: 299]
>        at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
>        at 
> org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.
> java:632)
>        at 
> org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonPa
> rserBase.java:565)
>        at 
> org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser
> .java:128)
>        at 
> org.codehaus.jackson.map.deser.UntypedObjectDeserializer.mapArray(Unt
> ypedObjectDeserializer.java:81)
>        at 
> org.codehaus.jackson.map.deser.UntypedObjectDeserializer.deserialize(
> UntypedObjectDeserializer.java:62)
>        at 
> org.codehaus.jackson.map.deser.UntypedObjectDeserializer.mapArray(Unt
> ypedObjectDeserializer.java:82)
>        at 
> org.codehaus.jackson.map.deser.UntypedObjectDeserializer.deserialize(
> UntypedObjectDeserializer.java:62)
>        at 
> org.codehaus.jackson.map.deser.MapDeserializer._readAndBind(MapDeseri
> alizer.java:197)
>        at 
> org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeseria
> lizer.java:145)
>        at 
> org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeseria
> lizer.java:23)
>        at 
> org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:12
> 61)
>        at 
> org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:517
> )
>        at org.codehaus.jackson.JsonParser.readValueAs(JsonParser.java:897)
>        at 
> org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport
> .java:263)
>        at 
> org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.jav
> a:252)
>        at 
> org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476)
>
>
>  Is there anything I can do with my data to fix this?
>
>  Cheers,
>  Steve
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


BulkLoader

2011-07-13 Thread Stephen Pope
 I'm trying to figure out how to use the BulkLoader, and it looks like there's 
no way to run it against a local machine, because of this:

Set hosts = Gossiper.instance.getLiveMembers();
hosts.remove(FBUtilities.getLocalAddress());
if (hosts.isEmpty())
throw new IllegalStateException("Cannot load any sstable, 
no live member found in the cluster");

 Is this intended behavior? May I ask why? We'd like to be able to run it 
against the local machine.

 Cheers,
 Steve


Re: Survey: Cassandra/JVM Resident Set Size increase

2011-07-13 Thread Konstantin Naryshkin
Do you mean that it is using all of the available heap? That is the expected 
behavior of most long running Java applications. The JVM will not GC until it 
needs memory (or you explicitly ask it to) and will only free up a bit of 
memory at a time. That is very good behavior from a performance stand point 
since frequent, large GCs would make your application very unresponsive. It 
also makes Java applications take up all the memory you give them.

- Original Message -
From: "Sasha Dolgy" 
To: user@cassandra.apache.org
Sent: Tuesday, July 12, 2011 10:23:02 PM
Subject: Re: Survey: Cassandra/JVM Resident Set Size increase

I'll post more tomorrow ... However, we set up one node in a single node
cluster and have left it with no datareviewing memory consumption
graphs...it increased daily until it gobbled (highly technical term) all
memory...the system is now running just below 100% memory usagewhich i
find peculiar seeings that it is doing nothingwith no data and
no peers.
On Jul 12, 2011 3:29 PM, "Chris Burroughs" 
wrote:
> ### Preamble
>
> There have been several reports on the mailing list of the JVM running
> Cassandra using "too much" memory. That is, the resident set size is
>>>(max java heap size + mmaped segments) and continues to grow until the
> process swaps, kernel oom killer comes along, or performance just
> degrades too far due to the lack of space for the page cache. It has
> been unclear from these reports if there is a pattern. My hope here is
> that by comparing JVM versions, OS versions, JVM configuration etc., we
> will find something. Thank you everyone for your time.
>
>
> Some example reports:
> - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html
> -
>
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html
> - https://issues.apache.org/jira/browse/CASSANDRA-2868
> -
>
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html
> -
>
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html
>
> For reference theories include (in no particular order):
> - memory fragmentation
> - JVM bug
> - OS/glibc bug
> - direct memory
> - swap induced fragmentation
> - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity.
>
> ### Survey
>
> 1. Do you think you are experiencing this problem?
>
> 2. Why? (This is a good time to share a graph like
> http://www.twitpic.com/5fdabn or
> http://img24.imageshack.us/img24/1754/cassandrarss.png)
>
> 2. Are you using mmap? (If yes be sure to have read
> http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have
> used pmap [or another tool] to rule you mmap and top decieving you.)
>
> 3. Are you using JNA? Was mlockall succesful (it's in the logs on
startup)?
>
> 4. Is swap enabled? Are you swapping?
>
> 5. What version of Apache Cassandra are you using?
>
> 6. What is the earliest version of Apache Cassandra you recall seeing
> this problem with?
>
> 7. Have you tried the patch from CASSANDRA-2654 ?
>
> 8. What jvm and version are you using?
>
> 9. What OS and version are you using?
>
> 10. What are your jvm flags?
>
> 11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize)
>
> 12. Can you characterise how much GC your cluster is doing?
>
> 13. Approximately how many read/writes per unit time is your cluster
> doing (per node or the whole cluster)?
>
> 14. How are you column families configured (key cache size, row cache
> size, etc.)?
>


Re: insert a super column

2011-07-13 Thread Konstantin Naryshkin
A ColumnPath can contain a super column, so you should be fine inserting a 
super column family (in fact I do that). Quoting cassandra.thrift:

struct ColumnPath {
3: required string column_family,
4: optional binary super_column,
5: optional binary column,
}

- Original Message -
From: "魏金仙" 
To: "user" 
Sent: Wednesday, July 13, 2011 7:43:15 AM
Subject: insert a super column

insert(key, column_path, column, consistency_level) can only insert a standard 
column.
Is batch_mutate the only API to insert a super column?


and also can someone tell why batch_insert,multi_get is removed in version 
0.7.4?


Re: Why do Digest Queries return hash instead of timestamp?

2011-07-13 Thread David Boxenhorn
Is that the actual reason?

This seems like a big inefficiency to me. For those of us who don't worry
about this extreme edge case (that probably will NEVER happen in real life,
for most applications), is there a way to turn this off?

Or am I wrong about this making the operation MUCH more expensive?


On Wed, Jul 13, 2011 at 3:20 PM, Boris Yen  wrote:

> For a specific column, If there are two versions with the same timestamp,
> the value of the column is used to break the tie.
>
> if v1.value().compareTo(v2.value()) < 0, it means that v2 wins.
>
> On Wed, Jul 13, 2011 at 7:13 PM, David Boxenhorn wrote:
>
>> How would you know which data is correct, if they both have the same
>> timestamp?
>>
>> On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen  wrote:
>>
>>> I can only say, "data" does matter, that is why the developers use hash
>>> instead of timestamp. If hash value comes from other node is not a match, a
>>> read repair would perform. so that correct data can be returned.
>>>
>>>
>>> On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn wrote:
>>>
 If you have to pieces of data that are different but have the same
 timestamp, how can you resolve consistency?

 This is a pathological situation to begin with, why should you waste
 effort to (not) solve it?

 On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen  wrote:

> I guess it is because the timestamp does not guarantee data
> consistency, but hash does.
>
> Boris
>
>
> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn 
> wrote:
>
>> I just saw this
>>
>> http://wiki.apache.org/cassandra/DigestQueries
>>
>> and I was wondering why it returns a hash of the data. Wouldn't it be
>> better and easier to return the timestamp? You don't really care what the
>> data is, you only care whether it is more or less recent than another 
>> piece
>> of data.
>>
>
>

>>>
>>
>


How to remove/add node

2011-07-13 Thread Abdul Haq Shaik
Hi,

I have deleted the data, commitlog and saved cache directories. I have
removed one of the nodes from the seeds of cassandra.yaml. When i tried to
use nodetool, itshowing the removed node as up..

Thanks,

Abdul


RE: BulkLoader

2011-07-13 Thread Stephen Pope
 I think I've solved my own problem here. After generating the sstable using 
json2sstable it looks like I can simply copy the created sstable into my data 
directory.

 Can anyone think of any potential problems with doing it this way?

-Original Message-
From: Stephen Pope [mailto:stephen.p...@quest.com] 
Sent: Wednesday, July 13, 2011 9:32 AM
To: user@cassandra.apache.org
Subject: BulkLoader

 I'm trying to figure out how to use the BulkLoader, and it looks like there's 
no way to run it against a local machine, because of this:

Set hosts = Gossiper.instance.getLiveMembers();
hosts.remove(FBUtilities.getLocalAddress());
if (hosts.isEmpty())
throw new IllegalStateException("Cannot load any sstable, 
no live member found in the cluster");

 Is this intended behavior? May I ask why? We'd like to be able to run it 
against the local machine.

 Cheers,
 Steve


Re: AssertionError: No data found for NamesQueryFilter

2011-07-13 Thread Jonathan Ellis
This (https://issues.apache.org/jira/browse/CASSANDRA-2653) is fixed
in 0.7.7, which will be out soon.

On Tue, Jul 12, 2011 at 9:13 PM, Kyle Gibson
 wrote:
> Running version 0.7.6-2, recently upgraded from 0.7.3.
>
> I am get a time out exception when I run a particular
> get_indexed_slices, which results in the following error showing up on
> a few nodes:
>
> ERROR [ReadStage:16] 2011-07-12 23:01:31,424
> AbstractCassandraDaemon.java (line 114) Fatal exception in thread
> Thread[ReadStage:16,5,main]
> java.lang.AssertionError: No data found for
> NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=18
> cap=29],java.nio.HeapByteBuffer[pos=22 lim=28 cap=29]) in
> DecoratedKey(39222808797828327646767854834585383073,
> 464f2d47584f4c4833454a46384e4c54543341544c4339):QueryPath(columnFamilyName='subscriptions',
> superColumnName='null', columnName='null') (original filter
> NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=18
> cap=29],java.nio.HeapByteBuffer[pos=22 lim=28 cap=29])) from
> expression 'XXX EQ YYY'
>        at 
> org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1603)
>        at 
> org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
>        at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
> Source)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>        at java.lang.Thread.run(Unknown Source)
>
>
> Thanks
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Why do Digest Queries return hash instead of timestamp?

2011-07-13 Thread Jonathan Ellis
(1) the hash calculation is a small amount of CPU -- MD5 is
specifically designed to be efficient in this kind of situation
(2) we compute one hash per query, so for multiple columns the
advantage over timestamp-per-column gets large quickly.

On Wed, Jul 13, 2011 at 7:31 AM, David Boxenhorn  wrote:
> Is that the actual reason?
>
> This seems like a big inefficiency to me. For those of us who don't worry
> about this extreme edge case (that probably will NEVER happen in real life,
> for most applications), is there a way to turn this off?
>
> Or am I wrong about this making the operation MUCH more expensive?
>
>
> On Wed, Jul 13, 2011 at 3:20 PM, Boris Yen  wrote:
>>
>> For a specific column, If there are two versions with the same timestamp,
>> the value of the column is used to break the tie.
>> if v1.value().compareTo(v2.value()) < 0, it means that v2 wins.
>> On Wed, Jul 13, 2011 at 7:13 PM, David Boxenhorn 
>> wrote:
>>>
>>> How would you know which data is correct, if they both have the same
>>> timestamp?
>>>
>>> On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen  wrote:

 I can only say, "data" does matter, that is why the developers use hash
 instead of timestamp. If hash value comes from other node is not a match, a
 read repair would perform. so that correct data can be returned.

 On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn 
 wrote:
>
> If you have to pieces of data that are different but have the same
> timestamp, how can you resolve consistency?
>
> This is a pathological situation to begin with, why should you waste
> effort to (not) solve it?
>
> On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen  wrote:
>>
>> I guess it is because the timestamp does not guarantee data
>> consistency, but hash does.
>> Boris
>>
>> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn 
>> wrote:
>>>
>>> I just saw this
>>>
>>> http://wiki.apache.org/cassandra/DigestQueries
>>>
>>> and I was wondering why it returns a hash of the data. Wouldn't it be
>>> better and easier to return the timestamp? You don't really care what 
>>> the
>>> data is, you only care whether it is more or less recent than another 
>>> piece
>>> of data.
>>
>

>>>
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: BulkLoader

2011-07-13 Thread Jonathan Ellis
Sure, that will work fine with a single machine.  The advantage of
bulkloader is it handles splitting the sstable up and sending each
piece to the right place(s) when you have more than one.

On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope  wrote:
>  I think I've solved my own problem here. After generating the sstable using 
> json2sstable it looks like I can simply copy the created sstable into my data 
> directory.
>
>  Can anyone think of any potential problems with doing it this way?
>
> -Original Message-
> From: Stephen Pope [mailto:stephen.p...@quest.com]
> Sent: Wednesday, July 13, 2011 9:32 AM
> To: user@cassandra.apache.org
> Subject: BulkLoader
>
>  I'm trying to figure out how to use the BulkLoader, and it looks like 
> there's no way to run it against a local machine, because of this:
>
>                Set hosts = Gossiper.instance.getLiveMembers();
>                hosts.remove(FBUtilities.getLocalAddress());
>                if (hosts.isEmpty())
>                    throw new IllegalStateException("Cannot load any sstable, 
> no live member found in the cluster");
>
>  Is this intended behavior? May I ask why? We'd like to be able to run it 
> against the local machine.
>
>  Cheers,
>  Steve
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


One node down but it thinks its fine...

2011-07-13 Thread Ray Slakinski
One of our nodes, which happens to be the seed thinks its Up and all the other 
nodes are down. However all the other nodes thinks the seed is down instead. 
The logs for the seed node show everything is running as it should be. I've 
tried restarting the node, turning on/off gossip and thrift and nothing seems 
to get the node to see the rest of its ring as up and running. I have also 
tried restarting one of the other nodes, which had no affect on the situation. 
Below is the ring outputs for the seed and one other node in the ring, plus a 
ping to show that the seed can ping the other node.

# bin/nodetool -h 0.0.0.0 ring
Address Status State Load Owns Token 
 141784319550391026443072753096570088105 
127.0.0.1 Up Normal 4.61 GB 16.67% 0 
xx.xxx.30.210 Down Normal ? 16.67% 28356863910078205288614550619314017621 
xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242 
xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863 
xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484 
xx.xxx.17.122 Down Normal ? 16.67% 141784319550391026443072753096570088105 


# ping xx.xxx.30.210
PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data.
64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms
64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms
^C
--- xx.xxx.30.210 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms


# bin/nodetool -h xx.xxx.30.210 ring
Address Status State Load Owns Token 
 141784319550391026443072753096570088105 
xx.xxx.23.40 Down Normal ? 16.67% 0 
xx.xxx.30.210 Up Normal 10.58 GB 16.67% 28356863910078205288614550619314017621 
xx.xx.90.87 Up Normal 10.47 GB 16.67% 56713727820156410577229101238628035242 
xx.xx.22.236 Up Normal 9.63 GB 16.67% 85070591730234615865843651857942052863 
xx.xx.97.96 Up Normal 10.68 GB 16.67% 113427455640312821154458202477256070484 
xx.xxx.17.122 Up Normal 10.18 GB 16.67% 141784319550391026443072753096570088105 

-- 
Ray Slakinski




RE: BulkLoader

2011-07-13 Thread Stephen Pope
 Fair enough. My original question stands then. :) 

 Why aren't you allowed to talk to a local installation using BulkLoader?

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Wednesday, July 13, 2011 11:06 AM
To: user@cassandra.apache.org
Subject: Re: BulkLoader

Sure, that will work fine with a single machine.  The advantage of
bulkloader is it handles splitting the sstable up and sending each
piece to the right place(s) when you have more than one.

On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope  wrote:
>  I think I've solved my own problem here. After generating the sstable using 
> json2sstable it looks like I can simply copy the created sstable into my data 
> directory.
>
>  Can anyone think of any potential problems with doing it this way?
>
> -Original Message-
> From: Stephen Pope [mailto:stephen.p...@quest.com]
> Sent: Wednesday, July 13, 2011 9:32 AM
> To: user@cassandra.apache.org
> Subject: BulkLoader
>
>  I'm trying to figure out how to use the BulkLoader, and it looks like 
> there's no way to run it against a local machine, because of this:
>
>                Set hosts = Gossiper.instance.getLiveMembers();
>                hosts.remove(FBUtilities.getLocalAddress());
>                if (hosts.isEmpty())
>                    throw new IllegalStateException("Cannot load any sstable, 
> no live member found in the cluster");
>
>  Is this intended behavior? May I ask why? We'd like to be able to run it 
> against the local machine.
>
>  Cheers,
>  Steve
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: BulkLoader

2011-07-13 Thread Jonathan Ellis
Because it's hooking directly into gossip, so the local instance it's
ignoring is the bulkloader process, not Cassandra.

You'd need to run the bulkloader from a different IP, than Cassandra.

On Wed, Jul 13, 2011 at 8:22 AM, Stephen Pope  wrote:
>  Fair enough. My original question stands then. :)
>
>  Why aren't you allowed to talk to a local installation using BulkLoader?
>
> -Original Message-
> From: Jonathan Ellis [mailto:jbel...@gmail.com]
> Sent: Wednesday, July 13, 2011 11:06 AM
> To: user@cassandra.apache.org
> Subject: Re: BulkLoader
>
> Sure, that will work fine with a single machine.  The advantage of
> bulkloader is it handles splitting the sstable up and sending each
> piece to the right place(s) when you have more than one.
>
> On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope  wrote:
>>  I think I've solved my own problem here. After generating the sstable using 
>> json2sstable it looks like I can simply copy the created sstable into my 
>> data directory.
>>
>>  Can anyone think of any potential problems with doing it this way?
>>
>> -Original Message-
>> From: Stephen Pope [mailto:stephen.p...@quest.com]
>> Sent: Wednesday, July 13, 2011 9:32 AM
>> To: user@cassandra.apache.org
>> Subject: BulkLoader
>>
>>  I'm trying to figure out how to use the BulkLoader, and it looks like 
>> there's no way to run it against a local machine, because of this:
>>
>>                Set hosts = Gossiper.instance.getLiveMembers();
>>                hosts.remove(FBUtilities.getLocalAddress());
>>                if (hosts.isEmpty())
>>                    throw new IllegalStateException("Cannot load any sstable, 
>> no live member found in the cluster");
>>
>>  Is this intended behavior? May I ask why? We'd like to be able to run it 
>> against the local machine.
>>
>>  Cheers,
>>  Steve
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


RE: BulkLoader

2011-07-13 Thread Stephen Pope
 Ahhh..ok. Thanks.

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Wednesday, July 13, 2011 11:35 AM
To: user@cassandra.apache.org
Subject: Re: BulkLoader

Because it's hooking directly into gossip, so the local instance it's
ignoring is the bulkloader process, not Cassandra.

You'd need to run the bulkloader from a different IP, than Cassandra.

On Wed, Jul 13, 2011 at 8:22 AM, Stephen Pope  wrote:
>  Fair enough. My original question stands then. :)
>
>  Why aren't you allowed to talk to a local installation using BulkLoader?
>
> -Original Message-
> From: Jonathan Ellis [mailto:jbel...@gmail.com]
> Sent: Wednesday, July 13, 2011 11:06 AM
> To: user@cassandra.apache.org
> Subject: Re: BulkLoader
>
> Sure, that will work fine with a single machine.  The advantage of
> bulkloader is it handles splitting the sstable up and sending each
> piece to the right place(s) when you have more than one.
>
> On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope  wrote:
>>  I think I've solved my own problem here. After generating the sstable using 
>> json2sstable it looks like I can simply copy the created sstable into my 
>> data directory.
>>
>>  Can anyone think of any potential problems with doing it this way?
>>
>> -Original Message-
>> From: Stephen Pope [mailto:stephen.p...@quest.com]
>> Sent: Wednesday, July 13, 2011 9:32 AM
>> To: user@cassandra.apache.org
>> Subject: BulkLoader
>>
>>  I'm trying to figure out how to use the BulkLoader, and it looks like 
>> there's no way to run it against a local machine, because of this:
>>
>>                Set hosts = Gossiper.instance.getLiveMembers();
>>                hosts.remove(FBUtilities.getLocalAddress());
>>                if (hosts.isEmpty())
>>                    throw new IllegalStateException("Cannot load any sstable, 
>> no live member found in the cluster");
>>
>>  Is this intended behavior? May I ask why? We'd like to be able to run it 
>> against the local machine.
>>
>>  Cheers,
>>  Steve
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Why do Digest Queries return hash instead of timestamp?

2011-07-13 Thread David Boxenhorn
Got it.

Thanks!

On Wed, Jul 13, 2011 at 6:05 PM, Jonathan Ellis  wrote:

> (1) the hash calculation is a small amount of CPU -- MD5 is
> specifically designed to be efficient in this kind of situation
> (2) we compute one hash per query, so for multiple columns the
> advantage over timestamp-per-column gets large quickly.
>
> On Wed, Jul 13, 2011 at 7:31 AM, David Boxenhorn 
> wrote:
> > Is that the actual reason?
> >
> > This seems like a big inefficiency to me. For those of us who don't worry
> > about this extreme edge case (that probably will NEVER happen in real
> life,
> > for most applications), is there a way to turn this off?
> >
> > Or am I wrong about this making the operation MUCH more expensive?
> >
> >
> > On Wed, Jul 13, 2011 at 3:20 PM, Boris Yen  wrote:
> >>
> >> For a specific column, If there are two versions with the same
> timestamp,
> >> the value of the column is used to break the tie.
> >> if v1.value().compareTo(v2.value()) < 0, it means that v2 wins.
> >> On Wed, Jul 13, 2011 at 7:13 PM, David Boxenhorn 
> >> wrote:
> >>>
> >>> How would you know which data is correct, if they both have the same
> >>> timestamp?
> >>>
> >>> On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen 
> wrote:
> 
>  I can only say, "data" does matter, that is why the developers use
> hash
>  instead of timestamp. If hash value comes from other node is not a
> match, a
>  read repair would perform. so that correct data can be returned.
> 
>  On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn 
>  wrote:
> >
> > If you have to pieces of data that are different but have the same
> > timestamp, how can you resolve consistency?
> >
> > This is a pathological situation to begin with, why should you waste
> > effort to (not) solve it?
> >
> > On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen 
> wrote:
> >>
> >> I guess it is because the timestamp does not guarantee data
> >> consistency, but hash does.
> >> Boris
> >>
> >> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn <
> da...@citypath.com>
> >> wrote:
> >>>
> >>> I just saw this
> >>>
> >>> http://wiki.apache.org/cassandra/DigestQueries
> >>>
> >>> and I was wondering why it returns a hash of the data. Wouldn't it
> be
> >>> better and easier to return the timestamp? You don't really care
> what the
> >>> data is, you only care whether it is more or less recent than
> another piece
> >>> of data.
> >>
> >
> 
> >>>
> >>
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>


Re: One node down but it thinks its fine...

2011-07-13 Thread samal
Check seed ip is same in all node and should not be loopback ip on cluster.

On Wed, Jul 13, 2011 at 8:40 PM, Ray Slakinski wrote:

> One of our nodes, which happens to be the seed thinks its Up and all the
> other nodes are down. However all the other nodes thinks the seed is down
> instead. The logs for the seed node show everything is running as it should
> be. I've tried restarting the node, turning on/off gossip and thrift and
> nothing seems to get the node to see the rest of its ring as up and running.
> I have also tried restarting one of the other nodes, which had no affect on
> the situation. Below is the ring outputs for the seed and one other node in
> the ring, plus a ping to show that the seed can ping the other node.
>
> # bin/nodetool -h 0.0.0.0 ring
> Address Status State Load Owns Token
>  141784319550391026443072753096570088105
> 127.0.0.1 Up Normal 4.61 GB 16.67% 0
> xx.xxx.30.210 Down Normal ? 16.67% 28356863910078205288614550619314017621
> xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242
> xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863
> xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484
> xx.xxx.17.122 Down Normal ? 16.67% 141784319550391026443072753096570088105
>
>
> # ping xx.xxx.30.210
> PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data.
> 64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms
> 64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms
> ^C
> --- xx.xxx.30.210 ping statistics ---
> 2 packets transmitted, 2 received, 0% packet loss, time 999ms
> rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms
>
>
> # bin/nodetool -h xx.xxx.30.210 ring
> Address Status State Load Owns Token
>  141784319550391026443072753096570088105
> xx.xxx.23.40 Down Normal ? 16.67% 0
> xx.xxx.30.210 Up Normal 10.58 GB 16.67%
> 28356863910078205288614550619314017621
> xx.xx.90.87 Up Normal 10.47 GB 16.67%
> 56713727820156410577229101238628035242
> xx.xx.22.236 Up Normal 9.63 GB 16.67%
> 85070591730234615865843651857942052863
> xx.xx.97.96 Up Normal 10.68 GB 16.67%
> 113427455640312821154458202477256070484
> xx.xxx.17.122 Up Normal 10.18 GB 16.67%
> 141784319550391026443072753096570088105
>
> --
> Ray Slakinski
>
>
>


JSR-347

2011-07-13 Thread Pete Muir
Hi,

I am looking to "round out" the EG membership of JSR-347 so that we can get 
going with discussions. It would be great if someone from the Cassandra 
community could join to represent the experiences of developing HBase :-)

We'll be communicating using https://groups.google.com/forum/#!forum/jsr347 - 
so that would be a good place to start whilst we wait for the JCP to process 
formal nominations!

Let me know any queries

Best,

Pete

Re: One node down but it thinks its fine...

2011-07-13 Thread Sasha Dolgy
any firewall changes?  ping is fine ... but if you can't get from
node(a) to nodes(n) on the specific ports...

On Wed, Jul 13, 2011 at 6:47 PM, samal  wrote:
> Check seed ip is same in all node and should not be loopback ip on cluster.
>
> On Wed, Jul 13, 2011 at 8:40 PM, Ray Slakinski 
> wrote:
>>
>> One of our nodes, which happens to be the seed thinks its Up and all the
>> other nodes are down. However all the other nodes thinks the seed is down
>> instead. The logs for the seed node show everything is running as it should
>> be. I've tried restarting the node, turning on/off gossip and thrift and
>> nothing seems to get the node to see the rest of its ring as up and running.
>> I have also tried restarting one of the other nodes, which had no affect on
>> the situation. Below is the ring outputs for the seed and one other node in
>> the ring, plus a ping to show that the seed can ping the other node.
>>
>> # bin/nodetool -h 0.0.0.0 ring
>> Address Status State Load Owns Token
>>  141784319550391026443072753096570088105
>> 127.0.0.1 Up Normal 4.61 GB 16.67% 0
>> xx.xxx.30.210 Down Normal ? 16.67% 28356863910078205288614550619314017621
>> xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242
>> xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863
>> xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484
>> xx.xxx.17.122 Down Normal ? 16.67% 141784319550391026443072753096570088105
>>
>>
>> # ping xx.xxx.30.210
>> PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data.
>> 64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms
>> 64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms
>> ^C
>> --- xx.xxx.30.210 ping statistics ---
>> 2 packets transmitted, 2 received, 0% packet loss, time 999ms
>> rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms
>>
>>
>> # bin/nodetool -h xx.xxx.30.210 ring
>> Address Status State Load Owns Token
>>  141784319550391026443072753096570088105
>> xx.xxx.23.40 Down Normal ? 16.67% 0
>> xx.xxx.30.210 Up Normal 10.58 GB 16.67%
>> 28356863910078205288614550619314017621
>> xx.xx.90.87 Up Normal 10.47 GB 16.67%
>> 56713727820156410577229101238628035242
>> xx.xx.22.236 Up Normal 9.63 GB 16.67%
>> 85070591730234615865843651857942052863
>> xx.xx.97.96 Up Normal 10.68 GB 16.67%
>> 113427455640312821154458202477256070484
>> xx.xxx.17.122 Up Normal 10.18 GB 16.67%
>> 141784319550391026443072753096570088105
>>
>> --
>> Ray Slakinski
>>
>>
>
>



-- 
Sasha Dolgy
sasha.do...@gmail.com


Re: JSR-347

2011-07-13 Thread Yang
"data grids", it seems that this really does not have much
relationship to "java", since all major noSQL solutions explicitly
create interfaces in almost all languages and try to be
language-agnostic by using RPC like thrift,avro etc.

On Wed, Jul 13, 2011 at 9:06 AM, Pete Muir  wrote:
> Hi,
>
> I am looking to "round out" the EG membership of JSR-347 so that we can get 
> going with discussions. It would be great if someone from the Cassandra 
> community could join to represent the experiences of developing HBase :-)
>
> We'll be communicating using https://groups.google.com/forum/#!forum/jsr347 - 
> so that would be a good place to start whilst we wait for the JCP to process 
> formal nominations!
>
> Let me know any queries
>
> Best,
>
> Pete


Re: BulkLoader

2011-07-13 Thread Sylvain Lebresne
Also note that if you have a cassandra node running on the local node
from which you want to bulk load sstables, there is a JMX
(StorageService->bulkLoad) call to do just that. May be simpler than
using sstableloader if that is what you want to do.

--
Sylvain

On Wed, Jul 13, 2011 at 3:46 PM, Stephen Pope  wrote:
>  Ahhh..ok. Thanks.
>
> -Original Message-
> From: Jonathan Ellis [mailto:jbel...@gmail.com]
> Sent: Wednesday, July 13, 2011 11:35 AM
> To: user@cassandra.apache.org
> Subject: Re: BulkLoader
>
> Because it's hooking directly into gossip, so the local instance it's
> ignoring is the bulkloader process, not Cassandra.
>
> You'd need to run the bulkloader from a different IP, than Cassandra.
>
> On Wed, Jul 13, 2011 at 8:22 AM, Stephen Pope  wrote:
>>  Fair enough. My original question stands then. :)
>>
>>  Why aren't you allowed to talk to a local installation using BulkLoader?
>>
>> -Original Message-
>> From: Jonathan Ellis [mailto:jbel...@gmail.com]
>> Sent: Wednesday, July 13, 2011 11:06 AM
>> To: user@cassandra.apache.org
>> Subject: Re: BulkLoader
>>
>> Sure, that will work fine with a single machine.  The advantage of
>> bulkloader is it handles splitting the sstable up and sending each
>> piece to the right place(s) when you have more than one.
>>
>> On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope  wrote:
>>>  I think I've solved my own problem here. After generating the sstable 
>>> using json2sstable it looks like I can simply copy the created sstable into 
>>> my data directory.
>>>
>>>  Can anyone think of any potential problems with doing it this way?
>>>
>>> -Original Message-
>>> From: Stephen Pope [mailto:stephen.p...@quest.com]
>>> Sent: Wednesday, July 13, 2011 9:32 AM
>>> To: user@cassandra.apache.org
>>> Subject: BulkLoader
>>>
>>>  I'm trying to figure out how to use the BulkLoader, and it looks like 
>>> there's no way to run it against a local machine, because of this:
>>>
>>>                Set hosts = Gossiper.instance.getLiveMembers();
>>>                hosts.remove(FBUtilities.getLocalAddress());
>>>                if (hosts.isEmpty())
>>>                    throw new IllegalStateException("Cannot load any 
>>> sstable, no live member found in the cluster");
>>>
>>>  Is this intended behavior? May I ask why? We'd like to be able to run it 
>>> against the local machine.
>>>
>>>  Cheers,
>>>  Steve
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>


Escaping characters in cqlsh

2011-07-13 Thread Blake Visin
I am trying to get all the columns named "fmd:" in cqlsh.

I am using:

select 'fmd:'..'fmd;' from feeds where;

But I am getting errors (as expected).  Is there any way to escape the colon
or semicolon in cqlsh?

Thanks,
Blake


Re: One node down but it thinks its fine...

2011-07-13 Thread Ray Slakinski
Was all working before, but we ran out of file handles and ended up restarting 
the nodes. No yaml changes have occurred. 

Ray Slakinski

On 2011-07-13, at 12:55 PM, Sasha Dolgy  wrote:

> any firewall changes?  ping is fine ... but if you can't get from
> node(a) to nodes(n) on the specific ports...
> 
> On Wed, Jul 13, 2011 at 6:47 PM, samal  wrote:
>> Check seed ip is same in all node and should not be loopback ip on cluster.
>> 
>> On Wed, Jul 13, 2011 at 8:40 PM, Ray Slakinski 
>> wrote:
>>> 
>>> One of our nodes, which happens to be the seed thinks its Up and all the
>>> other nodes are down. However all the other nodes thinks the seed is down
>>> instead. The logs for the seed node show everything is running as it should
>>> be. I've tried restarting the node, turning on/off gossip and thrift and
>>> nothing seems to get the node to see the rest of its ring as up and running.
>>> I have also tried restarting one of the other nodes, which had no affect on
>>> the situation. Below is the ring outputs for the seed and one other node in
>>> the ring, plus a ping to show that the seed can ping the other node.
>>> 
>>> # bin/nodetool -h 0.0.0.0 ring
>>> Address Status State Load Owns Token
>>>  141784319550391026443072753096570088105
>>> 127.0.0.1 Up Normal 4.61 GB 16.67% 0
>>> xx.xxx.30.210 Down Normal ? 16.67% 28356863910078205288614550619314017621
>>> xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242
>>> xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863
>>> xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484
>>> xx.xxx.17.122 Down Normal ? 16.67% 141784319550391026443072753096570088105
>>> 
>>> 
>>> # ping xx.xxx.30.210
>>> PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data.
>>> 64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms
>>> 64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms
>>> ^C
>>> --- xx.xxx.30.210 ping statistics ---
>>> 2 packets transmitted, 2 received, 0% packet loss, time 999ms
>>> rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms
>>> 
>>> 
>>> # bin/nodetool -h xx.xxx.30.210 ring
>>> Address Status State Load Owns Token
>>>  141784319550391026443072753096570088105
>>> xx.xxx.23.40 Down Normal ? 16.67% 0
>>> xx.xxx.30.210 Up Normal 10.58 GB 16.67%
>>> 28356863910078205288614550619314017621
>>> xx.xx.90.87 Up Normal 10.47 GB 16.67%
>>> 56713727820156410577229101238628035242
>>> xx.xx.22.236 Up Normal 9.63 GB 16.67%
>>> 85070591730234615865843651857942052863
>>> xx.xx.97.96 Up Normal 10.68 GB 16.67%
>>> 113427455640312821154458202477256070484
>>> xx.xxx.17.122 Up Normal 10.18 GB 16.67%
>>> 141784319550391026443072753096570088105
>>> 
>>> --
>>> Ray Slakinski
>>> 
>>> 
>> 
>> 
> 
> 
> 
> -- 
> Sasha Dolgy
> sasha.do...@gmail.com


Re: CQL + Counters = bad request

2011-07-13 Thread Aaron Turner
I've tried using the Thrift/execute_cql_query() API as well, and it
doesn't work either.  I've also tried using a CF where the column
names are of AsciiType to see if that was the problem (quoted and
unquoted column names) and I get the exact same error of: no viable
alternative at character '+'

Frankly, I'm about ready to open a ticket against 0.8.1 saying
CQL/Counter support does not work at all.

Or is there a trick which isn't documented in the ticket?  I tried
reading the Java code referred to in ticket #2473, but i'm over my
head.

On Tue, Jul 12, 2011 at 6:46 PM, Aaron Turner  wrote:
> Doesn't seem to help:
>
> cqlsh> UPDATE RouterAggWeekly SET '1310367600' = '1310367600' + 17
> WHERE KEY = '1_20110728_ifoutmulticastpkts';
> Bad Request: line 1:55 no viable alternative at character '+'
>
> cqlsh> UPDATE RouterAggWeekly SET 1310367600 = '1310367600' + 17 WHERE
> KEY = '1_20110728_ifoutmulticastpkts';
> Bad Request: line 1:53 no viable alternative at character '+'
>
> cqlsh> UPDATE RouterAggWeekly SET '1310367600' = 1310367600 + 17 WHERE
> KEY = '1_20110728_ifoutmulticastpkts';
> Bad Request: line 1:53 no viable alternative at character '+'
>
> On Tue, Jul 12, 2011 at 5:35 PM, Jonathan Ellis  wrote:
>> Try quoting the column name.
>>
>> On Tue, Jul 12, 2011 at 5:30 PM, Aaron Turner  wrote:
>>> Using Cassandra 0.8.1 and cql 1.0.3 and following the syntax mentioned
>>> in https://issues.apache.org/jira/browse/CASSANDRA-2473
>>>
>>> cqlsh> UPDATE RouterAggWeekly SET 1310367600 = 1310367600 + 17 WHERE
>>> KEY = '1_20110728_ifoutmulticastpkts';
>>> Bad Request: line 1:51 no viable alternative at character '+'
>>>
>>> Column names are Long's, hence the INT = INT + INT
>>>
>>> Ideas?
>>>
>>> --
>>> Aaron Turner
>>> http://synfin.net/         Twitter: @synfinatic
>>> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & 
>>> Windows
>>> Those who would give up essential Liberty, to purchase a little temporary
>>> Safety, deserve neither Liberty nor Safety.
>>>     -- Benjamin Franklin
>>> "carpe diem quam minimum credula postero"
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>
>
> --
> Aaron Turner
> http://synfin.net/         Twitter: @synfinatic
> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & 
> Windows
> Those who would give up essential Liberty, to purchase a little temporary
> Safety, deserve neither Liberty nor Safety.
>     -- Benjamin Franklin
> "carpe diem quam minimum credula postero"
>



-- 
Aaron Turner
http://synfin.net/         Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
    -- Benjamin Franklin
"carpe diem quam minimum credula postero"


Re: CQL + Counters = bad request

2011-07-13 Thread samal
>
> >>> cqlsh> UPDATE RouterAggWeekly SET 1310367600 = 1310367600 + 17 WHERE
> >>> KEY = '1_20110728_ifoutmulticastpkts';
> >>> Bad Request: line 1:51 no viable alternative at character '+'
>

I m able to insert it.
___

cqlsh>
cqlsh>  UPDATE counts SET 1310367600 = 1310367600 + 17 WHERE KEY =
'1_20110728_ifoutmulticastpkts';
cqlsh>  UPDATE counts SET 1310367600 = 1310367600 + 17 WHERE KEY =
'1_20110728_ifoutmulticastpkts';
cqlsh>
_
[default@test] list counts;
Using default limit of 100
---
RowKey: 1_20110728_ifoutmulticastpkts
=> (counter=12, value=16)
=> (counter=1310367600, value=34)
---
RowKey: 1
=> (counter=1, value=10)

2 Rows Returned.
[default@test]


Re: One node down but it thinks its fine...

2011-07-13 Thread Ray Slakinski
And fixed! a co-worker put in a bad host line entry last night that through it 
all off :( Thanks for the assist guys.

-- 
Ray Slakinski


On Wednesday, July 13, 2011 at 1:32 PM, Ray Slakinski wrote:

> Was all working before, but we ran out of file handles and ended up 
> restarting the nodes. No yaml changes have occurred. 
> 
> Ray Slakinski
> 
> On 2011-07-13, at 12:55 PM, Sasha Dolgy  (mailto:sdo...@gmail.com)> wrote:
> 
> > any firewall changes? ping is fine ... but if you can't get from
> > node(a) to nodes(n) on the specific ports...
> > 
> > On Wed, Jul 13, 2011 at 6:47 PM, samal  > (mailto:sa...@wakya.in)> wrote:
> > > Check seed ip is same in all node and should not be loopback ip on 
> > > cluster.
> > > 
> > > On Wed, Jul 13, 2011 at 8:40 PM, Ray Slakinski  > > (mailto:ray.slakin...@gmail.com)>
> > > wrote:
> > > > 
> > > > One of our nodes, which happens to be the seed thinks its Up and all the
> > > > other nodes are down. However all the other nodes thinks the seed is 
> > > > down
> > > > instead. The logs for the seed node show everything is running as it 
> > > > should
> > > > be. I've tried restarting the node, turning on/off gossip and thrift and
> > > > nothing seems to get the node to see the rest of its ring as up and 
> > > > running.
> > > > I have also tried restarting one of the other nodes, which had no 
> > > > affect on
> > > > the situation. Below is the ring outputs for the seed and one other 
> > > > node in
> > > > the ring, plus a ping to show that the seed can ping the other node.
> > > > 
> > > > # bin/nodetool -h 0.0.0.0 ring
> > > > Address Status State Load Owns Token
> > > >  141784319550391026443072753096570088105
> > > > 127.0.0.1 Up Normal 4.61 GB 16.67% 0
> > > > xx.xxx.30.210 Down Normal ? 16.67% 
> > > > 28356863910078205288614550619314017621
> > > > xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242
> > > > xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863
> > > > xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484
> > > > xx.xxx.17.122 Down Normal ? 16.67% 
> > > > 141784319550391026443072753096570088105
> > > > 
> > > > 
> > > > # ping xx.xxx.30.210
> > > > PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data.
> > > > 64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms
> > > > 64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms
> > > > ^C
> > > > --- xx.xxx.30.210 ping statistics ---
> > > > 2 packets transmitted, 2 received, 0% packet loss, time 999ms
> > > > rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms
> > > > 
> > > > 
> > > > # bin/nodetool -h xx.xxx.30.210 ring
> > > > Address Status State Load Owns Token
> > > >  141784319550391026443072753096570088105
> > > > xx.xxx.23.40 Down Normal ? 16.67% 0
> > > > xx.xxx.30.210 Up Normal 10.58 GB 16.67%
> > > > 28356863910078205288614550619314017621
> > > > xx.xx.90.87 Up Normal 10.47 GB 16.67%
> > > > 56713727820156410577229101238628035242
> > > > xx.xx.22.236 Up Normal 9.63 GB 16.67%
> > > > 85070591730234615865843651857942052863
> > > > xx.xx.97.96 Up Normal 10.68 GB 16.67%
> > > > 113427455640312821154458202477256070484
> > > > xx.xxx.17.122 Up Normal 10.18 GB 16.67%
> > > > 141784319550391026443072753096570088105
> > > > 
> > > > --
> > > > Ray Slakinski
> > 
> > 
> > 
> > -- 
> > Sasha Dolgy
> > sasha.do...@gmail.com (mailto:sasha.do...@gmail.com)




Re: Escaping characters in cqlsh

2011-07-13 Thread Jonathan Ellis
You can escape quotes but I don't think you can escape semicolons.
Can you create a ticket for us to fix this?

On Wed, Jul 13, 2011 at 10:16 AM, Blake Visin  wrote:
> I am trying to get all the columns named "fmd:" in cqlsh.
> I am using:
> select 'fmd:'..'fmd;' from feeds where;
> But I am getting errors (as expected).  Is there any way to escape the colon
> or semicolon in cqlsh?
> Thanks,
> Blake
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: BulkLoader

2011-07-13 Thread Sylvain Lebresne
I'll have to apologize on that one. Just saw that the JMX call I was
talking about doesn't work as it should.
I'll fix that for 0.8.2 but in the meantime you'll want to use
sstableloader on a different IP as pointed by Jonathan.

--
Sylvain

On Wed, Jul 13, 2011 at 5:11 PM, Sylvain Lebresne  wrote:
> Also note that if you have a cassandra node running on the local node
> from which you want to bulk load sstables, there is a JMX
> (StorageService->bulkLoad) call to do just that. May be simpler than
> using sstableloader if that is what you want to do.
>
> --
> Sylvain
>
> On Wed, Jul 13, 2011 at 3:46 PM, Stephen Pope  wrote:
>>  Ahhh..ok. Thanks.
>>
>> -Original Message-
>> From: Jonathan Ellis [mailto:jbel...@gmail.com]
>> Sent: Wednesday, July 13, 2011 11:35 AM
>> To: user@cassandra.apache.org
>> Subject: Re: BulkLoader
>>
>> Because it's hooking directly into gossip, so the local instance it's
>> ignoring is the bulkloader process, not Cassandra.
>>
>> You'd need to run the bulkloader from a different IP, than Cassandra.
>>
>> On Wed, Jul 13, 2011 at 8:22 AM, Stephen Pope  wrote:
>>>  Fair enough. My original question stands then. :)
>>>
>>>  Why aren't you allowed to talk to a local installation using BulkLoader?
>>>
>>> -Original Message-
>>> From: Jonathan Ellis [mailto:jbel...@gmail.com]
>>> Sent: Wednesday, July 13, 2011 11:06 AM
>>> To: user@cassandra.apache.org
>>> Subject: Re: BulkLoader
>>>
>>> Sure, that will work fine with a single machine.  The advantage of
>>> bulkloader is it handles splitting the sstable up and sending each
>>> piece to the right place(s) when you have more than one.
>>>
>>> On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope  
>>> wrote:
  I think I've solved my own problem here. After generating the sstable 
 using json2sstable it looks like I can simply copy the created sstable 
 into my data directory.

  Can anyone think of any potential problems with doing it this way?

 -Original Message-
 From: Stephen Pope [mailto:stephen.p...@quest.com]
 Sent: Wednesday, July 13, 2011 9:32 AM
 To: user@cassandra.apache.org
 Subject: BulkLoader

  I'm trying to figure out how to use the BulkLoader, and it looks like 
 there's no way to run it against a local machine, because of this:

                Set hosts = Gossiper.instance.getLiveMembers();
                hosts.remove(FBUtilities.getLocalAddress());
                if (hosts.isEmpty())
                    throw new IllegalStateException("Cannot load any 
 sstable, no live member found in the cluster");

  Is this intended behavior? May I ask why? We'd like to be able to run it 
 against the local machine.

  Cheers,
  Steve

>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>


Re: Re: Re: Re: AntiEntropy?

2011-07-13 Thread Peter Schuller
> In the company I work for I suggested many times to run repair at least 1
> every 10 days (gcgraceseconds is set approx to 10 days in our config) -- but
> this book has been used against me :-) I will ask to run repair asap

Note that if GCGraceSeconds is 10 days, you want to run repair often
enough that there will never be a moment where there is more than
exactly 10 days since the last successfully completed repair
*STARTED*.

When scheduling repairs, factor in things like - what happens if
repair fails? Who gets alerted and how, and will there be time to fix
the problem? How long does repair take?

So basically, leave significant margin.

-- 
/ Peter Schuller


Re: Off-heap Cache

2011-07-13 Thread Raj N
How do I ensure it is indeed using the SerializingCacheProvider.

Thanks
-Rajesh

On Tue, Jul 12, 2011 at 1:46 PM, Jonathan Ellis  wrote:

> You need to set row_cache_provider=SerializingCacheProvider on the
> columnfamily definition (via the cli)
>
> On Tue, Jul 12, 2011 at 9:57 AM, Raj N  wrote:
> > Do we need to do anything special to turn off-heap cache on?
> > https://issues.apache.org/jira/browse/CASSANDRA-1969
> > -Raj
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>


Re: commitlog replay missing data

2011-07-13 Thread Aaron Morton
Have you verified that data you expect to see is not in the server after 
shutdown?

WRT the differed in the difference between the Memtable data size and SSTable 
live size, don't believe everything you read :)

Memtable live size is increased by the serialised byte size of every column 
inserted, and is never decremented. Deletes and overwrites will inflate this 
value. What was your workload like?

As of 0.8 we now have global memory management for cf's that tracks actual JVM 
bytes used by a CF. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 12/07/2011, at 3:28 PM, Jeffrey Wang  wrote:

> Hey all,
> 
>  
> 
> Recently upgraded to 0.8.1 and noticed what seems to be missing data after a 
> commitlog replay on a single-node cluster. I start the node, insert a bunch 
> of stuff (~600MB), stop it, and restart it. There are log messages pertaining 
> to the commitlog replay and no errors, but some of the data is missing. If I 
> flush before stopping the node, everything is fine, and running cfstats in 
> the two cases shows different amounts of data in the SSTables. Moreover, the 
> amount of data that is missing is nondeterministic. Has anyone run into this? 
> Thanks.
> 
>  
> 
> Here is the output of a side-by-side diff between cfstats outputs for a 
> single CF before restarting (left) and after (right). Somehow a 37MB memtable 
> became a 2.9MB SSTable (note the difference in write count as well)?
> 
>  
> 
> Column Family: Blocks   Column 
> Family: Blocks
> 
> SSTable count: 0  | SSTable 
> count: 1
> 
> Space used (live): 0  | Space used 
> (live): 2907637
> 
> Space used (total): 0 | Space used 
> (total): 2907637
> 
> Memtable Columns Count: 8198  | Memtable 
> Columns Count: 0
> 
> Memtable Data Size: 37550510  | Memtable Data 
> Size: 0
> 
> Memtable Switch Count: 0  | Memtable 
> Switch Count: 1
> 
> Read Count: 0   Read Count: 0
> 
> Read Latency: NaN ms.   Read Latency: 
> NaN ms.
> 
> Write Count: 8198 | Write Count: 
> 1526
> 
> Write Latency: 0.018 ms.  | Write 
> Latency: 0.011 ms.
> 
> Pending Tasks: 0Pending 
> Tasks: 0
> 
> Key cache capacity: 20  Key cache 
> capacity: 20
> 
> Key cache size: 0   Key cache 
> size: 0
> 
> Key cache hit rate: NaN Key cache hit 
> rate: NaN
> 
> Row cache: disabled Row cache: 
> disabled
> 
> Compacted row minimum size: 0 | Compacted row 
> minimum size: 1110
> 
> Compacted row maximum size: 0 | Compacted row 
> maximum size: 2299
> 
> Compacted row mean size: 0| Compacted row 
> mean size: 1960
> 
>  
> 
> Note that I patched https://issues.apache.org/jira/browse/CASSANDRA-2317 in 
> my version, but there are no deletions involved so I don’t think it’s 
> relevant unless I messed something up while patching.
> 
>  
> 
> -Jeffrey
> 


Re: Storing counters in the standard column families along with non-counter columns ?

2011-07-13 Thread Aaron Morton
If you can provide some more details on the use case we may be able to provide 
some data model help.

You can always use a dedicated CF for the counters, and use the same row key.

Cheers


-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 12/07/2011, at 6:36 AM, Aditya Narayan  wrote:

> Oops that's really very much disheartening and it could seriously impact our 
> plans for going live in near future. Without this facility I guess counters 
> currently have very little usefulness.
> 
> On Mon, Jul 11, 2011 at 8:16 PM, Chris Burroughs  
> wrote:
> On 07/10/2011 01:09 PM, Aditya Narayan wrote:
> > Is there any target version in near future for which this has been promised
> > ?
> 
> The ticket is problematic in that it would -- unless someone has a
> clever new idea -- require breaking thrift compatibility to add it to
> the api.  Since is unfortunate since it would be so useful.
> 
> If it's in the 0.8.x series it will only be through CQL.
> 


Re: CQL + Counters = bad request

2011-07-13 Thread Aaron Turner
Thanks.  Looks like we tracked down the problem to the datasax 0.8.1
rpm is actually 0.8.0.

rpm -qa | grep cassandra
apache-cassandra08-0.8.1-1

grep ' Cassandra version:' /var/log/cassandra/system.log | tail -1
INFO [main] 2011-07-13 12:04:31,039 StorageService.java (line 368)
Cassandra version: 0.8.0



On Wed, Jul 13, 2011 at 11:40 AM, samal  wrote:
>> >>> cqlsh> UPDATE RouterAggWeekly SET 1310367600 = 1310367600 + 17 WHERE
>> >>> KEY = '1_20110728_ifoutmulticastpkts';
>> >>> Bad Request: line 1:51 no viable alternative at character '+'
>
> I m able to insert it.
> ___
> cqlsh>
> cqlsh>  UPDATE counts SET 1310367600 = 1310367600 + 17 WHERE KEY =
> '1_20110728_ifoutmulticastpkts';
> cqlsh>  UPDATE counts SET 1310367600 = 1310367600 + 17 WHERE KEY =
> '1_20110728_ifoutmulticastpkts';
> cqlsh>
> _
> [default@test] list counts;
> Using default limit of 100
> ---
> RowKey: 1_20110728_ifoutmulticastpkts
> => (counter=12, value=16)
> => (counter=1310367600, value=34)
> ---
> RowKey: 1
> => (counter=1, value=10)
> 2 Rows Returned.
> [default@test]
>



-- 
Aaron Turner
http://synfin.net/         Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
    -- Benjamin Franklin
"carpe diem quam minimum credula postero"


Re: commitlog replay missing data

2011-07-13 Thread Peter Schuller
> Recently upgraded to 0.8.1 and noticed what seems to be missing data after a
> commitlog replay on a single-node cluster. I start the node, insert a bunch
> of stuff (~600MB), stop it, and restart it. There are log messages

If you stop by a kill, make sure you use batched commitlog synch mode
instead of periodic if you want guarantees on individual writes.

(I don't believe you'd expect a significant disk space discrepancy
though since in practice the delay until write() should be small. But
don't quote me on this because I'd have to check the code to make sure
that commit log reply isn't dependent on some marker that isn't
written until commit log synch.)

-- 
/ Peter Schuller (@scode on twitter)


Re: commitlog replay missing data

2011-07-13 Thread mcasandra

Peter Schuller wrote:
> 
>> Recently upgraded to 0.8.1 and noticed what seems to be missing data
>> after a
>> commitlog replay on a single-node cluster. I start the node, insert a
>> bunch
>> of stuff (~600MB), stop it, and restart it. There are log messages
> 
> If you stop by a kill, make sure you use batched commitlog synch mode
> instead of periodic if you want guarantees on individual writes.
> 

What are the other ways to stop Cassandra?

What's the difference between batch vs periodic?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/commitlog-replay-missing-data-tp6573659p6580886.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: commitlog replay missing data

2011-07-13 Thread Peter Schuller
> What are the other ways to stop Cassandra?

nodetool disablegossip
nodetool disablethrift
# wait for a bit until no one is sending it writes anymore
nodetool flush # only relevant if in periodic mode
# then kill it

> What's the difference between batch vs periodic?

Search for "batch" on http://wiki.apache.org/cassandra/StorageConfiguration

-- 
/ Peter Schuller (@scode on twitter)


Re: commitlog replay missing data

2011-07-13 Thread Peter Schuller
> # wait for a bit until no one is sending it writes anymore

More accurately, until all other nodes have realized it's down
(nodetool ring on each respective host).

-- 
/ Peter Schuller (@scode on twitter)


R: Re: Re: Re: Re: AntiEntropy?

2011-07-13 Thread cbert...@libero.it
>Note that if GCGraceSeconds is 10 days, you want to run repair often
>enough that there will never be a moment where there is more than
>exactly 10 days since the last successfully completed repair
>*STARTED*.

>When scheduling repairs, factor in things like - what happens if
>repair fails? Who gets alerted and how, and will there be time to fix
>the problem? How long does repair take?

Peter thanks for the tip. I'm still very surprised for what I've read in the 
book about the repair.
Best Regards

Carlo


Replicating to all nodes

2011-07-13 Thread Kyle Gibson
I am wondering if the following cluster figuration is possible with
cassandra, and if so, how it could be achieved. Please also feel free
to point out any issues that may make this configuration undesired
that I may not have thought of.

Suppose a cluster of N nodes.

Each node replicates the data of all other nodes.

Read and write operations should succeed even if only 1 node is online.

When a read is performed, it is performed against all active nodes.

When a write is performed, it is performed against all active nodes,
inactive/offline nodes are updated when they come back online.

Would this involve a new ConsistencyLevel, e.g.
ConsistentLevel.Active? Does a facility exist which could mimic this
behavior?

I don't believe it does. Currently the replication factor is hard
coded based on key space, not a function of the number of nodes in the
cluster. You could say, if N = 7, configure replication factor = 7,
but then if only 6 nodes are online, writes would fail. Is this
correct?


JDBC CQL Driver unable to locate cassandra.yaml

2011-07-13 Thread Derek Tracy
I am trying to integrate the Cassandra JDBC CQL driver with my companies ETL
product.
We have an interface that performs database queries using there respective
JDBC drivers.
When I try to use the Cassandra CQL JDBC driver I keep getting a stacktrace:

Unable to locate cassandra.yaml

I am using Cassandra 0.8.1.  Is there a guide on how to utilize/setup the
JDBC driver?



Derek Tracy
trac...@gmail.com
-


Re: JDBC CQL Driver unable to locate cassandra.yaml

2011-07-13 Thread Jonathan Ellis
The current version of the driver does require having the server's
cassandra.yaml on the classpath.  This is a bug.

On Wed, Jul 13, 2011 at 3:13 PM, Derek Tracy  wrote:
> I am trying to integrate the Cassandra JDBC CQL driver with my companies ETL
> product.
> We have an interface that performs database queries using there respective
> JDBC drivers.
> When I try to use the Cassandra CQL JDBC driver I keep getting a stacktrace:
>
> Unable to locate cassandra.yaml
>
> I am using Cassandra 0.8.1.  Is there a guide on how to utilize/setup the
> JDBC driver?
>
>
>
> Derek Tracy
> trac...@gmail.com
> -
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Replicating to all nodes

2011-07-13 Thread Peter Schuller
> Read and write operations should succeed even if only 1 node is online.
>
> When a read is performed, it is performed against all active nodes.

Using QUORUM is the closest thing you get for reads without modifying
Cassandra. You can't make it wait for all nodes that happen to be up.

> When a write is performed, it is performed against all active nodes,
> inactive/offline nodes are updated when they come back online.

Writes always go to all nodes that are up, but if you want to wait for
them before returning "OK" to the client than no - except CL.ALL
(which means you don't survive one being down) and CL.QUORUM (which
means you don't wait for all if all are up).

> I don't believe it does. Currently the replication factor is hard
> coded based on key space, not a function of the number of nodes in the
> cluster. You could say, if N = 7, configure replication factor = 7,
> but then if only 6 nodes are online, writes would fail. Is this
> correct?

No. Reads/write fail according to the consistency level. The RF +
consistency level tells how many nodes must be up and successfully
service the request in order for the operation to succeed. RF just
tells you the number of total nodes int he replicate set for a key;
whether an operation fails is up to the consistency level.

I would ask: Why are you trying to do this? It really seems you're
trying to do the "wrong" thing. Why would you ever want to replicate
to all? If you want 3 copies in total, then do RF=3 and keep a 3 node
ring. If you need more capacity, you add nodes and retain RF. If you
need more redundancy, you have to increase RF. Those are two very
different axis along which to scale. I cannot think of any reason why
you would want to tie RF to the total number of nodes.

What is the goal you're trying to achieve?

-- 
/ Peter Schuller (@scode on twitter)


Question about compaction

2011-07-13 Thread Sameer Farooqui
Running Cassandra 0.8.1. Ran major compaction via:

sudo /home/ubuntu/brisk/resources/cassandra/bin/nodetool -h localhost
compact &

>From what I'd read about Cassandra, I thought that after compaction all of
the different SSTables on disk for a Column Family would be merged into one
new file.

However, there are now a bunch of 0-sized Compacted files and a bunch of
Data files. Any ideas about why there are still so many files left?

Also, is a minor compaction the same thing as a read-only compaction in 0.7?


ubuntu@domU-12-31-39-0E-x-x:/raiddrive/data/DemoKS$ ls -l
total 270527136
-rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5670-Compacted
-rw-r--r-- 1 root root  89457447799 2011-07-10 00:26 DemoCF-g-5670-Data.db
-rw-r--r-- 1 root root   193456 2011-07-10 00:26 DemoCF-g-5670-Filter.db
-rw-r--r-- 1 root root  2081159 2011-07-10 00:26 DemoCF-g-5670-Index.db
-rw-r--r-- 1 root root 4276 2011-07-10 00:26
DemoCF-g-5670-Statistics.db
-rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5686-Compacted
-rw-r--r-- 1 root root920521489 2011-07-09 22:03 DemoCF-g-5686-Data.db
-rw-r--r-- 1 root root11776 2011-07-09 22:03 DemoCF-g-5686-Filter.db
-rw-r--r-- 1 root root   126725 2011-07-09 22:03 DemoCF-g-5686-Index.db
-rw-r--r-- 1 root root 4276 2011-07-09 22:03
DemoCF-g-5686-Statistics.db
-rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5781-Compacted
-rw-r--r-- 1 root root223970446 2011-07-09 22:38 DemoCF-g-5781-Data.db
-rw-r--r-- 1 root root 7216 2011-07-09 22:38 DemoCF-g-5781-Filter.db
-rw-r--r-- 1 root root32750 2011-07-09 22:38 DemoCF-g-5781-Index.db
-rw-r--r-- 1 root root 4276 2011-07-09 22:38
DemoCF-g-5781-Statistics.db
-rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5874-Compacted
-rw-r--r-- 1 root root156284248 2011-07-09 23:20 DemoCF-g-5874-Data.db
-rw-r--r-- 1 root root 5056 2011-07-09 23:20 DemoCF-g-5874-Filter.db
-rw-r--r-- 1 root root10400 2011-07-09 23:20 DemoCF-g-5874-Index.db
-rw-r--r-- 1 root root 4276 2011-07-09 23:20
DemoCF-g-5874-Statistics.db
-rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-6938-Compacted
-rw-r--r-- 1 root root  22947541446 2011-07-10 11:43 DemoCF-g-6938-Data.db
-rw-r--r-- 1 root root49936 2011-07-10 11:43 DemoCF-g-6938-Filter.db
-rw-r--r-- 1 root root   563550 2011-07-10 11:43 DemoCF-g-6938-Index.db
-rw-r--r-- 1 root root 4276 2011-07-10 11:43
DemoCF-g-6938-Statistics.db
-rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-6996-Compacted
-rw-r--r-- 1 root root224253930 2011-07-10 11:28 DemoCF-g-6996-Data.db
-rw-r--r-- 1 root root 7216 2011-07-10 11:27 DemoCF-g-6996-Filter.db
-rw-r--r-- 1 root root26250 2011-07-10 11:28 DemoCF-g-6996-Index.db
-rw-r--r-- 1 root root 4276 2011-07-10 11:28
DemoCF-g-6996-Statistics.db
-rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-8324-Compacted


Re: How to remove/add node

2011-07-13 Thread Sameer Farooqui
As long as you have no data in this cluster, try clearing out the
/var/lib/cassandra directory from all nodes and restart Cassandra.

The only way to change tokens after they've been set is using a nodetool
move  or clearing /var/lib/cassandra.


On Wed, Jul 13, 2011 at 7:41 AM, Abdul Haq Shaik <
abdulsk.cassan...@gmail.com> wrote:

> Hi,
>
> I have deleted the data, commitlog and saved cache directories. I have
> removed one of the nodes from the seeds of cassandra.yaml. When i tried to
> use nodetool, itshowing the removed node as up..
>
> Thanks,
>
> Abdul
>


Re: Replicating to all nodes

2011-07-13 Thread Kyle Gibson
Thanks for the reply Peter.

The goal is to configure a cluster in which reads and writes can
complete successfully even if only 1 node is online. For this to work,
each node would need the entire dataset. Your example of a 3 node ring
with RF=3 would satisfy this requirement. However, if two nodes are
offline, CL.QUORUM would not work, I would need to use CL.ONE. If all
3 nodes are online, CL.ONE is undershooting, I would want to use
CL.QUORUM (or maybe CL.ALL). Or does CL.ONE actually function this
way, somewhat?

A complication occurs when you want to add another node. Now there's a
4 node ring, but only 3 replicas, so each node isn't guaranteed to
have all of the data, so the cluster can't completely function when
N-1 nodes are offline. So this is why I would like to have the RF
scale relative to the size of the cluster. Am I mistaken?

Thanks!

On Wed, Jul 13, 2011 at 6:41 PM, Peter Schuller
 wrote:
>> Read and write operations should succeed even if only 1 node is online.
>>
>> When a read is performed, it is performed against all active nodes.
>
> Using QUORUM is the closest thing you get for reads without modifying
> Cassandra. You can't make it wait for all nodes that happen to be up.
>
>> When a write is performed, it is performed against all active nodes,
>> inactive/offline nodes are updated when they come back online.
>
> Writes always go to all nodes that are up, but if you want to wait for
> them before returning "OK" to the client than no - except CL.ALL
> (which means you don't survive one being down) and CL.QUORUM (which
> means you don't wait for all if all are up).
>
>> I don't believe it does. Currently the replication factor is hard
>> coded based on key space, not a function of the number of nodes in the
>> cluster. You could say, if N = 7, configure replication factor = 7,
>> but then if only 6 nodes are online, writes would fail. Is this
>> correct?
>
> No. Reads/write fail according to the consistency level. The RF +
> consistency level tells how many nodes must be up and successfully
> service the request in order for the operation to succeed. RF just
> tells you the number of total nodes int he replicate set for a key;
> whether an operation fails is up to the consistency level.
>
> I would ask: Why are you trying to do this? It really seems you're
> trying to do the "wrong" thing. Why would you ever want to replicate
> to all? If you want 3 copies in total, then do RF=3 and keep a 3 node
> ring. If you need more capacity, you add nodes and retain RF. If you
> need more redundancy, you have to increase RF. Those are two very
> different axis along which to scale. I cannot think of any reason why
> you would want to tie RF to the total number of nodes.
>
> What is the goal you're trying to achieve?
>
> --
> / Peter Schuller (@scode on twitter)
>


cassandra goes infinite loop and data lost.....

2011-07-13 Thread Yan Chunlu
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 0 of 2147483647: 100zs:false:14@1310168625866434


Re: cassandra goes infinite loop and data lost.....

2011-07-13 Thread Yan Chunlu
I gave cassandra 8GB heap size and somehow it run out of memory and crashed.
after I start it, it just runs in to the following infinite loop, the last
line:
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 0 of 2147483647: 100zs:false:14@1310168625866434

goes for ever

I have 3 nodes and RF=2, so I am losing data. is that means I am screwed and
can't get it back?

 DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
collecting 20 of 2147483647: q74k:false:14@1308886095008943
DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
collecting 0 of 2147483647: 10fbu:false:1@1310223075340297
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 0 of 2147483647: apbg:false:13@1305641597957086
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 1 of 2147483647: auje:false:13@1305641597957075
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 2 of 2147483647: ayj8:false:13@1305641597957060
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 3 of 2147483647: b4fz:false:13@1305641597957096
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 0 of 2147483647: 100zs:false:14@1310168625866434
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 1 of 2147483647: 1017f:false:14@1310168680375612
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 2 of 2147483647: 1018e:false:14@1310168759614715
DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123)
collecting 3 of 2147483647: 101dd:false:14@1310169260225339


On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu  wrote:

> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434




-- 
闫春路


Re: cassandra goes infinite loop and data lost.....

2011-07-13 Thread Bret Palsson
How much total memory does your machine have?

-- 
Bret


On Wednesday, July 13, 2011 at 9:27 PM, Yan Chunlu wrote:

> I gave cassandra 8GB heap size and somehow it run out of memory and crashed. 
> after I start it, it just runs in to the following infinite loop, the last 
> line:
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) 
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
> 
> goes for ever 
> 
> I have 3 nodes and RF=2, so I am losing data. is that means I am screwed and 
> can't get it back? 
> 
> DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) 
> collecting 20 of 2147483647: q74k:false:14@1308886095008943 
> DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) 
> collecting 0 of 2147483647: 10fbu:false:1@1310223075340297
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) 
> collecting 0 of 2147483647: apbg:false:13@1305641597957086
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) 
> collecting 1 of 2147483647: auje:false:13@1305641597957075
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) 
> collecting 2 of 2147483647: ayj8:false:13@1305641597957060
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) 
> collecting 3 of 2147483647: b4fz:false:13@1305641597957096
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) 
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) 
> collecting 1 of 2147483647: 1017f:false:14@1310168680375612
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) 
> collecting 2 of 2147483647: 1018e:false:14@1310168759614715
> DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123) 
> collecting 3 of 2147483647: 101dd:false:14@1310169260225339
> 
> 
> On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu  (mailto:springri...@gmail.com)> wrote:
> >  DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) 
> > collecting 0 of 2147483647 (tel:2147483647): 
> > 100zs:false:14@1310168625866434 
> 
> 
> -- 
> 闫春路



Re: cassandra goes infinite loop and data lost.....

2011-07-13 Thread Yan Chunlu
16GB

On Thu, Jul 14, 2011 at 11:29 AM, Bret Palsson  wrote:

>  How much total memory does your machine have?
>
> --
> Bret
>
> On Wednesday, July 13, 2011 at 9:27 PM, Yan Chunlu wrote:
>
> I gave cassandra 8GB heap size and somehow it run out of memory and
> crashed. after I start it, it just runs in to the following infinite loop,
> the last line:
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
>
> goes for ever
>
> I have 3 nodes and RF=2, so I am losing data. is that means I am screwed
> and can't get it back?
>
>  DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
> collecting 20 of 2147483647: q74k:false:14@1308886095008943
> DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 10fbu:false:1@1310223075340297
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: apbg:false:13@1305641597957086
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 1 of 2147483647: auje:false:13@1305641597957075
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 2 of 2147483647: ayj8:false:13@1305641597957060
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 3 of 2147483647: b4fz:false:13@1305641597957096
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 1 of 2147483647: 1017f:false:14@1310168680375612
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 2 of 2147483647: 1018e:false:14@1310168759614715
> DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123)
> collecting 3 of 2147483647: 101dd:false:14@1310169260225339
>
>
> On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu wrote:
>
>  DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
>
>
>
>
> --
> 闫春路
>
>
>


-- 
Charles


Re: cassandra goes infinite loop and data lost.....

2011-07-13 Thread Yan Chunlu
problem is I can't take cassandra back does that because not enough
memory for cassandra?

On Thu, Jul 14, 2011 at 11:29 AM, Bret Palsson  wrote:

> How much total memory does your machine have?
>
> --
> Bret
>
> On Wednesday, July 13, 2011 at 9:27 PM, Yan Chunlu wrote:
>
> I gave cassandra 8GB heap size and somehow it run out of memory and
> crashed. after I start it, it just runs in to the following infinite loop,
> the last line:
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
>
> goes for ever
>
> I have 3 nodes and RF=2, so I am losing data. is that means I am screwed
> and can't get it back?
>
>  DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
> collecting 20 of 2147483647: q74k:false:14@1308886095008943
> DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 10fbu:false:1@1310223075340297
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: apbg:false:13@1305641597957086
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 1 of 2147483647: auje:false:13@1305641597957075
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 2 of 2147483647: ayj8:false:13@1305641597957060
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 3 of 2147483647: b4fz:false:13@1305641597957096
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 1 of 2147483647: 1017f:false:14@1310168680375612
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 2 of 2147483647: 1018e:false:14@1310168759614715
> DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123)
> collecting 3 of 2147483647: 101dd:false:14@1310169260225339
>
>
> On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu wrote:
>
>  DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
>
>
>
>
> --
> 闫春路
>
>
>


-- 
闫春路


Re: Replicating to all nodes

2011-07-13 Thread Maki Watanabe
Consistency and Availability are in trade-off each other.
If you use RF=7 + CL=ONE, your read/write will success if you have one
node alive during replicate data to 7 nodes.
Of course you will have a chance to read old data in this case.
If you need strong consistency, you must use CL=QUORUM.

maki


2011/7/14 Kyle Gibson :
> Thanks for the reply Peter.
>
> The goal is to configure a cluster in which reads and writes can
> complete successfully even if only 1 node is online. For this to work,
> each node would need the entire dataset. Your example of a 3 node ring
> with RF=3 would satisfy this requirement. However, if two nodes are
> offline, CL.QUORUM would not work, I would need to use CL.ONE. If all
> 3 nodes are online, CL.ONE is undershooting, I would want to use
> CL.QUORUM (or maybe CL.ALL). Or does CL.ONE actually function this
> way, somewhat?
>
> A complication occurs when you want to add another node. Now there's a
> 4 node ring, but only 3 replicas, so each node isn't guaranteed to
> have all of the data, so the cluster can't completely function when
> N-1 nodes are offline. So this is why I would like to have the RF
> scale relative to the size of the cluster. Am I mistaken?
>
> Thanks!
>
> On Wed, Jul 13, 2011 at 6:41 PM, Peter Schuller
>  wrote:
>>> Read and write operations should succeed even if only 1 node is online.
>>>
>>> When a read is performed, it is performed against all active nodes.
>>
>> Using QUORUM is the closest thing you get for reads without modifying
>> Cassandra. You can't make it wait for all nodes that happen to be up.
>>
>>> When a write is performed, it is performed against all active nodes,
>>> inactive/offline nodes are updated when they come back online.
>>
>> Writes always go to all nodes that are up, but if you want to wait for
>> them before returning "OK" to the client than no - except CL.ALL
>> (which means you don't survive one being down) and CL.QUORUM (which
>> means you don't wait for all if all are up).
>>
>>> I don't believe it does. Currently the replication factor is hard
>>> coded based on key space, not a function of the number of nodes in the
>>> cluster. You could say, if N = 7, configure replication factor = 7,
>>> but then if only 6 nodes are online, writes would fail. Is this
>>> correct?
>>
>> No. Reads/write fail according to the consistency level. The RF +
>> consistency level tells how many nodes must be up and successfully
>> service the request in order for the operation to succeed. RF just
>> tells you the number of total nodes int he replicate set for a key;
>> whether an operation fails is up to the consistency level.
>>
>> I would ask: Why are you trying to do this? It really seems you're
>> trying to do the "wrong" thing. Why would you ever want to replicate
>> to all? If you want 3 copies in total, then do RF=3 and keep a 3 node
>> ring. If you need more capacity, you add nodes and retain RF. If you
>> need more redundancy, you have to increase RF. Those are two very
>> different axis along which to scale. I cannot think of any reason why
>> you would want to tie RF to the total number of nodes.
>>
>> What is the goal you're trying to achieve?
>>
>> --
>> / Peter Schuller (@scode on twitter)
>>
>



-- 
w3m


Re: Survey: Cassandra/JVM Resident Set Size increase

2011-07-13 Thread Zhu Han
On Wed, Jul 13, 2011 at 9:45 PM, Konstantin Naryshkin
wrote:

> Do you mean that it is using all of the available heap? That is the
> expected behavior of most long running Java applications. The JVM will not
> GC until it needs memory (or you explicitly ask it to) and will only free up
> a bit of memory at a time. That is very good behavior from a performance
> stand point since frequent, large GCs would make your application very
> unresponsive. It also makes Java applications take up all the memory you
> give them.
>
> - Original Message -
> From: "Sasha Dolgy" 
> To: user@cassandra.apache.org
> Sent: Tuesday, July 12, 2011 10:23:02 PM
> Subject: Re: Survey: Cassandra/JVM Resident Set Size increase
>
> I'll post more tomorrow ... However, we set up one node in a single node
> cluster and have left it with no datareviewing memory consumption
> graphs...it increased daily until it gobbled (highly technical term) all
> memory...the system is now running just below 100% memory usagewhich i
> find peculiar seeings that it is doing nothingwith no data and
> no peers.
> On Jul 12, 2011 3:29 PM, "Chris Burroughs" 
> wrote:
> > ### Preamble
> >
> > There have been several reports on the mailing list of the JVM running
> > Cassandra using "too much" memory. That is, the resident set size is
> >>>(max java heap size + mmaped segments) and continues to grow until the
> > process swaps, kernel oom killer comes along, or performance just
> > degrades too far due to the lack of space for the page cache. It has
> > been unclear from these reports if there is a pattern. My hope here is
> > that by comparing JVM versions, OS versions, JVM configuration etc., we
> > will find something. Thank you everyone for your time.
> >
> >
> > Some example reports:
> > - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html
> > -
> >
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html
> > - https://issues.apache.org/jira/browse/CASSANDRA-2868
> > -
> >
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html
> > -
> >
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html
> >
> > For reference theories include (in no particular order):
> > - memory fragmentation
> > - JVM bug
> > - OS/glibc bug
> > - direct memory
> > - swap induced fragmentation
> > - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity.
> >
> > ### Survey
> >
> > 1. Do you think you are experiencing this problem?
>

Yes.


> >
> > 2. Why? (This is a good time to share a graph like
> > http://www.twitpic.com/5fdabn or
> > http://img24.imageshack.us/img24/1754/cassandrarss.png)
>

I observe  the RSS of cassandra process keeps going up to dozens of
gigabytes, even if the dataset (sstables) is just hundreds of megabytes.

> >
> > 2. Are you using mmap? (If yes be sure to have read
> > http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have
> > used pmap [or another tool] to rule you mmap and top decieving you.)
>

Yes. pmap tells me a lot of anonymous regions are created and expanded
during the life cycle
of cassandra process. That is is primary reason of RSS occupy. I'm pretty
these anonymous regions are  not the Java heap used by JVM, as they are not
continuous.

>
> > 3. Are you using JNA? Was mlockall succesful (it's in the logs on
> startup)?
>

Yes. mlockall is successful either. I have not tried other settings.


> >
> > 4. Is swap enabled? Are you swapping?
>

No. Swap is disabled.


> >
> > 5. What version of Apache Cassandra are you using?
>

0.6.13


> >
> > 6. What is the earliest version of Apache Cassandra you recall seeing
> > this problem with?
>

Earlier version of 0.6.x branch.


> >
> > 7. Have you tried the patch from CASSANDRA-2654 ?
>

Not yet, as I do not query large datasets.


> >
> > 8. What jvm and version are you using?
>

"java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)"

I also tried openJDK.


>
> > 9. What OS and version are you using?
>

 The kernel version is "2.6.18-194.26.1.el5.028stab079.2", which is from
CentOS 5.4

The user level environment is Ubuntu 10.04 (Lucid) server edition.  This
strange combination is because cassandra runs inside OpenVZ container
(Ubuntu 10.04) above Cent OS host.

I am afraid the old kernel caused the memory fragmentation of cassandra
process. But I can not prove it as I did not try it on latest kernel.

>
> > 10. What are your jvm flags?
>

Both CMS and parallel old GC can observe the problem. These are the flags
used:

"-ea -Xms3G-Xmx3G -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8  -XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFractio

Re: cassandra goes infinite loop and data lost.....

2011-07-13 Thread Jonathan Ellis
That says "I'm collecting data to answer requests."

I don't see anything here that indicates an infinite loop.

I do see that it's saying "N of 2147483647" which looks like you're
doing slices with a much larger limit than is advisable (good way to
OOM the way you already did).

On Wed, Jul 13, 2011 at 8:27 PM, Yan Chunlu  wrote:
> I gave cassandra 8GB heap size and somehow it run out of memory and crashed.
> after I start it, it just runs in to the following infinite loop, the last
> line:
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
> goes for ever
> I have 3 nodes and RF=2, so I am losing data. is that means I am screwed and
> can't get it back?
> DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
> collecting 20 of 2147483647: q74k:false:14@1308886095008943
> DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 10fbu:false:1@1310223075340297
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: apbg:false:13@1305641597957086
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 1 of 2147483647: auje:false:13@1305641597957075
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 2 of 2147483647: ayj8:false:13@1305641597957060
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 3 of 2147483647: b4fz:false:13@1305641597957096
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 1 of 2147483647: 1017f:false:14@1310168680375612
> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> collecting 2 of 2147483647: 1018e:false:14@1310168759614715
> DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123)
> collecting 3 of 2147483647: 101dd:false:14@1310169260225339
>
> On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu  wrote:
>>
>> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
>> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
>
>
> --
> 闫春路
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


RE: JDBC CQL Driver unable to locate cassandra.yaml

2011-07-13 Thread Vivek Mishra
setting server.config ->$SERVER_PATH/Cassandra.yaml  as a system property 
should resolve this?

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Thursday, July 14, 2011 3:53 AM
To: user@cassandra.apache.org
Subject: Re: JDBC CQL Driver unable to locate cassandra.yaml

The current version of the driver does require having the server's 
cassandra.yaml on the classpath.  This is a bug.

On Wed, Jul 13, 2011 at 3:13 PM, Derek Tracy  wrote:
> I am trying to integrate the Cassandra JDBC CQL driver with my
> companies ETL product.
> We have an interface that performs database queries using there
> respective JDBC drivers.
> When I try to use the Cassandra CQL JDBC driver I keep getting a stacktrace:
>
> Unable to locate cassandra.yaml
>
> I am using Cassandra 0.8.1.  Is there a guide on how to utilize/setup
> the JDBC driver?
>
>
>
> Derek Tracy
> trac...@gmail.com
> -
>
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support 
http://www.datastax.com



Register for Impetus Webinar on ‘Device Side Performance Optimization of Mobile 
Apps’, July 08 (10:00 am Pacific Time). Impetus is presenting a Cassandra case 
study on July 11 as a sponsor for Cassandra SF 2011 in San Francisco.

Click http://www.impetus.com to know more. Follow us on 
www.twitter.com/impetuscalling


NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


Re: cassandra goes infinite loop and data lost.....

2011-07-13 Thread Yan Chunlu
okay, I am not sure if it is infinite loop, I change log4j to "DEBUG" only
because cassandra never get online after run cassandra, it seems just halt.
 I enable debug then it start showing those message very fast and never end.

I have just run nodetool cleanup, and it start reading commitlog, seems
normal now.

thanks for the help, I am really newbie on cassandra and has no idea how
does slice works, could you give me more information? thanks alot!

On Thu, Jul 14, 2011 at 1:36 PM, Jonathan Ellis  wrote:

> That says "I'm collecting data to answer requests."
>
> I don't see anything here that indicates an infinite loop.
>
> I do see that it's saying "N of 2147483647" which looks like you're
> doing slices with a much larger limit than is advisable (good way to
> OOM the way you already did).
>
> On Wed, Jul 13, 2011 at 8:27 PM, Yan Chunlu  wrote:
> > I gave cassandra 8GB heap size and somehow it run out of memory and
> crashed.
> > after I start it, it just runs in to the following infinite loop, the
> last
> > line:
> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> > collecting 0 of 2147483647: 100zs:false:14@1310168625866434
> > goes for ever
> > I have 3 nodes and RF=2, so I am losing data. is that means I am screwed
> and
> > can't get it back?
> > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
> > collecting 20 of 2147483647: q74k:false:14@1308886095008943
> > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123)
> > collecting 0 of 2147483647: 10fbu:false:1@1310223075340297
> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> > collecting 0 of 2147483647: apbg:false:13@1305641597957086
> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> > collecting 1 of 2147483647: auje:false:13@1305641597957075
> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> > collecting 2 of 2147483647: ayj8:false:13@1305641597957060
> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> > collecting 3 of 2147483647: b4fz:false:13@1305641597957096
> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> > collecting 0 of 2147483647: 100zs:false:14@1310168625866434
> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> > collecting 1 of 2147483647: 1017f:false:14@1310168680375612
> > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> > collecting 2 of 2147483647: 1018e:false:14@1310168759614715
> > DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123)
> > collecting 3 of 2147483647: 101dd:false:14@1310169260225339
> >
> > On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu 
> wrote:
> >>
> >> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
> >> collecting 0 of 2147483647: 100zs:false:14@1310168625866434
> >
> >
> > --
> > 闫春路
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
闫春路


Re: How to remove/add node

2011-07-13 Thread Abdul Haq Shaik
Thanks a lot dear. I will try it out and will let you know if the problem
persists.

On Thu, Jul 14, 2011 at 5:52 AM, Sameer Farooqui wrote:

> As long as you have no data in this cluster, try clearing out the
> /var/lib/cassandra directory from all nodes and restart Cassandra.
>
> The only way to change tokens after they've been set is using a nodetool
> move  or clearing /var/lib/cassandra.
>
>
>
> On Wed, Jul 13, 2011 at 7:41 AM, Abdul Haq Shaik <
> abdulsk.cassan...@gmail.com> wrote:
>
>> Hi,
>>
>> I have deleted the data, commitlog and saved cache directories. I have
>> removed one of the nodes from the seeds of cassandra.yaml. When i tried to
>> use nodetool, itshowing the removed node as up..
>>
>> Thanks,
>>
>> Abdul
>>
>
>