DataStax Brisk

2011-06-30 Thread Sasha Dolgy
How far behind is Brisk from the Cassandra release cycle?  If 0.8.1 of
Cassandra was released yesterday, when ( if it isn't already ) will
the Brisk distribution implement 0.8.1?

-sd

-- 
Sasha Dolgy
sasha.do...@gmail.com


word_count example in Cassandra 0.8.0

2011-06-30 Thread Markus Mock
Hello,

I am running into the following problem:  I am running a single node
cassandra setup (out of the box so to speak) and was trying out the code
in apache-cassandra-0.8.0-src/examples/hadoop_word_count.

The bin/word_count_setup seems to work fine as cassandra-cli reports that
there are 1000 rows when I do

list input_words limit 2000;
(after connecting via connect 127.0.01/9160 and use wordcount;)

However, after running bin/word_count it seems the reducer is not writing
into cassandra as list output_words returns 0 rows.
When setting the output reducer to filesystem I get results in
/tmp/word_count0 /tmp/word_count1 etc .

Has anybody observed the same problem and has an idea what might be wrong?

Thanks.

  -- Markus


Re : Re : Re : get_range_slices result

2011-06-30 Thread karim abbouh
what i want is that i get the records in the same order wich they were inserted.
how can i get this using any type of comparator type
if there is a code java for this it can be useful.




De : aaron morton 
À : user@cassandra.apache.org
Envoyé le : Mardi 28 Juin 2011 12h40
Objet : Re: Re : Re : get_range_slices result


First thing is you really should upgrade from 0.6, the current release is 0.8. 

Info on time uuid's
http://wiki.apache.org/cassandra/FAQ#working_with_timeuuid_in_java

If you are using a higher level client like Hector or Pelops it will take care 
of encoding for you. 

Cheers


-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com 

On 28 Jun 2011, at 22:20, karim abbouh wrote:

can i have an example for using    TimeUUIDType   as comparator in a client  
java code.
>
>
>
>
>
>De : karim abbouh 
>À : "user@cassandra.apache.org" 
>Envoyé le : Lundi 27 Juin 2011 17h59
>Objet : Re : Re : get_range_slices result
>
>
>i used TimeUUIDType as type in storage-conf.xml file
>
> 
>
>
>and i used it as comparator in my java code,
>but in the execution i get exception : 
>
>Erreur --java.io.UnsupportedEncodingException: TimeUUIDType
>
>
>
>
>how can i write it?
>
>
>BR
>
>
>
>
>
>De : David Boxenhorn 
>À : user@cassandra.apache.org
>Cc : karim abbouh 
>Envoyé le : Vendredi 24 Juin 2011 11h25
>Objet : Re: Re : get_range_slices result
>
>You can get the best of both worlds by repeating the key in a column,
>and creating a secondary index on that column.
>
>On Fri, Jun 24, 2011 at 1:16 PM, Sylvain Lebresne  wrote:
>> On Fri, Jun 24, 2011 at 10:21 AM, karim abbouh  wrote:
>>> i want get_range_slices() function returns records sorted(orded) 
 by the
>>> key(rowId) used during the insertion.
>>> is
 it possible?
>>
>> You will have to use the OrderPreservingPartitioner. This is no
>> without inconvenience however.
>> See for instance
>> http://wiki.apache.org/cassandra/StorageConfiguration#line-100 or
>> http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/
>> that give more details on the pros and cons (the short version being
>> that the main advantage of
>> OrderPreservingPartitioner is what you're asking for, but it's main
>> drawback is that load-balancing
>> the cluster will likely be very very hard).
>>
>> In general the advice is to stick with RandomPartitioner and design a
>> data
 model that avoids
 needing
>> range slices (or at least needing that the result is sorted). This is
>> very often not too hard and more
>> efficient, and much more simpler than to deal with the load balancing
>> problems of OrderPreservingPartitioner.
>>
>> --
>> Sylvain
>>
>>>
>>> 
>>> De : aaron morton 
>>> À : user@cassandra.apache.org
>>> Envoyé le : Jeudi 23 Juin 2011 20h30
>>> Objet : Re: get_range_slices result
>>>
>>> Not sure what your question is.
>>> Does this help ? http://wiki.apache.org/cassandra/FAQ#range_rp
>>> Cheers
>>> -
>>> Aaron Morton
>>> Freelance Cassandra Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> On 23 Jun 2011, at 21:59, karim abbouh wrote:
>>>
>>> how can get_range_slices() function returns sorting key ?
>>> BR
>>>
>>>
>>>
>>>
>>
>
>
>
>
>

Row cache

2011-06-30 Thread Shay Assulin
Hi,

I am running Cassandra 0.7.4 and I monitor the nodes using JConsole. 

I am trying to figure out  the location Cassandra read the returned rows 
and there are few strange things... 

1. I am reading few rows (using Hector) and the 
org.apache.cassandra.db.ColumnFamilies...ReadCount 
remains 0 -  It remains 0 with MEMTable and after flushing the MEMTable.
2. The column family  is configured to run with row-cache and key-cache 
and although I am reading the same row over and over the row-cache 
size/requests remains 0. The key-cache size/requests attributes are 
changed.





Why Cassandra does not cache a row that was requested few times?
What the ReadCount attribute in ColumnFamilies indicates and why it 
remains zero.
How can I know from where Cassandra read a row (from MEMTable,RowCache or 
SSTable)?
does the following correct? In read operation Cassandra looks for the row 
in the MEMTable - if not found it looks in the row-cache - if not found it 
looks in SSTable (after looking in the key-cache to optimize the access to 
the SSTable)? 

10x



Re: hadoop results

2011-06-30 Thread William Oberman
I think I'll do the former, thanks!

On Wed, Jun 29, 2011 at 11:16 PM, aaron morton wrote:

> How about  get_slice() with reversed == true and count = 1 to get the
> highest time UUID ?
>
> Or you can also store a column with a magic name that have the value of the
> timeuuid that is the current metric to use.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 30 Jun 2011, at 06:35, William Oberman wrote:
>
> > I'll start with my question: given a CF with comparator TimeUUIDType,
> what is the most efficient way to get the greatest column's value?
> >
> > Context: I've been running cassandra for a couple of months now, so
> obviously it's time to start layering more on top :-)  In my test
> environment, I managed to get pig/hadoop running, and developed a few
> scripts to collect metrics I've been missing since I switched from MySQL to
> cassandra (including the ever useful "select count(*) from table"
> equivalent).
> >
> > I was hoping to dump the results of this processing back into cassandra
> for use in other tools/processes.  My initial thought was: new CF called
> "stats" with comparator TimeUUIDType.  The basic idea being I'd store:
> > stat_name -> time stat was computed (as UUID) -> value
> > That way I can also see a historical perspective of any given stat for
> auditing (and for cumulative stats to see trends).  The stat_name itself is
> a URI that is composed of "what" and any constraints on the "what"
> (including an optional time range, if the stat supports it).  E.g.
> ClassOfSomething/ID/MetricName/OptionalTimeRange (or something, still
> deciding on the format of the URI).  But, right now, the only way I know to
> get the "current" stat value would be to iterate over all columns (the
> TimeUUIDs) and then return the last one.
> >
> > Thanks for any tips,
> >
> > will
>
>


-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com


Re: word_count example in Cassandra 0.8.0

2011-06-30 Thread Jonathan Ellis
fixed in 0.8.1.  https://issues.apache.org/jira/browse/CASSANDRA-2727

On Thu, Jun 30, 2011 at 3:09 AM, Markus Mock  wrote:
> Hello,
> I am running into the following problem:  I am running a single node
> cassandra setup (out of the box so to speak) and was trying out the code
> in apache-cassandra-0.8.0-src/examples/hadoop_word_count.
> The bin/word_count_setup seems to work fine as cassandra-cli reports that
> there are 1000 rows when I do
> list input_words limit 2000;
> (after connecting via connect 127.0.01/9160 and use wordcount;)
> However, after running bin/word_count it seems the reducer is not writing
> into cassandra as list output_words returns 0 rows.
> When setting the output reducer to filesystem I get results in
> /tmp/word_count0 /tmp/word_count1 etc .
> Has anybody observed the same problem and has an idea what might be wrong?
> Thanks.
>   -- Markus
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


SimpleAuthenticator

2011-06-30 Thread Earl Barnes
Hi,

I am encountering an error while trying to set up simple authentication in a
test environment.

*BACKGROUND*
*Cassandra Version: ReleaseVersion: 0.7.2-0ubuntu4~lucid1*
*OS Level: Linux cassandra1 2.6.32-32-server #62-Ubuntu SMP Wed Apr 20
22:07:43 UTC 2011 x86_64 GNU/Linux*
*2 node cluster*

Properties file exist in the following directory:
* > /etc/cassandra/access.properties*
* > /etc/cassandra/passwd.properties*
 The *authenticator element* in the */etc/cassandra/cassandra.yaml* file is
set to:
*authenticator: org.apache.cassandra.auth.SimpleAuthenticator*
The *authority element* in the */etc/cassandra/cassandra.yaml *file is set
to:
*authority: org.apache.cassandra.auth.SimpleAuthority*

The *cassandra.in.sh* file located in */usr/share/cassandra* has been
updated to show the location of the properties files in the following
manner:

# Location of access.properties and passwd.properties
JVM_OPTS="
-Dpasswd.properties=/etc/cassandra/passwd.properties
-Daccess.properties=/etc/cassandra/access.properties"

Also, the destination of the configuration directory:
CASSANDRA_CONF=/etc/cassandra

*ERROR*
After setting DEBUG mode, I get the following error message in the *
system.log*:

 INFO [main] 2011-06-30 10:12:01,365 AbstractCassandraDaemon.java (line 249)
Cassandra shutting down...
 INFO [main] 2011-06-30 10:12:01,366 CassandraDaemon.java (line 159) Stop
listening to thrift clients
 INFO [main] 2011-06-30 10:13:14,186 AbstractCassandraDaemon.java (line 77)
Logging initialized
 INFO [main] 2011-06-30 10:13:14,196 AbstractCassandraDaemon.java (line 97)
Heap size: 510263296/511311872
 WARN [main] 2011-06-30 10:13:14,227 CLibrary.java (line 93) Obsolete
version of JNA present; unable to read errno. Upgrade to JNA 3.2.7 or later
 WARN [main] 2011-06-30 10:13:14,227 CLibrary.java (line 93) Obsolete
version of JNA present; unable to read errno. Upgrade to JNA 3.2.7 or later
 WARN [main] 2011-06-30 10:13:14,228 CLibrary.java (line 125) Unknown
mlockall error 0
 INFO [main] 2011-06-30 10:13:14,234 DatabaseDescriptor.java (line 121)
Loading settings from file:/etc/cassandra/cassandra.yaml
 INFO [main] 2011-06-30 10:13:14,337 DatabaseDescriptor.java (line 181)
DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
ERROR [main] 2011-06-30 10:13:14,342 DatabaseDescriptor.java (line 405)
Fatal configuration error
org.apache.cassandra.config.ConfigurationException: When using
org.apache.cassandra.auth.SimpleAuthenticator passwd.properties properties
must be defined.
at
org.apache.cassandra.auth.SimpleAuthenticator.validateConfiguration(SimpleAuthenticator.java:148)
at
org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:200)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:100)
at
org.apache.cassandra.service.AbstractCassandraDaemon.init(AbstractCassandraDaemon.java:217)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:160)
Data from the *output.log*:

 INFO 10:12:01,365 Cassandra shutting down...
 INFO 10:12:01,366 Stop listening to thrift clients
 INFO 10:13:14,186 Logging initialized
 INFO 10:13:14,196 Heap size: 510263296/511311872
 WARN 10:13:14,227 Obsolete version of JNA present; unable to read errno.
Upgrade to JNA 3.2.7 or later
 WARN 10:13:14,227 Obsolete version of JNA present; unable to read errno.
Upgrade to JNA 3.2.7 or later
 WARN 10:13:14,228 Unknown mlockall error 0
 INFO 10:13:14,234 Loading settings from file:/etc/cassandra/cassandra.yaml
 INFO 10:13:14,337 DiskAccessMode 'auto' determined to be mmap,
indexAccessMode is mmap
ERROR 10:13:14,342 Fatal configuration error
org.apache.cassandra.config.ConfigurationException: When using
org.apache.cassandra.auth.SimpleAuthenticator passwd.properties properties
must be defined.
at
org.apache.cassandra.auth.SimpleAuthenticator.validateConfiguration(SimpleAuthenticator.java:148)
at
org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:200)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:100)
at
org.apache.cassandra.service.AbstractCassandraDaemon.init(AbstractCassandraDaemon.java:217)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:160)
When using org.apache.cassandra.auth.S

RE: custom reconciling columns?

2011-06-30 Thread Jeremiah Jordan
The reason to break it up is that the information will then be on
different servers, so you can have server 1 spending time retrieving row
1, while you have server 2 retrieving row 2, and server 3 retrieving row
3...  So instead of getting 3000 things from one server, you get 1000
from 3 servers in parallel...



From: Yang [mailto:tedd...@gmail.com] 
Sent: Wednesday, June 29, 2011 12:07 AM
To: user@cassandra.apache.org
Subject: Re: custom reconciling columns?


ok, here is the profiling result. I think this is consistent (having
been trying to recover how to effectively use yourkit ...)  see attached
picture 

since I actually do not use the thrift interface, but just directly use
the thrift.CassandraServer and run my code in the same JVM as cassandra,

and was running the whole thing on a single box, there is no message
serialization/deserialization cost. but more columns did add on to more
time.

the time was spent in the ConcurrentSkipListMap operations that
implement the memtable. 


regarding breaking up the row, I'm not sure it would reduce my run time,
since our requirement is to read the entire rolling window history (we
already have 
the TTL enabled , so the history is limited to a certain length, but it
is quite long: over 1000 , in some  cases, can be 5000 or more ) .  I
think accessing roughly 1000 items is not an uncommon requirement for
many applications. in our case, each column has about 30 bytes of data,
besides the meta data such as ttl, timestamp.  
at history length of 3000, the read takes about 12ms (remember this is
completely in-memory, no disk access) 

I just took a look at the expiring column logic, it looks that the
expiration does not come into play until when the
CassandraServer.internal_get()===>thriftifyColumns() gets called. so the
above memtable access time is still spent. yes, then breaking up the row
is going to be helpful, but only to the degree of preventing accessing 
expired columns (btw  if this is actually built into cassandra code
it would be nicer, so instead of spending multiple key lookups, I locate
to the row once, and then within the row, there are different
"generation" buckets, so those old generation buckets that are beyond
expiration are not read ); currently just accessing the 3000 live
columns is already quite slow.

I'm trying to see whether there are some easy magic bullets for a
drop-in replacement for concurrentSkipListMap...

Yang




On Tue, Jun 28, 2011 at 4:18 PM, Nate McCall  wrote:


I agree with Aaron's suggestion on data model and query here.
Since
there is a time component, you can split the row on a fixed
duration
for a given user, so the row key would become userId_[timestamp
rounded to day].

This provides you an easy way to roll up the information for the
date
ranges you need since the key suffix can be created without a
read.
This also benefits from spreading the read load over the cluster
instead of just the replicas since you have 30 rows in this case
instead of one.


On Tue, Jun 28, 2011 at 5:55 PM, aaron morton
 wrote:
> Can you provide some more info:
> - how big are the rows, e.g. number of columns and column size
?
> - how much data are you asking for ?
> - what sort of read query are you using ?
> - what sort of numbers are you seeing ?
> - are you deleting columns or using TTL ?
> I would consider issues with the data churn, data model and
query before
> looking at serialisation.
> Cheers
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> On 29 Jun 2011, at 10:37, Yang wrote:
>
> I can see that as my user history grows, the reads time
proportionally ( or
> faster than linear) grows.
> if my business requirements ask me to keep a month's history
for each user,
> it could become too slow.- I was suspecting that it's
actually the
> serializing and deserializing that's taking time (I can
definitely it's cpu
> bound)
>
>
> On Tue, Jun 28, 2011 at 3:04 PM, aaron morton

> wrote:
>>
>> There is no facility to do custom reconciliation for a
column. An append
>> style operation would run into many of the same problems as
the Counter
>> type, e.g. not every node may get an append and there is a
chance for lost
>> appends unless you go to all the trouble Counter's do.
>>
>> I would go with using a row for the user and columns for each
item. Then
>> you can have fast no look writes.
>>
>> What problems are you seeing with the reads ?
>>
>> Cheers
>>
>>
>> -
>> Aaron Morton
>> Freelanc

Re: Re : get_range_slices result

2011-06-30 Thread Terje Marthinussen
It should of course be noted that how hard it is to load balance depends a
lot on your dataset

Some datasets load balances reasonably well even when ordered and use of the
OPP is not a big problem at all (on the contrary) and in quite a few use
cases with current HW, read performance really isn't your problem in any
case.

You may for instance find it more useful to simplify adding nodes for
growing data capacity to the "end" of the token range using OPP than getting
extra performance you don't really need.

Terje

On Fri, Jun 24, 2011 at 7:16 PM, Sylvain Lebresne wrote:

> On Fri, Jun 24, 2011 at 10:21 AM, karim abbouh  wrote:
> > i want get_range_slices() function returns records sorted(orded)  by the
> > key(rowId) used during the insertion.
> > is it possible?
>
> You will have to use the OrderPreservingPartitioner. This is no
> without inconvenience however.
> See for instance
> http://wiki.apache.org/cassandra/StorageConfiguration#line-100 or
>
> http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/
> that give more details on the pros and cons (the short version being
> that the main advantage of
> OrderPreservingPartitioner is what you're asking for, but it's main
> drawback is that load-balancing
> the cluster will likely be very very hard).
>
> In general the advice is to stick with RandomPartitioner and design a
> data model that avoids needing
> range slices (or at least needing that the result is sorted). This is
> very often not too hard and more
> efficient, and much more simpler than to deal with the load balancing
> problems of OrderPreservingPartitioner.
>
> --
> Sylvain
>
> >
> > 
> > De : aaron morton 
> > À : user@cassandra.apache.org
> > Envoyé le : Jeudi 23 Juin 2011 20h30
> > Objet : Re: get_range_slices result
> >
> > Not sure what your question is.
> > Does this help ? http://wiki.apache.org/cassandra/FAQ#range_rp
> > Cheers
> > -
> > Aaron Morton
> > Freelance Cassandra Developer
> > @aaronmorton
> > http://www.thelastpickle.com
> > On 23 Jun 2011, at 21:59, karim abbouh wrote:
> >
> > how can get_range_slices() function returns sorting key ?
> > BR
> >
> >
> >
> >
>


Re: custom reconciling columns?

2011-06-30 Thread Yang
thanks.

but then the client application has the responsibility to sort the 3
segments (assuming that I need to order the "user browsing history" in the
example), I guess the total time would not be significantly different.  also
this results in 3 times more seeks while the original way needs only one
seek.  this is probably fine if my cluster is mostly idle, but if it's
mostly busy, the load is going to increase.

now my thinking is that the read path does not really need a map (the thrift
api is a list of columns anyway, sorted), so it's a luxury to construct a
map (in fact a sortedmap) in the internal process. we could very well just
use a sorted list to do the read path, which would be much faster.
(hacking out this idea today ...)

yang

On Thu, Jun 30, 2011 at 8:27 AM, Jeremiah Jordan <
jeremiah.jor...@morningstar.com> wrote:

> **
> The reason to break it up is that the information will then be on different
> servers, so you can have server 1 spending time retrieving row 1, while you
> have server 2 retrieving row 2, and server 3 retrieving row 3...  So instead
> of getting 3000 things from one server, you get 1000 from 3 servers in
> parallel...
>
>  --
> *From:* Yang [mailto:tedd...@gmail.com]
> *Sent:* Wednesday, June 29, 2011 12:07 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: custom reconciling columns?
>
> ok, here is the profiling result. I think this is consistent (having been
> trying to recover how to effectively use yourkit ...)  see attached picture
>
> since I actually do not use the thrift interface, but just directly use the
> thrift.CassandraServer and run my code in the same JVM as cassandra,
> and was running the whole thing on a single box, there is no message
> serialization/deserialization cost. but more columns did add on to more
> time.
>
> the time was spent in the ConcurrentSkipListMap operations that implement
> the memtable.
>
>
> regarding breaking up the row, I'm not sure it would reduce my run time,
> since our requirement is to read the entire rolling window history (we
> already have
> the TTL enabled , so the history is limited to a certain length, but it is
> quite long: over 1000 , in some  cases, can be 5000 or more ) .  I think
> accessing roughly 1000 items is not an uncommon requirement for many
> applications. in our case, each column has about 30 bytes of data, besides
> the meta data such as ttl, timestamp.
> at history length of 3000, the read takes about 12ms (remember this is
> completely in-memory, no disk access)
>
> I just took a look at the expiring column logic, it looks that the
> expiration does not come into play until when the
> CassandraServer.internal_get()===>thriftifyColumns() gets called. so the
> above memtable access time is still spent. yes, then breaking up the row is
> going to be helpful, but only to the degree of preventing accessing
> expired columns (btw  if this is actually built into cassandra code it
> would be nicer, so instead of spending multiple key lookups, I locate to the
> row once, and then within the row, there are different "generation" buckets,
> so those old generation buckets that are beyond expiration are not read );
> currently just accessing the 3000 live columns is already quite slow.
>
> I'm trying to see whether there are some easy magic bullets for a drop-in
> replacement for concurrentSkipListMap...
>
> Yang
>
>
>
>
> On Tue, Jun 28, 2011 at 4:18 PM, Nate McCall  wrote:
>
>> I agree with Aaron's suggestion on data model and query here. Since
>> there is a time component, you can split the row on a fixed duration
>> for a given user, so the row key would become userId_[timestamp
>> rounded to day].
>>
>> This provides you an easy way to roll up the information for the date
>> ranges you need since the key suffix can be created without a read.
>> This also benefits from spreading the read load over the cluster
>> instead of just the replicas since you have 30 rows in this case
>> instead of one.
>>
>> On Tue, Jun 28, 2011 at 5:55 PM, aaron morton 
>> wrote:
>> > Can you provide some more info:
>> > - how big are the rows, e.g. number of columns and column size  ?
>> > - how much data are you asking for ?
>> > - what sort of read query are you using ?
>> > - what sort of numbers are you seeing ?
>> > - are you deleting columns or using TTL ?
>> > I would consider issues with the data churn, data model and query before
>> > looking at serialisation.
>> > Cheers
>> > -
>> > Aaron Morton
>> > Freelance Cassandra Developer
>> > @aaronmorton
>> > http://www.thelastpickle.com
>> > On 29 Jun 2011, at 10:37, Yang wrote:
>> >
>> > I can see that as my user history grows, the reads time proportionally (
>> or
>> > faster than linear) grows.
>> > if my business requirements ask me to keep a month's history for each
>> user,
>> > it could become too slow.- I was suspecting that it's actually the
>> > serializing and deserializing that's taking time (I can defin

Re: Row cache

2011-06-30 Thread Daniel Doubleday
Here's my understanding of things ... (this applies only for the regular heap 
implementation of row cache)

> Why Cassandra does not cache a row that was requested few times? 

What does the cache capacity read. Is it > 0?

> What the ReadCount attribute in ColumnFamilies indicates and why it remains 
> zero. 

Hm I had that too one time (read count wont go up while there were reads). But 
I didn't have the time to debug.

> How can I know from where Cassandra read a row (from MEMTable,RowCache or 
> SSTable)? 

It will always read from 
row cache or 
memtable(s) and sstable(s)

jmx should tell you (hits go up)

> does the following correct? In read operation Cassandra looks for the row in 
> the MEMTable - if not found it looks in the row-cache - if not found it looks 
> in SSTable (after looking in the key-cache to optimize the access to the 
> SSTable)? 

No. 

If row cache capacity is > 0 then a read will check if the row is in cache if 
not it read the entire row and cache it. Then / or if row was in cache already 
it will read from there and apply the respective filter to the cached CF.   
Writes update memtable and row cache when the row is cached. I must admit that 
I still dont quite understand why there's no race here. I haven't found any 
cache lock. So someone else should explain why a concurrent read / write cannot 
produce a lost update in the cached row.

If capacity is 0 then it will read from the current memtable, the memtable(s) 
that are being flushed and all sstables that may contain the row (filtered by 
bloom filter)

Hope that's correct and helps.

Cheers,
Daniel



Alternative Row Cache Implementation

2011-06-30 Thread Daniel Doubleday
Hi all - or rather devs

we have been working on an alternative implementation to the existing row 
cache(s)

We have 2 main goals:

- Decrease memory -> get more rows in the cache without suffering a huge 
performance penalty
- Reduce gc pressure

This sounds a lot like we should be using the new serializing cache in 0.8. 
Unfortunately our workload consists of loads of updates which would invalidate 
the cache all the time.

The second unfortunate thing is that the idea we came up with doesn't fit the 
new cache provider api...

It looks like this:

Like the serializing cache we basically only cache the serialized byte buffer. 
we don't serialize the bloom filter and try to do some other minor compression 
tricks (var ints etc not done yet). The main difference is that we don't 
deserialize but use the normal sstable iterators and filters as in the regular 
uncached case.

So the read path looks like this:

return filter.collectCollatedColumns(memtable iter, cached row iter)

The write path is not affected. It does not update the cache

During flush we merge all memtable updates with the cached rows.

These are early test results:

- Depending on row width and value size the serialized cache takes between 30% 
- 50% of memory compared with cached CF. This might be optimized further
- Read times increase by 5 - 10%

We haven't tested the effects on gc but hope that we will see improvements 
there because we only cache a fraction of objects (in terms of numbers) in old 
gen heap which should make gc cheaper. Of course there's also the option to use 
native mem like serializing cache does.

We believe that this approach is quite promising but as I said it is not 
compatible with the current cache api.

So my question is: does that sound interesting enough to open a jira or has 
that idea already been considered and rejected for some reason?

Cheers,
Daniel
 

Re: Alternative Row Cache Implementation

2011-06-30 Thread Edward Capriolo
On Thu, Jun 30, 2011 at 12:44 PM, Daniel Doubleday  wrote:

> Hi all - or rather devs
>
> we have been working on an alternative implementation to the existing row
> cache(s)
>
> We have 2 main goals:
>
> - Decrease memory -> get more rows in the cache without suffering a huge
> performance penalty
> - Reduce gc pressure
>
> This sounds a lot like we should be using the new serializing cache in 0.8.
> Unfortunately our workload consists of loads of updates which would
> invalidate the cache all the time.
>
> The second unfortunate thing is that the idea we came up with doesn't fit
> the new cache provider api...
>
> It looks like this:
>
> Like the serializing cache we basically only cache the serialized byte
> buffer. we don't serialize the bloom filter and try to do some other minor
> compression tricks (var ints etc not done yet). The main difference is that
> we don't deserialize but use the normal sstable iterators and filters as in
> the regular uncached case.
>
> So the read path looks like this:
>
> return filter.collectCollatedColumns(memtable iter, cached row iter)
>
> The write path is not affected. It does not update the cache
>
> During flush we merge all memtable updates with the cached rows.
>
> These are early test results:
>
> - Depending on row width and value size the serialized cache takes between
> 30% - 50% of memory compared with cached CF. This might be optimized further
> - Read times increase by 5 - 10%
>
> We haven't tested the effects on gc but hope that we will see improvements
> there because we only cache a fraction of objects (in terms of numbers) in
> old gen heap which should make gc cheaper. Of course there's also the option
> to use native mem like serializing cache does.
>
> We believe that this approach is quite promising but as I said it is not
> compatible with the current cache api.
>
> So my question is: does that sound interesting enough to open a jira or has
> that idea already been considered and rejected for some reason?
>
> Cheers,
> Daniel
>



The problem I see with the row cache implementation is more of a JVM
problem. This problem is not Cassandra localized (IMHO) as I hear Hbase
people with similar large cache/ Xmx issues. Personally, I feel this is a
sign of Java showing age. "Let us worry about the pointers" was a great
solution when systems had 32MB memory, because the cost of walking the
object graph was small and possible and small time windows. But JVM's
already can not handle 13+ GB of RAM and it is quite common to see systems
with 32-64GB physical memory. I am very curious to see how java is going to
evolve on systems with 128GB or even higher memory.

The G1 collector will help somewhat, however I do not see that really
pushing Xmx higher then it is now. HBase has even went the route of using an
off heap cache, https://issues.apache.org/jira/browse/HBASE-4018 , and some
Jira mentions Cassandra exploring this alternative as well.

Doing whatever possible to shrink the current size of item in cache is an
awesome. Anything that delivers more bang for the buck is +1. However I feel
that VFS cache is the only way to effectively cache large datasets. I was
quite disappointed when I upped a machine from 16GB to 48 GB physical
memory. I said to myself "Awesome! now I can shave off a couple of GB for
larger row caches" I changed Xmx from 9GB to 13GB, upped the caches, and
restarted. I found the system spending a lot of time managing heap, and also
found that my compaction processes that did 200GB in 4 hours now were taking
6 or 8 hours.

I had heard that JVMs "top out around 20GB" but I found they "top out" much
lower. VFS cache +1


RE: bulk load

2011-06-30 Thread Priyanka
I am working on Cassandra for last 4 weeks and am trying to load large amount
of data.I am trying to use the 
Bulk loading technique but am not clear with the process.Could some explain
the process for the bulk load? 
Also Is the new bulk loading utility discussed in the previous posts
available? 

Could some one help me in this regard? 


Priyanka

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/bulk-load-tp6505627p6534280.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: CQL injection attacks?

2011-06-30 Thread Nate McCall
The CQL drivers are all still sitting on top of the execute_cql_query
Thrift API method for now.

On Wed, Jun 29, 2011 at 2:12 PM,   wrote:
>
> Someone asked a while ago whether Cassandra was vulnerable to injection 
> attacks:
>
> http://stackoverflow.com/questions/5998838/nosql-injection-php-phpcassa-cassandra
>
> With Thrift, the answer was 'no'.
>
> With CQL, presumably the situation is different, at least until prepared
> statements are possible (CASSANDRA-2475) ?
>
> Has there been any discussion on this already that someone could point me to,
> please? I couldn't see anything on JIRA (searching for CQL AND injection, CQL
> AND security, etc).
>
> Thanks.
>
> 
> This message was sent using IMP, the Internet Messaging Program.
>
> This email and any attachments to it may be confidential and are
> intended solely for the use of the individual to whom it is addressed.
> If you are not the intended recipient of this email, you must neither
> take any action based upon its contents, nor copy or show it to anyone.
> Please contact the sender if you believe you have received this email in
> error. QinetiQ may monitor email traffic data and also the content of
> email for the purposes of security. QinetiQ Limited (Registered in
> England & Wales: Company Number: 3796233) Registered office: Cody Technology
> Park, Ively Road, Farnborough, Hampshire, GU14 0LX http://www.qinetiq.com.
>


SimpleAuth​enticator

2011-06-30 Thread earltbj
Hi,

 

I am encountering an error while trying to set up simple authentication in a
test environment.  

 

BACKGROUND
(1) Cassandra Version: ReleaseVersion: 0.7.2-0ubuntu4~lucid1
(2) OS Level: Linux cassandra1 2.6.32-32-server #62-Ubuntu SMP Wed Apr 20
22:07:43 UTC 2011 x86_64 GNU/Linux
2 node cluster

Properties file exist in the following directory:
 > /etc/cassandra/access.properties
 > /etc/cassandra/passwd.properties



The authenticator element in the /etc/cassandra/cassandra.yamlfile is set
to:
authenticator: org.apache.cassandra.auth.SimpleAuthenticator
The authority element in the /etc/cassandra/cassandra.yaml file is set to:
authority: org.apache.cassandra.auth.SimpleAuthority

 

The cassandra.in.shfile located in /usr/share/cassandrahas been updated to
show the location of the properties files in the following manner:

 

# Location of access.properties and passwd.properties
JVM_OPTS="
-Dpasswd.properties=/etc/cassandra/passwd.properties
-Daccess.properties=/etc/cassandra/access.properties"

 

Also, the destination of the configuration directory:

CASSANDRA_CONF=/etc/cassandra

 

ERROR

After setting DEBUG mode, I get the following error message in the
system.log:

 

 INFO [main] 2011-06-30 10:12:01,365 AbstractCassandraDaemon.java (line 249)
Cassandra shutting down...
 INFO [main] 2011-06-30 10:12:01,366 CassandraDaemon.java (line 159) Stop
listening to thrift clients
 INFO [main] 2011-06-30 10:13:14,186 AbstractCassandraDaemon.java (line 77)
Logging initialized
 INFO [main] 2011-06-30 10:13:14,196 AbstractCassandraDaemon.java (line 97)
Heap size: 510263296/511311872
 WARN [main] 2011-06-30 10:13:14,227 CLibrary.java (line 93) Obsolete
version of JNA present; unable to read errno. Upgrade to JNA 3.2.7 or later
 WARN [main] 2011-06-30 10:13:14,227 CLibrary.java (line 93) Obsolete
version of JNA present; unable to read errno. Upgrade to JNA 3.2.7 or later
 WARN [main] 2011-06-30 10:13:14,228 CLibrary.java (line 125) Unknown
mlockall error 0
 INFO [main] 2011-06-30 10:13:14,234 DatabaseDescriptor.java (line 121)
Loading settings from file:/etc/cassandra/cassandra.yaml
 INFO [main] 2011-06-30 10:13:14,337 DatabaseDescriptor.java (line 181)
DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
ERROR [main] 2011-06-30 10:13:14,342 DatabaseDescriptor.java (line 405)
Fatal configuration error
org.apache.cassandra.config.ConfigurationException: When using
org.apache.cassandra.auth.SimpleAuthenticator passwd.properties properties
must be defined.
at
org.apache.cassandra.auth.SimpleAuthenticator.validateConfiguration(SimpleAuthenticator.java:148)
at
org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:200)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:100)
at
org.apache.cassandra.service.AbstractCassandraDaemon.init(AbstractCassandraDaemon.java:217)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:160)


Data from the output.log:

 

 INFO 10:12:01,365 Cassandra shutting down...
 INFO 10:12:01,366 Stop listening to thrift clients
 INFO 10:13:14,186 Logging initialized
 INFO 10:13:14,196 Heap size: 510263296/511311872
 WARN 10:13:14,227 Obsolete version of JNA present; unable to read errno.
Upgrade to JNA 3.2.7 or later
 WARN 10:13:14,227 Obsolete version of JNA present; unable to read errno.
Upgrade to JNA 3.2.7 or later
 WARN 10:13:14,228 Unknown mlockall error 0
 INFO 10:13:14,234 Loading settings from file:/etc/cassandra/cassandra.yaml
 INFO 10:13:14,337 DiskAccessMode 'auto' determined to be mmap,
indexAccessMode is mmap
ERROR 10:13:14,342 Fatal configuration error
org.apache.cassandra.config.ConfigurationException: When using
org.apache.cassandra.auth.SimpleAuthenticator passwd.properties properties
must be defined.
at
org.apache.cassandra.auth.SimpleAuthenticator.validateConfiguration(SimpleAuthenticator.java:148)
at
org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:200)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:100)
at
org.apache.cassandra.service.AbstractCassandraDaemon.init(AbstractCassandraDaemon.java:217)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:160)
When using org.apache.cassandra.auth

RE: Cassandra ACID

2011-06-30 Thread Jeremiah Jordan
For your Consistency case, it is actually an ALL read that is needed,
not an ALL write.  ALL read, with what ever consistency level of write
that you need (to support machines dyeing) is the only way to get
consistent results in the face of a failed write which was at > ONE that
went to one node, but not the others.



From: AJ [mailto:a...@dude.podzone.net] 
Sent: Friday, June 24, 2011 11:28 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra ACID


Ok, here it is reworked; consider it a summary of the thread.  If I left
out an important point that you think is 100% correct even if you
already mentioned it, then make some noise about it and provide some
evidence so it's captured sufficiently.  And, if you're in a debate,
please try and get to a resolution; all will appreciate it.

It will be evident below that Consistency is not the only thing that is
"tunable", at least indirectly.  Unfortunately, you still can't
tunafish.  Ar ar ar.

Atomicity
All individual writes are atomic at the row level.  So, a batch mutate
for one specific key will apply updates to all the columns for that one
specific row atomically.  If part of the single-key batch update fails,
then all of the updates will be reverted since they all pertained to one
key/row.  Notice, I said 'reverted' not 'rolled back'.  Note: atomicity
and isolation are related to the topic of transactions but one does not
imply the other.  Even though row updates are atomic, they are not
isolated from other users' updates or reads.
Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
Cassandra does not provide the same scope of Consistency as defined in
the ACID standard.  Consistency in C* does not include referential
integrity since C* is not a relational database.  Any referential
integrity required would have to be handled by the client.  Also, even
though the official docs say that QUORUM writes/reads is the minimal
consistency_level setting to guarantee full consistency, this assumes
that the write preceding the read does not fail (see comments below).
Therefore, an ALL write would be necessary prior to a QUORUM read of the
same data.  For a multi-dc scenario use an ALL write followed by a
EACH_QUORUM read.
Refs: http://wiki.apache.org/cassandra/ArchitectureOverview

Isolation
NOTHING is isolated; because there is no transaction support in the
first place.  This means that two or more clients can update the same
row at the same time.  Their updates of the same or different columns
may be interleaved and leave the row in a state that may not make sense
depending on your application.  Note: this doesn't mean to say that two
updates of the same column will be corrupted, obviously; columns are the
smallest atomic unit ('atomic' in the more general thread-safe context).
Refs: None that directly address this explicitly and clearly and in one
place.

Durability
Updates are made highly durable at the level comparable to a DBMS by the
use of the commit log.  However, this requires "commitlog_sync: batch"
in cassandra.yaml.  For "some" performance improvement with "some" cost
in durability you can specify "commitlog_sync: periodic".  See
discussion below for more details.
Refs: Plenty + this thread.



On 6/24/2011 1:46 PM, Jim Newsham wrote: 

On 6/23/2011 8:55 PM, AJ wrote: 

Can any Cassandra contributors/guru's confirm my
understanding of Cassandra's degree of support for the ACID properties?

I provide official references when known.  Please let me
know if I missed some good official documentation.

Atomicity
All individual writes are atomic at the row level.  So,
a batch mutate for one specific key will apply updates to all the
columns for that one specific row atomically.  If part of the single-key
batch update fails, then all of the updates will be reverted since they
all pertained to one key/row.  Notice, I said 'reverted' not 'rolled
back'.  Note: atomicity and isolation are related to the topic of
transactions but one does not imply the other.  Even though row updates
are atomic, they are not isolated from other users' updates or reads.

Refs:
http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
If you want 100% consistency, use consistency level
QUORUM for both reads and writes and EACH_QUORUM in a multi-dc scenario.

Refs:
http://wiki.apache.org/cassandra/ArchitectureOverview




This is a pretty narrow interpretation of consistency.  In a
traditional database, consistency prevents you from getting into a
logically inconsistent state, where records in one table do not agree
with records in another table.  This includes referential integrity,
cascading deletes, etc.  It seems to me Cassandra has no support for
this concept whatsoever.


Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread A J
I am little confused of the reason why nodetool repair has to run
within GCGraceSeconds.

The documentation at:
http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair
is not very clear to me.

How can a delete be 'unforgotten' if I don't run nodetool repair? (I
understand that if a node is down for more than GCGraceSeconds, I
should not get it up without resynching is completely. Otherwise
deletes may reappear.http://wiki.apache.org/cassandra/DistributedDeletes
)
But not sure how exactly nodetool repair ties into this mechanism of
distributed deletes.

Thanks for any clarifications.


Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread Edward Capriolo
On Thu, Jun 30, 2011 at 4:25 PM, A J  wrote:

> I am little confused of the reason why nodetool repair has to run
> within GCGraceSeconds.
>
> The documentation at:
> http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair
> is not very clear to me.
>
> How can a delete be 'unforgotten' if I don't run nodetool repair? (I
> understand that if a node is down for more than GCGraceSeconds, I
> should not get it up without resynching is completely. Otherwise
> deletes may reappear.http://wiki.apache.org/cassandra/DistributedDeletes
> )
> But not sure how exactly nodetool repair ties into this mechanism of
> distributed deletes.
>
> Thanks for any clarifications.
>

Read repair does NOT repair tombstones. Failed writes/tomstones with
TimedoutException do not get hinted even if HH is on.
https://issues.apache.org/jira/browse/CASSANDRA-2034. Thus tombstones can
get lost.

Because of this the only way to find lost tombstones is to anti-entropy
repair. If you do not repair in the gc period a node could lose a tombstone
and the row could be read repaired and resurrected.

In our case, we are lucky, we delete rows when they get old and stale. While
it is not great if a deleted row appears it is not harmful thus I can live
with less repairing then most.


Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread Konstantin Naryshkin
As I understand, it has to do with a node being up but missing the delete 
message (remember, if you apply the delete at CL.QUORUM, you can have almost 
half the replicas miss it and still succeed). Imagine that you have 3 nodes A, 
B, and C, each of which has a column 'foo' with a value 'bar'. Their state 
would be:
A: 'foo':'bar' B: 'foo':'bar' C: 'foo':'bar'

We attempt to delete column 'foo', and it succeeds on nodes A and B (meaning 
that we succeeded on CL.QUORUM). Unfortunately the packet going to node C runs 
afoul of the network gods and gets zapped in transit. The state is now:
A: 'foo':deleted B: 'foo':deleted C: 'foo':'bar'

If we try a read at this point, at CL.QUORUM, we are guaranteed to get at least 
one record that 'foo' was deleted and because of timestamps we know to tell the 
client as much.

After GCGraceSeconds and a compaction, the state of the nodes will be:
A: None B: None C: 'foo':'bar'

Some time later, we attempt a read and just happen to get C's response first. 
The response will be that 'foo' is storing 'bar'. Not only that, but read 
repair happens as well, so the state will become:
A: 'foo':'bar' B: 'foo':'bar' C: 'foo':'bar'

We have the infamous undelete.

- Original Message -
From: "A J" 
To: user@cassandra.apache.org
Sent: Thursday, June 30, 2011 8:25:29 PM
Subject: Meaning of 'nodetool repair has to run within GCGraceSeconds'

I am little confused of the reason why nodetool repair has to run
within GCGraceSeconds.

The documentation at:
http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair
is not very clear to me.

How can a delete be 'unforgotten' if I don't run nodetool repair? (I
understand that if a node is down for more than GCGraceSeconds, I
should not get it up without resynching is completely. Otherwise
deletes may reappear.http://wiki.apache.org/cassandra/DistributedDeletes
)
But not sure how exactly nodetool repair ties into this mechanism of
distributed deletes.

Thanks for any clarifications.


Re: SimpleAuth​enticator

2011-06-30 Thread earltbj
Found the fix myself, and wanted to share the resolution.  Documentation
states that the "cassandra.in.sh" file needs to be updated with the
following values, if the properties files exist in the directory I've
stipulated:

JVM_OPTS="$JVM_OPTS -Dpasswd.properties=/etc/cassandra/passwd.properties"
JVM_OPTS="$JVM_OPTS -Daccess.properties=/etc/cassandra/access.properties"

Turns out that "cassandra.in.sh" was not being called at all during start
up.  Not sure if this is a bug or not, but to get around the issue I
inserted the two lines above into the "cassandra-env.sh" file, started up
the instance and ... the database comes up and I get prompted the following:

root@cassandra1:/etc/cassandra# cassandra-cli
Welcome to cassandra CLI.

Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit.
[default@unknown] connect 11.1.11.111/9160;
Login failure. Did you specify 'keyspace', 'username' and 'password'?
[default@unknown]


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/SimpleAuth-enticator-tp6534645p6534942.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Repair doesn't work after upgrading to 0.8.1

2011-06-30 Thread Héctor Izquierdo Seliva
Hi all,

I have upgraded all my cluster to 0.8.1. Today one of the disks in one
of the nodes died. After replacing the disk I tried running repair, but
this message appears:

 INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.80
from repair because it is on version 0.7 or sooner. You should consider
updating this node before running repair again.
 INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.76
from repair because it is on version 0.7 or sooner. You should consider
updating this node before running repair again.
 INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.80
from repair because it is on version 0.7 or sooner. You should consider
updating this node before running repair again.
 INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
20:36:25,086 AntiEntropyService.java (line 179) Excluding /10.20.13.77
from repair because it is on version 0.7 or sooner. You should consider
updating this node before running repair again.
 INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.76
from repair because it is on version 0.7 or sooner. You should consider
updating this node before running repair again.
 INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
20:36:25,086 AntiEntropyService.java (line 782) No neighbors to repair
with for sbs on
(170141183460469231731687303715884105727,28356863910078205288614550619314017621]:
 manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098 completed.
 INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
20:36:25,086 AntiEntropyService.java (line 179) Excluding /10.20.13.79
from repair because it is on version 0.7 or sooner. You should consider
updating this node before running repair again.
 INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
20:36:25,086 AntiEntropyService.java (line 782) No neighbors to repair
with for sbs on
(141784319550391026443072753096570088105,170141183460469231731687303715884105727]:
 manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf completed.
 INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
20:36:25,087 AntiEntropyService.java (line 782) No neighbors to repair
with for sbs on
(113427455640312821154458202477256070484,141784319550391026443072753096570088105]:
 manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a completed.

What can I do?



Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread Jonathan Ellis
On Thu, Jun 30, 2011 at 3:47 PM, Edward Capriolo  wrote:
> Read repair does NOT repair tombstones.

It does, but you can't rely on RR to repair _all_ tombstones, because
RR only happens if the row in question is requested by a client.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread A J
Thanks all !
In other words, I think it is safe to say that a node as a whole can
be made consistent only on 'nodetool repair'.

Has there been enough interest in providing anti-entropy without
compaction as a separate operation (nodetool repair does both) ?


On Thu, Jun 30, 2011 at 5:27 PM, Jonathan Ellis  wrote:
> On Thu, Jun 30, 2011 at 3:47 PM, Edward Capriolo  
> wrote:
>> Read repair does NOT repair tombstones.
>
> It does, but you can't rely on RR to repair _all_ tombstones, because
> RR only happens if the row in question is requested by a client.
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>


Re: custom reconciling columns?

2011-06-30 Thread Yang
ok, I kind of found the magic bullet , but you can only use it to shoot your
enemy close really close range :)


for read path, the thrift API already limits the output to a list of
columns, so it does not make sense to use maps in the internal operations.
plus the return CF on the read path is not going to be modified/shared by
any other threads, so synchronization is not necessary. so
the solution is to modify ColumnFamilyStore so that getTopLevelColumns takes
a returnCF param, instead of always constructing it inside with
ColumnFamily.create().
so only read path behavior is changed.

in read path, we pass in a FastColumnFamily implementation, which uses an
ArrayList internally to store sorted columns, and do binary search to insert
, and merge to addAll(column).

I tried out this, it's about 50% faster on rows with 3000 cols.


Jonathan: do you think this is a viable approach? the only disadvantage is a
slight change to getTopLevelColumns so we have 2 flavors of this method

Thanks
Yang

On Wed, Jun 29, 2011 at 5:51 PM, Jonathan Ellis  wrote:

> On Tue, Jun 28, 2011 at 10:06 PM, Yang  wrote:
> > I'm trying to see whether there are some easy magic bullets for a drop-in
> > replacement for concurrentSkipListMap...
>
> I'm highly interested if you find one. :)
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>


Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread AJ

It would be helpful if this was automated some how.


Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread Edward Capriolo
On Thu, Jun 30, 2011 at 5:27 PM, Jonathan Ellis  wrote:

> On Thu, Jun 30, 2011 at 3:47 PM, Edward Capriolo 
> wrote:
> > Read repair does NOT repair tombstones.
>
> It does, but you can't rely on RR to repair _all_ tombstones, because
> RR only happens if the row in question is requested by a client.
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Doh! Right. I was thinking about range scans and read repair.
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Alternative-to-repair-td6098108.html


Re: Alternative Row Cache Implementation

2011-06-30 Thread Terje Marthinussen
We had a visitor from Intel a month ago.

One question from him was "What could you do if we gave you a server 2 years
from now that had 16TB of memory"

I went Eh... using Java?

2 years is maybe unrealistic, but you can already get some quite acceptable
prices even on servers in the 100GB memory range now if you buy in larger
quantities (30-50 servers and more in one go).

I don't think it is unrealistic that we will start seeing high end consumer
(x64) servers with TB's of memory in a few years and I really wonder were
that puts java based software.

Terje

On Fri, Jul 1, 2011 at 2:25 AM, Edward Capriolo wrote:

>
>
> On Thu, Jun 30, 2011 at 12:44 PM, Daniel Doubleday <
> daniel.double...@gmx.net> wrote:
>
>> Hi all - or rather devs
>>
>> we have been working on an alternative implementation to the existing row
>> cache(s)
>>
>> We have 2 main goals:
>>
>> - Decrease memory -> get more rows in the cache without suffering a huge
>> performance penalty
>> - Reduce gc pressure
>>
>> This sounds a lot like we should be using the new serializing cache in
>> 0.8.
>> Unfortunately our workload consists of loads of updates which would
>> invalidate the cache all the time.
>>
>> The second unfortunate thing is that the idea we came up with doesn't fit
>> the new cache provider api...
>>
>> It looks like this:
>>
>> Like the serializing cache we basically only cache the serialized byte
>> buffer. we don't serialize the bloom filter and try to do some other minor
>> compression tricks (var ints etc not done yet). The main difference is that
>> we don't deserialize but use the normal sstable iterators and filters as in
>> the regular uncached case.
>>
>> So the read path looks like this:
>>
>> return filter.collectCollatedColumns(memtable iter, cached row iter)
>>
>> The write path is not affected. It does not update the cache
>>
>> During flush we merge all memtable updates with the cached rows.
>>
>> These are early test results:
>>
>> - Depending on row width and value size the serialized cache takes between
>> 30% - 50% of memory compared with cached CF. This might be optimized further
>> - Read times increase by 5 - 10%
>>
>> We haven't tested the effects on gc but hope that we will see improvements
>> there because we only cache a fraction of objects (in terms of numbers) in
>> old gen heap which should make gc cheaper. Of course there's also the option
>> to use native mem like serializing cache does.
>>
>> We believe that this approach is quite promising but as I said it is not
>> compatible with the current cache api.
>>
>> So my question is: does that sound interesting enough to open a jira or
>> has that idea already been considered and rejected for some reason?
>>
>> Cheers,
>> Daniel
>>
>
>
>
> The problem I see with the row cache implementation is more of a JVM
> problem. This problem is not Cassandra localized (IMHO) as I hear Hbase
> people with similar large cache/ Xmx issues. Personally, I feel this is a
> sign of Java showing age. "Let us worry about the pointers" was a great
> solution when systems had 32MB memory, because the cost of walking the
> object graph was small and possible and small time windows. But JVM's
> already can not handle 13+ GB of RAM and it is quite common to see systems
> with 32-64GB physical memory. I am very curious to see how java is going to
> evolve on systems with 128GB or even higher memory.
>
> The G1 collector will help somewhat, however I do not see that really
> pushing Xmx higher then it is now. HBase has even went the route of using an
> off heap cache, https://issues.apache.org/jira/browse/HBASE-4018 , and
> some Jira mentions Cassandra exploring this alternative as well.
>
> Doing whatever possible to shrink the current size of item in cache is an
> awesome. Anything that delivers more bang for the buck is +1. However I feel
> that VFS cache is the only way to effectively cache large datasets. I was
> quite disappointed when I upped a machine from 16GB to 48 GB physical
> memory. I said to myself "Awesome! now I can shave off a couple of GB for
> larger row caches" I changed Xmx from 9GB to 13GB, upped the caches, and
> restarted. I found the system spending a lot of time managing heap, and also
> found that my compaction processes that did 200GB in 4 hours now were taking
> 6 or 8 hours.
>
> I had heard that JVMs "top out around 20GB" but I found they "top out" much
> lower. VFS cache +1
>
>
>


Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread Watanabe Maki
Repair doesn't compact. Those are different processes already.

maki


On 2011/07/01, at 7:21, A J  wrote:

> Thanks all !
> In other words, I think it is safe to say that a node as a whole can
> be made consistent only on 'nodetool repair'.
> 
> Has there been enough interest in providing anti-entropy without
> compaction as a separate operation (nodetool repair does both) ?
> 
> 
> On Thu, Jun 30, 2011 at 5:27 PM, Jonathan Ellis  wrote:
>> On Thu, Jun 30, 2011 at 3:47 PM, Edward Capriolo  
>> wrote:
>>> Read repair does NOT repair tombstones.
>> 
>> It does, but you can't rely on RR to repair _all_ tombstones, because
>> RR only happens if the row in question is requested by a client.
>> 
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>> 


Re: Alternative Row Cache Implementation

2011-06-30 Thread Jonathan Ellis
I'm interested. :)

On Thu, Jun 30, 2011 at 11:44 AM, Daniel Doubleday
 wrote:
> Hi all - or rather devs
>
> we have been working on an alternative implementation to the existing row 
> cache(s)
>
> We have 2 main goals:
>
> - Decrease memory -> get more rows in the cache without suffering a huge 
> performance penalty
> - Reduce gc pressure
>
> This sounds a lot like we should be using the new serializing cache in 0.8.
> Unfortunately our workload consists of loads of updates which would 
> invalidate the cache all the time.
>
> The second unfortunate thing is that the idea we came up with doesn't fit the 
> new cache provider api...
>
> It looks like this:
>
> Like the serializing cache we basically only cache the serialized byte 
> buffer. we don't serialize the bloom filter and try to do some other minor 
> compression tricks (var ints etc not done yet). The main difference is that 
> we don't deserialize but use the normal sstable iterators and filters as in 
> the regular uncached case.
>
> So the read path looks like this:
>
> return filter.collectCollatedColumns(memtable iter, cached row iter)
>
> The write path is not affected. It does not update the cache
>
> During flush we merge all memtable updates with the cached rows.
>
> These are early test results:
>
> - Depending on row width and value size the serialized cache takes between 
> 30% - 50% of memory compared with cached CF. This might be optimized further
> - Read times increase by 5 - 10%
>
> We haven't tested the effects on gc but hope that we will see improvements 
> there because we only cache a fraction of objects (in terms of numbers) in 
> old gen heap which should make gc cheaper. Of course there's also the option 
> to use native mem like serializing cache does.
>
> We believe that this approach is quite promising but as I said it is not 
> compatible with the current cache api.
>
> So my question is: does that sound interesting enough to open a jira or has 
> that idea already been considered and rejected for some reason?
>
> Cheers,
> Daniel
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: SimpleAuthenticator

2011-06-30 Thread aaron morton
cassandra.in.sh is old skool 0.6 series, 0.7 series uses cassandra-env.sh. The 
packages put it in /etc/cassandra.
 
This works for me at the end of cassandra-env.sh 

JVM_OPTS="$JVM_OPTS -Dpasswd.properties=/etc/cassandra/passwd.properties"
JVM_OPTS="$JVM_OPTS -Daccess.properties=/etc/cassandra/access.properties"

btw at a minimum you should upgrade from 0.7.2 to 0.7.6-2 see 
https://github.com/apache/cassandra/blob/cassandra-0.7.6-2/NEWS.txt#L61

Hope that helps. 

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 1 Jul 2011, at 02:20, Earl Barnes wrote:

> Hi,
>  
> I am encountering an error while trying to set up simple authentication in a 
> test environment. 
>  
> BACKGROUND
> Cassandra Version: ReleaseVersion: 0.7.2-0ubuntu4~lucid1
> OS Level: Linux cassandra1 2.6.32-32-server #62-Ubuntu SMP Wed Apr 20 
> 22:07:43 UTC 2011 x86_64 GNU/Linux
> 2 node cluster
> Properties file exist in the following directory:
> 
>  > /etc/cassandra/access.properties
>  > /etc/cassandra/passwd.properties
> The authenticator element in the /etc/cassandra/cassandra.yaml file is set to:
> authenticator: org.apache.cassandra.auth.SimpleAuthenticator
> The authority element in the /etc/cassandra/cassandra.yaml file is set to:
> authority: org.apache.cassandra.auth.SimpleAuthority
>  
> The cassandra.in.sh file located in /usr/share/cassandra has been updated to 
> show the location of the properties files in the following manner:
>  
> # Location of access.properties and passwd.properties
> JVM_OPTS="
> -Dpasswd.properties=/etc/cassandra/passwd.properties
> -Daccess.properties=/etc/cassandra/access.properties"
>  
> Also, the destination of the configuration directory:
> CASSANDRA_CONF=/etc/cassandra
>  
> ERROR
> After setting DEBUG mode, I get the following error message in the system.log:
>  
>  INFO [main] 2011-06-30 10:12:01,365 AbstractCassandraDaemon.java (line 249) 
> Cassandra shutting down...
>  INFO [main] 2011-06-30 10:12:01,366 CassandraDaemon.java (line 159) Stop 
> listening to thrift clients
>  INFO [main] 2011-06-30 10:13:14,186 AbstractCassandraDaemon.java (line 77) 
> Logging initialized
>  INFO [main] 2011-06-30 10:13:14,196 AbstractCassandraDaemon.java (line 97) 
> Heap size: 510263296/511311872
>  WARN [main] 2011-06-30 10:13:14,227 CLibrary.java (line 93) Obsolete version 
> of JNA present; unable to read errno. Upgrade to JNA 3.2.7 or later
>  WARN [main] 2011-06-30 10:13:14,227 CLibrary.java (line 93) Obsolete version 
> of JNA present; unable to read errno. Upgrade to JNA 3.2.7 or later
>  WARN [main] 2011-06-30 10:13:14,228 CLibrary.java (line 125) Unknown 
> mlockall error 0
>  INFO [main] 2011-06-30 10:13:14,234 DatabaseDescriptor.java (line 121) 
> Loading settings from file:/etc/cassandra/cassandra.yaml
>  INFO [main] 2011-06-30 10:13:14,337 DatabaseDescriptor.java (line 181) 
> DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
> ERROR [main] 2011-06-30 10:13:14,342 DatabaseDescriptor.java (line 405) Fatal 
> configuration error
> org.apache.cassandra.config.ConfigurationException: When using 
> org.apache.cassandra.auth.SimpleAuthenticator passwd.properties properties 
> must be defined.
> at 
> org.apache.cassandra.auth.SimpleAuthenticator.validateConfiguration(SimpleAuthenticator.java:148)
> at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:200)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:100)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.init(AbstractCassandraDaemon.java:217)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:616)
> at 
> org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:160)
> Data from the output.log:
>  
>  INFO 10:12:01,365 Cassandra shutting down...
>  INFO 10:12:01,366 Stop listening to thrift clients
>  INFO 10:13:14,186 Logging initialized
>  INFO 10:13:14,196 Heap size: 510263296/511311872
>  WARN 10:13:14,227 Obsolete version of JNA present; unable to read errno. 
> Upgrade to JNA 3.2.7 or later
>  WARN 10:13:14,227 Obsolete version of JNA present; unable to read errno. 
> Upgrade to JNA 3.2.7 or later
>  WARN 10:13:14,228 Unknown mlockall error 0
>  INFO 10:13:14,234 Loading settings from file:/etc/cassandra/cassandra.yaml
>  INFO 10:13:14,337 DiskAccessMode 'auto' determined to be mmap, 
> indexAccessMode is mmap
> ERROR 10:13:14,342 Fatal configuration error
> org.apache.cassandra.config.ConfigurationException: When using 
> org.apache.cassandra.auth.SimpleAuthenticator passwd.properties properties 
> must be defined.
> at 
> org.apache.cassandra.auth.Si

Re: Repair doesn't work after upgrading to 0.8.1

2011-06-30 Thread aaron morton
This seems to be a known issue related to 
https://issues.apache.org/jira/browse/CASSANDRA-2818 e.g. 
https://issues.apache.org/jira/browse/CASSANDRA-2768

There was some discussion on the IRC list today, driftx said the simple fix was 
a full cluster restart. Or perhaps a rolling restart with the 2818 patch 
applied may work. 

Starting with "Dcassandra.load_ring_state=false" causes the node to rediscover 
the ring which may help (just a guess really). But if there is bad node start 
been passed around in gossip it will just get the bad state again.  

Anyone else ?


-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 1 Jul 2011, at 09:11, Héctor Izquierdo Seliva wrote:

> Hi all,
> 
> I have upgraded all my cluster to 0.8.1. Today one of the disks in one
> of the nodes died. After replacing the disk I tried running repair, but
> this message appears:
> 
> INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
> 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.80
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
> 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.76
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
> 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.80
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
> 20:36:25,086 AntiEntropyService.java (line 179) Excluding /10.20.13.77
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
> 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.76
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
> 20:36:25,086 AntiEntropyService.java (line 782) No neighbors to repair
> with for sbs on
> (170141183460469231731687303715884105727,28356863910078205288614550619314017621]:
>  manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098 completed.
> INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
> 20:36:25,086 AntiEntropyService.java (line 179) Excluding /10.20.13.79
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
> 20:36:25,086 AntiEntropyService.java (line 782) No neighbors to repair
> with for sbs on
> (141784319550391026443072753096570088105,170141183460469231731687303715884105727]:
>  manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf completed.
> INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
> 20:36:25,087 AntiEntropyService.java (line 782) No neighbors to repair
> with for sbs on
> (113427455640312821154458202477256070484,141784319550391026443072753096570088105]:
>  manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a completed.
> 
> What can I do?
> 



cassandra inside beanstalk?

2011-06-30 Thread Andrei Pozolotin
Hello;

I would like to run cassandra inside beanstalk;
http://aws.amazon.com/elasticbeanstalk/
along with the distributed client application;

this blog advises against it

http://www.evidentsoftware.com/embedding-cassandra-within-tomcat-for-testing/

is it really so?
can cassandra be tuned not to be a resource hog and co-exist
peacefully with the client?
can client take advantage of proximity running on the same node as
cassandra?

Thank you,

Andrei. 



Re: Repair doesn't work after upgrading to 0.8.1

2011-06-30 Thread Jonathan Ellis
This isn't 2818 -- (a) the 0.8.1 protocol is identical to 0.8.0 and
(b) the whole cluster is on the same version.

On Thu, Jun 30, 2011 at 9:35 PM, aaron morton  wrote:
> This seems to be a known issue related
> to https://issues.apache.org/jira/browse/CASSANDRA-2818 e.g. https://issues.apache.org/jira/browse/CASSANDRA-2768
> There was some discussion on the IRC list today, driftx said the simple fix
> was a full cluster restart. Or perhaps a rolling restart with the 2818 patch
> applied may work.
> Starting with "Dcassandra.load_ring_state=false" causes the node to
> rediscover the ring which may help (just a guess really). But if there is
> bad node start been passed around in gossip it will just get the bad state
> again.
> Anyone else ?
>
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> On 1 Jul 2011, at 09:11, Héctor Izquierdo Seliva wrote:
>
> Hi all,
>
> I have upgraded all my cluster to 0.8.1. Today one of the disks in one
> of the nodes died. After replacing the disk I tried running repair, but
> this message appears:
>
> INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
> 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.80
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
> 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.76
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
> 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.80
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
> 20:36:25,086 AntiEntropyService.java (line 179) Excluding /10.20.13.77
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
> 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.76
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
> 20:36:25,086 AntiEntropyService.java (line 782) No neighbors to repair
> with for sbs on
> (170141183460469231731687303715884105727,28356863910078205288614550619314017621]:
> manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098 completed.
> INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
> 20:36:25,086 AntiEntropyService.java (line 179) Excluding /10.20.13.79
> from repair because it is on version 0.7 or sooner. You should consider
> updating this node before running repair again.
> INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
> 20:36:25,086 AntiEntropyService.java (line 782) No neighbors to repair
> with for sbs on
> (141784319550391026443072753096570088105,170141183460469231731687303715884105727]:
> manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf completed.
> INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
> 20:36:25,087 AntiEntropyService.java (line 782) No neighbors to repair
> with for sbs on
> (113427455640312821154458202477256070484,141784319550391026443072753096570088105]:
> manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a completed.
>
> What can I do?
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Repair doesn't work after upgrading to 0.8.1

2011-06-30 Thread Terje Marthinussen
Unless it is a 0.8.1 RC or beta

On Fri, Jul 1, 2011 at 12:57 PM, Jonathan Ellis  wrote:

> This isn't 2818 -- (a) the 0.8.1 protocol is identical to 0.8.0 and
> (b) the whole cluster is on the same version.
>
> On Thu, Jun 30, 2011 at 9:35 PM, aaron morton 
> wrote:
> > This seems to be a known issue related
> > to https://issues.apache.org/jira/browse/CASSANDRA-2818 e.g.
> https://issues.apache.org/jira/browse/CASSANDRA-2768
> > There was some discussion on the IRC list today, driftx said the simple
> fix
> > was a full cluster restart. Or perhaps a rolling restart with the 2818
> patch
> > applied may work.
> > Starting with "Dcassandra.load_ring_state=false" causes the node to
> > rediscover the ring which may help (just a guess really). But if there is
> > bad node start been passed around in gossip it will just get the bad
> state
> > again.
> > Anyone else ?
> >
> > -
> > Aaron Morton
> > Freelance Cassandra Developer
> > @aaronmorton
> > http://www.thelastpickle.com
> > On 1 Jul 2011, at 09:11, Héctor Izquierdo Seliva wrote:
> >
> > Hi all,
> >
> > I have upgraded all my cluster to 0.8.1. Today one of the disks in one
> > of the nodes died. After replacing the disk I tried running repair, but
> > this message appears:
> >
> > INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
> > 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.80
> > from repair because it is on version 0.7 or sooner. You should consider
> > updating this node before running repair again.
> > INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
> > 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.76
> > from repair because it is on version 0.7 or sooner. You should consider
> > updating this node before running repair again.
> > INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
> > 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.80
> > from repair because it is on version 0.7 or sooner. You should consider
> > updating this node before running repair again.
> > INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
> > 20:36:25,086 AntiEntropyService.java (line 179) Excluding /10.20.13.77
> > from repair because it is on version 0.7 or sooner. You should consider
> > updating this node before running repair again.
> > INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
> > 20:36:25,085 AntiEntropyService.java (line 179) Excluding /10.20.13.76
> > from repair because it is on version 0.7 or sooner. You should consider
> > updating this node before running repair again.
> > INFO [manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098] 2011-06-30
> > 20:36:25,086 AntiEntropyService.java (line 782) No neighbors to repair
> > with for sbs on
> >
> (170141183460469231731687303715884105727,28356863910078205288614550619314017621]:
> > manual-repair-26f5a7dd-cf12-44de-9f8f-6b6335bdd098 completed.
> > INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
> > 20:36:25,086 AntiEntropyService.java (line 179) Excluding /10.20.13.79
> > from repair because it is on version 0.7 or sooner. You should consider
> > updating this node before running repair again.
> > INFO [manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf] 2011-06-30
> > 20:36:25,086 AntiEntropyService.java (line 782) No neighbors to repair
> > with for sbs on
> >
> (141784319550391026443072753096570088105,170141183460469231731687303715884105727]:
> > manual-repair-bdb4055a-d370-4d2a-a1dd-70a7e4fa60cf completed.
> > INFO [manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a] 2011-06-30
> > 20:36:25,087 AntiEntropyService.java (line 782) No neighbors to repair
> > with for sbs on
> >
> (113427455640312821154458202477256070484,141784319550391026443072753096570088105]:
> > manual-repair-2a11d01c-e1e4-4f1e-b8cd-00a9a3fd2f4a completed.
> >
> > What can I do?
> >
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>