Re: Gossiper question

2011-05-18 Thread Peter Schuller
>   I have 9 node cluster with RF-3 and using Cassandra0.70/Hector26. Recently
> we are seeing lot of "UnavailableException" at the client side. Whenever
> this happens, I found following pattern in Cassandra node's log file at that
> given time,

UnavailableException is the expected error if an insufficient number
of nodes are up in the replicate set for the row you're
reading/writing. "Up" in this case means according to gossip (what you
see with nodetool ring).

The question is really why nodes are being marked as down and whether
it's legitimate or not. There is
https://issues.apache.org/jira/browse/CASSANDRA-2554 which may be
relevant, but simple overload is also a possible reason.

-- 
/ Peter Schuller


Re: Knowing when there is a *real* need to add nodes

2011-05-18 Thread Tomer B
As for static disk usage i would add this:

test: df -kh
description: run test after compaction (check GCGraceSeconds in
storage-conf.xml) as only then data is expunged permanently, run on data
disk, assuming here commitlog disk is separated from data dir.
green gauge: used_space < 30% of disk capacity
yellow gauge: used space 30% - 50% of disk capacity
red gauge: used_space > 50% of disk capacity
comments: Compactions can require up to 100% of in use space temporarily in
worst case (data file dir) when approaching 50% or more of disk capacity use
raid0 for data dir disk if cannot try increasing your disk if cannot
consider adding nodes (or first consider adding nodes if that's what you
wish).

2011/5/12 Watanabe Maki 

> It's interesting topic for me too.
> How about to add measurement on static disk utilization (% used) and memory
> utilization ( rss, JVM heap, JVM GC )?
>
> maki
>
> From iPhone
>
>
> On 2011/05/12, at 0:49, Tomer B  wrote:
>
> > Hi
> >
> > I'm trying to predict when my cluster would soon be needing new nodes
> > added, i want a continuous graph telling my of my cluster health so
> > that when i see my cluster becomes more and more busy (I want numbers
> > & measurments) i would be able to know i need to start purchasing more
> > machines and get them into my cluster, so i want to know of that
> > beforehand.
> > I'm writing here what I came with after doing some research over net.
> > I would highly appreciate any additional gauge measurements and ranges
> > in order to test my cluster health and to know beforehand when i'm
> > going to soon need more nodes.Although i'm writing down green
> > gauge,yellow gauge,red gauge, i'm also trying to find a continuous
> > graph where i can tell where our cluster stand (as much as
> > possible...)
> >
> > Also my recommendation is always before adding new nodes:
> >
> > 1. Make sure all nodes are balanced and if not balance them.
> > 2. Separate commit log drive from data (SSTables) drive
> > 3. use mmap index only in memory and not auto
> > 4. Increase disk IO if possible.
> > 5. Avoid swapping as much as possible.
> >
> >
> > As for my gauge tests for when to add new nodes:
> >
> > test: nodetool tpstats -h 
> > green gauge: No pending column with number higher
> > yellow gauge: pending columns 100-2000
> > red gauge:Larger than 3000
> >
> > test: iostat -x -n -p -z 5 10  and iostat -xcn 5
> > green gauge: kw/s + kr/s reaches is below 25% capacity of disk io
> > yellow gauge: 20%-50%
> > red gauge: 50%+
> >
> > test: ostat -x -n -p -z 5 10 and check %b column
> > green gauge: less than 10%
> > yellow gauge:  10%-80%
> > red gauge: 90%+
> >
> > test: nodetool cfstats --host localhost
> > green gauge: “SSTable count” item does not continually grow over time
> > yellow gauge:
> > red gauge: “SSTable count” item continually grows over time
> >
> > test: ./nodetool cfstats --host localhost | grep -i pending
> > green gauge: 0-2
> > yellow gauge: 3-100
> > red gauge: 101+
> >
> > I would highly appreciate any additional gauge measurements and ranges
> > in order to test my cluster health and to know ***beforehand*** when
> > i'm going to soon need more nodes.
>


[RELEASE] Apache Cassandra 0.7.6 released

2011-05-18 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra
version 0.7.6.

Cassandra is a highly scalable second-generation distributed database,
bringing together Dynamo's fully distributed design and Bigtable's
ColumnFamily-based data model. You can read more here:

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1,3] and upgrade is highly encouraged.

Please always pay attention to the release notes[2] before upgrading,
especially if you upgrade from 0.7.2 or before. Upgrade from 0.7.3 or later
should be a snap.

If you were to encounter any problem, please let us know[4].

Have fun!


[1]: http://goo.gl/VYZ2e (CHANGES.txt)
[2]: http://goo.gl/jMRDE (NEWS.txt)
[3]: http://goo.gl/6ohkb (JIRA Release Notes)
[4]: https://issues.apache.org/jira/browse/CASSANDRA


Berlin Buzzword Hackathon

2011-05-18 Thread Daniel Doubleday
Hi all

was wondering if there's anybody here planning to go to the Berlin Buzzwords 
and attend the cassandra hackathon.

I'm still indecisive but it might be good to have the chance to talk about 
experiences in more detail.

Cheers,
Daniel

RE: AssertionError

2011-05-18 Thread Desimpel, Ignace
Hi Sylvain,

I did the upgrade from 0.7.4 to 0.7.5 and the exception does not occur anymore 
(on Windows ...). Thanks for pointing me to the bug fix.
>From the 0.7.5 version I upgraded to the 0.7.6 version, and this is also OK, 
>without any code changes and by still keeping the same data files generate 
>with the 0.7.4 version.

Could you still give me a comment on the question regarding the AbstractType 
class change? To be on the save side, I could simply make new array backed byte 
buffers (that is what I need). But I ask the question because I want to avoid 
allocating any object if it is not really needed since I know that I will query 
for a lot of data of that type.

Ignace


-Original Message-
From: Desimpel, Ignace [mailto:ignace.desim...@nuance.com] 
Sent: Tuesday, May 17, 2011 3:33 PM
To: user@cassandra.apache.org
Subject: RE: AssertionError

Seems like the AbstractType class has changed going from 0.7.4 to 0.7.5. 
It is now required to implement a compose and decompose method. Already did 
that, and it starts up with the 0.7.5 code using the 0.7.4 data and 
configuration (using a smaller extra test database) Below I made a sample 
implementations to illustrate another question : On the compose method , can I 
simply create my own AbstractType class and use the given ByteBuffer. Or like 
in the decompose example, do I need to duplicate the ByteBuffer or could the 
paramT object  be reused or should I make a complete copy?

@Override
public Object compose(ByteBuffer paramByteBuffer){
ReverseCFFloatValues oNew = new ReverseCFFloatValues();
oNew.paramByteBuffer = paramByteBuffer;
return oNew;
}

@Override
public ByteBuffer decompose(Object paramT){
return 
((ReverseCFFloatValues)paramT).paramByteBuffer.duplicate();
}



-Original Message-
From: Sylvain Lebresne [mailto:sylv...@datastax.com]
Sent: Tuesday, May 17, 2011 2:50 PM
To: user@cassandra.apache.org
Subject: Re: AssertionError

On Tue, May 17, 2011 at 1:46 PM, Desimpel, Ignace  
wrote:
> Ok, I will do that (next test will be done on some linux boxes being 
> installed now, but at this time I need to gone with the current windows 
> setup).
> Question : Can I use the 0.7.4 data files as is? Do I need to backup the 
> datafiles in order to be able to get back to the 0.7.4 version if needed?

Yes you can use 0.7.4 data files as is. And I can't think of a reason why you 
should have problem getting back to 0.7.4 if needed, though snapshotting before 
cannot hurt.

>
> Ignace
>
> -Original Message-
> From: Sylvain Lebresne [mailto:sylv...@datastax.com]
> Sent: Tuesday, May 17, 2011 1:16 PM
> To: user@cassandra.apache.org
> Subject: Re: AssertionError
>
> First thing to do would be to update to 0.7.5.
>
> The assertionError you're running into is a assertion where we check if a 
> skipBytes did skip all the bytes we had ask him to. As it turns out, the spec 
> for skipBytes authorize it to not skip all the bytes asked even with no good 
> reason. I'm pretty sure on a linux box skipBytes on a file will always read 
> the number of asked bytes unless it reaches EOF, but I see you're running 
> windows, so who knows what can happen.
>
> Anyway, long story short, it's a "bug" in 0.7.4 that has been fixed in 0.7.5. 
> If you still run into this in 0.7.5 at least we'll know it's something else 
> (and we will have a more helpful error message).
>
> --
> Sylvain
>
> On Tue, May 17, 2011 at 12:41 PM, Desimpel, Ignace 
>  wrote:
>> I use a custom comparator class. So I think there is a high chance 
>> that I do something wrong there. I was thinking that the stack trace 
>> could give a clue and help me on the way, maybe because some already got the 
>> same error.
>>
>>
>>
>> Anyway, here is some more information you requested.
>>
>>
>>
>> Yaml definition :
>>
>> name: ForwardStringValues
>>
>>   column_type: Super
>>
>>   compare_with:
>> be.landc.services.search.server.db.cassandra.node.ForwardCFStringValu
>> e
>> s
>>
>>   compare_subcolumns_with: BytesType
>>
>>   keys_cached: 10
>>
>>   rows_cached: 0
>>
>>   comment: Stores the values of functions returning string
>>
>>   memtable_throughput_in_mb: 64
>>
>>   memtable_operations_in_millions: 15
>>
>>   min_compaction_threshold: 2
>>
>>   max_compaction_threshold: 5
>>
>>
>>
>> Column Family: ForwardStringValues
>>
>>     SSTable count: 8
>>
>>     Space used (live): 131311776690
>>
>>     Space used (total): 131311776690
>>
>>     Memtable Columns Count: 0
>>
>>     Memtable Data Size: 0
>>
>>     Memtable Switch Count: 0
>>
>>     Read Count: 1
>>
>>     Read Latency: 404.890 ms.
>>
>>     Write Count: 0
>>
>>     Write Latency: NaN ms.
>>
>>     Pending

Re: Berlin Buzzword Hackathon

2011-05-18 Thread Christoph Rueger
Oh damn, I would love to go, but I'll just be at Berlin Buzzwords for the
6th and 7th.
Next time :)

On Wed, May 18, 2011 at 2:31 PM, Daniel Doubleday
wrote:

> Hi all
>
> was wondering if there's anybody here planning to go to the Berlin
> Buzzwords and attend the cassandra hackathon.
>
> I'm still indecisive but it might be good to have the chance to talk about
> experiences in more detail.
>
> Cheers,
> Daniel


Re: AssertionError

2011-05-18 Thread Sylvain Lebresne
The compose() and decompose() methods of AbstractType are used only by
the PIG driver (in 0.7 at least, in 0.8 I think
CQL uses them too). If you're not using PIG, you safe with making
those function simple pass-through, i.e, to have
something along the line of:
  class CustomComparator extends AbstractType
  {
  ...
  public ByteBuffer compose(ByteBuffer v) { return v; }
  public ByteBuffer decompose(ByteBuffer v) { return v; }
  }

I'm not a PIG expert, but even if you're using it I'm not sure how
much useful it is to actually diverge from what's above
since PIG probably doesn't know much about your type. In any case,
those function are not called during "normal" query.

Sylvain

On Wed, May 18, 2011 at 2:40 PM, Desimpel, Ignace
 wrote:
> Hi Sylvain,
>
> I did the upgrade from 0.7.4 to 0.7.5 and the exception does not occur 
> anymore (on Windows ...). Thanks for pointing me to the bug fix.
> From the 0.7.5 version I upgraded to the 0.7.6 version, and this is also OK, 
> without any code changes and by still keeping the same data files generate 
> with the 0.7.4 version.
>
> Could you still give me a comment on the question regarding the AbstractType 
> class change? To be on the save side, I could simply make new array backed 
> byte buffers (that is what I need). But I ask the question because I want to 
> avoid allocating any object if it is not really needed since I know that I 
> will query for a lot of data of that type.
>
> Ignace
>
>
> -Original Message-
> From: Desimpel, Ignace [mailto:ignace.desim...@nuance.com]
> Sent: Tuesday, May 17, 2011 3:33 PM
> To: user@cassandra.apache.org
> Subject: RE: AssertionError
>
> Seems like the AbstractType class has changed going from 0.7.4 to 0.7.5.
> It is now required to implement a compose and decompose method. Already did 
> that, and it starts up with the 0.7.5 code using the 0.7.4 data and 
> configuration (using a smaller extra test database) Below I made a sample 
> implementations to illustrate another question : On the compose method , can 
> I simply create my own AbstractType class and use the given ByteBuffer. Or 
> like in the decompose example, do I need to duplicate the ByteBuffer or could 
> the paramT object  be reused or should I make a complete copy?
>
>        @Override
>        public Object compose(ByteBuffer paramByteBuffer){
>                ReverseCFFloatValues oNew = new ReverseCFFloatValues();
>                oNew.paramByteBuffer = paramByteBuffer;
>                return oNew;
>        }
>
>        @Override
>        public ByteBuffer decompose(Object paramT){
>                return 
> ((ReverseCFFloatValues)paramT).paramByteBuffer.duplicate();
>        }
>
>
>
> -Original Message-
> From: Sylvain Lebresne [mailto:sylv...@datastax.com]
> Sent: Tuesday, May 17, 2011 2:50 PM
> To: user@cassandra.apache.org
> Subject: Re: AssertionError
>
> On Tue, May 17, 2011 at 1:46 PM, Desimpel, Ignace 
>  wrote:
>> Ok, I will do that (next test will be done on some linux boxes being 
>> installed now, but at this time I need to gone with the current windows 
>> setup).
>> Question : Can I use the 0.7.4 data files as is? Do I need to backup the 
>> datafiles in order to be able to get back to the 0.7.4 version if needed?
>
> Yes you can use 0.7.4 data files as is. And I can't think of a reason why you 
> should have problem getting back to 0.7.4 if needed, though snapshotting 
> before cannot hurt.
>
>>
>> Ignace
>>
>> -Original Message-
>> From: Sylvain Lebresne [mailto:sylv...@datastax.com]
>> Sent: Tuesday, May 17, 2011 1:16 PM
>> To: user@cassandra.apache.org
>> Subject: Re: AssertionError
>>
>> First thing to do would be to update to 0.7.5.
>>
>> The assertionError you're running into is a assertion where we check if a 
>> skipBytes did skip all the bytes we had ask him to. As it turns out, the 
>> spec for skipBytes authorize it to not skip all the bytes asked even with no 
>> good reason. I'm pretty sure on a linux box skipBytes on a file will always 
>> read the number of asked bytes unless it reaches EOF, but I see you're 
>> running windows, so who knows what can happen.
>>
>> Anyway, long story short, it's a "bug" in 0.7.4 that has been fixed in 
>> 0.7.5. If you still run into this in 0.7.5 at least we'll know it's 
>> something else (and we will have a more helpful error message).
>>
>> --
>> Sylvain
>>
>> On Tue, May 17, 2011 at 12:41 PM, Desimpel, Ignace 
>>  wrote:
>>> I use a custom comparator class. So I think there is a high chance
>>> that I do something wrong there. I was thinking that the stack trace
>>> could give a clue and help me on the way, maybe because some already got 
>>> the same error.
>>>
>>>
>>>
>>> Anyway, here is some more information you requested.
>>>
>>>
>>>
>>> Yaml definition :
>>>
>>> name: ForwardStringValues
>>>
>>>   column_type: Super
>>>
>>>   compare_with:
>>> be.landc.services.search.server.db.cassandra.node.ForwardCFStringValu
>>> e
>>> s
>>>

RE: AssertionError

2011-05-18 Thread Desimpel, Ignace
Great! I'm not using PIG.

Thanks.

-Original Message-
From: Sylvain Lebresne [mailto:sylv...@datastax.com] 
Sent: Wednesday, May 18, 2011 3:07 PM
To: user@cassandra.apache.org
Subject: Re: AssertionError

The compose() and decompose() methods of AbstractType are used only by the PIG 
driver (in 0.7 at least, in 0.8 I think CQL uses them too). If you're not using 
PIG, you safe with making those function simple pass-through, i.e, to have 
something along the line of:
  class CustomComparator extends AbstractType
  {
  ...
  public ByteBuffer compose(ByteBuffer v) { return v; }
  public ByteBuffer decompose(ByteBuffer v) { return v; }
  }

I'm not a PIG expert, but even if you're using it I'm not sure how much useful 
it is to actually diverge from what's above since PIG probably doesn't know 
much about your type. In any case, those function are not called during 
"normal" query.

Sylvain

On Wed, May 18, 2011 at 2:40 PM, Desimpel, Ignace  
wrote:
> Hi Sylvain,
>
> I did the upgrade from 0.7.4 to 0.7.5 and the exception does not occur 
> anymore (on Windows ...). Thanks for pointing me to the bug fix.
> From the 0.7.5 version I upgraded to the 0.7.6 version, and this is also OK, 
> without any code changes and by still keeping the same data files generate 
> with the 0.7.4 version.
>
> Could you still give me a comment on the question regarding the AbstractType 
> class change? To be on the save side, I could simply make new array backed 
> byte buffers (that is what I need). But I ask the question because I want to 
> avoid allocating any object if it is not really needed since I know that I 
> will query for a lot of data of that type.
>
> Ignace
>
>
> -Original Message-
> From: Desimpel, Ignace [mailto:ignace.desim...@nuance.com]
> Sent: Tuesday, May 17, 2011 3:33 PM
> To: user@cassandra.apache.org
> Subject: RE: AssertionError
>
> Seems like the AbstractType class has changed going from 0.7.4 to 0.7.5.
> It is now required to implement a compose and decompose method. Already did 
> that, and it starts up with the 0.7.5 code using the 0.7.4 data and 
> configuration (using a smaller extra test database) Below I made a sample 
> implementations to illustrate another question : On the compose method , can 
> I simply create my own AbstractType class and use the given ByteBuffer. Or 
> like in the decompose example, do I need to duplicate the ByteBuffer or could 
> the paramT object  be reused or should I make a complete copy?
>
>        @Override
>        public Object compose(ByteBuffer paramByteBuffer){
>                ReverseCFFloatValues oNew = new ReverseCFFloatValues();
>                oNew.paramByteBuffer = paramByteBuffer;
>                return oNew;
>        }
>
>        @Override
>        public ByteBuffer decompose(Object paramT){
>                return 
> ((ReverseCFFloatValues)paramT).paramByteBuffer.duplicate();
>        }
>
>
>
> -Original Message-
> From: Sylvain Lebresne [mailto:sylv...@datastax.com]
> Sent: Tuesday, May 17, 2011 2:50 PM
> To: user@cassandra.apache.org
> Subject: Re: AssertionError
>
> On Tue, May 17, 2011 at 1:46 PM, Desimpel, Ignace 
>  wrote:
>> Ok, I will do that (next test will be done on some linux boxes being 
>> installed now, but at this time I need to gone with the current windows 
>> setup).
>> Question : Can I use the 0.7.4 data files as is? Do I need to backup the 
>> datafiles in order to be able to get back to the 0.7.4 version if needed?
>
> Yes you can use 0.7.4 data files as is. And I can't think of a reason why you 
> should have problem getting back to 0.7.4 if needed, though snapshotting 
> before cannot hurt.
>
>>
>> Ignace
>>
>> -Original Message-
>> From: Sylvain Lebresne [mailto:sylv...@datastax.com]
>> Sent: Tuesday, May 17, 2011 1:16 PM
>> To: user@cassandra.apache.org
>> Subject: Re: AssertionError
>>
>> First thing to do would be to update to 0.7.5.
>>
>> The assertionError you're running into is a assertion where we check if a 
>> skipBytes did skip all the bytes we had ask him to. As it turns out, the 
>> spec for skipBytes authorize it to not skip all the bytes asked even with no 
>> good reason. I'm pretty sure on a linux box skipBytes on a file will always 
>> read the number of asked bytes unless it reaches EOF, but I see you're 
>> running windows, so who knows what can happen.
>>
>> Anyway, long story short, it's a "bug" in 0.7.4 that has been fixed in 
>> 0.7.5. If you still run into this in 0.7.5 at least we'll know it's 
>> something else (and we will have a more helpful error message).
>>
>> --
>> Sylvain
>>
>> On Tue, May 17, 2011 at 12:41 PM, Desimpel, Ignace 
>>  wrote:
>>> I use a custom comparator class. So I think there is a high chance 
>>> that I do something wrong there. I was thinking that the stack trace 
>>> could give a clue and help me on the way, maybe because some already got 
>>> the same error.
>>>
>>>
>>>
>>> Anyway, here is some more information you requested.
>>>

Re: Questions about using MD5 encryption with SimpleAuthenticator

2011-05-18 Thread Ted Zlatanov
On Tue, 17 May 2011 15:52:22 -0700 Sameer Farooqui  
wrote: 

SF> Would still be nice though to use the bcrypt hash over MD5 for stronger
SF> security.

I used MD5 when I proposed SimpleAuthenticator for two reasons:

1) SimpleAuthenticator is supposed to be a demo of the authentication
interface.  It can be used for testing and trivial setups, but I
wouldn't use it in production.  So it's meant to get you going easily,
not to serve you long-term.

2) MD5 is built into Java.  At the time, bcrypt and SHA-* were not.  I
used MD5 only so the passwords are not stored in the clear, not to
provide production-level security.

You should consider carefully the implications of storing passwords in a
file on a database server, no matter how they are encrypted.  It would
be better to write a trivial AD/LDAP/etc. authenticator that fits your
specific needs and doesn't rely on a local file.

Ted



Re: Berlin Buzzword Hackathon

2011-05-18 Thread Eric Evans
On Wed, 2011-05-18 at 14:31 +0200, Daniel Doubleday wrote:
> was wondering if there's anybody here planning to go to the Berlin
> Buzzwords and attend the cassandra hackathon.

I'll be there.

> I'm still indecisive but it might be good to have the chance to talk
> about experiences in more detail. 

I thought a hackathon was for... hacking. ;)

-- 
Eric Evans
eev...@rackspace.com



Re: How to configure internode encryption in 0.8.0?

2011-05-18 Thread Jeremy Hanna
I'll CC Nirmal Ranganathan who implemented the internode encryption who might 
be able to give you some advice on this.

On May 17, 2011, at 7:47 PM, Sameer Farooqui wrote:

> Thanks for the link, Jeremy.
> 
> I generated the keystore and truststore for inter-node communication using 
> the link in the YAML file:
> http://download.oracle.com/javase/6/docs/technotes/guides/security/jsse/JSSERefGuide.html#CreateKeystore
> 
> Unfortunately, the default instructions in the above link used 
> TLS_RSA_WITH_AES_256_CBC_SHA. So, when I start Cassandra now, I get this 
> error:
> 
> ERROR 00:10:38,734 Exception encountered during startup.
> java.lang.IllegalArgumentException: Cannot support 
> TLS_RSA_WITH_AES_256_CBC_SHA   with currently installed providers
> at 
> com.sun.net.ssl.internal.ssl.CipherSuiteList.(CipherSuiteList.j  
> ava:79)
> at 
> com.sun.net.ssl.internal.ssl.SSLServerSocketImpl.setEnabledCipherSuit  
> es(SSLServerSocketImpl.java:166)
> at 
> org.apache.cassandra.security.SSLFactory.getServerSocket(SSLFactory.j  
> ava:55)
> 
> 
> The YAML file states that the cipher suite for authentication should be: 
> TLS_RSA_WITH_AES_128_CBC_SHA.
> 
> This is my first time using keytool and I've searched the web to see how I 
> can change the cipher from AES_256 to AES_128, but haven't found the answer.
> 
> Anyone know how to change the cipher to AES_128?
> 
> Here are the commands I used to generate the non-working keystore and 
> truststore:
> 
> 1) keytool -genkeypair -alias jdoe -keyalg RSA -validity 7 -keystore .keystore
> 2) keytool -list -v -keystore .keystore
> 3) keytool -export -alias jdoe -keystore .keystore -rfc -file jdoe.cer
> 4) cat jdoe.cer
> 5) keytool -import -alias jdoecert -file jdoe.cer -keystore .truststore
> 6) keytool -list -v -keystore .truststore
> 
> 
> - Sameer
> 
> On Mon, May 16, 2011 at 5:35 PM, Jeremy Hanna  
> wrote:
> Take a look at cassandra.yaml in your 0.8 download at the very bottom.  There 
> are docs and examples there.
> e.g. 
> http://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.8.0-beta2/conf/cassandra.yaml
> 
> On May 16, 2011, at 6:36 PM, Sameer Farooqui wrote:
> 
> > I understand that 0.8.0 has configurable internode encryption 
> > (CASSANDRA-1567, 2152).
> >
> > I haven't been able to find any info on how to configure it though on this 
> > mailing list or the Datastax website.
> >
> > Can somebody point me towards how to set this up?
> >
> > - Sameer
> 
> 



Design for 'Most viewed Discussions' in a forum

2011-05-18 Thread Aditya Narayan
*
For a discussions forum, I need to show a page of most viewed discussions.

For implementing this, I maintain a count of views of a discussion & when
this views count of a discussion passes a certain threshold limit, the
discussion Id is added to a row of most viewed discussions.

This row of most viewed discussions contains columns with Integer names &
values containing serialized lists of Ids of all discussions whose views
count equals the Integral name of this column.

Thus if the view count of a discussion increases I'll need to move its 'Id'
from serialized list in some column to serialized list in another column
whose name represents the updated views count on that discussion.

Thus I can get the most viewed discussions by getting the appropriate no of
columns from one end of this Integer sorted row.



I wanted to get feedback from you all, to know if this is a good design.

Thanks


Re: Design for 'Most viewed Discussions' in a forum

2011-05-18 Thread Aditya Narayan
I would arrange for memtable flush period in such a manner that the time
period for which these most viewed discussions are generated equals the
memtable flush timeperiod, so that the entire row of most viewed discussion
on a topic is in one or maximum two memtables/ SST tables.
This would also help minimize several versions of the same column in the row
parts in different SST tables.


On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan  wrote:

> *
> For a discussions forum, I need to show a page of most viewed discussions.
>
> For implementing this, I maintain a count of views of a discussion & when
> this views count of a discussion passes a certain threshold limit, the
> discussion Id is added to a row of most viewed discussions.
>
> This row of most viewed discussions contains columns with Integer names &
> values containing serialized lists of Ids of all discussions whose views
> count equals the Integral name of this column.
>
> Thus if the view count of a discussion increases I'll need to move its 'Id'
> from serialized list in some column to serialized list in another column
> whose name represents the updated views count on that discussion.
>
> Thus I can get the most viewed discussions by getting the appropriate no of
> columns from one end of this Integer sorted row.
>
> 
>
> I wanted to get feedback from you all, to know if this is a good design.
>
> Thanks
>
>
>
>
>
>


Recommandation on how to organize CF

2011-05-18 Thread openvictor Open
Hello all,

I know organization is a broad topic and everybody may have an idea on how
to do it, but I really want to have some advices and opinions and I think it
could be interesting to discuss this matter.

Here is my problem: I am designing a messaging system internal to a website.
There are 3 big structures which are Message, MessageList, MessageBox. A
message/messagelist is identified only by an UUID; a MessageBox is
identified by a name(utf8 string). A messagebox has a set of MessageList in
it and a messagelist has a set of message in it, all of them being UUIDs.
Currently I have only two CF : message and message_time. Message is a
UTF8Type (cassandra 0.6.11, soon going for 0.8) and message_time is a
TimeUUIDType.

For example if I want to request all message in a certain messagelist I do :
message_time['messagelist:uuid(messagelist)']
If I want information of a mesasge I do message['message:uuid(message)']
If I want all messagelist for a certain messagebox ( called nameofbox for
user openvictor for this example) I do :
message_time['messagebox:openvictor:nameofbox']

My question to Cassandra users is : is it a good idea to regroup all those
things into two CF ? Is there some advantages / drawbacks of this two CFs
and for long term should I change my organization ?

Thank you,
Victor


Re: Design for 'Most viewed Discussions' in a forum

2011-05-18 Thread openvictor Open
Have you thought about user another kind of Database, which supports
volative content for example ?

I am currently thinking about doing something similar. The best and simplest
option at the moment that I can think of is Redis. In redis you have the
option of querying keys with wildcards. Your problem can be done by just
inserting an UUID into Redis for a certain amount of time ( the best is to
tailor this amount of time as an inverse function of the number of keys
existing in Redis).

*With Redis*
What I would do : I cut down time in pieces of X minutes ( 15 minutes, for
example by truncating a timestamp). Let timestampN be the timestamp for the
period of time ( [N,N+15] ), let Topic1 Topic2 be two topics then :

One or more people will view Topic 1 then Topic2 then again Topic1 in this
period of 15 minutes
(HINCRBY is the Increment)
H INCRBY
topics:Topic1:timestampN
viewcount 1
H INCRBY
topics:Topic2:timestampN
viewcount 1
H INCRBY
topics:Topic1:timestampN
viewcount 1

Then you just query in the following way :

MGET  topics:*:timestampN

* is the wildcard, you order by viewcount and you have what you are asking
for !
This is a simplified version of what you should do but personnally I really
like the combination of Cassandra and Redis.


Victor

2011/5/18 Aditya Narayan 

> I would arrange for memtable flush period in such a manner that the time
> period for which these most viewed discussions are generated equals the
> memtable flush timeperiod, so that the entire row of most viewed discussion
> on a topic is in one or maximum two memtables/ SST tables.
> This would also help minimize several versions of the same column in the
> row parts in different SST tables.
>
>
>
> On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan  wrote:
>
>> *
>> For a discussions forum, I need to show a page of most viewed discussions.
>>
>> For implementing this, I maintain a count of views of a discussion & when
>> this views count of a discussion passes a certain threshold limit, the
>> discussion Id is added to a row of most viewed discussions.
>>
>> This row of most viewed discussions contains columns with Integer names &
>> values containing serialized lists of Ids of all discussions whose views
>> count equals the Integral name of this column.
>>
>> Thus if the view count of a discussion increases I'll need to move its
>> 'Id' from serialized list in some column to serialized list in another
>> column whose name represents the updated views count on that discussion.
>>
>> Thus I can get the most viewed discussions by getting the appropriate no
>> of columns from one end of this Integer sorted row.
>>
>> 
>>
>> I wanted to get feedback from you all, to know if this is a good design.
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>


Re: Design for 'Most viewed Discussions' in a forum

2011-05-18 Thread Aditya Narayan
Thanks victor!

Aren't there any good ways by using Cassandra alone ?

On Wed, May 18, 2011 at 11:41 PM, openvictor Open wrote:

> Have you thought about user another kind of Database, which supports
> volative content for example ?
>
> I am currently thinking about doing something similar. The best and
> simplest option at the moment that I can think of is Redis. In redis you
> have the option of querying keys with wildcards. Your problem can be done by
> just inserting an UUID into Redis for a certain amount of time ( the best is
> to tailor this amount of time as an inverse function of the number of keys
> existing in Redis).
>
> *With Redis*
> What I would do : I cut down time in pieces of X minutes ( 15 minutes, for
> example by truncating a timestamp). Let timestampN be the timestamp for the
> period of time ( [N,N+15] ), let Topic1 Topic2 be two topics then :
>
> One or more people will view Topic 1 then Topic2 then again Topic1 in this
> period of 15 minutes
> (HINCRBY is the Increment)
> H INCRBY 
> topics:Topic1:timestampN
> viewcount 1
> H INCRBY 
> topics:Topic2:timestampN
> viewcount 1
> H INCRBY 
> topics:Topic1:timestampN
> viewcount 1
>
> Then you just query in the following way :
>
> MGET  topics:*:timestampN
>
> * is the wildcard, you order by viewcount and you have what you are asking
> for !
> This is a simplified version of what you should do but personnally I really
> like the combination of Cassandra and Redis.
>
>
> Victor
>
> 2011/5/18 Aditya Narayan 
>
>> I would arrange for memtable flush period in such a manner that the time
>> period for which these most viewed discussions are generated equals the
>> memtable flush timeperiod, so that the entire row of most viewed discussion
>> on a topic is in one or maximum two memtables/ SST tables.
>> This would also help minimize several versions of the same column in the
>> row parts in different SST tables.
>>
>>
>>
>> On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan wrote:
>>
>>> *
>>> For a discussions forum, I need to show a page of most viewed
>>> discussions.
>>>
>>> For implementing this, I maintain a count of views of a discussion & when
>>> this views count of a discussion passes a certain threshold limit, the
>>> discussion Id is added to a row of most viewed discussions.
>>>
>>> This row of most viewed discussions contains columns with Integer names &
>>> values containing serialized lists of Ids of all discussions whose views
>>> count equals the Integral name of this column.
>>>
>>> Thus if the view count of a discussion increases I'll need to move its
>>> 'Id' from serialized list in some column to serialized list in another
>>> column whose name represents the updated views count on that discussion.
>>>
>>> Thus I can get the most viewed discussions by getting the appropriate no
>>> of columns from one end of this Integer sorted row.
>>>
>>> 
>>>
>>> I wanted to get feedback from you all, to know if this is a good design.
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>>
>>
>


Re: Design for 'Most viewed Discussions' in a forum

2011-05-18 Thread openvictor Open
I guess you can use the same system, you need two CF for that and I think
it's better to use 0.8 because it supports counter :

One CF with UTF8Type called active-topics one CF with UUIDType called
topics-seen, then using the same principle :

for each timestampN you create :

For each visit to Topic1 Topic2 Topic1

You create a TimeUUID and you insert
active-topics[topics:timestampN] = {Topic1:whateveryouwant}
and :
topics-seen[topic:Topic1]={TimeUUID1:whatever}


active-topics[topics:timestampN] = {Topic2:whateveryouwant}
and :
topics-seen[topic:Topic2]={TimeUUID2:whatever}


active-topics[topics:timestampN] = {Topic1:whateveryouwant}
and :
topics-seen[topic:Topic1]={TimeUUID3:whatever}


Then when you want to query, you query first all the topics (slice) in
active-topics for topics:timestampN and then you get all counts in the
topics-seen CF for all topics in active-topics.

Not so simple... By the way it adds overhead compared to a simple counter
solution but I think it is far more elegant, but this is just my opinion.

Victor


2011/5/18 Aditya Narayan 

> Thanks victor!
>
> Aren't there any good ways by using Cassandra alone ?
>
>
> On Wed, May 18, 2011 at 11:41 PM, openvictor Open wrote:
>
>> Have you thought about user another kind of Database, which supports
>> volative content for example ?
>>
>> I am currently thinking about doing something similar. The best and
>> simplest option at the moment that I can think of is Redis. In redis you
>> have the option of querying keys with wildcards. Your problem can be done by
>> just inserting an UUID into Redis for a certain amount of time ( the best is
>> to tailor this amount of time as an inverse function of the number of keys
>> existing in Redis).
>>
>> *With Redis*
>> What I would do : I cut down time in pieces of X minutes ( 15 minutes, for
>> example by truncating a timestamp). Let timestampN be the timestamp for the
>> period of time ( [N,N+15] ), let Topic1 Topic2 be two topics then :
>>
>> One or more people will view Topic 1 then Topic2 then again Topic1 in this
>> period of 15 minutes
>> (HINCRBY is the Increment)
>> H INCRBY 
>> topics:Topic1:timestampN
>> viewcount 1
>> H INCRBY 
>> topics:Topic2:timestampN
>> viewcount 1
>> H INCRBY 
>> topics:Topic1:timestampN
>> viewcount 1
>>
>> Then you just query in the following way :
>>
>> MGET  topics:*:timestampN
>>
>> * is the wildcard, you order by viewcount and you have what you are asking
>> for !
>> This is a simplified version of what you should do but personnally I
>> really like the combination of Cassandra and Redis.
>>
>>
>> Victor
>>
>> 2011/5/18 Aditya Narayan 
>>
>>> I would arrange for memtable flush period in such a manner that the time
>>> period for which these most viewed discussions are generated equals the
>>> memtable flush timeperiod, so that the entire row of most viewed discussion
>>> on a topic is in one or maximum two memtables/ SST tables.
>>> This would also help minimize several versions of the same column in the
>>> row parts in different SST tables.
>>>
>>>
>>>
>>> On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan wrote:
>>>
 *
 For a discussions forum, I need to show a page of most viewed
 discussions.

 For implementing this, I maintain a count of views of a discussion &
 when this views count of a discussion passes a certain threshold limit, the
 discussion Id is added to a row of most viewed discussions.

 This row of most viewed discussions contains columns with Integer names
 & values containing serialized lists of Ids of all discussions whose views
 count equals the Integral name of this column.

 Thus if the view count of a discussion increases I'll need to move its
 'Id' from serialized list in some column to serialized list in another
 column whose name represents the updated views count on that discussion.

 Thus I can get the most viewed discussions by getting the appropriate no
 of columns from one end of this Integer sorted row.

 

 I wanted to get feedback from you all, to know if this is a good design.

 Thanks






>>>
>>
>


Re: Design for 'Most viewed Discussions' in a forum

2011-05-18 Thread openvictor Open
Sorry I made a mistake in topics-seen !
When you insert it should be :

topics-seen[topic:TopicX:timestampN]={TimeUUID3:whatever}

Sorry about that,
Victor

2011/5/18 openvictor Open 

> I guess you can use the same system, you need two CF for that and I think
> it's better to use 0.8 because it supports counter :
>
> One CF with UTF8Type called active-topics one CF with UUIDType called
> topics-seen, then using the same principle :
>
> for each timestampN you create :
>
> For each visit to Topic1 Topic2 Topic1
>
> You create a TimeUUID and you insert
> active-topics[topics:timestampN] = {Topic1:whateveryouwant}
> and :
> topics-seen[topic:Topic1]={TimeUUID1:whatever}
>
>
> active-topics[topics:timestampN] = {Topic2:whateveryouwant}
> and :
> topics-seen[topic:Topic2]={TimeUUID2:whatever}
>
>
> active-topics[topics:timestampN] = {Topic1:whateveryouwant}
> and :
> topics-seen[topic:Topic1]={TimeUUID3:whatever}
>
>
> Then when you want to query, you query first all the topics (slice) in
> active-topics for topics:timestampN and then you get all counts in the
> topics-seen CF for all topics in active-topics.
>
> Not so simple... By the way it adds overhead compared to a simple counter
> solution but I think it is far more elegant, but this is just my opinion.
>
>
> Victor
>
>
> 2011/5/18 Aditya Narayan 
>
>> Thanks victor!
>>
>> Aren't there any good ways by using Cassandra alone ?
>>
>>
>> On Wed, May 18, 2011 at 11:41 PM, openvictor Open 
>> wrote:
>>
>>> Have you thought about user another kind of Database, which supports
>>> volative content for example ?
>>>
>>> I am currently thinking about doing something similar. The best and
>>> simplest option at the moment that I can think of is Redis. In redis you
>>> have the option of querying keys with wildcards. Your problem can be done by
>>> just inserting an UUID into Redis for a certain amount of time ( the best is
>>> to tailor this amount of time as an inverse function of the number of keys
>>> existing in Redis).
>>>
>>> *With Redis*
>>> What I would do : I cut down time in pieces of X minutes ( 15 minutes,
>>> for example by truncating a timestamp). Let timestampN be the timestamp for
>>> the period of time ( [N,N+15] ), let Topic1 Topic2 be two topics then :
>>>
>>> One or more people will view Topic 1 then Topic2 then again Topic1 in
>>> this period of 15 minutes
>>> (HINCRBY is the Increment)
>>> H 
>>> INCRBY 
>>> topics:Topic1:timestampN
>>> viewcount 1
>>> H 
>>> INCRBY 
>>> topics:Topic2:timestampN
>>> viewcount 1
>>> H 
>>> INCRBY 
>>> topics:Topic1:timestampN
>>> viewcount 1
>>>
>>> Then you just query in the following way :
>>>
>>> MGET  topics:*:timestampN
>>>
>>> * is the wildcard, you order by viewcount and you have what you are
>>> asking for !
>>> This is a simplified version of what you should do but personnally I
>>> really like the combination of Cassandra and Redis.
>>>
>>>
>>> Victor
>>>
>>> 2011/5/18 Aditya Narayan 
>>>
 I would arrange for memtable flush period in such a manner that the time
 period for which these most viewed discussions are generated equals the
 memtable flush timeperiod, so that the entire row of most viewed discussion
 on a topic is in one or maximum two memtables/ SST tables.
 This would also help minimize several versions of the same column in the
 row parts in different SST tables.



 On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan wrote:

> *
> For a discussions forum, I need to show a page of most viewed
> discussions.
>
> For implementing this, I maintain a count of views of a discussion &
> when this views count of a discussion passes a certain threshold limit, 
> the
> discussion Id is added to a row of most viewed discussions.
>
> This row of most viewed discussions contains columns with Integer names
> & values containing serialized lists of Ids of all discussions whose views
> count equals the Integral name of this column.
>
> Thus if the view count of a discussion increases I'll need to move its
> 'Id' from serialized list in some column to serialized list in another
> column whose name represents the updated views count on that discussion.
>
> Thus I can get the most viewed discussions by getting the appropriate
> no of columns from one end of this Integer sorted row.
>
> 
>
> I wanted to get feedback from you all, to know if this is a good
> design.
>
> Thanks
>
>
>
>
>
>

>>>
>>
>


Re: [RELEASE] Apache Cassandra 0.7.6 released

2011-05-18 Thread Sylvain Lebresne
A small error in the debian setup script made it's way into the debian
package of 0.7.6
(more details here: https://issues.apache.org/jira/browse/CASSANDRA-2641).
We are working on fixing the problem but we must follow the apache
process and as
a result this may take a little longer than we would hope.

Note that if you are not using the debian package you can safely
ignore this mail.
Otherwise, you may want to wait a little longer before updating.

We will keep you posted as to when this is resolved.


PS: For the very impatient, you can also build the package from the
source after having
applied the second patch attached with the issue
(https://issues.apache.org/jira/browse/CASSANDRA-2641).

--
Sylvain

On Wed, May 18, 2011 at 12:19 PM, Sylvain Lebresne  wrote:
> The Cassandra team is pleased to announce the release of Apache Cassandra
> version 0.7.6.
>
> Cassandra is a highly scalable second-generation distributed database,
> bringing together Dynamo's fully distributed design and Bigtable's
> ColumnFamily-based data model. You can read more here:
>
>  http://cassandra.apache.org/
>
> Downloads of source and binary distributions are listed in our download
> section:
>
>  http://cassandra.apache.org/download/
>
> This version is a bug fix release[1,3] and upgrade is highly encouraged.
>
> Please always pay attention to the release notes[2] before upgrading,
> especially if you upgrade from 0.7.2 or before. Upgrade from 0.7.3 or later
> should be a snap.
>
> If you were to encounter any problem, please let us know[4].
>
> Have fun!
>
>
> [1]: http://goo.gl/VYZ2e (CHANGES.txt)
> [2]: http://goo.gl/jMRDE (NEWS.txt)
> [3]: http://goo.gl/6ohkb (JIRA Release Notes)
> [4]: https://issues.apache.org/jira/browse/CASSANDRA
>


Re: Questions about using MD5 encryption with SimpleAuthenticator

2011-05-18 Thread Aaron Morton
Also if you were wearing an aluminium foil hat you may also be concerned about 
how the password is sent to the server.

Again though, see previous "I am not a security guy" comment and helpful link 
from Jonathan confirming that statement :)
Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 19/05/2011, at 1:19 AM, Ted Zlatanov  wrote:

> On Tue, 17 May 2011 15:52:22 -0700 Sameer Farooqui  
> wrote: 
> 
> SF> Would still be nice though to use the bcrypt hash over MD5 for stronger
> SF> security.
> 
> I used MD5 when I proposed SimpleAuthenticator for two reasons:
> 
> 1) SimpleAuthenticator is supposed to be a demo of the authentication
> interface.  It can be used for testing and trivial setups, but I
> wouldn't use it in production.  So it's meant to get you going easily,
> not to serve you long-term.
> 
> 2) MD5 is built into Java.  At the time, bcrypt and SHA-* were not.  I
> used MD5 only so the passwords are not stored in the clear, not to
> provide production-level security.
> 
> You should consider carefully the implications of storing passwords in a
> file on a database server, no matter how they are encrypted.  It would
> be better to write a trivial AD/LDAP/etc. authenticator that fits your
> specific needs and doesn't rely on a local file.
> 
> Ted
> 


Re: Native heap leaks?

2011-05-18 Thread Hannes Schmidt
One last word on the effect of memory mapped IO on the VIRT, RES and
SHR columns in the output of the top utility.

With mmap enabled, VIRT can be big, as much as the sum of the size of
all index and data files and the sizes of shared libraries. RES is the
sum of the sizes of

1) the Java heap,
2) the native heap (JVM-internal data, Java stacks for all threads,
direct buffers, JNA-allocated memory, PermGen)
3) the native stack,
4) minor things like static data (C not Java) and
5) with mmap enabled, the RAM pages into which the OS has currently
loaded contents of memory-mapped files.

The 5th component of RES is managed by the OS and can fluctuate
wildly. The SHR column shows the RAM pages into which the OS has
currently loaded contents of memory-mapped files but to which the
process has not written to. Because Cassandra doesn't use
memory-mapped IO for writing (correct me if I am wrong), SHR and
component 5 of RES are identical. Without mmap, the 5th component will
be negligible.

To determine if there is a native leak, one needs to look at

m = RES - SHR - JavaHeap - PermGen - (JavaStack * #-of-threads).

If m keeps growing, there is a native leak.

On Sun, May 15, 2011 at 11:52 AM, Hannes Schmidt  wrote:
> As promised: https://issues.apache.org/jira/browse/CASSANDRA-2654
>
> On Sun, May 15, 2011 at 7:09 AM, Jonathan Ellis  wrote:
>> Great debugging work!
>>
>> That workaround sounds like the best alternative to me too.
>>
>> On Sat, May 14, 2011 at 9:46 PM, Hannes Schmidt  wrote:
>>> Ok, so I think I found one major cause contributing to the increasing
>>> resident size of the Cassandra process. Looking at the OpenJDK sources
>>> was of great help in understanding the problem but my findings also
>>> apply to the Sun/Oracle JDK because the affected code is shared by
>>> both.
>>>
>>> Each IncomingTcpConnection (ITC) thread handles a socket to another
>>> node. That socket is a server socket returned from
>>> ServerSocket.accept() and as such it is implemented on top of an NIO
>>> socket channel (sun.nio.ch.SocketAdaptor) which in turn makes use of
>>> direct byte buffers. It obtains these buffers from sun.nio.ch.Util
>>> which caches the 3 most recently used buffers per thread. If a cached
>>> buffer isn't large enough for a message, a new one that is will
>>> replace it. The size of the buffer is determined by the amount of data
>>> that the application requests to be read. ITC uses the readFully()
>>> method of DataInputStream (DIS) to read data into a byte array
>>> allocated to hold the entire message:
>>>
>>> int size = socketStream.readInt();
>>> byte[] buffer = new byte[size];
>>> socketStream.readFully(buffer);
>>>
>>> Whatever the value of 'size' will end up being the size of the direct
>>> buffer allocated by the socket channel code.
>>>
>>> Our application uses range queries whose result sets are around 40
>>> megabytes in size. If a range isn't hosted on the node the application
>>> client is connected to, the range result set will be fetched from
>>> another node. When that other node has prepared the result it will
>>> send it back (asynchonously, this took me a while to grasp) and it
>>> will end up in the direct byte buffer that is cached by
>>> sun.nio.ch.Util for the ITC thread on the original node.
>>>
>>> The thing is that the buffer holds the entire message, all 40 megs of
>>> it. ITC is rather long-lived and so the buffers will simply stick
>>> around. Our range queries cover the entire ring (we do a lot of "map
>>> reduce") and so each node ends up with as many 40M buffers as we have
>>> nodes in the ring, 10 in our case. That's 400M of native heap space
>>> wasted on each node.
>>>
>>> Each ITC thread holds onto the historically largest direct buffer,
>>> possibly for a long time. This could be alleviated by periodically
>>> closing the connection and thereby releasing a potentially large
>>> buffer and replacing it with a new thread that starts with a clean
>>> slate. If all queries have large result sets, this solution won't
>>> help. Another alternative is to read the message incrementally rather
>>> than buffering it in its entirety in a byte array as ITC currently
>>> does. A third and possibly the simplest solution would be to read the
>>> messages into the buffer in chunks of say 1M. DIS has offers
>>> readFully( data, offset, length ) for that. I have tried this solution
>>> and it fixes this problem for us. I'll open an issue and submit my
>>> patch. We have observed the issue with 0.6.12 but from looking at ITC
>>> in trunk it seems to be affected too.
>>>
>>> It gets worse though: even after the ITC thread dies, the cached
>>> buffers stick around as they are being held via SoftReferences. SR's
>>> are released only as a last resort to prevent an OutOfMemoryException.
>>> Using SR's for caching direct buffers is silly because direct buffers
>>> have negligible impact on the Java heap but may have dramatic impact
>>> on the native heap. I am not the only one who thinks s

Re: Questions about using MD5 encryption with SimpleAuthenticator

2011-05-18 Thread Sameer Farooqui
I am wearing said hat and am freaking out right now :-)

Just kidding and good point. I guess it would be nice if clients like Hector
had an option to use TLS/SSL to encapsulate the application protocol.

But even SSL/TLS is subject to attacks from tools like SSLSNIFF:
http://www.thoughtcrime.org/software/sslsniff



On Wed, May 18, 2011 at 2:33 PM, Aaron Morton wrote:

> Also if you were wearing an aluminium foil hat you may also be concerned
> about how the password is sent to the server.
>
> Again though, see previous "I am not a security guy" comment and helpful
> link from Jonathan confirming that statement :)
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 19/05/2011, at 1:19 AM, Ted Zlatanov  wrote:
>
> > On Tue, 17 May 2011 15:52:22 -0700 Sameer Farooqui <
> cassandral...@gmail.com> wrote:
> >
> > SF> Would still be nice though to use the bcrypt hash over MD5 for
> stronger
> > SF> security.
> >
> > I used MD5 when I proposed SimpleAuthenticator for two reasons:
> >
> > 1) SimpleAuthenticator is supposed to be a demo of the authentication
> > interface.  It can be used for testing and trivial setups, but I
> > wouldn't use it in production.  So it's meant to get you going easily,
> > not to serve you long-term.
> >
> > 2) MD5 is built into Java.  At the time, bcrypt and SHA-* were not.  I
> > used MD5 only so the passwords are not stored in the clear, not to
> > provide production-level security.
> >
> > You should consider carefully the implications of storing passwords in a
> > file on a database server, no matter how they are encrypted.  It would
> > be better to write a trivial AD/LDAP/etc. authenticator that fits your
> > specific needs and doesn't rely on a local file.
> >
> > Ted
> >
>


Snapshotting to a different volume?

2011-05-18 Thread Sameer Farooqui
As of 0.8.0, is it possible to take a Cassandra snapshot to a different
volume (like a EBS volume dedicated for backups)?

About a year ago, Jonathan Ellis said that this won't be implemented b/c
snapshots are basically hard links:
http://mail-archives.apache.org/mod_mbox/cassandra-commits/201002.mbox/%3c821546961.221891265939548009.javamail.j...@brutus.apache.org%3E

But I don't fully understand that. If a snapshot is just a hardlink, won't
the snapshot also change as new data is written to the SSTables?

- Sameer


Re: Snapshotting to a different volume?

2011-05-18 Thread Watanabe Maki
SSTables are immutable. Those won't changed once written to disk.

From iPhone


On 2011/05/19, at 9:37, Sameer Farooqui  wrote:

> As of 0.8.0, is it possible to take a Cassandra snapshot to a different 
> volume (like a EBS volume dedicated for backups)?
> 
> About a year ago, Jonathan Ellis said that this won't be implemented b/c 
> snapshots are basically hard links:
> http://mail-archives.apache.org/mod_mbox/cassandra-commits/201002.mbox/%3c821546961.221891265939548009.javamail.j...@brutus.apache.org%3E
> 
> But I don't fully understand that. If a snapshot is just a hardlink, won't 
> the snapshot also change as new data is written to the SSTables?
> 
> - Sameer


Re: Snapshotting to a different volume?

2011-05-18 Thread Sameer Farooqui
Ahh.. yeah. And during a compaction a new SSTable is created with the merged
data.

So, if I take a snapshot before compaction, the old SSTables won't be
deleted (b/c the snapshot hard links still have a reference to the files).

But if I hadn't taken a snapshot before compaction, does compaction also
automatically delete the old SSTables?

FYI - The O'Reilly Cassandra book has a really misleading definition of
snapshotting. It says that a snapshot makes a copy of the keyspace and saves
it to a separate database file.

Is there a way to do a read from a snapshot? So, using a client like Hector,
can I request a read from a snapshot that I took like 2 weeks ago?

It sounds like the main benefit of a snapshot is that it makes a bunch of
nicely organized hard links in a separate folder from a specific point in
time.

So, taking snapshots to a different volume doesn't make sense since the hard
links can't span file systems. But it would be nice to have a feature where
the entire point-in-time copy of the SSTables can be copied to a different
volume. Currently if the data volume gets corrupted, the snapshots on it can
also get corrupted.

On Wed, May 18, 2011 at 5:44 PM, Watanabe Maki wrote:

> SSTables are immutable. Those won't changed once written to disk.
>
> From iPhone
>
>
> On 2011/05/19, at 9:37, Sameer Farooqui  wrote:
>
> As of 0.8.0, is it possible to take a Cassandra snapshot to a different
> volume (like a EBS volume dedicated for backups)?
>
> About a year ago, Jonathan Ellis said that this won't be implemented b/c
> snapshots are basically hard links:
> 
> http://mail-archives.apache.org/mod_mbox/cassandra-commits/201002.mbox/%3c821546961.221891265939548009.javamail.j...@brutus.apache.org%3E
>
> But I don't fully understand that. If a snapshot is just a hardlink, won't
> the snapshot also change as new data is written to the SSTables?
>
> - Sameer
>
>


Re: Snapshotting to a different volume?

2011-05-18 Thread Watanabe Maki
Please note that all files on unix file system are basically hard links 
referring specific inode. If you make a hard link to a file, it means the inode 
has two referring names.
When the SSTable is compacted and GCed, Cassandra "delete" the old SSTable but 
keep snapshot. Now the reference count to the inode become one.
Big advantage of hard link is that you don't need copy the data. So snapshot 
completes very fast. If you need separate copy of the snapshot in different 
volume, you can write a script to copy them.

From iPhone


On 2011/05/19, at 10:17, Sameer Farooqui  wrote:

> Ahh.. yeah. And during a compaction a new SSTable is created with the merged 
> data.
> 
> So, if I take a snapshot before compaction, the old SSTables won't be deleted 
> (b/c the snapshot hard links still have a reference to the files).
> 
> But if I hadn't taken a snapshot before compaction, does compaction also 
> automatically delete the old SSTables?
> 
> FYI - The O'Reilly Cassandra book has a really misleading definition of 
> snapshotting. It says that a snapshot makes a copy of the keyspace and saves 
> it to a separate database file.
> 
> Is there a way to do a read from a snapshot? So, using a client like Hector, 
> can I request a read from a snapshot that I took like 2 weeks ago?
> 
> It sounds like the main benefit of a snapshot is that it makes a bunch of 
> nicely organized hard links in a separate folder from a specific point in 
> time.
> 
> So, taking snapshots to a different volume doesn't make sense since the hard 
> links can't span file systems. But it would be nice to have a feature where 
> the entire point-in-time copy of the SSTables can be copied to a different 
> volume. Currently if the data volume gets corrupted, the snapshots on it can 
> also get corrupted.
> 
> On Wed, May 18, 2011 at 5:44 PM, Watanabe Maki  
> wrote:
> SSTables are immutable. Those won't changed once written to disk.
> 
> From iPhone
> 
> 
> On 2011/05/19, at 9:37, Sameer Farooqui  wrote:
> 
>> As of 0.8.0, is it possible to take a Cassandra snapshot to a different 
>> volume (like a EBS volume dedicated for backups)?
>> 
>> About a year ago, Jonathan Ellis said that this won't be implemented b/c 
>> snapshots are basically hard links:
>> http://mail-archives.apache.org/mod_mbox/cassandra-commits/201002.mbox/%3c821546961.221891265939548009.javamail.j...@brutus.apache.org%3E
>> 
>> But I don't fully understand that. If a snapshot is just a hardlink, won't 
>> the snapshot also change as new data is written to the SSTables?
>> 
>> - Sameer
> 


Using Toad to access Cassandra

2011-05-18 Thread Sameer Farooqui
Has anybody heard of or used Toad to access Cassandra?

http://www.quest.com/toad-for-cloud-databases/

They claim to: "Toad® for Cloud Databases provides a SQL-based interface
that makes it simple for you to generate queries, migrate, browse, and edit
data, as well as create reports and tables in a familiar SQL view."

Their whitepaper:
http://www.quest.com/Quest_Site_Assets/PDF/DS-Toad-for-Cloud-Databases-031811.pdf

Toad's explanation of Cassandra Column Families:
http://wiki.toadforcloud.com/index.php/Cassandra_Column_Families

Toad taking a jab at Hector:
http://blog.toadforcloud.com/2011/01/14/working-with-cassandra-0-7/


Using counters in 0.8

2011-05-18 Thread Ertio Lew
I am using Hector for a project & wanted to try out using counters with
latest 0.8 v Cassandra.

How do we work with counters in 0.8 version ? Any web-links to such examples
are appreciated.
Has Hector started to provide API for that ?