Re: Occasional read timeouts seen during row scans

2014-08-01 Thread Clint Kelly
BTW a few other details, sorry for omitting these: - We are using version 2.0.4 of the Java driver - We are running against Cassandra 2.0.9 - I tried messing around with the page size (even reducing it down to a single record) and that didn't seem to help (in the cases where I was

Occasional read timeouts seen during row scans

2014-08-01 Thread Clint Kelly
Hi everyone, I am seeing occasional read timeouts during multi-row queries, but I'm having difficulty reproducing them or understanding what the problem is. First, some background: Our team wrote a custom MapReduce InputFormat that looks pretty similar to the DataStax InputFormat except that it

RE: select many rows one time or select many times?

2014-08-01 Thread Mohammed Guller
Did you benchmark these two options: 1) Select with IN 2) Select all words and filter in application Mohammed From: Philo Yang [mailto:ud1...@gmail.com] Sent: Thursday, July 31, 2014 10:45 AM To: user@cassandra.apache.org Subject: select many rows one time or select many times? Hi al

Re: question about commitlog segments and memlocking

2014-08-01 Thread Robert Coli
On Fri, Aug 1, 2014 at 2:53 AM, DE VITO Dominique < dominique.dev...@thalesgroup.com> wrote: > The instruction « CLibrary.tryMlockall(); » is called at the very > beginning of the setup() Cassandra method. > > So, the heap space is memlocked in memory (if OS rights are set). > > > > “mlockall()” i

Re: how do i know if nodetool repair is finished

2014-08-01 Thread Aiman Parvaiz
This is a old post, am not sure if something changed for new C* versions. If nodetool compactionstats says there are no Validation compactions running (and the compaction queue is empty) and netstats says there is nothing streaming there is a a good chance the repair is finished or dead. If a nei

Re: adding more nodes into the cluster

2014-08-01 Thread Redmumba
The Cassandra wiki is notoriously out of date. The Datastax documentation is generally more correct on most things. On Fri, Aug 1, 2014 at 9:27 AM, Donald Smith < donald.sm...@audiencescience.com> wrote: > According to datastax’s documentation at > http://www.datastax.com/documentation/cassand

RE: adding more nodes into the cluster

2014-08-01 Thread Donald Smith
According to datastax’s documentation at http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html “By default, this setting [auto_bootstrap] is true and not listed in the cassandra.yaml file.” But http://wiki.apache.org/cassandra/StorageConfigurati

Re: select many rows one time or select many times?

2014-08-01 Thread Laing, Michael
I don't think there is an easy "answer" to this... A possible approach, based upon the implied dimensions of the problem, would be to maintain a bloom filter over "words" for each user as a partition key with the user as clustering key. Then a single query would efficiently yield the list of users

question about commitlog segments and memlocking

2014-08-01 Thread DE VITO Dominique
Hi, The instruction < CLibrary.tryMlockall(); > is called at the very beginning of the setup() Cassandra method. So, the heap space is memlocked in memory (if OS rights are set). "mlockall()" is called with "MCL_CURRENT" : "MCL_CURRENT Lock all pages currently mapped into the process's address

how do i know if nodetool repair is finished

2014-08-01 Thread KZ Win
I have a 2 node apache cassandra (2.0.3) cluster with rep factor of 1. I change rep factor to 2 using the following command in cqlsh ALTER KEYSPACE "mykeyspace" WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; I then tried to run recommended "nodetool repair" after d