Re: Cassandra 2.0.7 always failes due to 'too may open files' error

2014-05-15 Thread Nikolay Mihaylov
sorry, probably somebody mentioned it, but did you checked global limit? cat /proc/sys/fs/file-max cat /proc/sys/fs/file-nr On Mon, May 5, 2014 at 10:31 PM, Bryan Talbot wrote: > Running > > #> cat /proc/$(cat /var/run/cassandra.pid)/limits > > as root or your cassandra user will tell you what

Failed to mkdirs $HOME/.cassandra

2014-05-15 Thread Bryan Talbot
How should nodetool command be run as the user "nobody"? The nodetool command fails with an exception if it cannot create a .cassandra directory in the current user's home directory. I'd like to schedule some nodetool commands to run with least privilege as cron jobs. I'd like to run them as the

Re: Really need some advices on large data considerations

2014-05-15 Thread Michael Shuler
On 05/13/2014 08:13 PM, Yatong Zhang wrote: Thank you Aaron, but we're planning about 20T per node, is that feasible? 20T per node is 5x greater than the max recommended data per node on high-end spec hardware of 5T/node on nodes with 16+ cores, 128-256G, SSD, and 10gigE. pgs 12-13 (the who

NTS, vnodes and 0% chance of data loss

2014-05-15 Thread William Oberman
I found this: http://mail-archives.apache.org/mod_mbox/cassandra-user/201404.mbox/%3ccaeduwd1erq-1m-kfj6ubzsbeser8dwh+g-kgdpstnbgqsqc...@mail.gmail.com%3E I read the three referenced cases. In addition, case 4123 references: http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html And

Stephanie Huynh invites you to Bay Area Big Data

2014-05-15 Thread Stephanie Huynh
Hi there, Stephanie Huynh has invited you to Bay Area Big Data. Stephanie Huynh says: Please join the Bay Area Big Data group to learn about the BigData and NoSQL landscape! -

What % of cassandra developers are employed by Datastax?

2014-05-15 Thread Kevin Burton
I'm curious what % of cassandra developers are employed by Datastax? … vs other companies. When MySQL was acquired by Oracle this became a big issue because even though you can't really buy an Open Source project, you can acquire all the developers and essentially do the same thing. It would be

Re: Storing log structured data in Cassandra without compactions for performance boost.

2014-05-15 Thread Aaron Morton
If you disable compaction you will end up with a *lot* of sstables, this will hurt read performance and be a pain to manage (including making repairs and bootstrapping taking longer) STCS is not too onerous, I’d recommend leaving on. If you want it to run less frequently increase min_threshold.

Re: Efficient bulk range deletions without compactions by dropping SSTables.

2014-05-15 Thread Kevin Burton
> > > We basically do this same thing in one of our production clusters, but > rather than dropping SSTables, we drop Column Families. We time-bucket our > CFs, and when a CF has passed some time threshold (metadata or embedded in > CF name), it is dropped. This means there is a home-grown system t

Re: Automatic tombstone removal issue (STCS)

2014-05-15 Thread Paulo Ricardo Motta Gomes
I just updated CASSANDRA-6563 with more details and proposed a patch to solve the issue, in case anyone else is interested. https://issues.apache.org/jira/browse/CASSANDRA-6563 On Tue, May 6, 2014 at 10:00 PM, Paulo Ricardo Motta Gomes < paulo.mo...@chaordicsystems.com> wrote: > Robert: thanks f

Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-15 Thread Anton Brazhnyk
Greetings, I'm reading data from C* with Spark (via ColumnFamilyInputFormat) and I'd like to read just part of it - something like Spark's sample() function. Cassandra's API seems allow to do it with its ConfigHelper.setInputRange(jobConfiguration, startToken, endToken) method, but it doesn't w

Re: How to balance this cluster out ?

2014-05-15 Thread Aaron Morton
This is not a problem with the token assignments. Here is the ideal assignments from the tools/bin/token-generator script DC #1: Node #1:0 Node #2: 56713727820156410577229101238628035242 Node #3: 113427455640312821154458202477256070484 You are pr

Mutation messages dropped

2014-05-15 Thread Raveendran, Varsha IN BLR STS
Hello, I am writing around 10Million records continuously into a single node Cassandra (2.0.5) . In the Cassandra log file I see an entry "272 MUTATION messages dropped in last 5000ms" . Does this mean that 272 records were not written successfully? Thanks, Varsha

Re: Effect of number of keyspaces on write-throughput....

2014-05-15 Thread Krishna Chaitanya
Hello, Thanks for the reply. Currently, each client is writing about 470 packets per second where each packet is 1500 bytes. I have four clients writing simultaneously to the cluster. Each client is writing to a separate keyspace simultaneously. Hence, is there a lot of switching of keyspaces?

Re: Disable reads during node rebuild

2014-05-15 Thread sankalp kohli
This might be useful Nodetool command to disable reads On Wed, May 14, 2014 at 8:31 AM, Paulo Ricardo Motta Gomes < paulo.mo...@chaordicsystems.com> wrote: > That's a nice workaround, will be really helpful in emergency situations > like this

Re: How long are expired values actually returned?

2014-05-15 Thread Aaron Morton
> Is this normal or am I doing something wrong?. probably this one. But the TTL is set based on the system clock on the server, first through would be to check the times are correct. If that fails, send over the schema and the insert. Cheers Aaron - Aaron Morton New Zealand

Re: Question about READS in a multi DC environment.

2014-05-15 Thread graham sanderson
Yeah, but all the requests for data/digest are sent at the same time… responses that aren’t “needed” to complete the request are dealt with asynchronously (possibly causing repair). In the original trace (which is confusing because I don’t think the clocks are in sync)… I don’t see anything th

Re: Disable reads during node rebuild

2014-05-15 Thread Aaron Morton
> As of 2.0.7, driftx has added this long-requested feature. Thanks A - Aaron Morton New Zealand @aaronmorton Co-Founder & Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 13/05/2014, at 9:36 am, Robert Coli wrote: > On Mon, May 12, 2014 at 10:18

Setting the read/write consistency globaly in the CQL3 datastax java driver

2014-05-15 Thread Sebastian Schmidt
Hi, I'm using the CQL3 Datastax Cassandra Java client. I want to use a global read and write consistency for my queries. I know that I can set the consistencyLevel for every single prepared statement. But I want to do that just once per cluster or once per session. Is that possible? Kind Regards,

Query returns incomplete result

2014-05-15 Thread Lu, Boying
Hi, All, I use the astyanax 1.56.48 + Cassandra 2.0.6 in my test codes and do some query like this: query = keyspace.prepareQuery(..).getKey(...) .autoPaginate(true) .withColumnRange(new RangeBuilder().setLimit(pageSize).build()); ColumnList result; result= query.execute().getResult(); while (!

Re: Storing log structured data in Cassandra without compactions for performance boost.

2014-05-15 Thread Nate McCall
The following article has some good information for what you describe: http://www.datastax.com/dev/blog/optimizations-around-cold-sstables Some related tickets which will provide background: https://issues.apache.org/jira/browse/CASSANDRA-5228 https://issues.apache.org/jira/browse/CASSANDRA-5515