Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01

2013-03-27 Thread Andras Szerdahelyi
Aaron, What version are you using ? 1.1.9 Have you changed the bf_ chance ? The sstables need to be rebuilt for it to take affect. I did ( several times ) and I ran upgradesstables after Not sure what this means. Are you saying it's in a boat on a river, with tangerine trees and marmalade sk

Repair hangs after Upgrade to VNodes & 1.2.2

2013-03-27 Thread Ryan Lowe
Has anyone else experienced this? After upgrading to VNodes, I am having Repair issues. If I run `nodetool -h localhost repair`, then it will repair only the first Keyspace and then hang... I let it go for a week and nothing. If I run `nodetool -h localhost repair -pr`, then it appears to only r

Re: Repair hangs after Upgrade to VNodes & 1.2.2

2013-03-27 Thread Marco Matarazzo
> If I run `nodetool -h localhost repair`, then it will repair only the first > Keyspace and then hang... I let it go for a week and nothing. Does node logs show any error ? > If I run `nodetool -h localhost repair -pr`, then it appears to only repair > the first VNode range, but does do all ke

Re: Repair hangs after Upgrade to VNodes & 1.2.2

2013-03-27 Thread Ryan Lowe
Marco, No there are no errors... the last line I see in my logs related to repair is : [repair #...] Sending completed merkle tree to /[node] for (keyspace1,columnfamily1) Ryan On Wed, Mar 27, 2013 at 8:49 AM, Marco Matarazzo < marco.matara...@hexkeep.com> wrote: > > If I run `nodetool -h lo

Re: TimeUUID Order Partitioner

2013-03-27 Thread Lanny Ripple
A type 4 UUID can be created from two Longs. You could MD5 your strings giving you 128 hashed bits and then make UUIDs out of that. Using Scala: import java.nio.ByteBuffer import java.security.MessageDigest import java.util.UUID val key = "Hello, World!" val md = MessageDigest

Re: TimeUUID Order Partitioner

2013-03-27 Thread Lanny Ripple
Ah. TimeUUID. Not as useful for you then but still something for the toolbox. On Mar 27, 2013, at 8:42 AM, Lanny Ripple wrote: > A type 4 UUID can be created from two Longs. You could MD5 your strings > giving you 128 hashed bits and then make UUIDs out of that. Using Scala: > > import ja

Re: java.io.IOException: FAILED_TO_UNCOMPRESS(5) exception when running nodetool rebuild

2013-03-27 Thread Ondřej Černoš
Hi Aaron, I switched to 1.2.3 with no luck. I created https://issues.apache.org/jira/browse/CASSANDRA-5391 describing the problem. Maybe it's related to the EOFException problem, but I am not sure - I don't know Cassandra internals well and I have never seen the EOFException. regards, ondrej On

Re: Clearing tombstones

2013-03-27 Thread Joel Samuelsson
I see. The cleanup operation took several minutes though. This doesn't seem normal then? My replication settings should be very normal (simple strategy and replication factor 1). 2013/3/26 Tyler Hobbs > > On Tue, Mar 26, 2013 at 5:39 AM, Joel Samuelsson < > samuelsson.j...@gmail.com> wrote: > >

Digest Query Seems to be corrupt on certain cases

2013-03-27 Thread Ravikumar Govindarajan
We started receiving OOMs in our cassandra grid and took a heap dump. We are running version 1.0.7 with LOCAL_QUORUM from both reads/writes. After some analysis, we kind of identified the problem, with SliceByNamesReadCommand, involving a single Super-Column. This seems to be happening only in dig

Re: TimeUUID Order Partitioner

2013-03-27 Thread Carlos Pérez Miguel
Thanks, Lanny. That is what I am doing. Actually I'm having another problem. My UUIDOrderedPartitioner doesn't order by time. Instead, it orders by byte order and I cannot find why. Which are the functions that control ordering between tokens? I have implemented time ordering in the "compareTo" fu

Re: Repair hangs after Upgrade to VNodes & 1.2.2

2013-03-27 Thread Ryan Lowe
Upgrading to 1.2.3 fixed the -pr Repair.. I'll just use that from now on (which is what I prefer!) Thanks, Ryan On Wed, Mar 27, 2013 at 9:11 AM, Ryan Lowe wrote: > Marco, > > No there are no errors... the last line I see in my logs related to repair > is : > > [repair #...] Sending completed m

Re: recv_describe_keyspace bug in org.apache.cassandra.thrift.Cassandra ?

2013-03-27 Thread cscetbon.ext
Okay. I found an issue already opened for that https://issues.apache.org/jira/browse/CASSANDRA-5234 and added it my comment as it's labeled as 'Not a problem' thanks -- Cyril SCETBON On Mar 26, 2013, at 9:24 PM, aaron morton mailto:aa...@thelastpickle.com>> wrote: Is there a way to have the c

Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01

2013-03-27 Thread Wei Zhu
Welcome to the wonderland of SSTableSize of LCS. There is some discussion around it, but no guidelines yet. I asked the people in the IRC, someone is running as high as 128M on the production with no problem. I guess you have to test it on your system and see how it performs. Attached is the

Timeseries data

2013-03-27 Thread Kanwar Sangha
Hi - I have a query on Read with Cassandra. We are planning to have dynamic column family and each column would be on based a timeseries. Inserting data - key => ‘xxx′, {column_name => TimeUUID(now), :column_value => ‘value’ }, {column_name => TimeUUID(now), :column_value => ‘value’ },..

Re: Timeseries data

2013-03-27 Thread Bryan Talbot
In the worst case, that is possible, but compaction strategies try to minimize the number of SSTables that a row appears in so a row being in ALL SStables is not likely for most cases. -Bryan On Wed, Mar 27, 2013 at 12:17 PM, Kanwar Sangha wrote: > Hi – I have a query on Read with Cassandra.

Re: cfhistograms

2013-03-27 Thread aaron morton
> I think we all go through this learning curve. Here is the answer I gave > last time this question was asked: +1 > What I don't understand hete is "Row Size" column. Why is it always 0? Is it zero all the way down? What does cfstats say about the compacted max row size? Cheers ---

Re: nodetool repair hung?

2013-03-27 Thread aaron morton
> > nodetool repair is not coming back on the command line As a side, nodetool command makes a call to the server for each KS you are repairing. The calls are done in serial and if your terminal session times out the repair will stop after the last call nodetool made. If I'm manually running no

Re: Delete Issues with cassandra cluster

2013-03-27 Thread aaron morton
> Node1 seeds Node2 > Node2 seeds Node1 > Node3 seeds Node1 General best practice is to have the same seed list for all nodes. You want 2 or 3 seeds per data centre. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 2

Re: Multiple Primary Keys on an IN clause or 2i?

2013-03-27 Thread aaron morton
> CREATE TABLE msg_archive( > thread_id varchar, > ts timestamp, > msg blob, > PRIMARY KEY (thread_id, ts)) This with reversed clustering so the most recent columns are at the start (makes it quicker to get the last X messages) see http://www.datastax.com/docs/1.2/cql_cli/cql/CREATE_TABLE#cql-cre

Re: Infinit Loop in CompactionExecutor

2013-03-27 Thread aaron morton
> Is there a workaround beside upgrading? We are not ready to upgrade just yet. Cannot see one. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 26/03/2013, at 7:42 PM, Arya Goudarzi wrote: > Hi, > > I am experien

Re: schema disagrement exception

2013-03-27 Thread aaron morton
Your cluster is angry http://wiki.apache.org/cassandra/FAQ#schema_disagreement If your are just starting I suggest blasting it away and restarting. Hope that helps - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 26/03/201

Re: weird behavior with RAID 0 on EC2

2013-03-27 Thread aaron morton
I noticed this on an m1.xlarge (cassandra 1.1.10) instance today as well, 1 or 2 disks in a raid 0 running at 85 to 100% the others 35 to 50ish. Have not looked into it. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com

Re: nodetool status inconsistencies, repair performance and system keyspace compactions

2013-03-27 Thread aaron morton
> During one of my tests - see this thread in this mailing list: > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/java-io-IOException-FAILED-TO-UNCOMPRESS-5-exception-when-running-nodetool-rebuild-td7586494.html That thread has been updated, check the bug ondrej created. > How w

Re: CQL vs. non-CQL data models

2013-03-27 Thread aaron morton
> Is this data model defined by Thrift? How closely does it reflect the > Cassandra internal data model? Yes. Astynax is a thrift based API, and the thrift model closely matches the internal model. CQL 3 provides some abstractions on top of the internal model. > Is there any documentation or o

Re: Returning A Generated Id From An Insert

2013-03-27 Thread aaron morton
>> Is it possible to do something similar with CQL (e.g. could I be >> returned the generated timeuuid from >> now() somehow?) No. Writes to not read, and the state of the row after your write may or may not be one include all of the columns your write included. Cheers - Aaron M

Re: For clients, which node to connect too? (Python, CQL 1.4 driver)

2013-03-27 Thread aaron morton
Heard back from one of the contributors, the interface as outlined by Python DB-API 2.0 so it's by design. You choices are add it to the library and push a patch or handle it outside of the library. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorto

Re: Insert v/s Update performance

2013-03-27 Thread aaron morton
* Check for GC activity in the logs * check the volume the commit log is on to see it it's over utilised. * check if the dropped messages correlate to compaction, look at the compaction_* settings in yaml and consider reducing the throughput. Like Dean says if you have existing data it will res

Re: Clearing tombstones

2013-03-27 Thread aaron morton
> The cleanup operation took several minutes though. This doesn't seem normal > then It read all the data and made sure the node was a replica for it. Since a single node cluster replicas all data, there was not a lot to throw away. > My replication settings should be very normal (simple strate

Re: Digest Query Seems to be corrupt on certain cases

2013-03-27 Thread aaron morton
> We started receiving OOMs in our cassandra grid and took a heap dump What are the JVM settings ? What was the error stack? > I am pasting the serialized byte array of SliceByNamesReadCommand, which > seems to be corrupt on issuing certain digest queries. Sorry I don't follow what you are sayi

Re: TimeUUID Order Partitioner

2013-03-27 Thread aaron morton
> That is the order I would expect to find if I read the CF, but if I do, I > obtain (with any client or library I've tried): > What happens if you export sstables with sstable2json ? Put some logging in Memtable.FlushRunnable.writeSortedContents to see the order the rows are written Cheers

Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01

2013-03-27 Thread aaron morton
> You nailed it. A significant number of reads are done from hundreds of > sstables ( I have to add, compaction is apparently constantly 6000-7000 tasks > behind and the vast majority of the reads access recently written data ) So that's not good. If IO is saturated then maybe LCS is not for you

Re: Timeseries data

2013-03-27 Thread aaron morton
sstablekey can help you find which sstables your keys are in. But yes, a slice call will need to read from all sstables the row has a fragment in. This is one reason we normally suggest partitioning time series data by month or year or something sensible in your problem domain. You will proba

Re: Digest Query Seems to be corrupt on certain cases

2013-03-27 Thread Ravikumar Govindarajan
VM Settings are -javaagent:./../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms8G -Xmx8G -Xmn800M -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccup