Re: Cassandra 0.8.4 node keeps crashing with OOM errors

2012-09-18 Thread Feng Qu
try increasing vm.max_map_count per http://blog.timstoop.nl/2011/04/20/cassandra-java-io-ioerror-java-io-ioexception-map-failed/     Feng Qu > > From: Raj N >To: user@cassandra.apache.org >Sent: Tuesday, September 18, 2012 6:37 PM >Subject: Cassandra 0.8.4 node

Re: Disk configuration in new cluster node

2012-09-18 Thread Віталій Тимчишин
Network also matters. It would take a lot of time sending 6TB over 1Gb link, even fully saturating it. IMHO You can try with 10Gb, but you will need to raise your streaming/compaction limits a lot. Also you will need to ensure that your compaction can keep up. It is often done in one thread and I a

Cassandra 0.8.4 node keeps crashing with OOM errors

2012-09-18 Thread Raj N
One of our nodes keeps crashing continuously with out of memory errors. I see the following error in the logs - INFO 21:03:54,007 Creating new commitlog segment /local3/logs/cassandra/commitlog/CommitLog-1348016634007.log Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack guard

Re: Row caches

2012-09-18 Thread Jason Wee
which version is that? in version, 1.1.2 , nodetool does take the column family. setcachecapacity - Set the key and row cache capacities of a given column family On Wed, Sep 19, 2012 at 2:15 AM, rohit reddy wrote: > Hi, > > Is it possible to enable row cache per column family after the colum

Re: persistent compaction issue (1.1.4 and 1.1.5)

2012-09-18 Thread Michael Kjellman
Potentially the pending compactions are a symptom and not the root cause/problem. When updating a 3rd column family with a larger sstable_size_in_mb it looks like the schema may not be in a good state [default@] UPDATE COLUMN FAMILY screenshots WITH compaction_strategy=LeveledCompactionStrate

Re: Is Cassandra right for me?

2012-09-18 Thread Hiller, Dean
Oh, and yes, that is the correct link. Dean From: Marcelo Elias Del Valle mailto:mvall...@gmail.com>> Reply-To: "user@cassandra.apache.org" mailto:user@cassandra.apache.org>> Date: Tuesday, September 18, 2012 10:50 AM To: "user@cassandra.apache.org

Re: Is Cassandra right for me?

2012-09-18 Thread Hiller, Dean
Cassandra is fully aware of all tables created with playOrm and you can still use DataStax enterprise features to get real time analytics. Playroom is a layer on top of cassandra and with any layer it makes a developer more productive at a slight cost of performance just like hibernate on top o

guarantee of write-read order?

2012-09-18 Thread Yang
I remember that the memtable uses ConcurrentSkipListMap underneath, and multiple writes/reads can proceed at the same time. in my Lock implementation on cassandra, I seem to run into cases where 2 clients A and B write into the same row, different columns, on a 1-node cluster but the resulting rea

Re: persistent compaction issue (1.1.4 and 1.1.5)

2012-09-18 Thread Michael Kjellman
Thanks, I just modified the schema on the worse offending column family (as determined by the .json) from 10MB to 200MB. Should I kick off a compaction on this cf now/repair?/scrub? Thanks -michael From: Віталій Тимчишин mailto:tiv...@gmail.com>> Reply-To: "user@cassandra.apache.org

Re: persistent compaction issue (1.1.4 and 1.1.5)

2012-09-18 Thread Віталій Тимчишин
I've started to use LeveledCompaction some time ago and from my experience this indicates some SST on lower levels than they should be. The compaction is going, moving them up level by level, but total count does not change as new data goes in. The numbers are pretty high as for me. Such numbers me

Re: persistent compaction issue (1.1.4 and 1.1.5)

2012-09-18 Thread Michael Kjellman
There are a large number of members in generation 0, which I'm assuming refers to L0 according to a few of the .json files I checked in my largest column families. This particular node I'm checking I have already tried a scrub and repair. What steps should I take to move these SSTables to the n

Re: Question on Read Repair

2012-09-18 Thread Vijay
Yes, If you are using 1.1 take a look at: dclocal_read_repair_chance and read_repair_chance CF settings. Regards, On Sun, Sep 16, 2012 at 5:03 PM, Raj N wrote: > Hi, >I have a 2 DC setup(DC1:3, DC2:3). All reads and writes are at > LOCAL_QUORUM. The question is if I do reads at LOCAL_

Re: Is Cassandra right for me?

2012-09-18 Thread Marcelo Elias Del Valle
You're talking about this project, right? https://github.com/deanhiller/playorm I will take a look. However, I don't think using Cassandra's model itself (with CFs / key-values) would be a problem, I just need to know where the advantage relies on. By your answer, my guess is it relies on better pe

Re: Disk configuration in new cluster node

2012-09-18 Thread Casey Deccio
On Tue, Sep 18, 2012 at 1:54 AM, aaron morton wrote: > each with several disks having large capacity, totaling 10 - 12 TB. Is > this (another) bad idea? > > Yes. Very bad. > If you had 6TB on average system with spinning disks you would measure > duration of repairs and compactions in days. > > I

Re: persistent compaction issue (1.1.4 and 1.1.5)

2012-09-18 Thread Ben Coverston
In your data directory there should be a .json file for each column family that holds the manifest. Do any of those indicate that you have a large number of SSTables in L0? This number is also indicated in JMX by the UnLeveledSSTables count for each column family. If not it's possible that the n

Re: are counters stable enough for production?

2012-09-18 Thread Robin Verlangen
" To go further, would it maybe be an idea to count everything twice? One as postive value and once as negative value. When reading the counters, the application could just compare the negative and positive counter to get an error margin. " This sounds interesting. Maybe someone should implement t

Re: persistent compaction issue (1.1.4 and 1.1.5)

2012-09-18 Thread Michael Kjellman
Leveled. nothing in the logs. Normal compactions seem to be occurring...these ones just won't go away. I've tried a rolling restart and literally tries killing our entire cluster and bringing up one node at a time in case gossip was causing this. Same result. The compactions are there immediate

Re: are counters stable enough for production?

2012-09-18 Thread horschi
"The repair of taking the highest value of two inconsistent might cause getting higher values?" If a counter counts backwards (therefore has negative values), then repair would still choose the larger value? Or does cassandra take the highter absolute value? This would result to an undercounting

updating CF from a mapper-only Hadoop job

2012-09-18 Thread Brian Jeltema
I wrote a Hadoop mapper-only job that uses BulkOutputFormat to load a Cassandra table. That job would consistently fail with a flurry of exceptions (primary cause looks like EOFExceptions streaming between nodes). I restructured the job to use an identity mapper and perform the updates in the r

Re: Invalid Counter Shard errors?

2012-09-18 Thread Alain RODRIGUEZ
I would like to understand or do my best helping you to understand this issue. I got the following (shortened) logs: (03a227f0-a5c3-11e1--b7f5e49dceff, 3, 3) and (03a227f0-a5c3-11e1--b7f5e49dceff, 3, 0) (03a227f0-a5c3-11e1--b7f5e49dceff, 6, 6) and (03a227f0-a5c3-11e1--b7f5e49dceff

Re: Is Cassandra right for me?

2012-09-18 Thread Hiller, Dean
Until Aaron replies, here are my thoughts on the relational piece… If everything in my model fits into a relational database, if my data is structured, would it still be a good idea to use Cassandra? Why? The playOrm project explores exactly this issue……A query on 1,000,000 rows in a

Re: Is Cassandra right for me?

2012-09-18 Thread Marcelo Elias Del Valle
I will have just 6 columns in my CF, but I will have about a billion writes per hour. In this case, I think Cassandra applies then, by what you are saying. This answer helped a lot too, thanks! 2012/9/18 Hiller, Dean > I wanted to clarify the where that statement comes from on wide rows …. > > R

Re: Is Cassandra right for me?

2012-09-18 Thread Marcelo Elias Del Valle
Aaron, Thank you very much for the answers! Helped me a lot! I would like just a bit more clarification about the points bellow, if you allow me: - You can query your data using Hadoop easily enough. You may want take a look at DSE from http://datastax.com/ it makes using Hadoop a

Re: HTimedOutException and cluster not working

2012-09-18 Thread Jason Wee
Hi Aaron, thank you for your reply. Please read inline comment. On Tue, Sep 18, 2012 at 7:36 PM, aaron morton wrote: > What version are you on ? > cassandra version 1.0.8 > > HTimedOutException is logged for all the nodes. > > TimedOutException happens when less than CL replica nodes respond to

Re: Is Cassandra right for me?

2012-09-18 Thread Hiller, Dean
I wanted to clarify the where that statement comes from on wide rows …. Realize some people make the claim that if you don’t' have 1000's of columns in "some" rows in cassandra you are doing something wrong. This is not true, BUT it comes from the fact that people are setting up indexes. This

Re: Query advice to prevent node overload

2012-09-18 Thread André Cruz
On Sep 18, 2012, at 3:06 AM, aaron morton wrote: >> select filename from inode where filename > ‘/tmp’ and filename < ‘/tmq’ and >> sentinel = ‘x’; Wouldn't that return files from directories '/tmp1', '/tmp2', for example? I thought the goal was to return files and subdirectories recursively i

Re: HTimedOutException and cluster not working

2012-09-18 Thread aaron morton
What version are you on ? > HTimedOutException is logged for all the nodes. TimedOutException happens when less than CL replica nodes respond to the coordinator in time. You could get the error from all nodes in your cluster if the 3 nodes that store the key are having problems. > MutationS

Re: Stream definition is lost after server restart

2012-09-18 Thread aaron morton
What is the query you are using to read the streams ? Can you reduce the fault to "this query is not returning data but it's there" ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/09/2012, at 4:11 PM, Ishan Thilina wrote: > Sorry, >

downgrade from 1.1.4 to 1.0.X

2012-09-18 Thread Arend-Jan Wijtzes
Hi, We are running Cassandra 1.1.4 and like to experiment with Datastax Enterprise which uses 1.0.8. Can we safely downgrade a production cluster or is it incompatible? Any special steps involved? Arend-Jan -- Arend-Jan Wijtzes -- Wiseguys -- www.wise-guys.nl

Re: Composite Column Types Storage

2012-09-18 Thread Sylvain Lebresne
> Range queries do not use bloom filters. It holds good for composite-columns > also right? Since I assume you are referring to column's bloom filters (key's bloom filters are always used) then yes, that holds good for composite columns. Currently, composite column name are completely opaque to th

Re: Bloom Filters in Cassandra

2012-09-18 Thread aaron morton
Some more background http://spyced.blogspot.com/2009/01/all-you-ever-wanted-to-know-about.html In additional to the SSTable bloom filter for keys, there are row level bloom filters for columns. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com

Re: Is Cassandra right for me?

2012-09-18 Thread aaron morton
> Also, I saw a presentation which said that if I don't have rows with more > than a hundred rows in Cassandra, whether I am doing something wrong or I > shouldn't be using Cassandra. I do not agree with that statement. (I read that as rows with ore than a hundred _columns_) > I need to suppor

Re: persistent compaction issue (1.1.4 and 1.1.5)

2012-09-18 Thread aaron morton
What Compaction Strategy are you using ? Are there any errors in the logs ? If you restart a node how long does it take for the numbers to start to rise ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/09/2012, at 7:39 AM, Michael Kje

Re: are counters stable enough for production?

2012-09-18 Thread Robin Verlangen
@Alain: " If you don't have much time to read this, just know that it's a random error, which appear with low frequency, but regularly, seems to appear quite randomly, and nobody knows the reason why it appears yet. Also, you need to know that it's repaired by taking the highest of the two inconsis

Re: are counters stable enough for production?

2012-09-18 Thread Alain RODRIGUEZ
Hi @Robin, about the log message: "Sometimes you can see log messages that indicate that counters are out of sync in the cluster and they get "repaired". My guess would be that the repairs actually destroys it, however I have no knowledge of the underlying techniques. " Here you got an answer fo

Re: Cassandra Messages Dropped

2012-09-18 Thread aaron morton
Any errors in the log ? The node recovers ? Do you use secondary indexes ? If so check comments for memtable_flush_queue_size in the yaml. if this value is too low writes may back up. But I would not expect it to cause dropped messages. > nodetool info also shows we have over a gig of avail

Re: Repair: Issue in netstats

2012-09-18 Thread B R
Thanks a lot for clarifying. We'll complete the upgrade of all nodes. Regards. On Mon, Sep 17, 2012 at 3:51 PM, Sylvain Lebresne wrote: > On Mon, Sep 17, 2012 at 11:06 AM, B R > wrote: > > Could this problem be due to running repair on a node upgraded to 1.0.11 > but > > the other node in the c

Re: are counters stable enough for production?

2012-09-18 Thread rohit bhatia
@Robin I'm pretty sure the GC issue is due to counters only. Since we have only write-heavy counter incrementing traffic. GC Frequency also increases linearly with write load. @Bartlomiej On Stress Testing, we see GC frequency and consequently write latency increase to several milliseconds. At 50k

Re: Cassandra supercolumns with same name

2012-09-18 Thread aaron morton
They are. Can you provide some more information ? What happens when you read the super column ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/09/2012, at 5:33 AM, Cyril Auburtin wrote: > First sorry but I'm using an old version 0.

Re: Disk configuration in new cluster node

2012-09-18 Thread aaron morton
> Given the advice to use a single RAID 0 volume, I think that's what I'll do. > By system mirror, you are referring to the volume on which the OS is > installed? Yes. I was thinking about a simple RAID 1 OS volume and RAID 0 data volume setup. With the Commit Log on the OS volume so it does

Re: Composite Column Types Storage

2012-09-18 Thread aaron morton
> It is slowly dawning on me that I need a super-column to use column blooms > effectively and at the same time don't want the entire sub-column list > deserialized. Queries by name use the row level bloom filter, regardless of the CF type. > In fact, for my use-case I also do not need a colum

Re: are counters stable enough for production?

2012-09-18 Thread Robin Verlangen
We've not been trying to create inconsistencies as you describe above. But it seems legit that those situations cause problems. Sometimes you can see log messages that indicate that counters are out of sync in the cluster and they get "repaired". My guess would be that the repairs actually destroy

Re: are counters stable enough for production?

2012-09-18 Thread Bartłomiej Romański
Garbage is one more issue we are having with counters. We are operating under very heavy load. Counters are spread over 7 nodes with SSD drives and we often seeing CPU usage between 90-100%. We are doing mostly reads. Latency is very important for us so GC pauses taking longer than 10ms (often arou

Re: are counters stable enough for production?

2012-09-18 Thread Robin Verlangen
@Rohit: We also use counters quite a lot (lets say 2000 increments / sec), but don't see the 50-100KB of garbage per increment. Are you sure that memory is coming from your counters? Best regards, Robin Verlangen *Software engineer* * * W http://www.robinverlangen.nl E ro...@us2.nl Disclaimer: T

Re: are counters stable enough for production?

2012-09-18 Thread rohit bhatia
We use counters in a 8 node cluster with RF 2 in cassandra 1.0.5. We use phpcassa and execute cql queries through thrift to work with composite types. We do not have any problem of overcounts as we tally with RDBMS daily. It works fine but we are having some GC pressure for young generation. Per