Re: Nodes dropping out of cluster due to GC

2010-06-03 Thread Peter Schüller
> We did indeed have a problem with our GC settings.  The survivor ratio was > too low.  After changing that things are better but we are still seeing GC > that takes 5-10 seconds, which is enough for the node to drop out of the > cluster briefly. This still indicates full GC:s. What is your write

Re: writing speed test

2010-06-02 Thread Peter Schüller
Since this thread has now gone on for a while... As far as I can tell you never specify the characteristics of your writes. Evaluating expected write throughput in terms of "MB/s to disk" is pretty impossible if one does not know anything about the nature of the writes. If you're expecting 50 MB,

Re: Why are writes faster than reads?

2010-05-25 Thread Peter Schüller
> I'm fairly certain the write path hits the commit log first, then the > memtable. I didn't mean to imply an ordering between the two (I probably should not have said "memtable plus commit log"...), and yes I believe so. -- / Peter Schuller aka scode

Re: Why are writes faster than reads?

2010-05-25 Thread Peter Schüller
> I have seen several off-hand mentions that writes are inherently faster than > reads. Why is this so? I believe the primary factor people are referring to is that writes are faster than reads in terms of disk I/O because writes are inherently sequential. Writes initially only happen in-memory pl

Re: Why Cassandra is "space inefficient" compared to MySQL?

2010-05-25 Thread Peter Schüller
> Could you please tell me why? There might be pending sstable removals on disk, which won't happen until GC or restart. If you just did a bulk insert and checked diskspace immediately afterwards, I think this is a possible explanation. (See "Write path" on http://wiki.apache.org/cassandra/Archit

Re: how does cassandra compare with mongodb?

2010-05-14 Thread Peter Schüller
> Not sure if this was mentioned, but MongoDB is strongly consistent while > Cassandra is eventually consistent -- at least about a month ago when I > looked at it in more detail, though with vector clocks in 0.7, this may be > less of an issue. Did Mongo switch away from the "fsync() every now an

Re: replication impact on write throughput

2010-05-11 Thread Peter Schüller
> The biggest impact on your write performance will most likely be the > consistency level of your writes. In other words, how many nodes you want to > wait for before you acknowledge the write back to the client. I believe the consistency level is only expected to have a significant impact on lat

Re: Read Latency

2010-05-11 Thread Peter Schüller
> isolated requests, obviously in scale the RAID should perform better... I > have not started testing concurrent reads in scale as the single reads are > too slow to begin with. I am getting 20-30ms response time off of internal Concurrent reads is what you need to do in order to see the benefit

Re: bloom filter

2010-05-07 Thread Peter Schüller
> what is the benefit of creating bloom filter when cassandra writes data, how > does it helps ? It allows Cassandra to answer requests for non-existent keys without going to disk, except in cases where the bloom filter gives a false positive. See: http://spyced.blogspot.com/2009/01/all-you-ever

Re: eventuality

2010-05-05 Thread Peter Schüller
>    I have one question about the eventuality, i.e. do you know what are the > variables from which it depends. Well the most obvoius is the > ConsistencyLevel, so lets assume it is set to ONE. The question is that the > eventuallity is the relative time to spread changes across the cassandra > no

Re: Quorom consistency in a changing ring

2010-04-26 Thread Peter Schüller
> Increasing the replication level is known to break it. Thanks! Yes, of that I am aware. When I said ring changes I meant nodes being added and removed, or just re-balanced, implying tokens moving around the ring. -- / Peter Schuller aka scode

Re: GC options

2010-04-13 Thread Peter Schüller
> FYI, G1 has been in 1.6 since u14. Yes, but (last time I checked) in a considerably older form. The JDK 1.7 one is more mature. -- / Peter Schuller aka scode

Re: GC options

2010-04-13 Thread Peter Schüller
> I'm working on getting our latency as consistent as possible, and the gc > likes to kick off 60+ms periods of unavailability for a node, which for my > application leads to a reasonable number of timed out requests. Outside of > the gc event, we get good responses. > > I'm happy with reduced t