> We did indeed have a problem with our GC settings. The survivor ratio was
> too low. After changing that things are better but we are still seeing GC
> that takes 5-10 seconds, which is enough for the node to drop out of the
> cluster briefly.
This still indicates full GC:s. What is your write
Since this thread has now gone on for a while...
As far as I can tell you never specify the characteristics of your
writes. Evaluating expected write throughput in terms of "MB/s to
disk" is pretty impossible if one does not know anything about the
nature of the writes. If you're expecting 50 MB,
> I'm fairly certain the write path hits the commit log first, then the
> memtable.
I didn't mean to imply an ordering between the two (I probably should
not have said "memtable plus commit log"...), and yes I believe so.
--
/ Peter Schuller aka scode
> I have seen several off-hand mentions that writes are inherently faster than
> reads. Why is this so?
I believe the primary factor people are referring to is that writes
are faster than reads in terms of disk I/O because writes are
inherently sequential. Writes initially only happen in-memory pl
> Could you please tell me why?
There might be pending sstable removals on disk, which won't happen
until GC or restart. If you just did a bulk insert and checked
diskspace immediately afterwards, I think this is a possible
explanation.
(See "Write path" on http://wiki.apache.org/cassandra/Archit
> Not sure if this was mentioned, but MongoDB is strongly consistent while
> Cassandra is eventually consistent -- at least about a month ago when I
> looked at it in more detail, though with vector clocks in 0.7, this may be
> less of an issue.
Did Mongo switch away from the "fsync() every now an
> The biggest impact on your write performance will most likely be the
> consistency level of your writes. In other words, how many nodes you want to
> wait for before you acknowledge the write back to the client.
I believe the consistency level is only expected to have a significant
impact on lat
> isolated requests, obviously in scale the RAID should perform better... I
> have not started testing concurrent reads in scale as the single reads are
> too slow to begin with. I am getting 20-30ms response time off of internal
Concurrent reads is what you need to do in order to see the benefit
> what is the benefit of creating bloom filter when cassandra writes data, how
> does it helps ?
It allows Cassandra to answer requests for non-existent keys without
going to disk, except in cases where the bloom filter gives a false
positive.
See:
http://spyced.blogspot.com/2009/01/all-you-ever
> I have one question about the eventuality, i.e. do you know what are the
> variables from which it depends. Well the most obvoius is the
> ConsistencyLevel, so lets assume it is set to ONE. The question is that the
> eventuallity is the relative time to spread changes across the cassandra
> no
> Increasing the replication level is known to break it.
Thanks! Yes, of that I am aware. When I said ring changes I meant
nodes being added and removed, or just re-balanced, implying tokens
moving around the ring.
--
/ Peter Schuller aka scode
> FYI, G1 has been in 1.6 since u14.
Yes, but (last time I checked) in a considerably older form. The JDK
1.7 one is more mature.
--
/ Peter Schuller aka scode
> I'm working on getting our latency as consistent as possible, and the gc
> likes to kick off 60+ms periods of unavailability for a node, which for my
> application leads to a reasonable number of timed out requests. Outside of
> the gc event, we get good responses.
>
> I'm happy with reduced t
13 matches
Mail list logo