Re: Does replicate_on_write=true imply that CL.QUORUM for reads is unnecessary?

2013-05-31 Thread Peter Schuller
deletions, it's safe to >> only use CL.ONE and disable the read repair if we're never deleting >> counters. (And, of course, if we did start deleting counters, we'd need to >> revert those client and column family changes.) > > -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Read IO

2013-02-20 Thread Peter Schuller
nel settings (typically trading pollution of page cache vs. number of I/O:s). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Simulating a failed node

2012-10-27 Thread Peter Schuller
cas to even try to write to. (Note though: Reads are a bit of a different story and if you want to test behavior when nodes go down I suggest including that. See CASSANDRA-2540 and CASSANDRA-3927.) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Java 7 support?

2012-10-24 Thread Peter Schuller
FWIW, we're using openjdk7 on most of our clusters. For those where we are still on openjdk6, it's not because of an issue - just haven't gotten to rolling out the upgrade yet. We haven't had any issues that I recall with upgrading the JDK. -- / Peter Sc

Re: nodetool cleanup

2012-10-22 Thread Peter Schuller
On Oct 22, 2012 11:54 AM, "B. Todd Burruss" wrote: > > does "nodetool cleanup" perform a major compaction in the process of > removing unwanted data? No.

Re: Why data tripled in size after repair?

2012-10-01 Thread Peter Schuller
o build my own > version of cassandra? It's in the 1.1 branch; I don't remember if it went into a release yet. If not, it'll be in the next 1.1.x release. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Why data tripled in size after repair?

2012-09-26 Thread Peter Schuller
- because you're creating a single sstable bigger than what would normally happen, and it takes more total disk space before it will be part of a compaction again. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: JVM 7, Cass 1.1.1 and G1 garbage collector

2012-09-24 Thread Peter Schuller
on it's not, is likely that it depends on the situation. Further, even if you do play the lottery and win - if you don't know *why*, how are you able to extrapolate the behavior of the system with slightly changed workloads? It's very hard to blackbox-test GC settings, which is probably why GC tuning can be perceived as a useless game of whack-a-mole. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Invalid Counter Shard errors?

2012-09-19 Thread Peter Schuller
and cannot be safely retried. Cassandra counters are generally not useful if *strict* correctness is desired, for this reason. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Invalid Counter Shard errors?

2012-09-19 Thread Peter Schuller
er case as far as I can tell (off the top of my head), *some* counter increment is lost. The only way I can see (again off the top of my head) the resulting value being correct is if the later increment (N2 in this case) is somehow including N1 as well (e.g., because it was generated by first reading the current counter value). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: JVM 7, Cass 1.1.1 and G1 garbage collector

2012-09-15 Thread Peter Schuller
t that I know of if you're on Hotspot, is to have the application behave in such a way that it avoids the causes of un-predictable behavior w.r.t. GC by being careful about it's memory allocation and *retention* profile. For the specific case of avoiding *ever* seeing a full gc, it gets even more complex. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Changing bloom filter false positive ratio

2012-09-14 Thread Peter Schuller
't see how it would since each sstable will effectively cover almost the entire range (since you're effectively spraying random tokens at it, unless clients are writing data in md5 order). (Maybe it's different for ordered partitioning though.) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: JVM 7, Cass 1.1.1 and G1 garbage collector

2012-09-12 Thread Peter Schuller
> Our full gc:s are typically not very frequent. Few days or even weeks > in between, depending on cluster. *PER NODE* that is. On a cluster of hundreds of nodes, that's pretty often (and all it takes is a single node). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: JVM 7, Cass 1.1.1 and G1 garbage collector

2012-09-12 Thread Peter Schuller
like to see how it is > in action. FWIW, J9's "balanced" collector is very similar to G1 in it's design. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: JVM 7, Cass 1.1.1 and G1 garbage collector

2012-09-12 Thread Peter Schuller
c analysis). The only question is how often. But given the lack of handling of such failure modes, the effect on clients is huge. Recommend data reads by default to mitigate this and a slew of other sources of problems (and for counter increments, we're rolling out least-active-request routi

Re: JVM 7, Cass 1.1.1 and G1 garbage collector

2012-09-10 Thread Peter Schuller
mbered set scanning costs (driven by inter-region pointers). If you can avoid that, one might hope to avoid full gc:s all-together. The jury is still out on my side; but like I said, I've seen promising indications. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra 1.1.1 on Java 7

2012-09-08 Thread Peter Schuller
> Has anyone tried running 1.1.1 on Java 7? Have been running jdk 1.7 on several clusters on 1.1 for a while now. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Invalid Counter Shard errors?

2012-09-06 Thread Peter Schuller
This problem is not new to 1.1. On Sep 6, 2012 5:51 AM, "Radim Kolar" wrote: > i would migrate to 1.0 because 1.1 is highly unstable. >

Re: force gc?

2012-09-02 Thread Peter Schuller
-obnoxiously iterate over all rows: for row_id, row in your_column_family.get_range(): https://github.com/pycassa/pycassa -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: force gc?

2012-09-02 Thread Peter Schuller
> I think that was clear from your post. I don't see a problem with your > process. Setting gc grace to 0 and forcing compaction should indeed > return you to the smallest possible on-disk size. (But may be unsafe as documented; can cause deleted data to pop back up, etc.) -- /

Re: force gc?

2012-09-02 Thread Peter Schuller
sing compression) the Cassandra on-disk format is not as compact as PostgreSQL. For example column names are duplicated in each row, and the row key is duplicated twice (once in index, once in data). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Memory Usage of a connection

2012-08-30 Thread Peter Schuller
le requesting large amounts of data? Large or many columns (or both), etc. Essentially all "working" data that your request touches is allocated on the heap and contributes to allocation rate and ParNew frequency. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: JMX(RMI) dynamic port allocation problem still exists?

2012-08-28 Thread Peter Schuller
I can recommend Jolokia highly for providing an HTTP/JSON interface to JMX (it can be trivially run in agent mode by just altering JVM args): http://www.jolokia.org/ -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Node forgets about most of its column families

2012-08-28 Thread Peter Schuller
re able to disablegossip and make other nodes not send requests to it. disabling thrift would also be advised, or even firewalling it prior to restart. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Why so slow?

2012-08-19 Thread Peter Schuller
You're almost certainly using a client that doesn't set TCP_NODELAY on the thrift TCP socket. The nagle algorithm is enabled, leading to 200 ms latency for each, and thus 5 requests/second. http://en.wikipedia.org/wiki/Nagle's_algorithm -- / Peter Schulle

Re: nodetool repair uses insane amount of disk space

2012-08-17 Thread Peter Schuller
t failure domains (for reasons outlined in 3810 above). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: 0.8 --> 1.1 Upgrade: Any Issues?

2012-07-19 Thread Peter Schuller
NDRA-3820 -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Invalid Counter Shard errors?

2012-06-02 Thread Peter Schuller
n 1.1, so the root cause remains unknown as far as I can tell (had previously hoped the root cause were thread-unsafe shard merging, or one of the other counter related issues fixed during the 0.8 run). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: how to increase compaction rate?

2012-03-11 Thread Peter Schuller
> multithreaded_compaction: false Set to true. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: OOM opening bloom filter

2012-03-11 Thread Peter Schuller
ng, but it certainly should significantly decrease memory use. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: OOM opening bloom filter

2012-03-11 Thread Peter Schuller
ad, want to adjust target bloom filter false positive rates: https://issues.apache.org/jira/browse/CASSANDRA-3497 -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: TTL 3 hours + GC grace 0

2012-03-11 Thread Peter Schuller
ache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: LeveledCompaction and/or SnappyCompressor causing memory pressure during repair

2012-03-10 Thread Peter Schuller
sage because of long-running repairs retaining sstables and delaying their unload/removal (index sampling/bloom filters filling your heap). If it really only happens for leveled/snappy however, I don't know what that might be caused by. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Frequency of Flushing in 1.0

2012-02-26 Thread Peter Schuller
lots of writes, index sampling will be insignificant. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Analysis of performance benchmarking - unexpected results

2012-02-16 Thread Peter Schuller
ould not have knowledge about topology and relative latency other than what is driven by traffic, and I could imagine this happening if read repair were turned completely off. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
0 +? Nowadays there is code to actively make caches smaller if Cassandra detects that you seem to be running low on heap. Watch cassandra.log for messages to that effect (don't remember the exact message right now). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
tage pending backing up constantly. If on the other hand these are batch jobs where throughput is the concern, it's not relevant. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
gine unless you have sub-second resolution, but would still exhibit un-evenness and have an affect on latency. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
What is your total data size (nodetool info/nodetool ring) per node, your heap size, and the amount of memory on the system? -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
close to maximum at all times, and pending racking up consistently. If you're just close, you'll likely see spikes sometimes. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
cache only caches the index positions in the data file, and not the actual data. The key cache will only ever eliminate the I/O that would have been required to lookup the index entry; it doesn't help to eliminate seeking to get the data (but as usual, it may still be in the operating system pa

Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
in query latency is high. That said, if you're seeing consistently bad latencies for a while where you sometimes see consistently good latencies, that sounds different but would hopefully be observable somehow. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
e node while it is being slow, and observe. Figure out what the bottleneck is. iostat, top, nodetool tpstats, nodetool netstats, nodetool compactionstats. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Multiple data center nodetool ring output display 0% owns

2012-02-12 Thread Peter Schuller
#x27;own' 0% (actually, 1/(2^128) :) ), and > depending on your replication factor might have no data (if replication were > 1). It's also incorrect for rack awareness if your topology is such that the rack awareness changes ownership (see https://issues.apache.org/jira/brows

Re: Cassandra 1.0.6 multi data center question

2012-02-08 Thread Peter Schuller
Again the *schema* gets propagated and the keyspace will exist everywhere. You should just have exactly zero amount of data for the keyspace in the DC w/o replicas. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra 1.0.6 multi data center question

2012-02-08 Thread Peter Schuller
sstables on disk with data for the keyspace? -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra 1.0.6 multi data center question

2012-02-08 Thread Peter Schuller
It is expected that the schema is replicated everywhere, but *data* won't be in the DC with 0 replicas. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Can you query Cassandra while it's doing major compaction

2012-02-02 Thread Peter Schuller
ontinuously happening. A good rule of thumb is that an individual node should be able to handle your traffic when doing compaction; you don't want to be in the position where you're just barely dealing with the traffic, and a node doing compaction not being able to handle it.

Re: read-repair?

2012-02-02 Thread Peter Schuller
rashed with data loss prior to the data making it elsewhere. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: read-repair?

2012-02-01 Thread Peter Schuller
umn wins. This accomplish the reads-see-write invariant. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: how to delete data with level compaction

2012-01-28 Thread Peter Schuller
. Have a good amount of margin. Less so with leveled compaction than size tiered compaction, but still important. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: how to delete data with level compaction

2012-01-28 Thread Peter Schuller
y reach a steady state of disk usage. It only becomes a problem if you're almost *entirely* full and are trying to delete data in a panic. How far away are you from entirely full? Are you just worried about the future or are you about to run out of disk space right now? -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: ideal cluster size

2012-01-21 Thread Peter Schuller
ecially in relation to memory size), it's not necessarily the best trade-off. Consider the time it takes to do repairs, streaming, node start-up, etc. If it's only about CPU resources then bigger nodes probably make more sense if the h/w is cost effective. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: delay in data deleting in cassadra

2012-01-20 Thread Peter Schuller
AL_QUORUM if it's only within a DC). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Garbage collection freezes cassandra node

2012-01-20 Thread Peter Schuller
ssen promotion into old-gen). Experiment on a single node, making sure you're not causing too much disk I/O by stealing memory otherwise used by page cache. Once you have something that works you might try slowly going back down. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: ideal cluster size

2012-01-19 Thread Peter Schuller
correspondingly big single cluster. It is probably more useful to try to select hardware such that you have a greater number of smaller nodes, than it is to focus on node count (although once you start reaching the "few hundreds" level you're entering territory of less actual

Re: Garbage collection freezes cassandra node

2012-01-19 Thread Peter Schuller
icularly detailed tuning of GC issues is pretty useless on 0.7 given the significant changes in 1.0. Don't even bother spending time on this until you're on 1.0, unless this is about a production cluster that you cannot upgrade for some reason. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cannot start cassandra node anymore

2012-01-06 Thread Peter Schuller
to me there are hints on the node, for other nodes, that contain writes to a deleted column family. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra stress test and max vs. average read/write latency.

2012-01-02 Thread Peter Schuller
o reproduce them.  But > offhand, I don't see any to throttle back the load created by the > stress test. I'm not aware of one built-in. It would be a useful patch IMO, to allow setting a target rate. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Consistency Level

2011-12-29 Thread Peter Schuller
og4j-server.properties at the top) and the strategy (assuming you are using NetworkTopologyStrategy) will log selected endpoints, and confirm that it's indeed picking endpoints that you think it should based on getendpoints. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Consistency Level

2011-12-28 Thread Peter Schuller
But > this is not happening. 1 node among the ones in the replica set of your row has to be up. > Will the read repair happen automatically even if I read and write using the > consistency level ONE? Yes, assuming it's turned on. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Restart for change of endpoint_snitch ?

2011-12-27 Thread Peter Schuller
> If I change endpoint_snitch from SimpleSnitch to PropertyFileSnitch, > does it require restart of cassandra on that node ? Yes. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: better anti OOM

2011-12-27 Thread Peter Schuller
d aren't latency critical, that's probably fine though. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: improving cassandra-vs-mongodb-vs-couchdb-vs-redis

2011-12-27 Thread Peter Schuller
r 99% of all benchmarks ever published on the internet... -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: will compaction delete empty rows after all columns expired?

2011-12-27 Thread Peter Schuller
sing reasons. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: will compaction delete empty rows after all columns expired?

2011-12-27 Thread Peter Schuller
> Compaction should delete empty rows once gc_grace_seconds is passed, right? Yes. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: index sampling

2011-12-27 Thread Peter Schuller
rly documented in > http://wiki.apache.org/cassandra/LargeDataSetConsiderations - that bloom > filters + index sampling will be responsible for most memory used by node. > Caching itself has minimal use on large data set used for OLAP. I added some information at the end. -

Re: better anti OOM

2011-12-27 Thread Peter Schuller
r to heap capacity than regular compaction. Also, consider tweaking compaction throughput settings to control the rate of allocation generated during a compaction, even if you don't need it for disk I/O purposes. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: better anti OOM

2011-12-26 Thread Peter Schuller
XXX - try XXX = number of CPU cores for example in this case). Alternatively, a larger young gen to avoid so much getting promoted during compaction. But really, in short: The easiest fix is probably to increase the heap size. I know this e-mail doesn't begin to explain details but it's such a long story. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: better anti OOM

2011-12-26 Thread Peter Schuller
ng compaction wouldn't really help anything other than short-term avoiding a fallback to full GC. I suggest you describe exactly what the problem is you have and why you think stopping compaction/repair is the appropriate solution. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: reported bloom filter FP ratio

2011-12-26 Thread Peter Schuller
nable bloom filters will be committed. That is a good reason for both to be configurable IMO. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: reported bloom filter FP ratio

2011-12-26 Thread Peter Schuller
this an option now), but a 1% false positive hit rate will be completely unacceptable in some circumstances. In others, perfectly acceptable due to the decrease in memory use and few reads. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: reported bloom filter FP ratio

2011-12-25 Thread Peter Schuller
s to sstables will be higher than the number of reads to the CF (unless you happen to have exactly one sstable or no rows ever span sstables). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra stress test and max vs. average read/write latency.

2011-12-22 Thread Peter Schuller
that you're seeing w/o GC correlation? Off the top of my head, that seems very unexpected (assuming a non-saturated system) and would definitely invite investigation IMO. If you're willing to start iterating with the source code I'd start bisecting down the call stack and see where it's happening . -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Routine nodetool repair

2011-12-22 Thread Peter Schuller
> One other thing to consider is are you creating a few very large rows ? You > can check the min, max and average row size using nodetool cfstats. Normall I agree, but assuming the two-node cluster has RF 2 it would actually not matter ;) -- / Peter Schuller (@scode

Re: Routine nodetool repair

2011-12-21 Thread Peter Schuller
n happens to be in on the given node (I am assuming you're not using leveled compaction). That is in addition to any imbalance that might result from your population of data in the cluster. Running repair can affect the live size, but *lack* of repair won't cause a live size divergen

Re: Cassandra stress test and max vs. average read/write latency.

2011-12-19 Thread Peter Schuller
l, and then see whether or not you can sustain that in terms of old-gen. Start with this in any case: Run Cassandra with -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Garbage collection freezes cassandra node

2011-12-19 Thread Peter Schuller
uot;occupancy") of CMS to a lower percentage, making the concurrent mark phase start earlier. * Increase heap size significantly (probably not necessary based on your graph, but for good measure). If it then goes away, report back and we can perhaps figure out details. There are other things

Re: Garbage collection freezes cassandra node

2011-12-19 Thread Peter Schuller
GC log around the time of the pause). Your graph is looking very unusual for CMS. It's possible that everything is as it otherwise should and CMS is kicking in too late, but I am kind of skeptical towards that even the extremely smooth look of your graph. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: CPU bound workload

2011-12-18 Thread Peter Schuller
s best, but it's polling. Unfortunately the JDK provides no way to properly monitor for GC events within the Java application. The GC inspector can miss a GC. Also, the GC inspector only tells you time + type of GC; a GC log will provide all sorts of details. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Crazy compactionstats

2011-12-14 Thread Peter Schuller
until suddenly snapping back to 0 again once compactions catch up. Whether or not non-zero is a problem depends on the Cassandra version, how many concurrent compactors you are running, and your column families/data sizes/flushing speeds etc. (Sorry, kind of a long story) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: CPU bound workload

2011-12-11 Thread Peter Schuller
ractively "wait for it", I suggest something as simple as fireing up an top + iostat for each host and have them on the screen at the same time, and look for what happens when you see this again. If the problem is fallback to full GC for example, the affected nodes should be

Re: Meaning of values in tpstats

2011-12-11 Thread Peter Schuller
nd the negative effects increase as you have higher demands of low latency on other traffic to the cluster. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: read/write counts

2011-12-11 Thread Peter Schuller
eneral, you will see a magnification by a factor of RF on the local statistics (in aggregate) relative to the StorageProxy stats. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Consistence for node shutdown and startup

2011-12-11 Thread Peter Schuller
ransactions -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: cassandra in production environment

2011-12-11 Thread Peter Schuller
ses RHEL 6.1 specifically? I mean I can say that I've run Cassandra on Debian Squeeze in production, but that doesn't really help you ;) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: CPU bound workload

2011-12-10 Thread Peter Schuller
quential I/O asynchronously. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: CPU bound workload

2011-12-10 Thread Peter Schuller
ggestions (apart from spreading the load on more nodes). > > Cluster is 5 node, BOP, RF=3, AMD opteron 4174 CPU (6 x 2.3 Ghz cores), > Gigabit ethernet, RAID-0 SATA2 disks For starters, what *is* the throughput? How many counter mutations are you submitting per second? -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: ParNew and caching

2011-12-10 Thread Peter Schuller
minating them entirely may not be possible). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Meaning of values in tpstats

2011-12-10 Thread Peter Schuller
NLY doing these queries, that's not a problem per se. But if you are also expecting other requests to have low latency, then you want to avoid it. In general, batching is good - but don't overdo it, especially for reads, and especially if you're going to disk for the workload. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra behavior too fragile?

2011-12-07 Thread Peter Schuller
ld all just sit there and work without intervention. It's a pretty big ticket though and not something I'm gonna be working on in my spare time, so I don't know whether or when I would actually work on that ticket (depends on priorities). I have the ideas but I can't promise to fix it :) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra not suitable?

2011-12-06 Thread Peter Schuller
ks - and figure out what the most cost-effective solution is. Note that if you're bottlenecking on disk I/O, it's not surprising at all that repairing ~ 100 gigs of data takes more than 24 hours. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Repair failure under 0.8.6

2011-12-04 Thread Peter Schuller
owse/CASSANDRA-3483 is done. The patch attached to that ticket should work for 0.8.6 I suspect (but no guarantees). This also assumes you have no reads running against the cluster. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Repair failure under 0.8.6

2011-12-04 Thread Peter Schuller
er at all, outside > the > maintenance like the repair. Ok. So what i'm getting at then is that there may be real legitimate connectivity problems that you aren't noticing in any other way since you don't have active traffic to the cluster. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Repair failure under 0.8.6

2011-12-04 Thread Peter Schuller
allow you to rule that out (or not). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Repair failure under 0.8.6

2011-12-03 Thread Peter Schuller
Filed https://issues.apache.org/jira/browse/CASSANDRA-3569 to fix it so that streams don't die due to conviction. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Repair failure under 0.8.6

2011-12-03 Thread Peter Schuller
e exception you're seeing should be indicative that it really was considered Down by the node. You might grep the log for references ot the node in question (UP or DOWN) to confirm. The question is why though. I would check if the node has maybe automatically restarted, or went into full GC, e

Re: Decommission a node causing high IO usage on 2 other nodes

2011-11-29 Thread Peter Schuller
the two "old" nodes affected by decommissioning node N. (Unless I'm tripping myself up somewhere now...) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Pending ReadStage is exploding on only one node

2011-11-24 Thread Peter Schuller
completely disk bound, and that'll show up as a huge amount of pending ReadStage. "iostat -x -k 1" should confirm it. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Local quorum reads

2011-11-18 Thread Peter Schuller
> No it's not just the cli tool, our app has the same issue coming back with > read issues. You are supposed to not be able to read it. But you should be getting a proper error, not an empty result. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

  1   2   3   4   5   6   7   >