Re: how does compaction_throughput_kb_per_sec affect disk io?

2011-09-26 Thread Yan Chunlu
okay, thanks! On Mon, Sep 26, 2011 at 10:38 PM, Jonathan Ellis wrote: > compaction throughput doesn't affect flushing or reads > > On Mon, Sep 26, 2011 at 7:40 AM, Yan Chunlu wrote: > > I am using the default 16MB when running repair. but the disk io is still > > quite high: > > Device:

Re: Update of column sometimes takes 10 seconds

2011-09-26 Thread aaron morton
Try turning up the logging to DEBUG and watch the requests come through. Check that the two inserts do indeed have different time stamps. In cases of lost updates, timestamps are most often the cause of the kerfuffle. btw, in this situation the commit log is a red hearing riding a scapegoat. D

Re: reverse range query performance

2011-09-26 Thread aaron morton
Does not matter to much but are you looking to get all the columns for some know keys (get_slice, multiget_slice) ? Or are you getting the columns for keys within a range (get_range_slices)? If you provide do a reversed query the server will skip to the "end" of the column range. Here is some

Re: Thrift CPU Usage

2011-09-26 Thread Jeremiah Jordan
Yes. All the stress tool does is flood data through the API, no real processing or anything happens. So thrift reading/writing data should be the majority of the CPU time... On 09/26/2011 08:32 AM, Baskar Duraikannu wrote: Hello - I have been running read tests on Cassandra using "stress" t

Re: Thrift CPU Usage

2011-09-26 Thread Baskar Duraikannu
Aaron From the CPU samples report. Here is the parts of the CPU samples report (-Xrunhprof:cpu=samples, depth=4). TRACE 300668: java.net.SocketInputStream.socketRead0(SocketInputStream.java:Unknown line) java.net.SocketInputStream.read(SocketInputStream.java:129) org.a

Re: SCF column comparator

2011-09-26 Thread aaron morton
it's the other way around… row-key: super-column: (sub)column: When using Create Column Family in the CLI: key_validation_class applies to the row key comparator applies to the super column (when using a Super Column Family) subcomparator applied

Re: Surgecon Meetup?

2011-09-26 Thread Wilson Mar
I'll be there from Tuesday night thru the weekend. - wilson...@gmail.com, 310.320-7878 On Mon, Sep 26, 2011 at 4:55 PM, Dan Kuebrich wrote: > I'll be at Surge on Thursday, would love to meet up.  Anyone else planning > to be there? > > On Sun, Sep 25, 2011 at 7:27 PM, Chris Burroughs > wrote:

Re: Surgecon Meetup?

2011-09-26 Thread Dan Kuebrich
I'll be at Surge on Thursday, would love to meet up. Anyone else planning to be there? On Sun, Sep 25, 2011 at 7:27 PM, Chris Burroughs wrote: > Surge [1] is scalability focused conference in late September hosted in > Baltimore. It's a pretty cool conference with a good mix of > operationally

Re: Thrift CPU Usage

2011-09-26 Thread aaron morton
How are you deciding what is thrift ? Thrift is used to handle connections and serialize / de-serialize off the wire. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 27/09/2011, at 2:32 AM, Baskar Duraikannu wrote: > Hello

Re: Token != DecoratedKey assertion

2011-09-26 Thread aaron morton
Looks like a mismatch between the key the index says should be at a certain position in the date file and the key that is actually there. I've not checked but scrub *may* fix this this. Try it and see. (repair is for repairing consistency between nodes, scrub fixes local issues with data. )

RE: Update of column sometimes takes 10 seconds

2011-09-26 Thread Rick Whitesel (rwhitese)
Thank you to all for the quick response. The test that fails is doing a insert, another insert (to update data) and then a get to validate. If I make multiple copies of the same test and execute them in succession, different copies will fail on successive runs. Each test only has a single get, so o

Re: Update of column sometimes takes 10 seconds

2011-09-26 Thread Jonathan Ellis
Sounds a lot like this to me: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Updates-lost-td6739612.html On Mon, Sep 26, 2011 at 1:55 PM, Rick Whitesel (rwhitese) < rwhit...@cisco.com> wrote: > Hi All: > > ** ** > > We have a simple junit test that inserts a column, immediat

GC for ParNew on 0.8.6

2011-09-26 Thread Philippe
Ever since upgrading to 0.8.6, my nodes' system.log is littered with GCInspector logs such as these INFO [ScheduledTasks:1] 2011-09-26 21:23:40,468 GCInspector.java (line 122) GC for ParNew: 209 ms for 1 collections, 4747932608 used; max is 16838033408 INFO [ScheduledTasks:1] 2011-09-26 21:23:43,7

RE: Update of column sometimes takes 10 seconds

2011-09-26 Thread Rick Whitesel (rwhitese)
Waiting 10 seconds between the update and reading the updated data seems to always work. Not waiting the 10 seconds will cause the test to randomly pass or fail. -Rick From: Jim Ancona [mailto:j...@anconafamily.com] Sent: Monday, September 26, 2011 3:04 PM To: user@cassandra.apache.org Su

Re: Update of column sometimes takes 10 seconds

2011-09-26 Thread Jim Ancona
Do you actually see the update occur if you wait for 10 seconds (as your subject implies), or do you just see intermittent failures when running the unit test? If it's the latter, are you sure that the update has a greater timestamp than the insert? I've seen similar unit tests fail because because

Update of column sometimes takes 10 seconds

2011-09-26 Thread Rick Whitesel (rwhitese)
Hi All: We have a simple junit test that inserts a column, immediately updates that column and then validates that the data updated. Cassandra is run embedded in the unit test. Sometimes the test will pass, i.e. the updated data is correct, and sometimes the test will fail. The configuration is

reverse range query performance

2011-09-26 Thread Ramesh Natarajan
Hi, I am trying to use the range query to retrieve a bunch of columns in reverse order. The API documentation has a parameter bool reversed which should return the results when queried using keys in a reverse order. Lets say my row has about 1500 columns with column names 1 to 1500, and I query

Re: Seed nodes in cassandra.yaml can not be hostnames

2011-09-26 Thread Radim Kolar
Dne 26.9.2011 16:37, Jonathan Ellis napsal(a): The seed names should match what the seeds advertise as listen_address. I can't think of a reason host names shouldn't work, I used DNS alias, that was probably reason why it didn't worked.

Re: how does compaction_throughput_kb_per_sec affect disk io?

2011-09-26 Thread mcasandra
I would think that compaction_throughput_kb_per_sec does have indirect impact on disk IO. High number means or setting it to 0 means there is no throttling on how much IO is being performed. Wouldn't it impact normal reads from disk during the time when disk IO or util is high which compaction is t

SCF column comparator

2011-09-26 Thread Sam Hodgson
Hi all, Im trying to create a Threads SCF that will store message thread id's in date order and i want to store the threadID => subject as the supercolumns. Please correct me if im incorrect but my understanding of a super column family is as follows: Category: //row key Timestamp: //Colu

Re: Quick advice needed - CF backup and restore

2011-09-26 Thread Jonathan Ellis
dsh -g mycluster -c "nodetool -h localhost snapshot" see http://www.netfort.gr.jp/~dancer/software/dsh.html.en On Mon, Sep 26, 2011 at 8:36 AM, Oleg Proudnikov wrote: > Hi, > > What is the easiest way to save/backup a single column family across the > cluster > and later reload it? > > Thank yo

Re: how does compaction_throughput_kb_per_sec affect disk io?

2011-09-26 Thread Jonathan Ellis
compaction throughput doesn't affect flushing or reads On Mon, Sep 26, 2011 at 7:40 AM, Yan Chunlu wrote: > I am using the default 16MB when running repair. but the disk io is still > quite high: > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz > avgqu-sz   await r_awa

Re: Seed nodes in cassandra.yaml can not be hostnames

2011-09-26 Thread Jonathan Ellis
The seed names should match what the seeds advertise as listen_address. I can't think of a reason host names shouldn't work, but as Peter said, using host names is a bad idea anyway. 2011/9/25 Radim Kolar : > I just discovered that using host names for seed nodes in cassandra.yaml do > not work.

Quick advice needed - CF backup and restore

2011-09-26 Thread Oleg Proudnikov
Hi, What is the easiest way to save/backup a single column family across the cluster and later reload it? Thank you very much, Oleg

Thrift CPU Usage

2011-09-26 Thread Baskar Duraikannu
Hello - I have been running read tests on Cassandra using "stress" tool. I have been noticing that thrift seems to be taking lot of CPU over 70% when I look at the "CPU samples" report. Is this normal? CPU usage seems to go down by 5 to 10% when I change the RPC from "sync" to "async". Is

how does compaction_throughput_kb_per_sec affect disk io?

2011-09-26 Thread Yan Chunlu
I am using the default 16MB when running repair. but the disk io is still quite high: Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 136.00 0.00 506.00 26.00 63430.00 5880.00 260.56 101.73 224.38

[ANN] Mojo's Cassandra Maven Plugin 0.8.6-1 released

2011-09-26 Thread Stephen Connolly
The Mojo team is pleased to announce the release of Mojo's Cassandra Maven Plugin version 0.8.6-1. Mojo's Cassandra Plugin is used when you want to install and control a test instance of Apache Cassandra from within your Apache Maven build. The Cassandra Plugin has the following goals. * cassa

Re: messages stopped for 3 minutes?

2011-09-26 Thread Yang
h. never mind, possibly the first 24 seconds delay was caused by GC, the GC logging was not printed in system.log, I found one line on stdout that possibly corresponds to that. I found I left out the enable parallel remark param, let me add that and retry. Thanks Yang On Mon, Sep 26, 2011

Re: Seed nodes in cassandra.yaml can not be hostnames

2011-09-26 Thread Peter Schuller
> I just discovered that using host names for seed nodes in cassandra.yaml do > not work. This is done on purpose? I believe so yes, to avoid relying on DNS to map correctly given that everything else is based on IP address. (IIRC, someone chime in if there is a different reason.) -- / Peter Sch