Re: GC options

2010-04-13 Thread Benjamin Black
Got it, thanks 2010/4/13 Peter Schüller : >> FYI, G1 has been in 1.6 since u14. > > Yes, but (last time I checked) in a considerably older form. The JDK > 1.7 one is more mature. > > -- > / Peter Schuller aka scode >

Re: GC options

2010-04-13 Thread Peter Schüller
> FYI, G1 has been in 1.6 since u14. Yes, but (last time I checked) in a considerably older form. The JDK 1.7 one is more mature. -- / Peter Schuller aka scode

Re: GC options

2010-04-13 Thread Benjamin Black
FYI, G1 has been in 1.6 since u14. 2010/4/13 Peter Schüller : >> I'm working on getting our latency as consistent as possible, and the gc >> likes to kick off 60+ms periods of unavailability for a node, which for my >> application leads to a reasonable number of timed out requests. Outside of >

Re: Starting Cassandra Fauna

2010-04-13 Thread Nirmala Agadgar
Hi, Can anyone please list steps to install and run cassandra in centos. It can help me to follow and check where i missed and run correctly. Also, if i wanted to insert some data programmatically, where i need to do place the code in Fauna.Can anyone help me on this? On Mon, Apr 12, 2010 at 10:3

Re: GC options

2010-04-13 Thread Peter Schüller
> I'm working on getting our latency as consistent as possible, and the gc > likes to kick off 60+ms periods of unavailability for a node, which for my > application leads to a reasonable number of timed out requests. Outside of > the gc event, we get good responses. > > I'm happy with reduced t

Re: [RELEASE] 0.6.0

2010-04-13 Thread Lu Ming
Cheers! From: Donyee Sent: Tuesday, April 13, 2010 11:10 PM To: user@cassandra.apache.org Subject: Re: [RELEASE] 0.6.0 Great! 2010/4/13 Eric Evans I'm pleased to announce the official release of Apache Cassandra 0.6.0, (aka The Best Evar). Get yours now from the usual place[1], and a

Re: EC2, XFS, and ec2-consistent-snapshot with Cassandra

2010-04-13 Thread Scott White
> I've implemented this with MySQL before, and it worked extremely well > (miles beyond mysqldump or mysqlhotcopy). On a given node, you sacrifice a > short period of availability (less than 0.5 seconds) to get a full, > consistent snapshot of your EBS volume that can be sent off to S3 in the > ba

Re: Caching is a full row?

2010-04-13 Thread Paul Prescod
On Tue, Apr 13, 2010 at 5:26 PM, Rob Coli wrote: > On 4/13/10 5:04 PM, Paul Prescod wrote: >> >> Am I correct in my understanding that the unit of caching (and >> fetching from disk?) is a full row? > > Cassandra has both a Key and a Row cache. Unfortunately there appears to be > no current wiki d

Re: Caching is a full row?

2010-04-13 Thread Rob Coli
On 4/13/10 5:04 PM, Paul Prescod wrote: Am I correct in my understanding that the unit of caching (and fetching from disk?) is a full row? Cassandra has both a Key and a Row cache. Unfortunately there appears to be no current wiki doc describing them. If you are looking into the topic, wiki u

Caching is a full row?

2010-04-13 Thread Paul Prescod
Am I correct in my understanding that the unit of caching (and fetching from disk?) is a full row? cf = cacheRow(filter.key); Also, that the thing that is cached is just the top level columns? cached = getTopLevelColumns(new IdentityQueryFilter(key, new QueryPath(colu

EC2, XFS, and ec2-consistent-snapshot with Cassandra

2010-04-13 Thread Ben Standefer
On EC2, it is common and recommended ( http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=100&externalID=1663) to use XFS's freeze/thaw functionality to create near-online snapshots of an EBS volume for MySQL snapshots. "Besides being a stable, modern, high performance, journalin

Re: Silly Question

2010-04-13 Thread Christian Torres
Any idea how in an app can I know how many, for example, news (records/rows) I have? On Tue, Apr 13, 2010 at 4:59 PM, Eric Evans wrote: > On Tue, 2010-04-13 at 16:56 -0600, Christian Torres wrote: > > Sorry! Can I from Cassandra Cli get all the rows? > > No, sorry. > > -- > Eric Evans > eev...@r

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
I notice that the documentation on the read path is quite compressed on this page: * http://wiki.apache.org/cassandra/ArchitectureOverview What is the best documentation of the read path? I'm also curious about the granularity and policies around caching. Paul Prescod

Re: Silly Question

2010-04-13 Thread Eric Evans
On Tue, 2010-04-13 at 16:56 -0600, Christian Torres wrote: > Sorry! Can I from Cassandra Cli get all the rows? No, sorry. -- Eric Evans eev...@rackspace.com

Re: Silly Question

2010-04-13 Thread Christian Torres
Sorry! Can I from Cassandra Cli get all the rows? On Tue, Apr 13, 2010 at 4:47 PM, Christian Torres wrote: > It's true! Thx. It works > > > On Tue, Apr 13, 2010 at 4:40 PM, Tupshin Harper wrote: > >> On 4/13/2010 3:39 PM, Christian Torres wrote: >> >> Maybe some other people have asked this but

Re: GC options

2010-04-13 Thread Benjamin Black
Which version of the JVM are you using? Recent builds (u18, u19) have significantly improved GC. On Tue, Apr 13, 2010 at 3:51 PM, Daniel Kluesing wrote: > Has anyone done any tuning on the jvm gc options or are the options included > in bin/cassandra.in.sh basically the best choice? > > I'm wor

GC options

2010-04-13 Thread Daniel Kluesing
Has anyone done any tuning on the jvm gc options or are the options included in bin/cassandra.in.sh basically the best choice? I'm working on getting our latency as consistent as possible, and the gc likes to kick off 60+ms periods of unavailability for a node, which for my application leads t

Re: Silly Question

2010-04-13 Thread Christian Torres
It's true! Thx. It works On Tue, Apr 13, 2010 at 4:40 PM, Tupshin Harper wrote: > On 4/13/2010 3:39 PM, Christian Torres wrote: > > Maybe some other people have asked this but I'm testing php to use with > cassandra and the error the example here gave me was: > > [link] http://wiki.apache.org/

Re: Silly Question

2010-04-13 Thread Tupshin Harper
On 4/13/2010 3:39 PM, Christian Torres wrote: Maybe some other people have asked this but I'm testing php to use with cassandra and the error the example here gave me was: [link] http://wiki.apache.org/cassandra/ThriftExamples03 *Bad type in structure.* I think is because I'm using the new ve

Silly Question

2010-04-13 Thread Christian Torres
Maybe some other people have asked this but I'm testing php to use with cassandra and the error the example here gave me was: [link] http://wiki.apache.org/cassandra/ThriftExamples03 *Bad type in structure.* I think is because I'm using the new version 0.6 -- Christian Torres * Desarrollador W

Re: BMT flush on windows?

2010-04-13 Thread Sonny Heer
Got it, thanks! On Tue, Apr 13, 2010 at 3:12 PM, Jonathan Ellis wrote: > you have three options > > (a) connect with jconsole or another jmx client and invoke flush that way > (b) run org.apache.cassandra.tools.NodeCmd manually > (b) write a bat file for NodeCmd like the nodetool shell script in

Re: BMT flush on windows?

2010-04-13 Thread Jonathan Ellis
you have three options (a) connect with jconsole or another jmx client and invoke flush that way (b) run org.apache.cassandra.tools.NodeCmd manually (b) write a bat file for NodeCmd like the nodetool shell script in bin/ On Tue, Apr 13, 2010 at 5:08 PM, Sonny Heer wrote: > Is there any way to ru

BMT flush on windows?

2010-04-13 Thread Sonny Heer
Is there any way to run a keyspace flush on a windows box?

Re: batch_mutate silently failing

2010-04-13 Thread Lee Parker
nevermind. I figured out what the problem was. I was not putting the column inside a ColumnOrSuperColumn container. Lee Parker l...@spredfast.com [image: Spredfast] On Tue, Apr 13, 2010 at 4:19 PM, Lee Parker wrote: > I upgraded my dev environment to 0.6.0 today in expectation of upgrading >

batch_mutate silently failing

2010-04-13 Thread Lee Parker
I upgraded my dev environment to 0.6.0 today in expectation of upgrading our prod environment soon. I am trying to rewrite some of our code to use batch_mutate with the Thrift PHP library directly. I'm not getting any result back, not even an exception or failure message, but the data is never sh

Re: [RELEASE] 0.6.0

2010-04-13 Thread Lee Parker
Awesome. This is greatly appreciated. Lee Parker l...@spredfast.com [image: Spredfast] On Tue, Apr 13, 2010 at 3:54 PM, Eric Evans wrote: > On Tue, 2010-04-13 at 12:12 -0500, Eric Evans wrote: > > > Any chance of an updated debian package? > > > > Yes. RSN. > > I leaned into it. An updated pac

Re: Recovery from botched compaction

2010-04-13 Thread Anthony Molinaro
On Tue, Apr 13, 2010 at 10:54:51AM -0500, Jonathan Ellis wrote: > On Sat, Apr 10, 2010 at 2:24 PM, Anthony Molinaro > wrote: > >  This is sort of a pre-emptive question as the compaction I'm doing hasn't > > failed yet but I expect it to any time now.  I have a cluster which has been > > storing

Re: [RELEASE] 0.6.0

2010-04-13 Thread Eric Evans
On Tue, 2010-04-13 at 12:12 -0500, Eric Evans wrote: > > Any chance of an updated debian package? > > Yes. RSN. I leaned into it. An updated package has been uploaded to the Cassandra repo (see: http://wiki.apache.org/cassandra/DebianPackaging). Enjoy. -- Eric Evans eev...@rackspace.com

Re: Worst case #iops to read a row

2010-04-13 Thread Jonathan Ellis
On Tue, Apr 13, 2010 at 1:55 PM, Paul Prescod wrote: > What do you mean by "bad practice"? The document above implies that it > is nearly impossible. It implies that you will have between 1 and 4 > SSTables. Does the administrator have a choice in this matter? You can tune the 4 number via JMX (p

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
On Tue, Apr 13, 2010 at 12:00 PM, Benjamin Black wrote: >> I am probably being totally naive, but is the answer to the question >> "worst iops on read" just: >> >>  3 reads per SSTable * 4 SStables * ReplicationFactor ? >> >> = 3 * 4 * 3 = 36? >> > > Why does RF enter this? A simplistic model for

Re: Tragedy: A high-level Cassandra Object Abstraction for Python

2010-04-13 Thread Paul Bohm
Ok, will add it. I originally tried to build upon both lazyboy and pycassa, but started from scratch when the abstraction didn't feel right both times. The only part of pycassa that I use now is the socket/reconnect code, which does exactly what's needed for now. I'm however planning to switch it

Re: Worst case #iops to read a row

2010-04-13 Thread Benjamin Black
On Tue, Apr 13, 2010 at 11:55 AM, Paul Prescod wrote: > > What do you mean by "bad practice"? The document above implies that it > is nearly impossible. It implies that you will have between 1 and 4 > SSTables. Does the administrator have a choice in this matter? > Hey, I am arguing the proposed

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
On Tue, Apr 13, 2010 at 11:52 AM, Scott White wrote: > >... > > Agreed. Kind of sorry to see Scott White and Benjamin Black being in agreementbut I guess that's the way yin and yang works. Opposition is illusory in any case. Paul Prescod

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
On Tue, Apr 13, 2010 at 11:31 AM, Benjamin Black wrote: > ... > How frequently do you want to write SSTables?  How much memory do you > want Memtables to consume?  How long do you want to wait between > Memtable flushes?  There is an entire wiki page on  Memtable tuning: > http://wiki.apache.org/c

Re: Tragedy: A high-level Cassandra Object Abstraction for Python

2010-04-13 Thread Jonathan Ellis
You should add it to http://wiki.apache.org/cassandra/ClientOptions (click Login to edit). Does it use pycassa under the hood, out of curiosity? On Tue, Apr 13, 2010 at 1:50 PM, Paul Bohm wrote: > Hey everyone, > > I've lately been working on Tragedy, a (yet another) Cassandra object > abstracti

Re: Worst case #iops to read a row

2010-04-13 Thread Scott White
> Do you understand you are assuming there have been no compactions, > which would be extremely bad practice given this number of SSTables? > A major compaction, as would be best practice given this volume, would > result in 1 SSTable per CF per node. One. Similarly, you are > assuming the update

Tragedy: A high-level Cassandra Object Abstraction for Python

2010-04-13 Thread Paul Bohm
Hey everyone, I've lately been working on Tragedy, a (yet another) Cassandra object abstraction for Python. While there clearly is no shortage of high quality Python Cassandra libraries out there already, Tragedy has a different enough design that you might still find it useful. The README contai

Re: Worst case #iops to read a row

2010-04-13 Thread Benjamin Black
On Tue, Apr 13, 2010 at 11:31 AM, Paul Prescod wrote: > I am just checking math, not model. > > On Tue, Apr 13, 2010 at 10:48 AM, Time Less wrote: > >> >> numRowsOnNode = 10B / 20 = 500M. > > 50 million > 10B / 20 is 500M. The rest of the analysis from our pseudonymous friend remains faulty.

Re: Worst case #iops to read a row

2010-04-13 Thread Benjamin Black
On Tue, Apr 13, 2010 at 10:48 AM, Time Less wrote: > > >> > If I have 10B rows in my CF, and I can fit 10k rows per >> > SStable, and the SStables are spread across 5 nodes, and I have 1 bloom The error you are making is in thinking the Memtable thresholds are the SSTable limits. They are not.

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
I am just checking math, not model. On Tue, Apr 13, 2010 at 10:48 AM, Time Less wrote: > > numRowsOnNode = 10B / 20 = 500M. 50 million > replicationFactor = 3. > rowsPerSStable = 128MB / 1K = 131k. > > Therefore worst-case iops per read on this cluster are: > (500M * 3 / 131k) * 3 = 150M / 131

Re: [RELEASE] 0.6.0

2010-04-13 Thread Eric Evans
On Tue, 2010-04-13 at 10:43 -0700, Ned Wolpert wrote: > Is 0.6.0 a repackage of 0.6.0rc1? If we're running 0.6.0rc1 do we need > to upgrade? Yes (repackage), and no (you don't need to upgrade). -- Eric Evans eev...@rackspace.com

Reading thousands of columns

2010-04-13 Thread James Golick
Hi All, I'm seeing about 35-50ms to read 1000 columns from a CF using get_range_slices. The columns are TimeUUIDType with empty values. The row cache is enabled and I'm running the query 500 times in a row, so I can only assume the row is cached. Is that about what's expected or am I doing somet

Re: Worst case #iops to read a row

2010-04-13 Thread Time Less
> If I have 10B rows in my CF, and I can fit 10k rows per > > SStable, and the SStables are spread across 5 nodes, and I have 1 bloom > > filter false positive and 1 tombstone and ask the wrong node for the key, > > then: > > > > Mv = (((2B/10k)+1+1)*3)+1 == ((200,000)+2)*3+1 == 300,007 iops to rea

Re: [RELEASE] 0.6.0

2010-04-13 Thread Jordan Pittier
For those who can't wait : http://perso.rezel.net/cassandra_0.6.0-1_all.deb md5sum is 6dd71e18e1e0239e50302098d395536e Based on https://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.6.0/ On Tue, Apr 13, 2010 at 7:43 PM, Ned Wolpert wrote: > Is 0.6.0 a repackage of 0.6.0rc1? If we're running

Re: [RELEASE] 0.6.0

2010-04-13 Thread Ned Wolpert
Is 0.6.0 a repackage of 0.6.0rc1? If we're running 0.6.0rc1 do we need to upgrade? On Tue, Apr 13, 2010 at 10:12 AM, Eric Evans wrote: > On Tue, 2010-04-13 at 11:14 -0500, Lee Parker wrote: > > Any chance of an updated debian package? > > Yes. RSN. > > -- > Eric Evans > eev...@rackspace.com > >

Re: [RELEASE] 0.6.0

2010-04-13 Thread Eric Evans
On Tue, 2010-04-13 at 11:14 -0500, Lee Parker wrote: > Any chance of an updated debian package? Yes. RSN. -- Eric Evans eev...@rackspace.com

Re: [RELEASE] 0.6.0

2010-04-13 Thread Jonathan Ellis
Compared to what? 0.5? It's still pure java and runs fine on Windows, the cassandra.bat is somewhat improved, and there are still no bat files for nodetool and the other utilities. On Tue, Apr 13, 2010 at 11:51 AM, dir dir wrote: > Any change or update in the new version (especially for the im

Re: [RELEASE] 0.6.0

2010-04-13 Thread dir dir
Any change or update in the new version (especially for the implementation in the Microsoft Windows)?? Thanks. Dir. On Tue, Apr 13, 2010 at 11:14 PM, Lee Parker wrote: > Any chance of an updated debian package? > > Lee Parker > l...@spredfast.com > > [image: Spredfast] > On Tue, Apr 13, 2010 a

Re: Internal error

2010-04-13 Thread Jonathan Ellis
On Tue, Apr 13, 2010 at 11:23 AM, Scott White wrote: > ERROR [pool-1-thread-60] 2010-04-13 04:58:11,806 Cassandra.java (line 1492) > Internal error processing insert internal error means "check the server log for the stacktrace" > Also a side question I was curious about, is the state of a boost

Internal error

2010-04-13 Thread Scott White
We started seeing for the first time errors that look like this last night after things were running smoothly for a couple days after switching to 0.6-rc1: ERROR [pool-1-thread-60] 2010-04-13 04:58:11,806 Cassandra.java (line 1492) Internal error processing insert We had not made any other change

Re: [RELEASE] 0.6.0

2010-04-13 Thread Lee Parker
Any chance of an updated debian package? Lee Parker l...@spredfast.com [image: Spredfast] On Tue, Apr 13, 2010 at 10:10 AM, Donyee wrote: > Great! > > 2010/4/13 Eric Evans > > >> I'm pleased to announce the official release of Apache Cassandra 0.6.0, >> (aka The Best Evar). Get yours now from

Re: Recovery from botched compaction

2010-04-13 Thread Jonathan Ellis
On Sat, Apr 10, 2010 at 2:24 PM, Anthony Molinaro wrote: >  This is sort of a pre-emptive question as the compaction I'm doing hasn't > failed yet but I expect it to any time now.  I have a cluster which has been > storing user profile data for a client.  Recently I've had to go back and > reload

Re: [RELEASE] 0.6.0

2010-04-13 Thread Donyee
Great! 2010/4/13 Eric Evans > > I'm pleased to announce the official release of Apache Cassandra 0.6.0, > (aka The Best Evar). Get yours now from the usual place[1], and as > always, be sure to read the release notes[2] (especially if you're > upgrading). > > Since this is our first official rel

[RELEASE] 0.6.0

2010-04-13 Thread Eric Evans
I'm pleased to announce the official release of Apache Cassandra 0.6.0, (aka The Best Evar). Get yours now from the usual place[1], and as always, be sure to read the release notes[2] (especially if you're upgrading). Since this is our first official release since graduating to a top-level projec

Re: RE : Re: RE : Re: Two dimensional matrices

2010-04-13 Thread Eric Evans
On Tue, 2010-04-13 at 16:05 +0200, Philippe wrote: > I'm confused : don't range queries such as the ones we've been > discussing require using an orderedpartitionner ? Alright, so distribution depends on your choice of token. -- Eric Evans eev...@rackspace.com

RE : Re: RE : Re: Two dimensional matrices

2010-04-13 Thread Philippe
I'm confused : don't range queries such as the ones we've been discussing require using an orderedpartitionner ? Le 13 avr. 2010 15:58, "Eric Evans" a écrit : On Tue, 2010-04-13 at 08:57 +0200, Philippe wrote: > Okay so if i switch columns and super columns i... Sure. > Assuming this is all co

Re: RE : Re: Two dimensional matrices

2010-04-13 Thread Eric Evans
On Tue, 2010-04-13 at 08:57 +0200, Philippe wrote: > Okay so if i switch columns and super columns in my model i get what i > want > don't i? > > Super column = x > Column = time frame > Now i can get 2d range extracts from the grid and every cell will > contain all time frame data. Is this correc

Re: Off line client nodes?

2010-04-13 Thread Colin Yates
Thanks for all your advice - I am evaluating couchDB now. I can see that each is designed for different use-cases and couchDB is optimal for this. Many thanks all. Col Colin Yates gmail.com> writes: