Re: slides for "Testing out a slab allocator for Cassandra to reduce GC promotion failures by @stuhood "?

2011-08-25 Thread Ryan King
(on Hbase though, ) > http://www.cloudera.com/blog/2011/03/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-3/ The cassandra implementation is somewhat similar. -ryan > > thanks > > On Thu, Aug 25, 2011 at 10:01 AM, Ryan King wrote: >> On Thu, Aug 25

Re: slides for "Testing out a slab allocator for Cassandra to reduce GC promotion failures by @stuhood "?

2011-08-25 Thread Ryan King
On Thu, Aug 25, 2011 at 9:33 AM, Yang wrote: > http://twitoaster.com/country-us/lenn0x/testing-out-a-slab-allocator-for-cassandra-to-reduce-gc-promotion-failures-by-stuhood-cassandra-memtables-gc-cc-jointheflock/ > > hi:  I'm interested in learning more about the slaballocator, anyone > has a copy

Re: Customized Secondary Index Schema

2011-08-24 Thread Ryan King
On Tue, Aug 23, 2011 at 10:03 AM, Alvin UW wrote: > Hello, > > As mentioned by Ed Anuff in his blog and slides, one way to build customized > secondary index is: > We use one CF, each row to represent a secondary index, with the secondary > index name as row key. > For example, > > Indexes = { > "

Re: Memory overhead of vector clocks…. how often are they pruned?

2011-08-24 Thread Ryan King
TSiAKU5vXoDA&sa=X&oi=book_result&ct=result&resnum=1&ved=0CBkQ6AEwAA#v=onepage&q&f=false >> >> … so… are 'timestamps' pruned? >> >> Even this mechanism seems like it will dominate the amount of memory used in >> Cassandra.  I co

Re: Memory overhead of vector clocks…. how often are they pruned?

2011-08-24 Thread Ryan King
On Tue, Aug 23, 2011 at 7:58 PM, Kevin Burton wrote: > I had a thread going the other day about vector clock memory usage and that > it is a series of (clock id, clock):ts and the ability to prune old entries > … I'm specifically curious here how often old entries are pruned. > > If you're storin

Re: Avoid Simultaneous Minor Compactions?

2011-08-21 Thread Ryan King
You should throttle your compactions to a sustainable level. -ryan On Sun, Aug 21, 2011 at 10:22 PM, Hefeng Yuan wrote: > We just noticed that at one time, 4 nodes were doing minor compaction > together, each of them took 20~60 minutes. > We're on 0.8.1, 6 nodes, RF5. > This simultaneous compac

Re: cassandra server disk full

2011-08-03 Thread Ryan King
, Aug 2, 2011 at 9:27 AM, Jim Ancona wrote: > On Mon, Aug 1, 2011 at 6:12 PM, Ryan King wrote: >> On Fri, Jul 29, 2011 at 12:02 PM, Chris Burroughs >> wrote: >>> On 07/25/2011 01:53 PM, Ryan King wrote: >>>> Actually I was wrong– our patch will disable gos

Re: Question about eventually consistent in Cassandra

2011-08-03 Thread Ryan King
On Wed, Aug 3, 2011 at 10:09 AM, mcasandra wrote: > What happens when DC is in different time zone so 9:00 pacific vs 11:00 > Central Nothing. Timestamps have no knowledge of timezones, they're just offsets from an arbitrary point in the past. -ryan

Re: cassandra server disk full

2011-08-01 Thread Ryan King
On Fri, Jul 29, 2011 at 12:02 PM, Chris Burroughs wrote: > On 07/25/2011 01:53 PM, Ryan King wrote: >> Actually I was wrong– our patch will disable gosisp and thrift but >> leave the process running: >> >> https://issues.apache.org/jira/browse/CASSANDRA-2118 >> &g

Re: Cassandra Pig with network topology and data centers.

2011-07-29 Thread Ryan King
It'd be great if we had different settings for inter- and intra-DC read repair. -ryan On Fri, Jul 29, 2011 at 5:06 PM, Jake Luciani wrote: > Yes it's read repair you can lower the read repair chance to tune this. > > > > On Jul 29, 2011, at 6:31 PM, Aaron Griffith > wrote: > >> I currently hav

Re: Aggregation and Co-Processors

2011-07-28 Thread Ryan King
On Thu, Jul 28, 2011 at 12:08 PM, Stephen Pope wrote: > I just finished watching the video by Eric Evans on “CQL – Not just NoSQL. > It’s MoSQL”, and I heard mention of aggregation queries. He said there’s > been some talk about it, and that you guys were calling it “co-processors”. > Can somebody

Re: cassandra server disk full

2011-07-25 Thread Ryan King
Actually I was wrong– our patch will disable gosisp and thrift but leave the process running: https://issues.apache.org/jira/browse/CASSANDRA-2118 If people are interested in that I can make sure its up to date with our latest version. -ryan On Mon, Jul 25, 2011 at 10:07 AM, Ryan King wrote

Re: cassandra server disk full

2011-07-25 Thread Ryan King
We have a patch somewhere that will kill the node on IOErrors, since those tend to be of the class that are unrecoverable. -ryan On Thu, Jul 7, 2011 at 8:02 PM, Jonathan Ellis wrote: > Yeah, ideally it should probably die or drop into read-only mode if it > runs out of space. > (https://issues.a

Re: Strong Consistency with ONE read/writes

2011-07-12 Thread Ryan King
If you're interested in this idea, you should read up about Spinnaker: http://www.vldb.org/pvldb/vol4/p243-rao.pdf -ryan On Mon, Jul 11, 2011 at 2:48 PM, Yang wrote: > I'm not proposing any changes to be done, but this looks like a very > interesting topic for thought/hack/learning, so the follo

Re: question on capacity planning

2011-06-29 Thread Ryan King
On Wed, Jun 29, 2011 at 5:36 AM, Jacob, Arun wrote: > if I'm planning to store 20TB of new data per week, and expire all data > every 2 weeks, with a replication factor of 3, do I only need approximately > 120 TB of disk? I'm going to use ttl in my column values to automatically > expire data. Or

Re: counter question

2011-06-24 Thread Ryan King
On Fri, Jun 24, 2011 at 6:08 AM, Joseph Stein wrote: > cool > now that 0.8 is out any chance Rainbird is going to be open sourced? Not anytime soon. We're busy launching a bunch of stuff (some of which you'll hear about at CassandraSF). -ryan > if not then I guess I will be building my own Scal

Re: hinted handoff sleeping

2011-06-23 Thread Ryan King
On Thu, Jun 23, 2011 at 2:55 PM, Jeffrey Wang wrote: > Hey all, > > > > We’re running a slightly patched version of 0.7.3 on a cluster of 5 nodes. > I’ve been noticing a number of messages in our logs which look like this > (after a node goes “down” and comes back up, usually just due to a GC): >

Re: No Transactions: An Example

2011-06-23 Thread Ryan King
On Thu, Jun 23, 2011 at 2:05 PM, Les Hazlewood wrote: > Hi Dominic, > Thanks so much for providing this information.  I was unaware of Cages and > this looks like it could be used effectively for certain things. > >> This is because Cassandra uses the timestamps of columns that have been >> writte

Re: 99.999% uptime - Operations Best Practices?

2011-06-22 Thread Ryan King
On Wed, Jun 22, 2011 at 2:24 PM, Les Hazlewood wrote: > I'm planning on using Cassandra as a product's core data store, and it is > imperative that it never goes down or loses data, even in the event of a > data center failure.  This uptime requirement ("five nines": 99.999% uptime) > w/ WAN capab

Re: simple question about merged SSTable sizes

2011-06-22 Thread Ryan King
On Wed, Jun 22, 2011 at 10:00 AM, Jonathan Colby wrote: > Thanks for the explanation.  I'm still a bit "skeptical". > > So if you really needed to control the maximum size of compacted SSTables,   > you need to delete data at such a rate that the new files created by > compaction are less than or

Re: SSTable corruption blocking compaction and scrub can't fix it

2011-06-17 Thread Ryan King
Even without lsof, you should be able to get the data from /proc/$pid -ryan On Fri, Jun 17, 2011 at 5:08 AM, Dominic Williams wrote: > Unfortunately I shutdown that node and anyway lsof wasn't installed. > But $ulimit gives > unlimited > > On 17 June 2011 13:00, Sylvain Lebresne wrote: >> >> On

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 2:12 PM, AJ wrote: > On 6/16/2011 2:37 PM, Ryan King wrote: >> >> On Thu, Jun 16, 2011 at 1:05 PM, AJ  wrote: > >> >>>> >>>> The Cassandra consistency model is pretty elegant and this type of >>>> approach bre

Re: compression for regular column names?

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 3:41 PM, E R wrote: > Hi all, > > As a way of gaining familiarity with Cassandra I am migrating a table > that is currently stored in a relational database and mapping it into > a Cassandra column family. We add about 700,000 new rows a day to this > table, and the average

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 1:05 PM, AJ wrote: > On 6/16/2011 10:58 AM, Dan Hendry wrote: >> >> I think this would add a lot of complexity behind the scenes and be >> conceptually confusing, particularly for new users. > > I'm not so sure about this.  Cass is already somewhat sophisticated and I > don

Re: snitch & thrift

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 6:11 AM, Terje Marthinussen wrote: > Hi all! > Assuming a node ends up in GC land for a while, there is a good chance that > even though it performs terribly and the dynamic snitching will help you to > avoid it on the gossip side, it will not really help you much if thrift

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 8:18 AM, AJ wrote: > Good morning all. > > Hypothetical Setup: > 1 data center > RF = 3 > Total nodes > 3 > > Problem: > Suppose I need maximum consistency for one critical operation; thus I > specify CL = ALL for reads.  However, this will fail if only 1 replica > endpoint

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Ryan King
There's a ticket open for this: https://issues.apache.org/jira/browse/CASSANDRA-2521. Vote on it if you think its important. -ryan On Wed, Jun 15, 2011 at 7:34 PM, Jeffrey Kesselman wrote: > The GC cleanup approach, if depending on specific objects being GCd, > is fundamentally flawed. > > I bro

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Ryan King
There's a ticket open to address this: https://issues.apache.org/jira/browse/CASSANDRA-1974 -ryan On Wed, Jun 15, 2011 at 8:49 AM, Terje Marthinussen wrote: > > > On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen > wrote: >> >> Even if the gc call cleaned all files, it is not really accepta

Re: Cassandra scaling problem in virtualized environment

2011-06-14 Thread Ryan King
On Tue, Jun 14, 2011 at 8:16 AM, Schuilenga, Jan Taeke wrote: > Hi All, > > We are having issues testing Cassandra in a virtualized environment (Vmware > ESX). > Our challenge is to combine a  high number of concurrent users with a very > low maximum response time. > Immediately we ran into a prob

Re: need some help with counters

2011-06-09 Thread Ryan King
On Thu, Jun 9, 2011 at 1:06 PM, Ian Holsman wrote: > Hi Ryan. > you wouldn't have your version of cassandra up on github would you?? No, and the patch isn't in our version yet either. We're still working on it. -ryan

Re: need some help with counters

2011-06-09 Thread Ryan King
On Thu, Jun 9, 2011 at 12:41 PM, Ian Holsman wrote: > Hi. > > I had a brief look at CASSANDRA-2103 (expiring counter columns), and I was > wondering if anyone can help me with my problem. > > I want to keep some page-view stats on a URL at different levels of > granularity (page views per hour,

Re: Ideas for Big Data Support

2011-06-09 Thread Ryan King
On Thu, Jun 9, 2011 at 7:40 AM, Edward Capriolo wrote: > > > On Thu, Jun 9, 2011 at 4:23 AM, AJ wrote: >> >> [Please feel free to correct me on anything or suggest other workarounds >> that could be employed now to help.] >> >> Hello, >> >> This is purely theoretical, as I don't have a big workin

Re: multiple clusters communicating

2011-06-07 Thread Ryan King
On Mon, Jun 6, 2011 at 5:01 PM, Jeffrey Wang wrote: > Hey all, > > > > We’re seeing a strange issue in which two completely separate clusters > (0.7.3) on the same subnet (X.X.X.146 through X.X.X.150) with 3 machines > (146-148) and 2 machines (149-150). Both of them are seeded with the > respecti

Re: [RELEASE] 0.8.0

2011-06-07 Thread Ryan King
On Mon, Jun 6, 2011 at 7:00 PM, Terje Marthinussen wrote: > Yes, I am aware of it but it was not an alternative for this project which > will face production soon. > The patch I have is fairly non-intrusive (especially vs. 674) so I think it > can be interesting depending on how quickly 674 will b

Re: Multiple large disks in server - setup considerations

2011-06-07 Thread Ryan King
On Tue, Jun 7, 2011 at 4:34 AM, Erik Forsberg wrote: > On Tue, 31 May 2011 13:23:36 -0500 > Jonathan Ellis wrote: > >> Have you read http://wiki.apache.org/cassandra/CassandraHardware ? > > I had, but it was a while ago so I guess I kind of deserved an RTFM! :-) > > After re-reading it, I still w

Re: [RELEASE] 0.8.0

2011-06-06 Thread Ryan King
On Mon, Jun 6, 2011 at 6:09 AM, Terje Marthinussen wrote: > Of course I talked too soon. > I saw a corrupted commitlog some days back after killing cassandra and I > just came across a committed hints file after a cluster restart for some > config changes :( > Will look into that. > Otherwise, not

Re: rainbird question (why is the 1minute buffer needed?)

2011-05-23 Thread Ryan King
Maybe. We haven't really tested it without buffering and probably won't anytime soon. 1 minute latency is good enough for what we're doing. On Mon, May 23, 2011 at 1:58 PM, Jeremy Hanna wrote: > > On May 23, 2011, at 2:23 PM, Ryan King wrote: > >> On Mon, May 23,

Re: rainbird question (why is the 1minute buffer needed?)

2011-05-23 Thread Ryan King
On Mon, May 23, 2011 at 12:06 PM, Yang wrote: > Thanks Ryan, > > could you please share more details: according to what you observed in > testing,  why was performance  worse if you do not do extra buffering? > > I was thinking (could be wrong)  that without extra buffering, the > counter update g

Re: rainbird question (why is the 1minute buffer needed?)

2011-05-23 Thread Ryan King
On Sun, May 22, 2011 at 11:00 AM, Yang wrote: > Thanks, > > I did read through that pdf doc, and went through the counters code in > 0.8-rc2, I think I understand the logic in that code. > > in my hypothetical implementation, I am not suggesting to overstep the > complicated logic in counters code

Re: Ghost token

2011-05-13 Thread Ryan King
That's the same as the last one. The token space is a circle so the last one at the list is repeated at the top. -ryan On Fri, May 13, 2011 at 9:59 AM, Scott McPheeters wrote: > Has anyone seen this and know if it is causing an issue or how to fix > it?  Anytime I run nodetool ring (on any node)

Re: CQL, 0.8, and need for language drivers

2011-04-13 Thread Ryan King
On Tue, Apr 12, 2011 at 7:16 PM, Jeremy Hanna wrote: > As some may have heard, CQL is going to be in 0.8.  It's a level of > abstraction that will hopefully make the lives of client developers > substantially easier.  The ideal is to make it so client devs only need to do > work to make a clien

Re: Analysing hotspot gc logs

2011-04-11 Thread Ryan King
On Mon, Apr 11, 2011 at 10:35 AM, Chris Burroughs wrote: > To avoid taking my own thread [1] off on a tangent.  Does anyone have a > reccomendation for a tool to graphical analysis (ie make useful graphs) > out of hoptspot gc logs?  Google searches have turned up several results > along the lines

Re: problem with large batch mutation set

2011-04-07 Thread Ryan King
On Wed, Apr 6, 2011 at 11:49 PM, Ross Black wrote: > Hi, > > I am using the thrift client batch_mutate method with Cassandra 0.7.0 on > Ubuntu 10.10. > > When the size of the mutations gets too large, the client fails with the > following exception: > > Caused by: org.apache.thrift.transport.TTr

Re: RTG/MRTG/Cricket replacement using Cassandra?

2011-03-31 Thread Ryan King
We have a solution for time series data on cassandra at Twitter that we'd like to open source, but it requires 0.8/trunk so we're not going to release it until that's stable. See http://www.slideshare.net/kevinweil/rainbird-realtime-analytics-at-twitter-strata-2011 -ryan On Thu, Mar 31, 2011 at

Re: stress.py bug?

2011-03-21 Thread Ryan King
On Mon, Mar 21, 2011 at 9:34 AM, pob wrote: > You mean, > more threads in stress.py? The purpose was figure out whats the > biggest bandwidth that C* can use. You should try more threads, but at some point you'll hit diminishing returns there. You many need to drive load from more than one host.

Re: stress.py bug?

2011-03-21 Thread Ryan King
On Mon, Mar 21, 2011 at 4:02 AM, pob wrote: > Hi, > I'm inserting data from client node with stress.py to cluster of 6 nodes. > They are all on 1Gbps network, max real throughput of network is 930Mbps > (after measurement). > python stress.py -c 1 -S 17  -d{6nodes}  -l3 -e QUORUM >  --operatio

Re: reduced cached mem; resident set size growth

2011-03-20 Thread Ryan King
The test was inconclusive because we decomissioned that cluster before it'd be running long enough to exhibit the problem. -ryan On Wed, Mar 16, 2011 at 7:27 PM, Zhu Han wrote: > > > On Thu, Feb 3, 2011 at 1:49 AM, Ryan King wrote: >> >> On Wed, Feb 2, 2011 a

Re: FW: Very slow batch insert using version 0.7.2

2011-03-10 Thread Ryan King
Why use such a large batch size? -ryan On Thu, Mar 10, 2011 at 6:31 AM, Desimpel, Ignace wrote: > > > Hello, > > I had a demo application with embedded cassandra version 0.6.x, inserting > about 120 K  row mutations in one call. > > In version 0.6.x that usually took about 5 seconds, and I could

Re: Question about insert performance in multiple node cluster

2011-02-28 Thread Ryan King
On Mon, Feb 28, 2011 at 2:05 PM, Flachbart, Dirk (HP Software - TransactionVision) wrote: > Replication factor is set to 1, and I'm using ConsistencyLevel.ANY. And yep, > I tried doubling the threads from 16 to 32 when running with the second > server, didn't make a difference. > > Regarding the

Re: Question about insert performance in multiple node cluster

2011-02-28 Thread Ryan King
On Mon, Feb 28, 2011 at 9:24 AM, Flachbart, Dirk (HP Software - TransactionVision) wrote: > Hi, > > > > We are trying to use Cassandra for high-performance insertion of simple > key/value records. I have set up Cassandra on two of my machines in my local > network (Windows 2008 server), using pret

Re: Basic Cassandra Architecture questions

2011-02-11 Thread Ryan King
On Fri, Feb 11, 2011 at 9:37 AM, mcasandra wrote: > > Is commit log file maintained on every node that's responsible to keep key > ranges? So if Key A is supposed to go to Node, 1,2,3 then the commit log for > Key A will be on each of these nodes? Is this commit log like redo log of > oracle, whic

Re: Column name size

2011-02-11 Thread Ryan King
On Fri, Feb 11, 2011 at 2:06 AM, Patrik Modesto wrote: > Hi all! > > I'm thinking if size of a column name could matter for a large dataset > in Cassandra  (I mean lots of rows). For example what if I have a row > with 10 columns each has 10 bytes value and 10 bytes name. Do I have > half the row

Re: Cassandra memory consumption

2011-02-08 Thread Ryan King
Which jvm and version are you using? -ryan On Tue, Feb 8, 2011 at 7:32 AM, Victor Kabdebon wrote: > It is really weird that I am the only one to have this issue. > I restarted Cassandra today and already the memory compution is over the > limit : > > root  1739  4.0 24.5 664968 494996 pts/4 

Re: Ruby thrift is trying to write Time as string

2011-02-07 Thread Ryan King
On Sat, Feb 5, 2011 at 10:12 PM, Joshua Partogi wrote: > Hi, > > I don't know whether my assumption is right or not. When I tried to insert a > Time value into a column I am getting this exception: > > vendor/ruby/1.8/gems/thrift-0.5.0/lib/thrift/protocol/binary_protocol.rb:106:in > `write_string'

Re: New Generation Size guidelines

2011-02-04 Thread Ryan King
On Fri, Feb 4, 2011 at 1:45 PM, Oleg Proudnikov wrote: > > Hi All, > > I have a 3 server cluster with RF=2. My heap is 2G out of a 4G RAM. The > servers > have 4 cores. I used default heap settings. The Eden space ended up around 60M > and the Survivor spaces are around 7M. This feels a little bi

Re: Using a synchronized counter that keeps track of no of users on the application & using it to allot UserIds/ keys to the new users after sign up

2011-02-04 Thread Ryan King
On Thu, Feb 3, 2011 at 9:12 PM, Aklin_81 wrote: > Thanks Matthew & Ryan, > > The main inspiration behind me trying to generate Ids in sequential > manner is to reduce the size of the userId, since I am using it for > heavy denormalization. UUIDs are 16 bytes long, but I can also have a > unique Id

Re: Using a synchronized counter that keeps track of no of users on the application & using it to allot UserIds/ keys to the new users after sign up

2011-02-03 Thread Ryan King
You could also consider snowflake: http://github.com/twitter/snowflake which gives you ids that roughly sort by time (but aren't sequential). -ryan On Thu, Feb 3, 2011 at 11:13 AM, Matthew E. Kennedy wrote: > Unless you need your user identifiers to be sequential for some reason, I > would sa

Re: Do supercolumns have a purpose?

2011-02-03 Thread Ryan King
On Thu, Feb 3, 2011 at 6:49 AM, Jonathan Ellis wrote: > On Thu, Feb 3, 2011 at 6:44 AM, Sylvain Lebresne wrote: >> On Thu, Feb 3, 2011 at 3:00 PM, David Boxenhorn wrote: >>> >>> The advantage would be to enable secondary indexes on supercolumn >>> families. >> >> Then I suggest opening a ticket

Re: 0.7.0 mx4j, get attribute

2011-02-02 Thread Ryan King
On Wed, Feb 2, 2011 at 10:40 AM, Chris Burroughs wrote: > I'm using 0.7.0 and experimenting with the new mx4j support. > > http://host:port/mbean?objectname=org.apache.cassandra.request%3Atype%3DReadStage > > Returns a nice pretty html page.  For purposes of monitoring I would > like to get a sing

Re: reduced cached mem; resident set size growth

2011-02-02 Thread Ryan King
On Wed, Feb 2, 2011 at 10:29 AM, Chris Burroughs wrote: > On 02/02/2011 12:49 PM, Ryan King wrote: >> We're seeing a similar problem with one of our clusters (but over a >> longer time scale). Its possible that its not a leak, but just >> fragmentation. Unless you

Re: reduced cached mem; resident set size growth

2011-02-02 Thread Ryan King
On Wed, Feb 2, 2011 at 6:22 AM, Chris Burroughs wrote: > On 01/28/2011 09:19 PM, Chris Burroughs wrote: >> Thanks Oleg and Zhu.  I swear that wasn't a new hotspot version when I >> checked, but that's obviously not the case.  I'll update one node to the >> latest as soon as I can and report back.

Re: GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Ryan King
On Fri, Jan 21, 2011 at 12:24 PM, Joseph Stein wrote: > Thanks Ryan, Jake and Mike for the quick responses. > I will mull through this weekend between engineering things from scratch or > going the Solr/Solandra route as Jake points out is an option (and the > effort/time related with introducing

Re: GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Ryan King
Not open source, but here's a preso on how simplegeo do it: http://www.slideshare.net/mmalone/scaling-gis-data-in-nonrelational-data-stores Note: we do it very differently here at Twitter (but aren't at liberty to discuss in detail)– I say this just to point out that there are several valid strat

Re: Document Mapper for Ruby?

2011-01-20 Thread Ryan King
Not sure what you mean by document mapper, but CassandraObject might fit the bill: https://github.com/nzkoz/cassandra_object -ryan On Wed, Jan 19, 2011 at 11:03 PM, Joshua Partogi wrote: > Hi all, > > Is anyone aware of a document mapper for Ruby similar to MongoMapper? > > Thanks heaps for your

Re: Tombstone lifespan after multiple deletions

2011-01-17 Thread Ryan King
On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn wrote: > If I delete a row, and later on delete it again, before GCGraceSeconds has > elapsed, does the tombstone live longer? Each delete is a new tombstone, which should answer your question. -ryan > In other words, if I have the following scen

Re: Storing big objects into columns

2011-01-13 Thread Ryan King
On Thu, Jan 13, 2011 at 2:44 PM, Victor Kabdebon wrote: > Is there any recommanded maximum size for a Column ? (not the very upper > limit which is 2Gb) > Why is it useful to chunk the content into multiple columns ? I think you're going to have to do some tests yourself. You want to chunk it so

Re: Storing big objects into columns

2011-01-13 Thread Ryan King
On Thu, Jan 13, 2011 at 2:38 PM, Victor Kabdebon wrote: > Dear all, > In a project I would like to store "big" objects in columns, serialized. For > example entire images (several Ko to several Mo), flash animations (several > Mo) etc... > Does someone use Cassandra with those relatively big colum

Re: cassandra row cache

2011-01-13 Thread Ryan King
I'm not sure if this is entirely true, but I *think* older version of cassandra used a version of the ConcurrentLinkedHashmap (which backs the row cache) that used the Second Chance algorithm, rather than LRU, which might explain this non-LRU-like behavior. I may be entirely wrong about this though

Re: Should nodetool ring give equal load ?

2011-01-12 Thread Ryan King
On Wed, Jan 12, 2011 at 2:08 PM, mck wrote: > >> You're using an ordered partitioner and your nodes are evenly spread >> around the ring, but your data probably isn't evenly distributed. > > This load number seems equals to `du -hs ` and > since i've got N == RF shouldn't the data size always be t

Re: Should nodetool ring give equal load ?

2011-01-12 Thread Ryan King
On Wed, Jan 12, 2011 at 2:00 PM, mck wrote: > I'm using 0.7.0-rc3, 3 nodes, RF=3, and ByteOrderedPartitioner. > > When i run "nodetool ring" it reports > >> Address         Status State   Load            Owns    Token >>                                                         >> Token(bytes[ff0343

Re: Ruby database migrations for Cassandra - ActiveColumn

2011-01-11 Thread Ryan King
Awesome and great to see you're using our fauna cassandra gem. :) -ryan On Tue, Jan 11, 2011 at 10:18 AM, Mike Wynholds wrote: > Happy new year all- > I just wanted to mention that I have released a new Cassandra data > management gem called ActiveColumn.  The first major feature is > ActiveReco

Re: Insert LongType with ruby

2011-01-04 Thread Ryan King
On Tue, Jan 4, 2011 at 12:50 PM, vicent roca daniel wrote: > I'm getting more consistent results using Time.stamp instead of Time > From: https://github.com/fauna/cassandra/blob/master/lib/cassandra/long.rb Yeah, you were probably overwriting values then. -ryan

Re: Does Cassandra run better on Amazon EC2 or Rackspace cloud servers?

2011-01-03 Thread Ryan King
On Mon, Jan 3, 2011 at 3:04 PM, Cassy Andra wrote: > My company is looking to develop a software prototype based off Cassandra in > the cloud. We except to run 5 - 10 NoSQL servers for the prototype. I've > read online (Jonathan Ellis was pretty vocal about this) that EC2 has some > I/O issues. Is

Re: Insert LongType with ruby

2011-01-03 Thread Ryan King
llect' > from > /Users/armandolalala/.rvm/gems/ruby-1.9.2-p0/gems/cassandra-0.9.0/lib/cassandra/cassandra.rb:125:in > `insert' > from (irb):6 > from /Users/armandolalala/.rvm/rubies/ruby-1.9.2-p0/bin/irb:17:in `' > > On Mon, Jan 3, 2011 at 10:06 PM, Ryan King w

Re: Insert LongType with ruby

2011-01-03 Thread Ryan King
On Mon, Jan 3, 2011 at 12:56 PM, vicent roca daniel wrote: > Hi again! > code: > require 'rubygems' > require 'cassandra' > app = Cassandra.new('AOM', servers = "127.0.0.1:9160") > app.insert(:NumData, 'device1-cpu', { Time.now => 10.to_s }) I'm going to assume you're getting an exception here? I

Re: Insert LongType with ruby

2011-01-03 Thread Ryan King
On Mon, Jan 3, 2011 at 9:32 AM, vicent roca daniel wrote: > Hi Ryan, > When I insert the column, I don't get  any error. But, when I inspect the > contents, I don't see a valid number. > also, If I try to do a range query, I'm not getting the expected results. Please show the code you're using.

Re: cassandra ruby undefined method

2011-01-03 Thread Ryan King
On Sat, Jan 1, 2011 at 2:42 PM, vicent roca daniel wrote: > Hi guys, I'm new in this list && Cassandra :) > I'm playing with Cassandra with the ruby wrapper, and I can't figurate out > what's happing with this error... > I have this: > app = Cassandra.new('AOM', servers = "127.0.0.1:9160", :transp

Re: Insert LongType with ruby

2011-01-03 Thread Ryan King
On Sun, Jan 2, 2011 at 3:45 PM, vicent roca daniel wrote: > Hi guys, I need your help. > I'm trying to insert a column name of type LongType using the ruby wrapper, > but I can't get it working. > What I'm trying is something like this: >    app.insert(:Data, 'device1-cpu', { Time.now => 1234.to_s

Re: Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-16 Thread Ryan King
Are you creating a new connection for each row you insert (and if so are you closing it)? -ryan On Wed, Dec 15, 2010 at 8:13 AM, Amin Sakka, Novapost wrote: > Hello, > I'm using cassandra 0.7.0 rc1, a single node configuration, replication > factor 1, random partitioner, 2 GO heap size. > I ran

Re: Fauna Questions

2010-12-15 Thread Ryan King
On Tue, Dec 14, 2010 at 7:14 AM, Alberto Velandia wrote: > Hi has anyone noticed that the documentation for the Cassandra Class is gone > from the website? > > http://blog.evanweaver.com/2010/12/06/cassandra-0-8/ http://rdoc.info/gems/cassandra will always have the latest rdocs. > I was wonderi

Re: Consistency question caused by Read_all and Write_one

2010-12-10 Thread Ryan King
On Fri, Dec 10, 2010 at 12:49 PM, Alvin UW wrote: > Hello, > > > I got a consistency problem in Cassandra. > > Given a column family with a record:    Id   Name >     1    David > > There are three backups for this column family. > > Assume there

Re: Running multiple instances on a single server --micrandra ??

2010-12-09 Thread Ryan King
Overall, I don't think this is a crazy idea, though I think I'd prefer cassandra to manage this setup. The problem you will run into is that because the storage port is assumed to be the same across the cluster you'll only be able to do this if you can assign multiple IPs to each server (one for e

fauna cassandra client 0.9.0

2010-12-08 Thread Ryan King
I just pushed a 0.9.0 release of the fauna-cassandra ruby client. This is our first release that includes support for Cassandra 0.7 (currently supporting RC1 and not earlier 0.7 releases). code/download: https://rubygems.org/gems/cassandra git: http://github.com/fauna/cassandra File any bugs on g

Re: If one seed node crash, how can I add one seed node?

2010-12-07 Thread Ryan King
On Tue, Dec 7, 2010 at 1:07 PM, Eric Gilmore wrote: > What would comprise a sane and reasonably balanced list? Should there be a > certain proportion of seeds per total nodes? Any other considerations > besides a) list must be identical on all nodes and b) you can't > auto-bootstrap a seed node

Re: fauna/cassandra gem does not work with Cassandra 0.7

2010-12-07 Thread Ryan King
Please file this on github issues: https://github.com/fauna/cassandra/issues. And I'll get to it soon. -ryan On Tue, Dec 7, 2010 at 2:21 AM, Joshua Partogi wrote: > Hi, > > I pull out fauna/cassandra gem 0.10.0 from github. > > I then tried to get a value from cassandra as such. > > irb(main):00

Re: If one seed node crash, how can I add one seed node?

2010-12-07 Thread Ryan King
Note that there's not really anything special about the seed node and its all relative– the cluster doesn't necessarily have to agreed on who the seeds are. So, to bring up a new node to replace the old seed, just set the new node's seed to any existing node in the system. After that you can go ba

Re: Newbie question about connecting to a cassandra server from another server using Fauna

2010-12-06 Thread Ryan King
It would help if you give us more context. The code snippet you've given us is incomplete and not very helpful. -ryan On Mon, Dec 6, 2010 at 12:33 PM, Alberto Velandia wrote: > Hi I've successfully managed to connect to the server through the > cassandra-cli command but still no luck on doing it

Testathon at Twitter on December 13th

2010-12-06 Thread Ryan King
We're going to be hosting people at the Twitter offices the evening of December 13th to focus on testing 0.7. If you're interested please contact me offlist and I'll add you to the invite. Note that we're trying to keep the group small and focused. -ryan

Re: avro + cassandra + ruby

2010-11-16 Thread Ryan King
On Tue, Nov 16, 2010 at 10:25 AM, Jonathan Ellis wrote: > On Tue, Sep 28, 2010 at 6:35 PM, Ryan King wrote: >> One thing you should try is to make thrift use >> BinaryProtocolAccelerated, rather than the pure-ruby implementation >> (we should change the default). > >

Re: Cassandra 0.7 beta3 BinaryMemtable and Supercolumns

2010-11-12 Thread Ryan King
On Fri, Nov 12, 2010 at 7:33 AM, Aditya Muralidharan wrote: > Thanks for the response. We're trying to get a general idea of the insert and > retrieval performance, and we figured BinaryMemtable would be a great enabler > for our bulk import scenarios. Normal thrift inserts are certainly fast, b

Re: CF Stats in 0.7beta3

2010-11-10 Thread Ryan King
Yeah, that's really microsecond latency. Note, though that this isn't the full request timing, its just the storage proxy down, so it doesn't account for any latency added by thrift or the network. -ryan On Wed, Nov 10, 2010 at 1:43 PM, Rock, Paul wrote: > Afternoon all - I'm playing with 0.7bet

Re: High BloomFilterFalseRation

2010-11-02 Thread Ryan King
On Tue, Nov 2, 2010 at 1:28 AM, Daniel Doubleday wrote: > Hi all > > had some time yesterday to dig a lil deeper. And maybe this saves someone who > made the same mistake the time so ... > > After trying to reproduce the problem in unit tests with the same data which > led nowhere because every

Re: atomic test-or-set

2010-10-05 Thread Ryan King
On Tue, Oct 5, 2010 at 8:23 AM, Ian Rogers wrote: > > Does Cassandra have an atomic test-or-set operation? > > That is, I want to check to see if a key has a value and, if not, set it to > something.  But it must be an atomic operation - I can't do a separate fetch > and then set from the applicat

Re: avro + cassandra + ruby

2010-09-30 Thread Ryan King
On Thu, Sep 30, 2010 at 1:08 PM, Gabor Torok wrote: > I added a comment to an existing issue: > https://issues.apache.org/jira/browse/AVRO-537 Cool. I'll work with Jeff (who sits about 10 feet from me) to get this fixed. :) -ryan

Re: avro + cassandra + ruby

2010-09-29 Thread Ryan King
On Tue, Sep 28, 2010 at 4:06 PM, Gabor Torok wrote: > Hi, > I'm attempting to use avro to talk to cassandra because the ruby thrift > client's read performance is pretty bad (I measured 4x slower than java). > > However, I run into a problem when calling multiget_slice. > The server gives a Keysp

Re: avro + cassandra + ruby

2010-09-28 Thread Ryan King
On Tue, Sep 28, 2010 at 4:06 PM, Gabor Torok wrote: > Hi, > I'm attempting to use avro to talk to cassandra because the ruby thrift > client's read performance is pretty bad (I measured 4x slower than java). Only 4x feels like a win. :) One thing you should try is to make thrift use BinaryProto

Re: Client developer mailing list

2010-09-01 Thread Ryan King
On Wed, Sep 1, 2010 at 4:40 AM, Guilherme Defreitas wrote: > Hi guys, > I'm new in cassandra development and I would like to know witch is the best > (stable) client in Ruby to use with Cassandra? It will be use in a rails > project, but it don't need to be "Active Record" like. Try http://github

Re: SEO friendly pagination

2010-08-25 Thread Ryan King
On Wed, Aug 25, 2010 at 11:20 AM, Petr Odut wrote: > Hi, > I've read about pagination in cassandra. My current implementation is > get_range_slices with startKey = lastKey + 1, but I need to get the > specified page directly. Is it any chance to do this? > > If you look at twitter, it has direct p

Re: is it my cassandra cluster ok?

2010-08-25 Thread Ryan King
Looks like you need to do some load balancing. -ryan On Wed, Aug 25, 2010 at 12:33 AM, john xie wrote: > /opt/apache-cassandra-0.6.4/bin/nodetool --host 192.168.123.100 ring > Address       Status     Load          Range >      Ring > > 162027259805094200094770502377853667196 > 192.168.123.101Up

Re: cache sizes using percentages

2010-08-17 Thread Ryan King
On Tue, Aug 17, 2010 at 10:55 AM, Artie Copeland wrote: > if i set a key cache size of 100% the way i understand how that works is: > - the cache is not write through, but read through > - a key gets added to the cache on the first read if not already available > - the size of the cache will alway

  1   2   >