Re: Not overwriting values

2010-09-21 Thread Phil Stanhope
My experience is that timestamps have to be sequentially increasing for writes to work. Soft/silent error if you do not follow this protocol. Haven't tested against > 0.6.4 though. On Tue, Sep 21, 2010 at 8:29 AM, Lucas Nodine wrote: > Chris, I believe if the timestamp being written if the same

Re: OrderPreservingPartitioner for get_range_slices

2010-09-15 Thread Phil Stanhope
My experience for the last question is ... it depends. If you have NO changes to the store (which I would argue could be abnormal, it's not in a production environment allowing writes) ... then I you can do a full range/key scan and get no repeats. Factors that will impact scanning all keys includ

Re: Cassandra on AWS across Regions

2010-09-02 Thread Phil Stanhope
Ben, can you elaborate on some infrastructure topology issues that would break this approach? On Wed, Sep 1, 2010 at 6:25 PM, Benjamin Black wrote: > On Wed, Sep 1, 2010 at 4:16 PM, Andres March wrote: > > I didn't have anything specific in mind. I understand all the issues > around > > DNS and

Re: Cassandra summit video downloads?

2010-08-28 Thread Phil Stanhope
Thanks ... The format and download links did the trick -phil On Aug 28, 2010, at 7:38 AM, Jeremy Hanna wrote: > Looks like below each video (on blip.tv) there's a download link - downloads > to a flash video (.flv) file. > > On Aug 28, 2010, at 5:33 AM, Phil Stanhope wrote

Cassandra summit video downloads?

2010-08-28 Thread Phil Stanhope
I'm about to be on a series of long plane flights ... is there way to download the videos from the summit for offline viewing?

Re: Identifying Tombstones

2010-07-01 Thread Phil Stanhope
I understand that tombstones are internal implementation detail ... yet, the fact remains in 0.6.2 that a key/col creation followed by a delete of the key/col will result in the key being returned in a get_range_slices call. If the CF is flushed and compacted (after GCGraceSeconds), the key will

Re: Finding new Cassandra data

2010-06-22 Thread Phil Stanhope
I can envision two fundamentally different approaches: 1. A CF that is CompareWith LONG ... use microsecond timestamps as your keys ... then you can filter by time ranges. This implies that you are willing to do a double write (once for the original data and then again for the logging). And a t

Re: Possible bug in Cassandra MapReduce

2010-06-18 Thread Phil Stanhope
"blow all the data away" ... how do you do that? What is the timestamp precision that you are using when creating key/col or key/supercol/col items? I have seen a fail to write a key when the timestamp is identical to the previous timestamp of a deleted key/col. While I didn't examine the source

Re: what is the best way to truncate a column family

2010-06-18 Thread Phil Stanhope
ration an admin api so I think it's a > fair tradeoff. > > On Fri, Jun 18, 2010 at 11:50 PM, Phil Stanhope wrote: > In 0.6.x the iterating approach works ... but you need to flush and compact > (after GCGraceSeconds) in order to NOT see the keys in the CF. > > Will the

Re: what is the best way to truncate a column family

2010-06-18 Thread Phil Stanhope
In 0.6.x the iterating approach works ... but you need to flush and compact (after GCGraceSeconds) in order to NOT see the keys in the CF. Will the behavior of the truncate method in 0.7 require flush/compact as well? Or will it be immediate? -phil On Jun 18, 2010, at 1:29 PM, Benjamin Black w

Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Phil Stanhope
How are you doing your inserts? I draw a clear line between 1) bootstrapping a cluster with data and 2) simulating expected/projected read/write behavior. If you are bootstrapping then I would look into the batch_mutate APIs. They allow you to improve your performance on writes dramatically. I