Re: compression

2010-04-01 Thread casablinca126.com
hi Ran, I think there's no compression on the sever end. I am doing the gzip compression on the client side myself. cheers, Cao Jiguang 2010-04-01 casablinca126.com 发件人: Ran Tavory 发送时间: 2010-04-01 14:37:59 收件人: user@cassandra.apache.org 抄送: 主题: compression What sort of compres

Re: Read Performance

2010-04-01 Thread Cemal Dalar
Hi James, I don't know how to get the below statistics data and calculate the access times (read/write in ms) in your previous mails. Can you explain a little? Iike to work on it also. CD On Thu, Apr 1, 2010 at 4:15 AM, Jonathan Ellis wrote: > On Wed, Mar 31, 2010 at 6:21 PM, James Golick > w

RE: compression

2010-04-01 Thread Weijun Li
Thrift client doesn’t seem to compress anything unless you change thrift protocol or use a transport that support compression. I modified TSocket to support compression but it occasionally has broken pipe error due to crappy Java zlib support (so that clients has to reconnect to get around the s

Re: expiring data out of Cassandra/time to live

2010-04-01 Thread Sylvain Lebresne
> On that topic, what exactly is keeping this feature out of the official > releases? The patch changes the thrift API. Among possibly other reason, I think it was one reason why it wasn't even consider for inclusion in the 0.6 branch. As for trunk (and for the future 0.7 thus), there is scheduled

Re: Cassandra data file corrupt

2010-04-01 Thread JKnight JKnight
Dear David Timothy Strauss, Could you tell me more detail about Backups? As I know, Cassandra data file will compact new data, so it can be changed many times. How to backup Cassandra data? Thanks. On Wed, Mar 31, 2010 at 8:16 AM, David Timothy Strauss < da...@fourkitchens.com> wrote: > Cassan

Re: compression

2010-04-01 Thread Rao Venugopal
To Cao Jiguang I was watching this presentation on bigtable yesterday http://video.google.com/videoplay?docid=7278544055668715642# and Jeff mentioned that they compared three different compression libraries BMDiff, LZO and gzip. Apparently, gzip was the most cpu intensive and they ended up goin

Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
So we are adding another node to the cluster with the latest 0.6 branch (RC1). It seems to be hung in some limbo state. Before bootstrapping our cluster had 50-60GB spread fairly evenly across 4 machines, with RF=3. One machine had more load than the others, and sure enough bootstrapping select

Re: compression

2010-04-01 Thread Tatu Saloranta
On Thu, Apr 1, 2010 at 8:27 AM, Rao Venugopal wrote: > To Cao Jiguang > > I was watching this presentation on bigtable yesterday > http://video.google.com/videoplay?docid=7278544055668715642# > > and Jeff mentioned that they compared three different compression libraries > BMDiff, LZO and gzip.  

Re: Stalled Bootstrapping Process

2010-04-01 Thread Gary Dusbabek
Does the JMX StreamingService list any incoming/outgoing files/hosts on the sending/receiving nodes? Gary. On Thu, Apr 1, 2010 at 10:26, Dan Di Spaltro wrote: > So we are adding another node to the cluster with the latest 0.6 branch > (RC1).  It seems to be hung in some limbo state. > Before boo

LazyBoy question

2010-04-01 Thread Gary
I am trying out the lazyboy library to access cassandra, I was able to get the data in and out using Record save/load functions. Is there a way to get a slice, or all the records under a CF so I can iterate? It is probably a naive question, as I am just getting into this field Thanks, Gary

Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
The light-blue machine is in Operation Mode: Bootstrap On Thu, Apr 1, 2010 at 9:26 AM, Dan Di Spaltro wrote: > So we are adding another node to the cluster with the latest 0.6 branch > (RC1). It seems to be hung in some limbo state. > > Before bootstrapping our cluster had 50-60GB spread fairly

Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
which node rebooted, the red one, or the blue one? On Thu, Apr 1, 2010 at 11:26 AM, Dan Di Spaltro wrote: > So we are adding another node to the cluster with the latest 0.6 branch > (RC1).  It seems to be hung in some limbo state. > Before bootstrapping our cluster had 50-60GB spread fairly evenl

Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Red one. Gary - both say nothing is happening with no destinations or sources. On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis wrote: > which node rebooted, the red one, or the blue one? > > On Thu, Apr 1, 2010 at 11:26 AM, Dan Di Spaltro > wrote: > > So we are adding another node to the clust

Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Before the Red one rebooted it had 1 active STREAM-STAGE. Now it has 0 in STREAM-STAGE. On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro wrote: > Red one. > > Gary - both say nothing is happening with no destinations or sources. > > > On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis wrote: > >> w

Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
Bootstrap source restarting will always fail bootstrap. You'll need to restart the blue one too now, I'm afraid. On Thu, Apr 1, 2010 at 2:01 PM, Dan Di Spaltro wrote: > Before the Red one rebooted it had 1 active STREAM-STAGE.  Now it has 0 in > STREAM-STAGE. > > On Thu, Apr 1, 2010 at 11:57 AM,

Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Okay, so should I run any more commands like cleanup before? On Thu, Apr 1, 2010 at 12:09 PM, Jonathan Ellis wrote: > Bootstrap source restarting will always fail bootstrap. You'll need > to restart the blue one too now, I'm afraid. > > On Thu, Apr 1, 2010 at 2:01 PM, Dan Di Spaltro > wrote: >

Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
There shouldn't be anything to clean up. (The temporary streaming files it anticompacted are automatically removed on restart) On Thu, Apr 1, 2010 at 2:17 PM, Dan Di Spaltro wrote: > Okay, so should I run any more commands like cleanup before? > > On Thu, Apr 1, 2010 at 12:09 PM, Jonathan Ellis

Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
But I didn't restart the red one. On Thu, Apr 1, 2010 at 12:18 PM, Jonathan Ellis wrote: > There shouldn't be anything to clean up. (The temporary streaming > files it anticompacted are automatically removed on restart) > > On Thu, Apr 1, 2010 at 2:17 PM, Dan Di Spaltro > wrote: > > Okay, so s

Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
On Thu, Apr 1, 2010 at 2:22 PM, Dan Di Spaltro wrote: > But I didn't restart the red one. >> >> > On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro >> >> > >> >> > wrote: >> >> >> >> >> >> Red one. >> >> >> >> >> >> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis >> >> >> wrote: >> >> >>> >> >> >

Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Sorry I meant the red one restarted about a day ago.  The graph shows the dip in disk space.  But it no where near returned to the previous amount of disk usage.  I was referring to how the red one didn't reclaim all its space (I figure about 60gb actually belong on that machine) Is that normal (it

Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
So it looks like its still performing anti-compaction. The compactionmanager is the best way to track this? On Thu, Apr 1, 2010 at 12:31 PM, Dan Di Spaltro wrote: > Sorry I meant the red one restarted about a day ago.  The graph shows > the dip in disk space.  But it no where near returned to th

Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
Right. On Thu, Apr 1, 2010 at 3:15 PM, Dan Di Spaltro wrote: > So it looks like its still performing anti-compaction.  The > compactionmanager is the best way to track this? > > On Thu, Apr 1, 2010 at 12:31 PM, Dan Di Spaltro > wrote: >> Sorry I meant the red one restarted about a day ago.  The

Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Seems to be doing more stuff now. Ive attached an updated screenshot. On Thu, Apr 1, 2010 at 1:16 PM, Jonathan Ellis wrote: > Right. > > On Thu, Apr 1, 2010 at 3:15 PM, Dan Di Spaltro > wrote: >> So it looks like its still performing anti-compaction.  The >> compactionmanager is the best way t

Re: Read Performance

2010-04-01 Thread James Golick
I don't have the additional hardware to try to isolate this issue atm, so I decided to push some code that performs 20% of reads directly from cassandra. The cache hit rate has gone up to about 88% now and it's still climbing, albeit slowly. There remains plenty of free cache space. So far, the av

Re: Read Performance

2010-04-01 Thread Joseph Stump
Taking our flamewar offline. :-D On Thu, Apr 1, 2010 at 1:36 PM, James Golick wrote: > I don't have the additional hardware to try to isolate this issue atm You'd be able to spin up hardware to isolate that issue on AWS. ;) --Joe

Re: Read Performance

2010-04-01 Thread Jeremy Dunck
Or rackspace. ;) On Thu, Apr 1, 2010 at 2:49 PM, Joseph Stump wrote: > Taking our flamewar offline. :-D > > On Thu, Apr 1, 2010 at 1:36 PM, James Golick wrote: >> I don't have the additional hardware to try to isolate this issue atm > > You'd be able to spin up hardware to isolate that issu

Re: Read Performance

2010-04-01 Thread James Golick
Damnit! On Thu, Apr 1, 2010 at 2:05 PM, Jeremy Dunck wrote: > Or rackspace. ;) > > On Thu, Apr 1, 2010 at 2:49 PM, Joseph Stump wrote: > > Taking our flamewar offline. :-D > > > > On Thu, Apr 1, 2010 at 1:36 PM, James Golick > wrote: > >> I don't have the additional hardware to try to iso

Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
I would turn debug logging on globally on the new node, that will answer more questions than just the streaming package.

Re: Read Performance

2010-04-01 Thread Peter Chang
pwned. On Thu, Apr 1, 2010 at 2:09 PM, James Golick wrote: > Damnit! > > > On Thu, Apr 1, 2010 at 2:05 PM, Jeremy Dunck wrote: > >> Or rackspace. ;) >> >> On Thu, Apr 1, 2010 at 2:49 PM, Joseph Stump wrote: >> > Taking our flamewar offline. :-D >> > >> > On Thu, Apr 1, 2010 at 1:36 PM, Ja

Proxy instances?

2010-04-01 Thread David King
Is it possible to have Cassandra instances that serve only as proxies to the rest of the cluster, but have no storage themselves? Maybe with a keyspace length of 0?

Re: Proxy instances?

2010-04-01 Thread Brandon Williams
On Thu, Apr 1, 2010 at 7:19 PM, David King wrote: > Is it possible to have Cassandra instances that serve only as proxies to > the rest of the cluster, but have no storage themselves? Maybe with a > keyspace length of 0? contrib/client_only is what you're looking for. -Brandon

Creating a Total Ordered Queue in Cassandra

2010-04-01 Thread Jeremy Davis
I'm in the process of implementing a Totally Ordered Queue in Cassandra, and wanted to bounce my ideas off the list and also see if there are any other suggestions. I've come up with an external source of ID's that are always increasing (but not monotonic), and I've also used external synchronizat

Re: Creating a Total Ordered Queue in Cassandra

2010-04-01 Thread Keith Thornhill
you mention never deleting from the queue, so what purpose is this serving? (if you don't pop off the front, is it really a queue?) seems if guaranteed order of messages is required, there are many other projects which are focused towards that problem (rabbitmq, kestrel, activemq, etc) or am i mi

Re: Re: compression

2010-04-01 Thread casablinca126.com
hi, Great! thanks to Rao and Tatu :) I will test them and let you know what I found. regards, Cao Jiguang - 发件人:Tatu Saloranta 发送日期:2010-04-02 01:08:52 收件人:u...@cassandra.apache.org 抄送: 主题:Re: compression

Re: Read Performance

2010-04-01 Thread James Golick
Well, folks, I'm feeling a little stupid right now (adding to the injury inflicted by one Mr. Stump :-P). So, here's the story. The cache hit rate is up around 97% now. The ruby code is down to around 20-25ms to multiget the 20 rows. I did some profiling, though, and realized that a lot of time wa

Re: Read Performance

2010-04-01 Thread Brandon Williams
On Thu, Apr 1, 2010 at 9:37 PM, James Golick wrote: > Well, folks, I'm feeling a little stupid right now (adding to the injury > inflicted by one Mr. Stump :-P). > > So, here's the story. The cache hit rate is up around 97% now. The ruby > code is down to around 20-25ms to multiget the 20 rows. I

Re: Creating a Total Ordered Queue in Cassandra

2010-04-01 Thread Jeremy Davis
You are correct, it is not a queue in the classic sense... I'm storing the entire "conversation" with a client in perpetuity, and then playing it back in the order received. Rabbitmq/activemq etc all have about the same throughput 3-6K persistent messages/sec, and are not good for storing the conv

Re: Read Performance

2010-04-01 Thread James Golick
Yes. J. Sent from my iPhone. On 2010-04-01, at 9:21 PM, Brandon Williams wrote: On Thu, Apr 1, 2010 at 9:37 PM, James Golick wrote: Well, folks, I'm feeling a little stupid right now (adding to the injury inflicted by one Mr. Stump :-P). So, here's the story. The cache hit rate is up a

Re: Creating a Total Ordered Queue in Cassandra

2010-04-01 Thread Jeremy Davis
Since twitter is everyone's favorite analogy: It's like twitter, but faster and with bigger messages that I may need to go back and replay in order to mine for more details at a later date. Thus, I call it a queue, because the order of messages is important.. But not anything like a message broker/

best practice for migrating data

2010-04-01 Thread AJ Chen
when adding/changing a column to a column family for existing data in cassandra, what's a good way to do it? thanks, -aj-- AJ Chen, PhD Chair, Semantic Web SIG, sdforum.org http://web2express.org twitter @web2express Palo Alto, CA, USA

Re: Creating a Total Ordered Queue in Cassandra

2010-04-01 Thread Tatu Saloranta
On Thu, Apr 1, 2010 at 9:43 PM, Jeremy Davis wrote: > > You are correct, it is not a queue in the classic sense... I'm storing the > entire "conversation" with a client in perpetuity, and then playing it back > in the order received. > > Rabbitmq/activemq etc all have about the same throughput 3-6