Re: compression

2010-04-01 Thread casablinca126.com
hi Ran,
I think there's no compression on the sever end. I am doing the gzip 
compression on the client side myself.

cheers,

Cao Jiguang


2010-04-01 



casablinca126.com 



发件人: Ran Tavory 
发送时间: 2010-04-01  14:37:59 
收件人: user@cassandra.apache.org 
抄送: 
主题: compression 
 
What sort of compression (if any) is performed by cassandra?
Does the thrift client compress anything before sending to the server to 
preserve bandwidth?
Does the server compress the values in the columns to preserve disk or memory?


... I assume compaction, performed on the server side, is different than 
compression... however, does compaction include any compression features as 
well?


Thanks


Re: Read Performance

2010-04-01 Thread Cemal Dalar
Hi James,

I don't know how to get the below statistics data and calculate the access
times (read/write in ms) in your previous mails. Can you explain a little?
Iike to work on it also.

CD

On Thu, Apr 1, 2010 at 4:15 AM, Jonathan Ellis  wrote:

> On Wed, Mar 31, 2010 at 6:21 PM, James Golick 
> wrote:
> > Keyspace: ActivityFeed
> > Read Count: 699443
> > Read Latency: 16.11017477192566 ms.
>
> > Column Family: Events
> > Read Count: 232378
> > Read Latency: 0.396 ms.
> > Row cache capacity: 50
> > Row cache size: 62768
> > Row cache hit rate: 0.007716049382716049
>
> This says that
>
>  - recent queries to Events are much faster than the lifetime average
> for your Keyspace
>  - even though you have almost no row cache hits (~1700 out of 232000
> reads)
>
> Not sure what to make of that, tbh.  If it were me I would try to
> reproduce on a test machine w/o all that pesky live traffic confusing
> things.
>
> -Jonathan
>


RE: compression

2010-04-01 Thread Weijun Li
Thrift client doesn’t seem to compress anything unless you change thrift 
protocol or use a transport that support compression. I modified TSocket to 
support compression but it occasionally has broken pipe error due to crappy 
Java zlib support (so that clients has to reconnect to get around the socket 
error).  This is a support in transport layer meaning you’ll get compression 
support for all or none. 

 

Cassandra server doesn’t seem to support compression either and we are doing 
that for memory cache by plugging memcached into Cassandra. Still testing…

 

-Weijun

 

From: Ran Tavory [mailto:ran...@gmail.com] 
Sent: Wednesday, March 31, 2010 11:37 PM
To: user@cassandra.apache.org
Subject: compression

 

What sort of compression (if any) is performed by cassandra?

Does the thrift client compress anything before sending to the server to 
preserve bandwidth?

Does the server compress the values in the columns to preserve disk or memory?

 

... I assume compaction, performed on the server side, is different than 
compression... however, does compaction include any compression features as 
well?

 

Thanks



Re: expiring data out of Cassandra/time to live

2010-04-01 Thread Sylvain Lebresne
> On that topic, what exactly is keeping this feature out of the official
> releases?

The patch changes the thrift API. Among possibly other reason, I think it was
one reason why it wasn't even consider for inclusion in the 0.6 branch. As for
trunk (and for the future 0.7 thus), there is scheduled internal
changes (vector
clocks and changes to the SSTable format at least) that will force
this patch to
be rewritten somehow.
I think that is part of the reasons why it is not yet included. But of
course, that
being said, I'm all for an inclusion.

(as a side node, patch for the 0.6 version are (now) attached to the
jira ticket.
Should make it much more easier for those who want to test than checking the
old svn version and merge back to 0.6)

>
> On Wed, Mar 31, 2010 at 3:43 PM, Daniel Kluesing  wrote:
>>
>> We also applied this patch to the 0.6 branch and have been running it for
>> a bit over a week. Works well, would love to see it get into trunk/0.7
>> proper.
>>
>>
>>
>> From: Ryan Daum [mailto:r...@thimbleware.com]
>> Sent: Wednesday, March 31, 2010 11:49 AM
>> To: user@cassandra.apache.org
>> Subject: Re: expiring data out of Cassandra/time to live
>>
>>
>>
>> I was able to successfully merge this patch into the 0.6 branch a few
>> weeks ago by doing the following:
>>
>>
>>
>> Downloading the patch
>> Checking out the trunk of Cassandra from github
>> Rolling back (checking out) the git repo to the same date that the patch
>> was submitted to Jira
>> Applying the patch
>> Committing to Git
>> Merging forward to the 0.6 branch
>> Resolve one or two minor conflicts.
>>
>>
>>
>> R
>>
>>
>>
>> On Wed, Mar 31, 2010 at 2:46 PM, Jonathan Ellis  wrote:
>>
>> Sounds like you want to follow
>> https://issues.apache.org/jira/browse/CASSANDRA-699.  There is a patch
>> there but I wouldn't recommend merging it if Java scares you. :)
>>
>> On Wed, Mar 31, 2010 at 1:39 PM, Mike Gallamore
>>  wrote:
>> > Hello everyone,
>> >
>> > I saw a thread on the incubator user chat that started a few months ago:
>> >
>> > http://www.mail-archive.com/cassandra-u...@incubator.apache.org/msg02047.html
>> > . It looks like this is the new official user mailing list so I'll add
>> > my
>> > thoughts/question here.
>> >
>> > Is there any way to set a TTL on data stored in Cassandra? Deleting old
>> > SSTables isn't enough for my needs. I need the data to go away after a
>> > fixed
>> > period of time. Here is what I'm trying to do and my reasoning why I
>> > think
>> > Cassandra and not something like Flare/Memcache mets my need:
>> >
>> > I'm building a reputation system. We get lots of data at my work (in the
>> > 10's of GB of reputation data a day). The trick is that old data is not
>> > useful as a senders ip address might have changed, they might have had a
>> > bot
>> > on their system and no have removed it, etc. So I need to be able to
>> > keep
>> > data for a fixed period of time and then afterwords it isn't
>> > needed/ideally
>> > would be GC'd out.
>> >
>> > We want to do one thing if we either never heard of the individual or at
>> > least not since the expiry time, and another thing based on the
>> > reputation
>> > data that is stored in Cassandra if it is current. So ideally a
>> > Cassandra
>> > call for a key for someone who's reputation is expired would return
>> > nothing
>> > and we'd reply with our default reputation for that individual. There
>> > really
>> > is no point using network bandwidth to return all the fields associated
>> > with
>> > that key only to look at a timestamp and end up ignoring it anyways.
>> > Similarly the latency of requesting first the timestamp and then the
>> > data in
>> > two separate requests is prohibitive.
>> >
>> > Why Cassandra:
>> >
>> > Our data is complex and is hard to handle completely in a key/value
>> > sense.
>> > In the past we were doing this and just encoding the complex structure
>> > inside of JSON but this isn't ideal. It is very nice algorithmically to
>> > be
>> > able to say: give me this column, or update this element of this hash
>> > etc,
>> > rather than having to pull the old version, decode, modify, re-encode
>> > and
>> > push back to a cache based system.
>> > Our data is large (in the low TB's at the moment, but expected to grow
>> > to
>> > 50-100TB of live data)
>> > Need quick response for both searches and writes: typically for each
>> > thing
>> > we track we get a request for the reputation, the message gets processed
>> > and
>> > then we get feedback back from the recipient. So reads and writes are
>> > symmetric.
>> > High request rate: millions per hour
>> > hundreds of millions of unique reputations (this is way crawling though
>> > the
>> > data with a script purging old data doesn't make sense)
>> > Availablity/load balancing a must. Data needs to be replicated a disk
>> > copy
>> > is useful so if we have a power outage we don't lose the system.
>> > It would be interesting to keep a local subset of our data at customers
>> > sites and

Re: Cassandra data file corrupt

2010-04-01 Thread JKnight JKnight
Dear David Timothy Strauss,

Could you tell me more detail about Backups? As I know, Cassandra data file
will compact new data, so it can be changed many times.

How to backup Cassandra data?

Thanks.

On Wed, Mar 31, 2010 at 8:16 AM, David Timothy Strauss <
da...@fourkitchens.com> wrote:

> Cassandra has always supported two great ways to prevent data loss:
>
> * Replication
> * Backups
>
> I doubt Cassandra will ever focus extensively on single-node recovery when
> it's so easy to wipe and rebuild any node from the cluster.
> --
> *From: * JKnight JKnight 
> *Date: *Wed, 31 Mar 2010 03:48:01 -0400
> *To: *
> *Subject: *Cassandra data file corrupt
>
> Dear all,
>
> My Cassandra data file had problem and I can not get data from this file.
> And all row after error row can not be accessed. So I lost a lot of data.
>
> Will next version of Cassandra implement the way to prevent data lost.
> Maybe we use the checkpoint. If data file corrupt, we will read from the
> next checkpoint.
>
> If not, can you suggest me the way to implement this function?
>
> --
> Best regards,
> JKnight
>



-- 
Best regards,
JKnight


Re: compression

2010-04-01 Thread Rao Venugopal
To Cao Jiguang

I was watching this presentation on bigtable yesterday
http://video.google.com/videoplay?docid=7278544055668715642#

and Jeff mentioned that they compared three different compression libraries
BMDiff, LZO and gzip.   Apparently, gzip was the most cpu intensive and they
ended up going with BMDiff.
I didn't find any Open source / Free implementation of BMDiff but I found
LZO.
http://www.oberhumer.com/opensource/lzo/


Thanks
-Venu


On Thu, Apr 1, 2010 at 3:07 AM, Weijun Li  wrote:

>  Thrift client doesn’t seem to compress anything unless you change thrift
> protocol or use a transport that support compression. I modified TSocket to
> support compression but it occasionally has broken pipe error due to crappy
> Java zlib support (so that clients has to reconnect to get around the socket
> error).  This is a support in transport layer meaning you’ll get compression
> support for all or none.
>
>
>
> Cassandra server doesn’t seem to support compression either and we are
> doing that for memory cache by plugging memcached into Cassandra. Still
> testing…
>
>
>
> -Weijun
>
>
>
> *From:* Ran Tavory [mailto:ran...@gmail.com]
> *Sent:* Wednesday, March 31, 2010 11:37 PM
>
> *To:* user@cassandra.apache.org
> *Subject:* compression
>
>
>
> What sort of compression (if any) is performed by cassandra?
>
> Does the thrift client compress anything before sending to the server to
> preserve bandwidth?
>
> Does the server compress the values in the columns to preserve disk or
> memory?
>
>
>
> ... I assume compaction, performed on the server side, is different than
> compression... however, does compaction include any compression features as
> well?
>
>
>
> Thanks
>


Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
So we are adding another node to the cluster with the latest 0.6 branch
(RC1).  It seems to be hung in some limbo state.

Before bootstrapping our cluster had 50-60GB spread fairly evenly across 4
machines, with RF=3.   One machine had more load than the others, and sure
enough bootstrapping selected that node.   That is the red machine.  The
light blue machine is the new machine.

I have attached a graph to illustrate when the bootstrap process started.

In jconsole the streamingservice status was "performing anticompaction..."
for over 18-24 hrs.  It is currently in "nothing is happening".   It did
have 1 active STREAM-STAGE task, but the machine had to be rebooted for
something unrelated to cassandra. Now the light blue machine appears to be
getting data, but its growing at virtually the same rate as the other
machines which makes me think it is part of the cluster and not actually
streaming data from the machine its supposed to.

Any other ideas on how to debug?


-- 
Dan Di Spaltro
<>

Re: compression

2010-04-01 Thread Tatu Saloranta
On Thu, Apr 1, 2010 at 8:27 AM, Rao Venugopal  wrote:
> To Cao Jiguang
>
> I was watching this presentation on bigtable yesterday
> http://video.google.com/videoplay?docid=7278544055668715642#
>
> and Jeff mentioned that they compared three different compression libraries
> BMDiff, LZO and gzip.   Apparently, gzip was the most cpu intensive and they
> ended up going with BMDiff.
> I didn't find any Open source / Free implementation of BMDiff but I found
> LZO.
> http://www.oberhumer.com/opensource/lzo/

Another IMO good alternative is LZF -- it has characteristics similar
to LZO. Gzip (i.e. deflate) is a two-phase compressor, with usual
lempel-ziv first, then huffman (oldest statistical encoding). LZO, LZF
and most other newer simpler but less compressing variants usually
only do lempel-ziv.
Why LZF? Because there are simple Java free+open implementations: H2
has codec, I ported it to Voldemort, and I think there was talk of
generalizing one from H2 as stand-alone codec for reuse. Possibly
others may have ported it for other libs/frameworks too (there were
multiple jira issues for adding some of these to hadoop). Block format
itself is simple, and it is possible to decode adjacent blocks
separately by skipping encoded blocks without decoding: this can be
used to allow some level of random access (access random block, decode
it, access something inside the block).

Performance-wise simpler codecs are fast enough to add less overhead
than fastest parsing of textual formats (json, xml), but more
importantly, they are MUCH faster to write (once again, not much more
overhead than format encoding). It is compression speed that really
kills gzip, esp. since it is often server that has to do it, for
small-requests, large-responses.

-+ Tatu +-


Re: Stalled Bootstrapping Process

2010-04-01 Thread Gary Dusbabek
Does the JMX StreamingService list any incoming/outgoing files/hosts
on the sending/receiving nodes?

Gary.

On Thu, Apr 1, 2010 at 10:26, Dan Di Spaltro  wrote:
> So we are adding another node to the cluster with the latest 0.6 branch
> (RC1).  It seems to be hung in some limbo state.
> Before bootstrapping our cluster had 50-60GB spread fairly evenly across 4
> machines, with RF=3.   One machine had more load than the others, and sure
> enough bootstrapping selected that node.   That is the red machine.  The
> light blue machine is the new machine.
> I have attached a graph to illustrate when the bootstrap process started.
> In jconsole the streamingservice status was "performing anticompaction..."
> for over 18-24 hrs.  It is currently in "nothing is happening".   It did
> have 1 active STREAM-STAGE task, but the machine had to be rebooted for
> something unrelated to cassandra. Now the light blue machine appears to be
> getting data, but its growing at virtually the same rate as the other
> machines which makes me think it is part of the cluster and not actually
> streaming data from the machine its supposed to.
> Any other ideas on how to debug?
>
> --
> Dan Di Spaltro
>


LazyBoy question

2010-04-01 Thread Gary
I am trying out the lazyboy library to access cassandra, I was able to get
the data in and out using Record save/load functions. Is there a way to get
a slice, or all the records under a CF so I can iterate? It is probably a
naive question, as I am just getting into this field

Thanks,
Gary


Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
The light-blue machine is in Operation Mode: Bootstrap

On Thu, Apr 1, 2010 at 9:26 AM, Dan Di Spaltro wrote:

> So we are adding another node to the cluster with the latest 0.6 branch
> (RC1).  It seems to be hung in some limbo state.
>
> Before bootstrapping our cluster had 50-60GB spread fairly evenly across 4
> machines, with RF=3.   One machine had more load than the others, and sure
> enough bootstrapping selected that node.   That is the red machine.  The
> light blue machine is the new machine.
>
> I have attached a graph to illustrate when the bootstrap process started.
>
> In jconsole the streamingservice status was "performing anticompaction..."
> for over 18-24 hrs.  It is currently in "nothing is happening".   It did
> have 1 active STREAM-STAGE task, but the machine had to be rebooted for
> something unrelated to cassandra. Now the light blue machine appears to be
> getting data, but its growing at virtually the same rate as the other
> machines which makes me think it is part of the cluster and not actually
> streaming data from the machine its supposed to.
>
> Any other ideas on how to debug?
>
>
> --
> Dan Di Spaltro
>



-- 
Dan Di Spaltro


Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
which node rebooted, the red one, or the blue one?

On Thu, Apr 1, 2010 at 11:26 AM, Dan Di Spaltro  wrote:
> So we are adding another node to the cluster with the latest 0.6 branch
> (RC1).  It seems to be hung in some limbo state.
> Before bootstrapping our cluster had 50-60GB spread fairly evenly across 4
> machines, with RF=3.   One machine had more load than the others, and sure
> enough bootstrapping selected that node.   That is the red machine.  The
> light blue machine is the new machine.
> I have attached a graph to illustrate when the bootstrap process started.
> In jconsole the streamingservice status was "performing anticompaction..."
> for over 18-24 hrs.  It is currently in "nothing is happening".   It did
> have 1 active STREAM-STAGE task, but the machine had to be rebooted for
> something unrelated to cassandra. Now the light blue machine appears to be
> getting data, but its growing at virtually the same rate as the other
> machines which makes me think it is part of the cluster and not actually
> streaming data from the machine its supposed to.
> Any other ideas on how to debug?
>
> --
> Dan Di Spaltro
>


Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Red one.

Gary - both say nothing is happening with no destinations or sources.

On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis  wrote:

> which node rebooted, the red one, or the blue one?
>
> On Thu, Apr 1, 2010 at 11:26 AM, Dan Di Spaltro 
> wrote:
> > So we are adding another node to the cluster with the latest 0.6 branch
> > (RC1).  It seems to be hung in some limbo state.
> > Before bootstrapping our cluster had 50-60GB spread fairly evenly across
> 4
> > machines, with RF=3.   One machine had more load than the others, and
> sure
> > enough bootstrapping selected that node.   That is the red machine.  The
> > light blue machine is the new machine.
> > I have attached a graph to illustrate when the bootstrap process started.
> > In jconsole the streamingservice status was "performing
> anticompaction..."
> > for over 18-24 hrs.  It is currently in "nothing is happening".   It did
> > have 1 active STREAM-STAGE task, but the machine had to be rebooted for
> > something unrelated to cassandra. Now the light blue machine appears to
> be
> > getting data, but its growing at virtually the same rate as the other
> > machines which makes me think it is part of the cluster and not actually
> > streaming data from the machine its supposed to.
> > Any other ideas on how to debug?
> >
> > --
> > Dan Di Spaltro
> >
>



-- 
Dan Di Spaltro


Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Before the Red one rebooted it had 1 active STREAM-STAGE.  Now it has 0 in
STREAM-STAGE.

On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro wrote:

> Red one.
>
> Gary - both say nothing is happening with no destinations or sources.
>
>
> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis  wrote:
>
>> which node rebooted, the red one, or the blue one?
>>
>> On Thu, Apr 1, 2010 at 11:26 AM, Dan Di Spaltro 
>> wrote:
>> > So we are adding another node to the cluster with the latest 0.6 branch
>> > (RC1).  It seems to be hung in some limbo state.
>> > Before bootstrapping our cluster had 50-60GB spread fairly evenly across
>> 4
>> > machines, with RF=3.   One machine had more load than the others, and
>> sure
>> > enough bootstrapping selected that node.   That is the red machine.  The
>> > light blue machine is the new machine.
>> > I have attached a graph to illustrate when the bootstrap process
>> started.
>> > In jconsole the streamingservice status was "performing
>> anticompaction..."
>> > for over 18-24 hrs.  It is currently in "nothing is happening".   It did
>> > have 1 active STREAM-STAGE task, but the machine had to be rebooted for
>> > something unrelated to cassandra. Now the light blue machine appears to
>> be
>> > getting data, but its growing at virtually the same rate as the other
>> > machines which makes me think it is part of the cluster and not actually
>> > streaming data from the machine its supposed to.
>> > Any other ideas on how to debug?
>> >
>> > --
>> > Dan Di Spaltro
>> >
>>
>
>
>
> --
> Dan Di Spaltro
>



-- 
Dan Di Spaltro


Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
Bootstrap source restarting will always fail bootstrap.  You'll need
to restart the blue one too now, I'm afraid.

On Thu, Apr 1, 2010 at 2:01 PM, Dan Di Spaltro  wrote:
> Before the Red one rebooted it had 1 active STREAM-STAGE.  Now it has 0 in
> STREAM-STAGE.
>
> On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro 
> wrote:
>>
>> Red one.
>> Gary - both say nothing is happening with no destinations or sources.
>>
>> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis  wrote:
>>>
>>> which node rebooted, the red one, or the blue one?
>>>
>>> On Thu, Apr 1, 2010 at 11:26 AM, Dan Di Spaltro 
>>> wrote:
>>> > So we are adding another node to the cluster with the latest 0.6 branch
>>> > (RC1).  It seems to be hung in some limbo state.
>>> > Before bootstrapping our cluster had 50-60GB spread fairly evenly
>>> > across 4
>>> > machines, with RF=3.   One machine had more load than the others, and
>>> > sure
>>> > enough bootstrapping selected that node.   That is the red machine.
>>> >  The
>>> > light blue machine is the new machine.
>>> > I have attached a graph to illustrate when the bootstrap process
>>> > started.
>>> > In jconsole the streamingservice status was "performing
>>> > anticompaction..."
>>> > for over 18-24 hrs.  It is currently in "nothing is happening".   It
>>> > did
>>> > have 1 active STREAM-STAGE task, but the machine had to be rebooted for
>>> > something unrelated to cassandra. Now the light blue machine appears to
>>> > be
>>> > getting data, but its growing at virtually the same rate as the other
>>> > machines which makes me think it is part of the cluster and not
>>> > actually
>>> > streaming data from the machine its supposed to.
>>> > Any other ideas on how to debug?
>>> >
>>> > --
>>> > Dan Di Spaltro
>>> >
>>
>>
>>
>> --
>> Dan Di Spaltro
>
>
>
> --
> Dan Di Spaltro
>


Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Okay, so should I run any more commands like cleanup before?

On Thu, Apr 1, 2010 at 12:09 PM, Jonathan Ellis  wrote:

> Bootstrap source restarting will always fail bootstrap.  You'll need
> to restart the blue one too now, I'm afraid.
>
> On Thu, Apr 1, 2010 at 2:01 PM, Dan Di Spaltro 
> wrote:
> > Before the Red one rebooted it had 1 active STREAM-STAGE.  Now it has 0
> in
> > STREAM-STAGE.
> >
> > On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro  >
> > wrote:
> >>
> >> Red one.
> >> Gary - both say nothing is happening with no destinations or sources.
> >>
> >> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis 
> wrote:
> >>>
> >>> which node rebooted, the red one, or the blue one?
> >>>
> >>> On Thu, Apr 1, 2010 at 11:26 AM, Dan Di Spaltro <
> dan.dispal...@gmail.com>
> >>> wrote:
> >>> > So we are adding another node to the cluster with the latest 0.6
> branch
> >>> > (RC1).  It seems to be hung in some limbo state.
> >>> > Before bootstrapping our cluster had 50-60GB spread fairly evenly
> >>> > across 4
> >>> > machines, with RF=3.   One machine had more load than the others, and
> >>> > sure
> >>> > enough bootstrapping selected that node.   That is the red machine.
> >>> >  The
> >>> > light blue machine is the new machine.
> >>> > I have attached a graph to illustrate when the bootstrap process
> >>> > started.
> >>> > In jconsole the streamingservice status was "performing
> >>> > anticompaction..."
> >>> > for over 18-24 hrs.  It is currently in "nothing is happening".   It
> >>> > did
> >>> > have 1 active STREAM-STAGE task, but the machine had to be rebooted
> for
> >>> > something unrelated to cassandra. Now the light blue machine appears
> to
> >>> > be
> >>> > getting data, but its growing at virtually the same rate as the other
> >>> > machines which makes me think it is part of the cluster and not
> >>> > actually
> >>> > streaming data from the machine its supposed to.
> >>> > Any other ideas on how to debug?
> >>> >
> >>> > --
> >>> > Dan Di Spaltro
> >>> >
> >>
> >>
> >>
> >> --
> >> Dan Di Spaltro
> >
> >
> >
> > --
> > Dan Di Spaltro
> >
>



-- 
Dan Di Spaltro


Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
There shouldn't be anything to clean up.  (The temporary streaming
files it anticompacted are automatically removed on restart)

On Thu, Apr 1, 2010 at 2:17 PM, Dan Di Spaltro  wrote:
> Okay, so should I run any more commands like cleanup before?
>
> On Thu, Apr 1, 2010 at 12:09 PM, Jonathan Ellis  wrote:
>>
>> Bootstrap source restarting will always fail bootstrap.  You'll need
>> to restart the blue one too now, I'm afraid.
>>
>> On Thu, Apr 1, 2010 at 2:01 PM, Dan Di Spaltro 
>> wrote:
>> > Before the Red one rebooted it had 1 active STREAM-STAGE.  Now it has 0
>> > in
>> > STREAM-STAGE.
>> >
>> > On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro
>> > 
>> > wrote:
>> >>
>> >> Red one.
>> >> Gary - both say nothing is happening with no destinations or sources.
>> >>
>> >> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis 
>> >> wrote:
>> >>>
>> >>> which node rebooted, the red one, or the blue one?
>> >>>
>> >>> On Thu, Apr 1, 2010 at 11:26 AM, Dan Di Spaltro
>> >>> 
>> >>> wrote:
>> >>> > So we are adding another node to the cluster with the latest 0.6
>> >>> > branch
>> >>> > (RC1).  It seems to be hung in some limbo state.
>> >>> > Before bootstrapping our cluster had 50-60GB spread fairly evenly
>> >>> > across 4
>> >>> > machines, with RF=3.   One machine had more load than the others,
>> >>> > and
>> >>> > sure
>> >>> > enough bootstrapping selected that node.   That is the red machine.
>> >>> >  The
>> >>> > light blue machine is the new machine.
>> >>> > I have attached a graph to illustrate when the bootstrap process
>> >>> > started.
>> >>> > In jconsole the streamingservice status was "performing
>> >>> > anticompaction..."
>> >>> > for over 18-24 hrs.  It is currently in "nothing is happening".   It
>> >>> > did
>> >>> > have 1 active STREAM-STAGE task, but the machine had to be rebooted
>> >>> > for
>> >>> > something unrelated to cassandra. Now the light blue machine appears
>> >>> > to
>> >>> > be
>> >>> > getting data, but its growing at virtually the same rate as the
>> >>> > other
>> >>> > machines which makes me think it is part of the cluster and not
>> >>> > actually
>> >>> > streaming data from the machine its supposed to.
>> >>> > Any other ideas on how to debug?
>> >>> >
>> >>> > --
>> >>> > Dan Di Spaltro
>> >>> >
>> >>
>> >>
>> >>
>> >> --
>> >> Dan Di Spaltro
>> >
>> >
>> >
>> > --
>> > Dan Di Spaltro
>> >
>
>
>
> --
> Dan Di Spaltro
>


Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
But I didn't restart the red one.

On Thu, Apr 1, 2010 at 12:18 PM, Jonathan Ellis  wrote:

> There shouldn't be anything to clean up.  (The temporary streaming
> files it anticompacted are automatically removed on restart)
>
> On Thu, Apr 1, 2010 at 2:17 PM, Dan Di Spaltro 
> wrote:
> > Okay, so should I run any more commands like cleanup before?
> >
> > On Thu, Apr 1, 2010 at 12:09 PM, Jonathan Ellis 
> wrote:
> >>
> >> Bootstrap source restarting will always fail bootstrap.  You'll need
> >> to restart the blue one too now, I'm afraid.
> >>
> >> On Thu, Apr 1, 2010 at 2:01 PM, Dan Di Spaltro  >
> >> wrote:
> >> > Before the Red one rebooted it had 1 active STREAM-STAGE.  Now it has
> 0
> >> > in
> >> > STREAM-STAGE.
> >> >
> >> > On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro
> >> > 
> >> > wrote:
> >> >>
> >> >> Red one.
> >> >> Gary - both say nothing is happening with no destinations or sources.
> >> >>
> >> >> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis 
> >> >> wrote:
> >> >>>
> >> >>> which node rebooted, the red one, or the blue one?
> >> >>>
> >> >>> On Thu, Apr 1, 2010 at 11:26 AM, Dan Di Spaltro
> >> >>> 
> >> >>> wrote:
> >> >>> > So we are adding another node to the cluster with the latest 0.6
> >> >>> > branch
> >> >>> > (RC1).  It seems to be hung in some limbo state.
> >> >>> > Before bootstrapping our cluster had 50-60GB spread fairly evenly
> >> >>> > across 4
> >> >>> > machines, with RF=3.   One machine had more load than the others,
> >> >>> > and
> >> >>> > sure
> >> >>> > enough bootstrapping selected that node.   That is the red
> machine.
> >> >>> >  The
> >> >>> > light blue machine is the new machine.
> >> >>> > I have attached a graph to illustrate when the bootstrap process
> >> >>> > started.
> >> >>> > In jconsole the streamingservice status was "performing
> >> >>> > anticompaction..."
> >> >>> > for over 18-24 hrs.  It is currently in "nothing is happening".
> It
> >> >>> > did
> >> >>> > have 1 active STREAM-STAGE task, but the machine had to be
> rebooted
> >> >>> > for
> >> >>> > something unrelated to cassandra. Now the light blue machine
> appears
> >> >>> > to
> >> >>> > be
> >> >>> > getting data, but its growing at virtually the same rate as the
> >> >>> > other
> >> >>> > machines which makes me think it is part of the cluster and not
> >> >>> > actually
> >> >>> > streaming data from the machine its supposed to.
> >> >>> > Any other ideas on how to debug?
> >> >>> >
> >> >>> > --
> >> >>> > Dan Di Spaltro
> >> >>> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Dan Di Spaltro
> >> >
> >> >
> >> >
> >> > --
> >> > Dan Di Spaltro
> >> >
> >
> >
> >
> > --
> > Dan Di Spaltro
> >
>



-- 
Dan Di Spaltro


Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
On Thu, Apr 1, 2010 at 2:22 PM, Dan Di Spaltro  wrote:
> But I didn't restart the red one.

>> >> > On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro
>> >> > 
>> >> > wrote:
>> >> >>
>> >> >> Red one.
>> >> >>
>> >> >> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis 
>> >> >> wrote:
>> >> >>>
>> >> >>> which node rebooted, the red one, or the blue one?

I'm confused.


Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Sorry I meant the red one restarted about a day ago.  The graph shows
the dip in disk space.  But it no where near returned to the previous
amount of disk usage.  I was referring to how the red one didn't
reclaim all its space (I figure about 60gb actually belong on that
machine) Is that normal (its currently taking up about 100gb)?

2 minutes ago, I restarted the blue one.

Now the streamservice task is performing anti-compaction on the red one.

On Thu, Apr 1, 2010 at 12:25 PM, Jonathan Ellis  wrote:
>
> On Thu, Apr 1, 2010 at 2:22 PM, Dan Di Spaltro  
> wrote:
> > But I didn't restart the red one.
>
> >> >> > On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro
> >> >> > 
> >> >> > wrote:
> >> >> >>
> >> >> >> Red one.
> >> >> >>
> >> >> >> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis 
> >> >> >> wrote:
> >> >> >>>
> >> >> >>> which node rebooted, the red one, or the blue one?
>
> I'm confused.

--
Dan Di Spaltro


Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
So it looks like its still performing anti-compaction.  The
compactionmanager is the best way to track this?

On Thu, Apr 1, 2010 at 12:31 PM, Dan Di Spaltro  wrote:
> Sorry I meant the red one restarted about a day ago.  The graph shows
> the dip in disk space.  But it no where near returned to the previous
> amount of disk usage.  I was referring to how the red one didn't
> reclaim all its space (I figure about 60gb actually belong on that
> machine) Is that normal (its currently taking up about 100gb)?
>
> 2 minutes ago, I restarted the blue one.
>
> Now the streamservice task is performing anti-compaction on the red one.
>
> On Thu, Apr 1, 2010 at 12:25 PM, Jonathan Ellis  wrote:
>>
>> On Thu, Apr 1, 2010 at 2:22 PM, Dan Di Spaltro  
>> wrote:
>> > But I didn't restart the red one.
>>
>> >> >> > On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro
>> >> >> > 
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Red one.
>> >> >> >>
>> >> >> >> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis 
>> >> >> >> wrote:
>> >> >> >>>
>> >> >> >>> which node rebooted, the red one, or the blue one?
>>
>> I'm confused.
>
> --
> Dan Di Spaltro
>



-- 
Dan Di Spaltro


Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
Right.

On Thu, Apr 1, 2010 at 3:15 PM, Dan Di Spaltro  wrote:
> So it looks like its still performing anti-compaction.  The
> compactionmanager is the best way to track this?
>
> On Thu, Apr 1, 2010 at 12:31 PM, Dan Di Spaltro  
> wrote:
>> Sorry I meant the red one restarted about a day ago.  The graph shows
>> the dip in disk space.  But it no where near returned to the previous
>> amount of disk usage.  I was referring to how the red one didn't
>> reclaim all its space (I figure about 60gb actually belong on that
>> machine) Is that normal (its currently taking up about 100gb)?
>>
>> 2 minutes ago, I restarted the blue one.
>>
>> Now the streamservice task is performing anti-compaction on the red one.
>>
>> On Thu, Apr 1, 2010 at 12:25 PM, Jonathan Ellis  wrote:
>>>
>>> On Thu, Apr 1, 2010 at 2:22 PM, Dan Di Spaltro  
>>> wrote:
>>> > But I didn't restart the red one.
>>>
>>> >> >> > On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro
>>> >> >> > 
>>> >> >> > wrote:
>>> >> >> >>
>>> >> >> >> Red one.
>>> >> >> >>
>>> >> >> >> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis 
>>> >> >> >> wrote:
>>> >> >> >>>
>>> >> >> >>> which node rebooted, the red one, or the blue one?
>>>
>>> I'm confused.
>>
>> --
>> Dan Di Spaltro
>>
>
>
>
> --
> Dan Di Spaltro
>


Re: Stalled Bootstrapping Process

2010-04-01 Thread Dan Di Spaltro
Seems to be doing more stuff now.

Ive attached an updated screenshot.

On Thu, Apr 1, 2010 at 1:16 PM, Jonathan Ellis  wrote:
> Right.
>
> On Thu, Apr 1, 2010 at 3:15 PM, Dan Di Spaltro  
> wrote:
>> So it looks like its still performing anti-compaction.  The
>> compactionmanager is the best way to track this?
>>
>> On Thu, Apr 1, 2010 at 12:31 PM, Dan Di Spaltro  
>> wrote:
>>> Sorry I meant the red one restarted about a day ago.  The graph shows
>>> the dip in disk space.  But it no where near returned to the previous
>>> amount of disk usage.  I was referring to how the red one didn't
>>> reclaim all its space (I figure about 60gb actually belong on that
>>> machine) Is that normal (its currently taking up about 100gb)?
>>>
>>> 2 minutes ago, I restarted the blue one.
>>>
>>> Now the streamservice task is performing anti-compaction on the red one.
>>>
>>> On Thu, Apr 1, 2010 at 12:25 PM, Jonathan Ellis  wrote:

 On Thu, Apr 1, 2010 at 2:22 PM, Dan Di Spaltro  
 wrote:
 > But I didn't restart the red one.

 >> >> > On Thu, Apr 1, 2010 at 11:57 AM, Dan Di Spaltro
 >> >> > 
 >> >> > wrote:
 >> >> >>
 >> >> >> Red one.
 >> >> >>
 >> >> >> On Thu, Apr 1, 2010 at 11:55 AM, Jonathan Ellis 
 >> >> >> 
 >> >> >> wrote:
 >> >> >>>
 >> >> >>> which node rebooted, the red one, or the blue one?

 I'm confused.
>>>
>>> --
>>> Dan Di Spaltro
>>>
>>
>>
>>
>> --
>> Dan Di Spaltro
>>
>



-- 
Dan Di Spaltro
<>

Re: Read Performance

2010-04-01 Thread James Golick
I don't have the additional hardware to try to isolate this issue atm, so I
decided to push some code that performs 20% of reads directly from
cassandra. The cache hit rate has gone up to about 88% now and it's still
climbing, albeit slowly. There remains plenty of free cache space.

So far, the average time to multi_get those 20 rows is still hovering around
35-45ms.

I'll report back with more info as it comes in.

On Thu, Apr 1, 2010 at 12:06 AM, Cemal Dalar  wrote:

> Hi James,
>
> I don't know how to get the below statistics data and calculate the access
> times (read/write in ms) in your previous mails. Can you explain a little?
> Iike to work on it also.
>
> CD
>
>
> On Thu, Apr 1, 2010 at 4:15 AM, Jonathan Ellis  wrote:
>
>> On Wed, Mar 31, 2010 at 6:21 PM, James Golick 
>> wrote:
>> > Keyspace: ActivityFeed
>> > Read Count: 699443
>> > Read Latency: 16.11017477192566 ms.
>>
>> > Column Family: Events
>> > Read Count: 232378
>> > Read Latency: 0.396 ms.
>> > Row cache capacity: 50
>> > Row cache size: 62768
>> > Row cache hit rate: 0.007716049382716049
>>
>> This says that
>>
>>  - recent queries to Events are much faster than the lifetime average
>> for your Keyspace
>>  - even though you have almost no row cache hits (~1700 out of 232000
>> reads)
>>
>> Not sure what to make of that, tbh.  If it were me I would try to
>> reproduce on a test machine w/o all that pesky live traffic confusing
>> things.
>>
>> -Jonathan
>>
>
>


Re: Read Performance

2010-04-01 Thread Joseph Stump
Taking our flamewar offline. :-D

On Thu, Apr 1, 2010 at 1:36 PM, James Golick  wrote:
> I don't have the additional hardware to try to isolate this issue atm

You'd be able to spin up hardware to isolate that issue on AWS. ;)

--Joe


Re: Read Performance

2010-04-01 Thread Jeremy Dunck
Or rackspace.  ;)

On Thu, Apr 1, 2010 at 2:49 PM, Joseph Stump  wrote:
> Taking our flamewar offline. :-D
>
> On Thu, Apr 1, 2010 at 1:36 PM, James Golick  wrote:
>> I don't have the additional hardware to try to isolate this issue atm
>
> You'd be able to spin up hardware to isolate that issue on AWS. ;)
>
> --Joe
>


Re: Read Performance

2010-04-01 Thread James Golick
Damnit!

On Thu, Apr 1, 2010 at 2:05 PM, Jeremy Dunck  wrote:

> Or rackspace.  ;)
>
> On Thu, Apr 1, 2010 at 2:49 PM, Joseph Stump  wrote:
> > Taking our flamewar offline. :-D
> >
> > On Thu, Apr 1, 2010 at 1:36 PM, James Golick 
> wrote:
> >> I don't have the additional hardware to try to isolate this issue atm
> >
> > You'd be able to spin up hardware to isolate that issue on AWS. ;)
> >
> > --Joe
> >
>


Re: Stalled Bootstrapping Process

2010-04-01 Thread Jonathan Ellis
I would turn debug logging on globally on the new node, that will
answer more questions than just the streaming package.


Re: Read Performance

2010-04-01 Thread Peter Chang
pwned.

On Thu, Apr 1, 2010 at 2:09 PM, James Golick  wrote:

> Damnit!
>
>
> On Thu, Apr 1, 2010 at 2:05 PM, Jeremy Dunck  wrote:
>
>> Or rackspace.  ;)
>>
>> On Thu, Apr 1, 2010 at 2:49 PM, Joseph Stump  wrote:
>> > Taking our flamewar offline. :-D
>> >
>> > On Thu, Apr 1, 2010 at 1:36 PM, James Golick 
>> wrote:
>> >> I don't have the additional hardware to try to isolate this issue atm
>> >
>> > You'd be able to spin up hardware to isolate that issue on AWS. ;)
>> >
>> > --Joe
>> >
>>
>
>


Proxy instances?

2010-04-01 Thread David King
Is it possible to have Cassandra instances that serve only as proxies to the 
rest of the cluster, but have no storage themselves? Maybe with a keyspace 
length of 0?

Re: Proxy instances?

2010-04-01 Thread Brandon Williams
On Thu, Apr 1, 2010 at 7:19 PM, David King  wrote:

> Is it possible to have Cassandra instances that serve only as proxies to
> the rest of the cluster, but have no storage themselves? Maybe with a
> keyspace length of 0?


contrib/client_only is what you're looking for.

-Brandon


Creating a Total Ordered Queue in Cassandra

2010-04-01 Thread Jeremy Davis
I'm in the process of implementing a Totally Ordered Queue in Cassandra, and
wanted to bounce my ideas off the list and also see if there are any other
suggestions.

I've come up with an external source of ID's that are always increasing (but
not monotonic), and I've also used external synchronization to ensure only
one writer to a given queue. And I handle de-duping in the app.


My current solution is : (simplified)

Use the "QueueId", to Key into a row of a CF.
Then, every column in that CF corresponds to a new entry in the Queue, with
a custom Comparator to sort the columns by my external ID that is always
increasing.

Technically I never delete data from the Queue, and I just page through it
from a given ID using a SliceRange, etc.

Obviously the problem being that the row needs to get compacted. so then I
started bucketizing with multiple rows for a given queue (for example one
per day (again I'm simplifying))...(so the Key is now "QueueId+Day"...)

Does this seem reasonable? It's solvable, but is starting to seem
complicated to implement... It would be very easy if I didn't have to have
multiple buckets..



My other thought is to store one entry per row, and perform get_range_slices
and specify a KeyRange, with the OrderPreservingPartitioner.
But it isn't exactly clear to me what the Order of the keys are in this
system, so I don't know how to construct my key and queries appropriately...
Is this Lexical String Order? Or?

So for example.. Assuming my QueueId's are longs, and my ID's are also
longs.. My key would be (in Java):

long queueId;
long msgId;

key = "" + queueId + ":" + msgId;

And if I wanted to do a query my key range might be from
start = "" + queueId + ":0"
end = "" + queueId + ":" + Long.MAX_VALUE;

(Will I have to left pad the msgIds with 0's)?

And is this going to be efficient if my msgId isn't monotonically
increasing?

Thanks,
-JD


Re: Creating a Total Ordered Queue in Cassandra

2010-04-01 Thread Keith Thornhill
you mention never deleting from the queue, so what purpose is this
serving? (if you don't pop off the front, is it really a queue?)

seems if guaranteed order of messages is required, there are many
other projects which are focused towards that problem (rabbitmq,
kestrel, activemq, etc)

or am i misunderstanding your needs here?

-keith

On Thu, Apr 1, 2010 at 6:32 PM, Jeremy Davis
 wrote:
> I'm in the process of implementing a Totally Ordered Queue in Cassandra, and
> wanted to bounce my ideas off the list and also see if there are any other
> suggestions.
>
> I've come up with an external source of ID's that are always increasing (but
> not monotonic), and I've also used external synchronization to ensure only
> one writer to a given queue. And I handle de-duping in the app.
>
>
> My current solution is : (simplified)
>
> Use the "QueueId", to Key into a row of a CF.
> Then, every column in that CF corresponds to a new entry in the Queue, with
> a custom Comparator to sort the columns by my external ID that is always
> increasing.
>
> Technically I never delete data from the Queue, and I just page through it
> from a given ID using a SliceRange, etc.
>
> Obviously the problem being that the row needs to get compacted. so then I
> started bucketizing with multiple rows for a given queue (for example one
> per day (again I'm simplifying))...(so the Key is now "QueueId+Day"...)
>
> Does this seem reasonable? It's solvable, but is starting to seem
> complicated to implement... It would be very easy if I didn't have to have
> multiple buckets..
>
>
>
> My other thought is to store one entry per row, and perform get_range_slices
> and specify a KeyRange, with the OrderPreservingPartitioner.
> But it isn't exactly clear to me what the Order of the keys are in this
> system, so I don't know how to construct my key and queries appropriately...
> Is this Lexical String Order? Or?
>
> So for example.. Assuming my QueueId's are longs, and my ID's are also
> longs.. My key would be (in Java):
>
> long queueId;
> long msgId;
>
> key = "" + queueId + ":" + msgId;
>
> And if I wanted to do a query my key range might be from
> start = "" + queueId + ":0"
> end = "" + queueId + ":" + Long.MAX_VALUE;
>
> (Will I have to left pad the msgIds with 0's)?
>
> And is this going to be efficient if my msgId isn't monotonically
> increasing?
>
> Thanks,
> -JD
>
>
>
>
>
>
>
>
>
>
>
>
>


Re: Re: compression

2010-04-01 Thread casablinca126.com
hi,
Great!
thanks to Rao and Tatu :)
I will  test them and let you know what I found.
regards,
Cao Jiguang

-
发件人:Tatu Saloranta
发送日期:2010-04-02 01:08:52
收件人:u...@cassandra.apache.org
抄送:
主题:Re: compression

On Thu, Apr 1, 2010 at 8:27 AM, Rao Venugopal  wrote:
> To Cao Jiguang
>
> I was watching this presentation on bigtable yesterday
> http://video.google.com/videoplay?docid=7278544055668715642#
>
> and Jeff mentioned that they compared three different compression libraries
> BMDiff, LZO and gzip.�� Apparently, gzip was the most cpu intensive and they
> ended up going with BMDiff.
> I didn't find any Open source / Free implementation of BMDiff but I found
> LZO.
> http://www.oberhumer.com/opensource/lzo/

Another IMO good alternative is LZF -- it has characteristics similar
to LZO. Gzip (i.e. deflate) is a two-phase compressor, with usual
lempel-ziv first, then huffman (oldest statistical encoding). LZO, LZF
and most other newer simpler but less compressing variants usually
only do lempel-ziv.
Why LZF? Because there are simple Java free+open implementations: H2
has codec, I ported it to Voldemort, and I think there was talk of
generalizing one from H2 as stand-alone codec for reuse. Possibly
others may have ported it for other libs/frameworks too (there were
multiple jira issues for adding some of these to hadoop). Block format
itself is simple, and it is possible to decode adjacent blocks
separately by skipping encoded blocks without decoding: this can be
used to allow some level of random access (access random block, decode
it, access something inside the block).

Performance-wise simpler codecs are fast enough to add less overhead
than fastest parsing of textual formats (json, xml), but more
importantly, they are MUCH faster to write (once again, not much more
overhead than format encoding). It is compression speed that really
kills gzip, esp. since it is often server that has to do it, for
small-requests, large-responses.

-+ Tatu +-


Re: Read Performance

2010-04-01 Thread James Golick
Well, folks, I'm feeling a little stupid right now (adding to the injury
inflicted by one Mr. Stump :-P).

So, here's the story. The cache hit rate is up around 97% now. The ruby code
is down to around 20-25ms to multiget the 20 rows. I did some profiling,
though, and realized that a lot of time was being spent in thrift. Turns
out, that's where pretty much all the time was going.

I just ran the same test using java (scala) and the load is taking around
2-4ms.

On Thu, Apr 1, 2010 at 4:37 PM, Peter Chang  wrote:

> pwned.
>
>
> On Thu, Apr 1, 2010 at 2:09 PM, James Golick wrote:
>
>> Damnit!
>>
>>
>> On Thu, Apr 1, 2010 at 2:05 PM, Jeremy Dunck  wrote:
>>
>>> Or rackspace.  ;)
>>>
>>> On Thu, Apr 1, 2010 at 2:49 PM, Joseph Stump  wrote:
>>> > Taking our flamewar offline. :-D
>>> >
>>> > On Thu, Apr 1, 2010 at 1:36 PM, James Golick 
>>> wrote:
>>> >> I don't have the additional hardware to try to isolate this issue atm
>>> >
>>> > You'd be able to spin up hardware to isolate that issue on AWS. ;)
>>> >
>>> > --Joe
>>> >
>>>
>>
>>
>


Re: Read Performance

2010-04-01 Thread Brandon Williams
On Thu, Apr 1, 2010 at 9:37 PM, James Golick  wrote:

> Well, folks, I'm feeling a little stupid right now (adding to the injury
> inflicted by one Mr. Stump :-P).
>
> So, here's the story. The cache hit rate is up around 97% now. The ruby
> code is down to around 20-25ms to multiget the 20 rows. I did some
> profiling, though, and realized that a lot of time was being spent in
> thrift. Turns out, that's where pretty much all the time was going.
>
> I just ran the same test using java (scala) and the load is taking around
> 2-4ms.
>

That's with the binary accelerated thrift for ruby?

-Brandon


Re: Creating a Total Ordered Queue in Cassandra

2010-04-01 Thread Jeremy Davis
You are correct, it is not a queue in the classic sense... I'm storing the
entire "conversation" with a client in perpetuity, and then playing it back
in the order received.

Rabbitmq/activemq etc all have about the same throughput 3-6K persistent
messages/sec, and are not good for storing the conversation forever... Also
I can easily scale cassandra past that message rate and not have to worry
about which message broker/cluster I'm connecting to/has the
conversation/etc.



On Thu, Apr 1, 2010 at 7:02 PM, Keith Thornhill  wrote:

> you mention never deleting from the queue, so what purpose is this
> serving? (if you don't pop off the front, is it really a queue?)
>
> seems if guaranteed order of messages is required, there are many
> other projects which are focused towards that problem (rabbitmq,
> kestrel, activemq, etc)
>
> or am i misunderstanding your needs here?
>
> -keith
>
> On Thu, Apr 1, 2010 at 6:32 PM, Jeremy Davis
>  wrote:
> > I'm in the process of implementing a Totally Ordered Queue in Cassandra,
> and
> > wanted to bounce my ideas off the list and also see if there are any
> other
> > suggestions.
> >
> > I've come up with an external source of ID's that are always increasing
> (but
> > not monotonic), and I've also used external synchronization to ensure
> only
> > one writer to a given queue. And I handle de-duping in the app.
> >
> >
> > My current solution is : (simplified)
> >
> > Use the "QueueId", to Key into a row of a CF.
> > Then, every column in that CF corresponds to a new entry in the Queue,
> with
> > a custom Comparator to sort the columns by my external ID that is always
> > increasing.
> >
> > Technically I never delete data from the Queue, and I just page through
> it
> > from a given ID using a SliceRange, etc.
> >
> > Obviously the problem being that the row needs to get compacted. so then
> I
> > started bucketizing with multiple rows for a given queue (for example one
> > per day (again I'm simplifying))...(so the Key is now "QueueId+Day"...)
> >
> > Does this seem reasonable? It's solvable, but is starting to seem
> > complicated to implement... It would be very easy if I didn't have to
> have
> > multiple buckets..
> >
> >
> >
> > My other thought is to store one entry per row, and perform
> get_range_slices
> > and specify a KeyRange, with the OrderPreservingPartitioner.
> > But it isn't exactly clear to me what the Order of the keys are in this
> > system, so I don't know how to construct my key and queries
> appropriately...
> > Is this Lexical String Order? Or?
> >
> > So for example.. Assuming my QueueId's are longs, and my ID's are also
> > longs.. My key would be (in Java):
> >
> > long queueId;
> > long msgId;
> >
> > key = "" + queueId + ":" + msgId;
> >
> > And if I wanted to do a query my key range might be from
> > start = "" + queueId + ":0"
> > end = "" + queueId + ":" + Long.MAX_VALUE;
> >
> > (Will I have to left pad the msgIds with 0's)?
> >
> > And is this going to be efficient if my msgId isn't monotonically
> > increasing?
> >
> > Thanks,
> > -JD
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>


Re: Read Performance

2010-04-01 Thread James Golick

Yes.

J.

Sent from my iPhone.

On 2010-04-01, at 9:21 PM, Brandon Williams  wrote:

On Thu, Apr 1, 2010 at 9:37 PM, James Golick   
wrote:
Well, folks, I'm feeling a little stupid right now (adding to the  
injury inflicted by one Mr. Stump :-P).


So, here's the story. The cache hit rate is up around 97% now. The  
ruby code is down to around 20-25ms to multiget the 20 rows. I did  
some profiling, though, and realized that a lot of time was being  
spent in thrift. Turns out, that's where pretty much all the time  
was going.


I just ran the same test using java (scala) and the load is taking  
around 2-4ms.


That's with the binary accelerated thrift for ruby?

-Brandon


Re: Creating a Total Ordered Queue in Cassandra

2010-04-01 Thread Jeremy Davis
Since twitter is everyone's favorite analogy:
It's like twitter, but faster and with bigger messages that I may need to go
back and replay in order to mine for more details at a later date.
Thus, I call it a queue, because the order of messages is important.. But
not anything like a message broker/pub-sub/topic/ etc...

-JD



On Thu, Apr 1, 2010 at 9:43 PM, Jeremy Davis
wrote:

>
> You are correct, it is not a queue in the classic sense... I'm storing the
> entire "conversation" with a client in perpetuity, and then playing it back
> in the order received.
>
> Rabbitmq/activemq etc all have about the same throughput 3-6K persistent
> messages/sec, and are not good for storing the conversation forever... Also
> I can easily scale cassandra past that message rate and not have to worry
> about which message broker/cluster I'm connecting to/has the
> conversation/etc.
>
>
>
>
> On Thu, Apr 1, 2010 at 7:02 PM, Keith Thornhill  wrote:
>
>> you mention never deleting from the queue, so what purpose is this
>> serving? (if you don't pop off the front, is it really a queue?)
>>
>> seems if guaranteed order of messages is required, there are many
>> other projects which are focused towards that problem (rabbitmq,
>> kestrel, activemq, etc)
>>
>> or am i misunderstanding your needs here?
>>
>> -keith
>>
>> On Thu, Apr 1, 2010 at 6:32 PM, Jeremy Davis
>>  wrote:
>> > I'm in the process of implementing a Totally Ordered Queue in Cassandra,
>> and
>> > wanted to bounce my ideas off the list and also see if there are any
>> other
>> > suggestions.
>> >
>> > I've come up with an external source of ID's that are always increasing
>> (but
>> > not monotonic), and I've also used external synchronization to ensure
>> only
>> > one writer to a given queue. And I handle de-duping in the app.
>> >
>> >
>> > My current solution is : (simplified)
>> >
>> > Use the "QueueId", to Key into a row of a CF.
>> > Then, every column in that CF corresponds to a new entry in the Queue,
>> with
>> > a custom Comparator to sort the columns by my external ID that is always
>> > increasing.
>> >
>> > Technically I never delete data from the Queue, and I just page through
>> it
>> > from a given ID using a SliceRange, etc.
>> >
>> > Obviously the problem being that the row needs to get compacted. so then
>> I
>> > started bucketizing with multiple rows for a given queue (for example
>> one
>> > per day (again I'm simplifying))...(so the Key is now "QueueId+Day"...)
>> >
>> > Does this seem reasonable? It's solvable, but is starting to seem
>> > complicated to implement... It would be very easy if I didn't have to
>> have
>> > multiple buckets..
>> >
>> >
>> >
>> > My other thought is to store one entry per row, and perform
>> get_range_slices
>> > and specify a KeyRange, with the OrderPreservingPartitioner.
>> > But it isn't exactly clear to me what the Order of the keys are in this
>> > system, so I don't know how to construct my key and queries
>> appropriately...
>> > Is this Lexical String Order? Or?
>> >
>> > So for example.. Assuming my QueueId's are longs, and my ID's are also
>> > longs.. My key would be (in Java):
>> >
>> > long queueId;
>> > long msgId;
>> >
>> > key = "" + queueId + ":" + msgId;
>> >
>> > And if I wanted to do a query my key range might be from
>> > start = "" + queueId + ":0"
>> > end = "" + queueId + ":" + Long.MAX_VALUE;
>> >
>> > (Will I have to left pad the msgIds with 0's)?
>> >
>> > And is this going to be efficient if my msgId isn't monotonically
>> > increasing?
>> >
>> > Thanks,
>> > -JD
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>
>


best practice for migrating data

2010-04-01 Thread AJ Chen
when adding/changing a column to a column family for existing data in
cassandra, what's a good way to do it?
thanks,
-aj--
AJ Chen, PhD
Chair, Semantic Web SIG, sdforum.org
http://web2express.org
twitter @web2express
Palo Alto, CA, USA


Re: Creating a Total Ordered Queue in Cassandra

2010-04-01 Thread Tatu Saloranta
On Thu, Apr 1, 2010 at 9:43 PM, Jeremy Davis
 wrote:
>
> You are correct, it is not a queue in the classic sense... I'm storing the
> entire "conversation" with a client in perpetuity, and then playing it back
> in the order received.
>
> Rabbitmq/activemq etc all have about the same throughput 3-6K persistent
> messages/sec, and are not good for storing the conversation forever... Also
> I can easily scale cassandra past that message rate and not have to worry
> about which message broker/cluster I'm connecting to/has the
> conversation/etc.

Also: I think RabbitMQ specifically does not have distributed message
stores -- each message lives in just one queue node, meaning that when
it is down (or gets wiped out), so are messages for that particular
queue. Otherwise it seems like a really nice queuing system.
The other potential concern is that all message metadata for it has to
fit in central memory (message payload can be persisted I think) of
the host that owns message.
So while RabbitMQ and ActiveMQ are obviously better matches for
queuing (with very powerful semantics, optional transactionality, etc.
etc. etc.)  Cassandra seems to have better distribution and
fault-tolerance properties. This could be useful for some scenarios.
In fact I wonder if "traditional" MQs could be considered quite a bit
like RDBMSs regarding scalability, regarding distribution, horizontal
scaling (or lack thereof) by adding nodes, and cost of ACID features
(high expresive power vs simple scalability).

I am actually also interested in similar aspects; using queue name and
sequence identifier for implementing queue-like constructs, and was
happy to see this question. But in my case, I would want to eventually
also delete messages, so I would not have to rely as much on
monotonically increasing ids aspect. This would allow
many-senders-single-receiver use case, with little or no external
synchronization.

-+ Tatu +-