unsubscribe

2010-12-10 Thread Massimo Carro
unsubscribe

Massimo Carro

www.liquida.it - www.liquida.com


Re: Quorum: killing 1 out of 3 server kills the cluster (?)

2010-12-10 Thread Timo Nentwig

On Dec 9, 2010, at 18:50, Tyler Hobbs wrote:

> If you switch your writes to CL ONE when a failure occurs, you might as well 
> use ONE for all writes.  ONE and QUORUM behave the same when all nodes are 
> working correctly.

That's finally a precise statement! :) I was wondering what " to at least 1 
replica's commit log" is supposed to actually mean: 
http://wiki.apache.org/cassandra/API

Does quorum mean that data is replicated to q nodes or to at least q nodes? I 
just added another blank machine to my cluster. Nothing happened as expected 
(stopped writing to the cluster) but after I ran nodetool repair it held more 
data than all other nodes. So it copied data from the other nodes to this one? 
I assumed that data is replicated to q nodes not to all, is quorum 'only' about 
consistency and not about saving storage space?

> - Tyler
> 
> On Thu, Dec 9, 2010 at 11:26 AM, Timo Nentwig  
> wrote:
> 
> On Dec 9, 2010, at 17:55, Sylvain Lebresne wrote:
> 
> >> I naively assume that if I kill either node that holds N1 (i.e. node 1 or 
> >> 3), N1 will still remain on another node. Only if both fail, I actually 
> >> lose data. But apparently this is not how it works...
> >
> > Sure, the data that N1 holds is also on another node and you won't
> > lose it by only losing N1.
> > But when you do a quorum query, you are saying to Cassandra "Please
> > please would you fail my request
> > if you can't get a response from 2 nodes". So if only 1 node holding
> > the data is up at the moment of the
> > query then Cassandra, which is a very polite software, do what you
> > asked and fail.
> 
> And my application would fall back to ONE. Quorum writes will also fail so I 
> would also use ONE so that the app stays up. What would I have to do make the 
> data to redistribute when the broken node is up again? Simply call nodetool 
> repair on it?
> 
> > If you want Cassandra to send you an answer with only one node up, use
> > CL=ONE (as said by David).
> >
> >>
> >>> On Thu, Dec 9, 2010 at 6:05 PM, Sylvain Lebresne  
> >>> wrote:
> >>> I'ts 2 out of the number of replicas, not the number of nodes. At RF=2, 
> >>> you have
> >>> 2 replicas. And since quorum is also 2 with that replication factor,
> >>> you cannot lose
> >>> a node, otherwise some query will end up as UnavailableException.
> >>>
> >>> Again, this is not related to the total number of nodes. Even with 200
> >>> nodes, if
> >>> you use RF=2, you will have some query that fail (altough much less that 
> >>> what
> >>> you are probably seeing).
> >>>
> >>> On Thu, Dec 9, 2010 at 5:00 PM, Timo Nentwig  
> >>> wrote:
> 
>  On Dec 9, 2010, at 16:50, Daniel Lundin wrote:
> 
> > Quorum is really only useful when RF > 2, since the for a quorum to
> > succeed RF/2+1 replicas must be available.
> 
>  2/2+1==2 and I killed 1 of 3, so... don't get it.
> 
> > This means for RF = 2, consistency levels QUORUM and ALL yield the same 
> > result.
> >
> > /d
> >
> > On Thu, Dec 9, 2010 at 4:40 PM, Timo Nentwig  
> > wrote:
> >> Hi!
> >>
> >> I've 3 servers running (0.7rc1) with a replication_factor of 2 and use 
> >> quorum for writes. But when I shut down one of them 
> >> UnavailableExceptions are thrown. Why is that? Isn't that the sense of 
> >> quorum and a fault-tolerant DB that it continues with the remaining 2 
> >> nodes and redistributes the data to the broken one as soons as its up 
> >> again?
> >>
> >> What may I be doing wrong?
> >>
> >> thx
> >> tcn
> 
> 
> >>>
> >>
> >>
> 
> 



Re: [RELEASE] 0.7.0 rc2

2010-12-10 Thread Nate McCall
RPM is available from http://rpm.riptano.com

Artifacts for maven folks are available as well:
http://mvn.riptano.com/content/repositories/riptano/

-Nate

On Fri, Dec 10, 2010 at 11:00 AM, Eric Evans  wrote:
>
> I'd have thought all that turkey and stuffing would have done more
> damage to momentum, but judging by the number of bug-fixes in the last
> couple of weeks, that isn't the case.
>
> As usual, I'd be remiss if I didn't point out that this is not yet a
> stable release.  It's getting pretty close, but we're not ready to stick
> a fork in it yet.  Be sure to test it thoroughly before upgrading
> something important.
>
> Please be sure to read through the changes[1] and release notes[2].
> Report any problems you find[3], and if you have any questions, don't
> hesitate to ask.
>
> Thanks!
>
> [1]: http://goo.gl/ZMQEe (CHANGES.txt)
> [2]: http://goo.gl/R35HH (NEWS.txt)
> [3]: https://issues.apache.org/jira/browse/CASSANDRA
> [4]: http://people.apache.org/~eevans/cassandra_0.7.0~rc2_all.deb
>
> --
> Eric Evans
> eev...@rackspace.com
>
>
>


Re: unsubscribe

2010-12-10 Thread Eric Evans
On Fri, 2010-12-10 at 10:17 +0100, Massimo Carro wrote:
> unsubscribe

http://wiki.apache.org/cassandra/FAQ#unsubscribe

-- 
Eric Evans
eev...@rackspace.com



Estimate Keys - JMX

2010-12-10 Thread Dan Hendry
Recent versions of Cassandra (I noticed it in RC1, which is what I am using)
add a very cool "estimateKeys" JMX operation for each column family. I have
a quick question: Is the value an estimate for the number of keys on a
particular node or for the entire cluster?

 

I have a 5 node cluster (RF=2) and am getting wildly different estimates
from different nodes. For example, one says ~50 million and another ~75
million. 

 

Dan Hendry

(403) 660-2297

 



Re: Stuck with adding nodes

2010-12-10 Thread Daniel Doubleday
Thanks for your help Peter.

We gave up and rolled back to our mysql implementation (we did all writes to 
our old store in parallel so we did not lose anything).
Problem was that every solution we came up with would require at least on major 
compaction before the new nodes could join and our cluster could not survive 
this (in terms of serving requests at reasonable latencies).

But thanks anyway,
Daniel

On Dec 9, 2010, at 8:25 PM, Peter Schuller wrote:

>> Currently I am copying all data files (thats all existing data) from one 
>> node to the new nodes in hope that I could than manually assign them their 
>> new tokenrange (nodetool move) and do cleanup.
> 
> Unless I'm misunderstanding you I believe you should be setting the
> initial token. nodetool move would be for a node already in the ring.
> And keep in mind that a nodetool move is currently a
> decommission+bootstrap - so if you're teetering on the edge of
> overload you will want to keep that in mind when moving a node to
> avoid ending up in a worse situation as another node temporarily
> receives more load than usual as a result of increased ring ownership.
> 
>> Obviously I will try this tomorrow (it's been a long day) on a test system 
>> but any advice would be highly appreciated.
> 
> One possibility if you have additional hardware to spare temporarily,
> is to add more nodes than you actually need and then, once you are
> significantly over capacity, you have the flexibility to move nodes
> around to an optimum position and then decommission those machines
> that were only "borrowed". I.e., initial bootstrap of nodes takes a
> shorter amount of time because you're giving them less token space per
> new node. And once all are in the ring, you're free to move things
> around and then free up the hardware.
> 
> (Another option may be to implement throttling of the anti-compaction
> so that it runs very slowly during peak hours, but that requires
> patching cassandra or else firewall/packet filtering fu and is
> probably likely to be more risky than it's worth.)
> 
> -- 
> / Peter Schuller



cassandra database viewer

2010-12-10 Thread Liangzhao Zeng
Is there any database viewer in cassandra to browser the content of the
database, like what DB2 or oracle have?


Thanks,

Liangzhao


Re: Quorum: killing 1 out of 3 server kills the cluster (?)

2010-12-10 Thread Peter Schuller
> If you switch your writes to CL ONE when a failure occurs, you might as well
> use ONE for all writes.  ONE and QUORUM behave the same when all nodes are
> working correctly.

Consistency wise yes, but not durability wise. Writing to QUOROM but
reading at ONE is useful if you want higher durability guarantees, but
is fine with inconsistent reads. In other words, if you want to avoid
loosing data if a node completely blows up and its data becomes
irrevocably lost forever. (The data that would be lost would be data
written at ONE that only reached that node, and no others.)

-- 
/ Peter Schuller


Memory leak with Sun Java 1.6 ?

2010-12-10 Thread Jedd Rashbrooke
 Howdi,

 We're using Cassandra 0.6.6 - intending to wait until 0.7 before
 we do any more upgrades.

 We're running a cluster of 16 boxes of 7.1GB each, on Amazon EC2
 using Ubuntu 10.04 (LTS).

 Today we saw one box kick its little feet up, and after investigating
 the other machines, it looks like they're all approaching the same fate.

 Over the past month or so, it looks like memory has slowly
 been exhausted.  Both nodetool drain and jmap can't run, and
 produce this error:

 Error occurred during initialization of VM
 Could not reserve enough space for object heap

 We've got Xmx/Xms set to 4GB.

 top shows free memory around 50-80MB, file cache under
 10MB, and the java process at 12+GB virt and 7.1GB res.

 This feels like a Java problem, not a Cassandra one, but I'm
 open to suggestions.  To ensure I don't get bothered over
 the weekend we're doing a rolling restart of Cassandra on
 each of the boxes now.  The last time they were restarted
 was just over a month ago.  Now I'm wondering whether I
 should (until 0.7.1 is available) schedule in a slower rolling
 restart over several days, every few weeks.

 I've shared a Zabbix graph of system memory at:

 http://www.imagebam.com/image/3b4213110283969

 cheers,
 Jedd.


Re: Quorum: killing 1 out of 3 server kills the cluster (?)

2010-12-10 Thread Peter Schuller
> That's finally a precise statement! :) I was wondering what " to at least 1 
> replica's commit log" is supposed to actually mean: 
> http://wiki.apache.org/cassandra/API

The main idea is that it has been "officially delivered" to one
replicate. If Cassandra only did batch-wise commit such that a write
was never ACK:ed until it was durable, it would mean that it had been
durably written to 1 replica set.

I suspect the phrasing is to get around the fact that it is not
actually durably written if nodes are configured to use periodic sync
mode.

> Does quorum mean that data is replicated to q nodes or to at least q nodes?

That it is replicated to at least a quorom of nodes before the write
is considered successful. This does not prevent further propagation to
all nodes; data always gets replicated according to replication
factor. Consistency levels only affect the consistency requirements of
the particular request.

>  I just added another blank machine to my cluster. Nothing happened as 
> expected (stopped writing to the cluster) but after I ran nodetool repair it 
> held more data than all other nodes. So it copied data from the other nodes 
> to this one? I assumed that data is replicated to q nodes not to all, is 
> quorum 'only' about consistency and not about saving storage space?

The new node should have gotten its appropriate amount according to
the ring responsibility (i.e., tokens). I'm not sure why a new node
would get more than its fair share (according to tokens) of data
though.

There is one extreme case which would be if the cluster has seen lots
of writes in degraded states so that there is a lot of data around the
cluster that has not yet reached their full replica sets. A repair on
a new node might make the new node be the only one that has all the
data it should have... but you'd have to have written data at low
consistency level during pretty shaky periods for this to have a
significant effect (especially if hinted handoff is turned on).

-- 
/ Peter Schuller


hazelcast

2010-12-10 Thread B. Todd Burruss

http://www.hazelcast.com/product.jsp

has anyone tested hazelcast as a distributed locking mechanism for java 
clients?  seems very attractive on the surface.


Re: Memory leak with Sun Java 1.6 ?

2010-12-10 Thread Peter Schuller
>  Over the past month or so, it looks like memory has slowly
>  been exhausted.  Both nodetool drain and jmap can't run, and
>  produce this error:
>
>     Error occurred during initialization of VM
>     Could not reserve enough space for object heap
>
>  We've got Xmx/Xms set to 4GB.
>
>  top shows free memory around 50-80MB, file cache under
>  10MB, and the java process at 12+GB virt and 7.1GB res.
>
>  This feels like a Java problem, not a Cassandra one, but I'm
>  open to suggestions.  To ensure I don't get bothered over
>  the weekend we're doing a rolling restart of Cassandra on
>  each of the boxes now.  The last time they were restarted
>  was just over a month ago.  Now I'm wondering whether I
>  should (until 0.7.1 is available) schedule in a slower rolling
>  restart over several days, every few weeks.

Memory-mapped files will account for both virtual and, to the extent
that they are resident in memory, to the resident size of the process.
However, your graph:

>  I've shared a Zabbix graph of system memory at:
>
>     http://www.imagebam.com/image/3b4213110283969

Certainly indicates that it is not the explanation since you should be
seeing cached occupy the remainder of memory above heap size. In
addition the allocation failures from jmap indicates memory is truly
short.

Just to confirm, what does the free +/- buffers show if you run
'free'? (I.e., middle line, under 'free' column)

A Java memory leak would likely indicate non-heap managed memory
(since I think it's unlikely that the JVM fails to limit the actual
heap size). The question is what

To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
in lenny or some such.) If it is a JVM issue, ensuring you're using a
reasonably recent JVM is probably much easier than to start tracking
it down...

-- 
/ Peter Schuller


Re: hazelcast

2010-12-10 Thread Germán Kondolf
Hi, I'm using it as a complement of cassandra, to avoid "duplicate"
searches and duplicate content in a given moment in time.
It works really nice by now, no critical issues, at least the
functionallity I'm using from it.

-- 
//GK
german.kond...@gmail.com
// sites
http://twitter.com/germanklf
http://ar.linkedin.com/in/germankondolf

On Fri, Dec 10, 2010 at 2:50 PM, B. Todd Burruss  wrote:
> http://www.hazelcast.com/product.jsp
>
> has anyone tested hazelcast as a distributed locking mechanism for java
> clients?  seems very attractive on the surface.
>


Re: hazelcast

2010-12-10 Thread Norman Maurer
Hi there,

I'm not using it atm but plan to in my next project. It really looks nice :)

Bye,
Norman

2010/12/10 Germán Kondolf :
> Hi, I'm using it as a complement of cassandra, to avoid "duplicate"
> searches and duplicate content in a given moment in time.
> It works really nice by now, no critical issues, at least the
> functionallity I'm using from it.
>
> --
> //GK
> german.kond...@gmail.com
> // sites
> http://twitter.com/germanklf
> http://ar.linkedin.com/in/germankondolf
>
> On Fri, Dec 10, 2010 at 2:50 PM, B. Todd Burruss  wrote:
>> http://www.hazelcast.com/product.jsp
>>
>> has anyone tested hazelcast as a distributed locking mechanism for java
>> clients?  seems very attractive on the surface.
>>
>


Re: hazelcast

2010-12-10 Thread B. Todd Burruss
thx for the feedback.  regarding locking, has anyone done a comparison 
to zookeeper?  does zookeeper provide functionality over hazelcast?


On 12/10/2010 11:08 AM, Norman Maurer wrote:

Hi there,

I'm not using it atm but plan to in my next project. It really looks nice :)

Bye,
Norman

2010/12/10 Germán Kondolf:

Hi, I'm using it as a complement of cassandra, to avoid "duplicate"
searches and duplicate content in a given moment in time.
It works really nice by now, no critical issues, at least the
functionallity I'm using from it.

--
//GK
german.kond...@gmail.com
// sites
http://twitter.com/germanklf
http://ar.linkedin.com/in/germankondolf

On Fri, Dec 10, 2010 at 2:50 PM, B. Todd Burruss  wrote:

http://www.hazelcast.com/product.jsp

has anyone tested hazelcast as a distributed locking mechanism for java
clients?  seems very attractive on the surface.



Re: hazelcast

2010-12-10 Thread Germán Kondolf
I don't know much about Zookeeper, but as far as I read, it is out of
JVM process.
Hazelcast is just a framework and you can programmatically start and
shutdown the cluster, it's just an xml to configure it.

Hazelcast also provides good caching features to integrate with
Hibernate, distributed executors, clusterized queues, distributed
events, and so on. I don't know if that is supported by Zookeeper, I
think not, because is not the main goal of it.

On Fri, Dec 10, 2010 at 4:49 PM, B. Todd Burruss  wrote:
> thx for the feedback.  regarding locking, has anyone done a comparison to
> zookeeper?  does zookeeper provide functionality over hazelcast?
>
> On 12/10/2010 11:08 AM, Norman Maurer wrote:
>>
>> Hi there,
>>
>> I'm not using it atm but plan to in my next project. It really looks nice
>> :)
>>
>> Bye,
>> Norman
>>
>> 2010/12/10 Germán Kondolf:
>>>
>>> Hi, I'm using it as a complement of cassandra, to avoid "duplicate"
>>> searches and duplicate content in a given moment in time.
>>> It works really nice by now, no critical issues, at least the
>>> functionallity I'm using from it.
>>>
>>> --
>>> //GK
>>> german.kond...@gmail.com
>>> // sites
>>> http://twitter.com/germanklf
>>> http://ar.linkedin.com/in/germankondolf
>>>
>>> On Fri, Dec 10, 2010 at 2:50 PM, B. Todd Burruss
>>>  wrote:

 http://www.hazelcast.com/product.jsp

 has anyone tested hazelcast as a distributed locking mechanism for java
 clients?  seems very attractive on the surface.

>



-- 
//GK
german.kond...@gmail.com
// sites
http://twitter.com/germanklf
http://www.facebook.com/germanklf
http://ar.linkedin.com/in/germankondolf


Re: hazelcast

2010-12-10 Thread Kani
As far as I know they are adding a clusterized semaphore on next version.


On Fri, Dec 10, 2010 at 5:15 PM, Germán Kondolf wrote:

> I don't know much about Zookeeper, but as far as I read, it is out of
> JVM process.
> Hazelcast is just a framework and you can programmatically start and
> shutdown the cluster, it's just an xml to configure it.
>
> Hazelcast also provides good caching features to integrate with
> Hibernate, distributed executors, clusterized queues, distributed
> events, and so on. I don't know if that is supported by Zookeeper, I
> think not, because is not the main goal of it.
>
> On Fri, Dec 10, 2010 at 4:49 PM, B. Todd Burruss 
> wrote:
> > thx for the feedback.  regarding locking, has anyone done a comparison to
> > zookeeper?  does zookeeper provide functionality over hazelcast?
> >
> > On 12/10/2010 11:08 AM, Norman Maurer wrote:
> >>
> >> Hi there,
> >>
> >> I'm not using it atm but plan to in my next project. It really looks
> nice
> >> :)
> >>
> >> Bye,
> >> Norman
> >>
> >> 2010/12/10 Germán Kondolf:
> >>>
> >>> Hi, I'm using it as a complement of cassandra, to avoid "duplicate"
> >>> searches and duplicate content in a given moment in time.
> >>> It works really nice by now, no critical issues, at least the
> >>> functionallity I'm using from it.
> >>>
> >>> --
> >>> //GK
> >>> german.kond...@gmail.com
> >>> // sites
> >>> http://twitter.com/germanklf
> >>> http://ar.linkedin.com/in/germankondolf
> >>>
> >>> On Fri, Dec 10, 2010 at 2:50 PM, B. Todd Burruss
> >>>  wrote:
> 
>  http://www.hazelcast.com/product.jsp
> 
>  has anyone tested hazelcast as a distributed locking mechanism for
> java
>  clients?  seems very attractive on the surface.
> 
> >
>
>
>
> --
> //GK
> german.kond...@gmail.com
> // sites
> http://twitter.com/germanklf
> http://www.facebook.com/germanklf
> http://ar.linkedin.com/in/germankondolf
>


Consistency question caused by Read_all and Write_one

2010-12-10 Thread Alvin UW
Hello,


I got a consistency problem in Cassandra.

Given a column family with a record:Id   Name
1David

There are three backups for this column family.

Assume there are two write operation happens issued by the same application
by this order: write_one("1", "Dan") ; write_one("1", "Ken").
What will Read_all("1") get?

Assume the above two write operations happens exactly the same time in two
applications,
Again what will Read_all("1") get?

Thanks.

Alvin


Re: Consistency question caused by Read_all and Write_one

2010-12-10 Thread Peter Schuller
> Assume there are two write operation happens issued by the same application
> by this order: write_one("1", "Dan") ; write_one("1", "Ken").
> What will Read_all("1") get?

Assuming read_all means reading at consistency level ALL, it sees the
latest value ("Ken").

> Assume the above two write operations happens exactly the same time in two
> applications,
> Again what will Read_all("1") get?

If they happen exactly at the same time, in the sense that the
timestamps (micro second) end up being exactly equal, IIRC the
tie-breaker is the value (I think the greater value wins). But note
that this is likely not a problem; if it is you probably already have
an issue with races anyway. Unless you have some very very strict
synchronization going on between your clients, it will be essentially
equivalent to an outside observer whether two clients wrote two values
with timestamps of almost the same microseconds, or wrote two values
with timestamps that are exactly equal. In both cases you're
effectively racing and you'll see afterwards who won. If you require
otherwise you probably require synchronization, regardless of what
Cassandra does in terms of tie breaking with identical timestamps (if
you have such strict ordering requirements then by what means do you
allocate the timestamps in a guaranteed-in-order way? you're probably
not).

-- 
/ Peter Schuller


Re: Consistency question caused by Read_all and Write_one

2010-12-10 Thread Ryan King
On Fri, Dec 10, 2010 at 12:49 PM, Alvin UW  wrote:
> Hello,
>
>
> I got a consistency problem in Cassandra.
>
> Given a column family with a record:    Id   Name
>     1    David
>
> There are three backups for this column family.
>
> Assume there are two write operation happens issued by the same application
> by this order: write_one("1", "Dan") ; write_one("1", "Ken").
> What will Read_all("1") get?
>
> Assume the above two write operations happens exactly the same time in two
> applications,
> Again what will Read_all("1") get?

By "exactly the same" do you mean "with the same timestamp"?

-ryan


Cassandra for Ad-hoc Aggregation and formula calculation

2010-12-10 Thread Arun Cherian
Hi,

I have been reading up on Cassandra for the past few weeks and I am
highly impressed by the features it offers. At work, we are starting
work on a product that will handle several million CDR (Call Data
Record, basically can be thought of as a .CSV file) per day. We will
have to store the data, and perform aggregations and calculations on
them. A few veteran RDBMS admin friends (we are a small .NET shop, we
don't have any in-house DB talent) recommended Infobright and noSQL to
us, and hence my search. I was wondering if Cassandra is a good fit
for

1. Storing several million data records per day (each record will be a
few KB in size) without any data loss.
2. Aggregation of certain fields in the stored records, like Avg
across time period.
3. Using certain existing fields to calculate new values on the fly
and store it too.
4. We were wondering if pre-aggregation was a good choice (calculating
aggregation per 1 min, 5 min, 15 min etc ahead of time) but in case we
need ad-hoc aggregation, does Cassandra support that over this amount
of data?

Thanks,
Arun


Re: cassandra database viewer

2010-12-10 Thread Aaron Morton
This is the only thing I can think of 
https://github.com/driftx/chiton

Have not used it myself.

Aaron
On 11/12/2010, at 5:33 AM, Liangzhao Zeng  wrote:

> Is there any database viewer in cassandra to browser the content of the 
> database, like what DB2 or oracle have?
> 
> 
> Thanks,
> 
> Liangzhao


Re: cassandra database viewer

2010-12-10 Thread Shashank Tiwari
what about https://github.com/suguru/cassandra-webconsole? any good?


On Fri, Dec 10, 2010 at 2:00 PM, Aaron Morton wrote:

> This is the only thing I can think of
> https://github.com/driftx/chiton
>
> Have not used it myself.
>
> Aaron
> On 11/12/2010, at 5:33 AM, Liangzhao Zeng 
> wrote:
>
> > Is there any database viewer in cassandra to browser the content of the
> database, like what DB2 or oracle have?
> >
> >
> > Thanks,
> >
> > Liangzhao
>


How do you implement pagination?

2010-12-10 Thread Joshua Partogi
Hi all,

I am interested to see people's way to do record pagination with cassandra
because I can not find anything like MySQL LIMIT in cassandra.

>From what I understand you need to tell cassandra the Record ID for the
beginning of the slice and the number of record you want to get after that
Record. I am using UUID instead of Long for the Record ID.

My question is, how does your application get the next Record ID after the
current slice that is displayed on the page?
Let's say I want to display record 1-10, do I actually grab 11 records but
only display 10 records and only keep the ID of the 11th records so I can
use it for pagination?

Sorry if the question is a bit obscured, but I am still figuring out how to
do pagination.

Thanks very much for your assistance.

Kind regards,
Joshua.

-- 
http://twitter.com/jpartogi 


Re: How do you implement pagination?

2010-12-10 Thread Tyler Hobbs
Yes, what you described is the correct way to do it.  Your next slice will
start with that 11th column.

- Tyler

On Fri, Dec 10, 2010 at 7:01 PM, Joshua Partogi wrote:

> Hi all,
>
> I am interested to see people's way to do record pagination with cassandra
> because I can not find anything like MySQL LIMIT in cassandra.
>
> From what I understand you need to tell cassandra the Record ID for the
> beginning of the slice and the number of record you want to get after that
> Record. I am using UUID instead of Long for the Record ID.
>
> My question is, how does your application get the next Record ID after the
> current slice that is displayed on the page?
> Let's say I want to display record 1-10, do I actually grab 11 records but
> only display 10 records and only keep the ID of the 11th records so I can
> use it for pagination?
>
> Sorry if the question is a bit obscured, but I am still figuring out how to
> do pagination.
>
> Thanks very much for your assistance.
>
> Kind regards,
> Joshua.
>
> --
> http://twitter.com/jpartogi 
>


Re: How do you implement pagination?

2010-12-10 Thread Joshua Partogi
So you're actually getting n+1 record? Correct? So this is the right way to
do it?


On Sat, Dec 11, 2010 at 1:02 PM, Tyler Hobbs  wrote:

> Yes, what you described is the correct way to do it.  Your next slice will
> start with that 11th column.
>
> - Tyler
>
>
> On Fri, Dec 10, 2010 at 7:01 PM, Joshua Partogi wrote:
>
>> Hi all,
>>
>> I am interested to see people's way to do record pagination with cassandra
>> because I can not find anything like MySQL LIMIT in cassandra.
>>
>> From what I understand you need to tell cassandra the Record ID for the
>> beginning of the slice and the number of record you want to get after that
>> Record. I am using UUID instead of Long for the Record ID.
>>
>> My question is, how does your application get the next Record ID after the
>> current slice that is displayed on the page?
>> Let's say I want to display record 1-10, do I actually grab 11 records but
>> only display 10 records and only keep the ID of the 11th records so I can
>> use it for pagination?
>>
>> Sorry if the question is a bit obscured, but I am still figuring out how
>> to do pagination.
>>
>> Thanks very much for your assistance.
>>
>> Kind regards,
>> Joshua.
>>
>> --
>> http://twitter.com/jpartogi 
>>
>
>


-- 
http://twitter.com/jpartogi


Re: hazelcast

2010-12-10 Thread Mubarak Seyed
How about KeptCollections (backs by ZooKeeper)?

https://github.com/anthonyu/KeptCollections

Thanks,
Mubarak

On Fri, Dec 10, 2010 at 12:15 PM, Germán Kondolf
wrote:

> I don't know much about Zookeeper, but as far as I read, it is out of
> JVM process.
> Hazelcast is just a framework and you can programmatically start and
> shutdown the cluster, it's just an xml to configure it.
>
> Hazelcast also provides good caching features to integrate with
> Hibernate, distributed executors, clusterized queues, distributed
> events, and so on. I don't know if that is supported by Zookeeper, I
> think not, because is not the main goal of it.
>
> On Fri, Dec 10, 2010 at 4:49 PM, B. Todd Burruss 
> wrote:
> > thx for the feedback.  regarding locking, has anyone done a comparison to
> > zookeeper?  does zookeeper provide functionality over hazelcast?
> >
> > On 12/10/2010 11:08 AM, Norman Maurer wrote:
> >>
> >> Hi there,
> >>
> >> I'm not using it atm but plan to in my next project. It really looks
> nice
> >> :)
> >>
> >> Bye,
> >> Norman
> >>
> >> 2010/12/10 Germán Kondolf:
> >>>
> >>> Hi, I'm using it as a complement of cassandra, to avoid "duplicate"
> >>> searches and duplicate content in a given moment in time.
> >>> It works really nice by now, no critical issues, at least the
> >>> functionallity I'm using from it.
> >>>
> >>> --
> >>> //GK
> >>> german.kond...@gmail.com
> >>> // sites
> >>> http://twitter.com/germanklf
> >>> http://ar.linkedin.com/in/germankondolf
> >>>
> >>> On Fri, Dec 10, 2010 at 2:50 PM, B. Todd Burruss
> >>>  wrote:
> 
>  http://www.hazelcast.com/product.jsp
> 
>  has anyone tested hazelcast as a distributed locking mechanism for
> java
>  clients?  seems very attractive on the surface.
> 
> >
>
>
>
> --
> //GK
> german.kond...@gmail.com
> // sites
> http://twitter.com/germanklf
> http://www.facebook.com/germanklf
> http://ar.linkedin.com/in/germankondolf
>



-- 
Thanks,
Mubarak Seyed.


RE: How do you implement pagination?

2010-12-10 Thread Dan Hendry
Or you can just start at the 1 + nth id given ids must be unique (you don't
have to specify an existing id as the start of a slice). You don't HAVE to
load the n + 1 record. 

 

This (slightly) more optimal approach has the disadvantage that you don't
know with certainty when you have reached the end of all records. This may
or may not be acceptable for your application.

 

Dan

 

From: joshua.j...@gmail.com [mailto:joshua.j...@gmail.com] On Behalf Of
Joshua Partogi
Sent: December-10-10 21:05
To: user@cassandra.apache.org
Subject: Re: How do you implement pagination?

 

So you're actually getting n+1 record? Correct? So this is the right way to
do it?



On Sat, Dec 11, 2010 at 1:02 PM, Tyler Hobbs  wrote:

Yes, what you described is the correct way to do it.  Your next slice will
start with that 11th column.

- Tyler

 

On Fri, Dec 10, 2010 at 7:01 PM, Joshua Partogi 
wrote:

Hi all,

I am interested to see people's way to do record pagination with cassandra
because I can not find anything like MySQL LIMIT in cassandra. 

>From what I understand you need to tell cassandra the Record ID for the
beginning of the slice and the number of record you want to get after that
Record. I am using UUID instead of Long for the Record ID. 

My question is, how does your application get the next Record ID after the
current slice that is displayed on the page? 
Let's say I want to display record 1-10, do I actually grab 11 records but
only display 10 records and only keep the ID of the 11th records so I can
use it for pagination?

Sorry if the question is a bit obscured, but I am still figuring out how to
do pagination. 

Thanks very much for your assistance.

Kind regards,
Joshua.

-- 
http://twitter.com/jpartogi  

 




-- 
http://twitter.com/jpartogi

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.872 / Virus Database: 271.1.1/3307 - Release Date: 12/10/10
02:37:00



Re: Running multiple instances on a single server --micrandra ??

2010-12-10 Thread Edward Capriolo
On Thu, Dec 9, 2010 at 10:40 PM, Bill de hÓra  wrote:
>
>
> On Tue, 2010-12-07 at 21:25 -0500, Edward Capriolo wrote:
>
> The idea behind "micrandra" is for a 6 disk system run 6 instances of
> Cassandra, one per disk. Use the RackAwareSnitch to make sure no
> replicas live on the same node.
>
> The downsides
> 1) we would have to manage 6x the instances of cassandra
> 2) we would have some overhead for each JVM.
>
> The upsides ?
> 1) Since disk/instance failure only degrades the overall performance
> 1/6th (RAID0 you lost the entire node) (RAID5 still takes a hit when
> down a disk)
> 2) Moves and joins have less work to do
> 3) Can scale up a single node by adding a single disk to an existing
> system (assuming the ram and cpu is light)
> 4) OPP would be "easier" to balance out hot spots (maybe not on this
> one in not an OPP)
>
> What does everyone thing? Does it ever make sense to run this way?
>
> It might for read heavy loads.
>
> When I looked at this, it was pointed out to me it's simpler to run fewer
> bigger coarser nodes and take the entire node/server out when something goes
> wrong. Basically give each Cassandra a server.
>
> I wonder if it would be better to rethink compaction if that's what's
> driving the idea. It seems to what is biting everyone, along with GC.
>
> Bill

Having 6 IP's on a machine would be a given in this setup. That is not
an issue for me.

It is not "biting" me. We all know that going from 10-20 nodes is
pretty simple. However organic growth from 10-16, then a couple months
later from 16 - 22, can take some effort with 300-600 GB per node,
since each join and clean up can take a while. I am wondering if
dividing a single large node into multiple smaller instances would
make this type of growth easier.


Re: Running multiple instances on a single server --micrandra ??

2010-12-10 Thread Edward Capriolo
On Fri, Dec 10, 2010 at 11:39 PM, Edward Capriolo  wrote:
> On Thu, Dec 9, 2010 at 10:40 PM, Bill de hÓra  wrote:
>>
>>
>> On Tue, 2010-12-07 at 21:25 -0500, Edward Capriolo wrote:
>>
>> The idea behind "micrandra" is for a 6 disk system run 6 instances of
>> Cassandra, one per disk. Use the RackAwareSnitch to make sure no
>> replicas live on the same node.
>>
>> The downsides
>> 1) we would have to manage 6x the instances of cassandra
>> 2) we would have some overhead for each JVM.
>>
>> The upsides ?
>> 1) Since disk/instance failure only degrades the overall performance
>> 1/6th (RAID0 you lost the entire node) (RAID5 still takes a hit when
>> down a disk)
>> 2) Moves and joins have less work to do
>> 3) Can scale up a single node by adding a single disk to an existing
>> system (assuming the ram and cpu is light)
>> 4) OPP would be "easier" to balance out hot spots (maybe not on this
>> one in not an OPP)
>>
>> What does everyone thing? Does it ever make sense to run this way?
>>
>> It might for read heavy loads.
>>
>> When I looked at this, it was pointed out to me it's simpler to run fewer
>> bigger coarser nodes and take the entire node/server out when something goes
>> wrong. Basically give each Cassandra a server.
>>
>> I wonder if it would be better to rethink compaction if that's what's
>> driving the idea. It seems to what is biting everyone, along with GC.
>>
>> Bill
>
> Having 6 IP's on a machine would be a given in this setup. That is not
> an issue for me.
>
> It is not "biting" me. We all know that going from 10-20 nodes is
> pretty simple. However organic growth from 10-16, then a couple months
> later from 16 - 22, can take some effort with 300-600 GB per node,
> since each join and clean up can take a while. I am wondering if
> dividing a single large node into multiple smaller instances would
> make this type of growth easier.
>

To clearly explain the scenario. 5 nodes cluster each node has 20 %
ring. They each have 6 disks. ~ 200 GB data.
Going to 10 nodes is easy. You can join each one directly between each node.

However if you are going from say 5 -> 8. This gets dicey. Do you
calculate the ideal ring position for 10 nodes?
20% | 20% | 10% | 10% | 10% | 10% | 10% | 10%  This results in three
joins and several clean ups. With this choice you save time but hope
you do not get to the point where the first two nodes get overloaded.

If you decide to work with the ideal tokens for 8 you have many moves
joins. Until we have:

https://issues.apache.org/jira/browse/CASSANDRA-1418
https://issues.apache.org/jira/browse/CASSANDRA-1427

Having 6 smaller instances on a node with 6 disks. Would make it
easier to keep close to balanced without having to double your cluster
size each time you grow or doing a series of moves to get balanced
again.


RE: Cassandra for Ad-hoc Aggregation and formula calculation

2010-12-10 Thread Dan Hendry
Perhaps other, more experienced and reputable contributors to this list can 
comment but to be frank: Cassandra is probably not for you (at least for now). 
I personally feel Cassandra is one of the stronger NoSQL options out there and 
has the potential to become the defacto standard; but its not quite there yet 
and does not inherently meet your requirements.

To give you some background, I started experimenting with Cassandra as a 
personal project to avoid having to look through days worth of server logs (and 
because I thought it was cool). The project ballooned and has become my 
organizations primary metrics and analytics platform which currently processes 
200 million+ events/records per day. I doubt any traditional database solution 
could have performed as well as Cassandra but the development and operations 
process has not been without severe growing pains. 

> 1. Storing several million data records per day (each record will be a
> few KB in size) without any data loss.

Absolutely, no problems on this front. A cluster of moderately beefy servers 
will handle this with no complaints. As long as you are careful to avoid 
hotspots in your data distribution, Cassandra truly is damn near linearly 
scalable with hardware. 

> 2. Aggregation of certain fields in the stored records, like Avg
> across time period.

Cassandra cannot do this on its own (by design and for good reason). There have 
been efforts to add support for higher level data processing languages (such as 
pig and hive) but they are not 'out of the box solutions' and in my experience, 
difficult to get working properly. I ended up writing my own data 
processing/report generation framework that works ridiculously well for my 
particular case. In relation to your requirements, calculating averages across 
fields would probably have to be implemented manually (and executed as a 
periodic, automated task). Although non-trivial this isn’t quite as bad as you 
might think.

> 3. Using certain existing fields to calculate new values on the fly
> and store it too.

Not quite sure what you are asking here. To go back to the last point to 
calculate anything new, you are probably going to have to load all the records 
on which that calculation depends into a separate process/server. Generally, I 
would say Cassandra isn’t particularly good at 'on the fly' data aggregation 
tasks (certainly not at all to the extent an SQL database is). To be fair, 
thats also not what it is explicitly designed for or advertised to do well. 

> 4. We were wondering if pre-aggregation was a good choice (calculating
> aggregation per 1 min, 5 min, 15 min etc ahead of time) but in case we
> need ad-hoc aggregation, does Cassandra support that over this amount
> of data?

Cassandra is GREAT for accessing/storing/retrieving/post-processing anything 
that can be pre-computed. If you have been doing any amount of reading, you 
will likely have heard that in SQL you model data, in Cassandra (and most other 
NoSQL databases) you model your queries (sorry for ripping off whoever said 
this originally). If there is one thing/concept I can say that I have learned 
about Cassandra is pre-compute (or asynchronously compute) anything you 
possibly can and don’t be afraid to write a ridiculous amount to the Cassandra 
database. In terms of ad-hoc aggregation, there is no nice simple scripting 
language for Cassandra data processing (eg SQL). That said, you can do most 
things pretty quick with a bit of code. Consider that loading a few hundred to 
a few thousand record (< 3k) can be pretty quick (< 100 ms, often < 10 ms 
particularly if they are cached). Our organization basically uses the following 
approach: 'use Cassandra for generating continuous 10 second accuracy time 
series reports but MySQL and a production DB replica for any ad-hoc single 
value report the boss wants NOW'.


Based on what you have described, it sounds like you are thinking about your 
problem from a SQL-like point of view: store data once then 
query/filter/aggregate it in multiple different ways to obtain useful 
information. If possible try to leverage the power of Cassandra and store it in 
efficient and per-query pre-optimized forms. For example, I can imagine the 
average call duration being an important parameter in a system analyzing call 
data records. Instead of storing all the information about a call in one place, 
store the 'call duration' in a separate column family, each row containing a 
single integer representing call duarations for a given hour (column name being 
the TimeUUID). My metrics system does something similar to this and loads 
batches of 15,000 records (column slice) in < 200 ms. By parallelizing across 
10 threads loading from different rows, I can process the average, standard 
deviation and a factor roughly meaning 'how close to Gaussian' for 1 million 
records in < 5 seconds. 

To reiterate, Cassandra is not the solution if you are looking for 'Database: I 
command th