Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Benjamin Roth
If your proposed solution is crazy depends on your needs :) It sounds like you can live with not-realtime data. So it is ok to cache it. Why preproduce the results if you only need 5% of them? Why not use redis as a cache with expiring sorted sets that are filled on demand from cassandra partitions

Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Benjamin Roth
Not if you want to sort by score (a counter) Am 14.01.2017 08:33 schrieb "DuyHai Doan" : > Clustering column can be seen as sorted set > > Table abstraction == Map> > > > On Sat, Jan 14, 2017 at 2:28 AM, Edward Capriolo > wrote: > >> >> >> On Fri, Jan 13, 2017 at 8:14 PM, Jonathan Haddad >> wro

Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread DuyHai Doan
Clustering column can be seen as sorted set Table abstraction == Map> On Sat, Jan 14, 2017 at 2:28 AM, Edward Capriolo wrote: > > > On Fri, Jan 13, 2017 at 8:14 PM, Jonathan Haddad > wrote: > >> I've thought about this for years and have never arrived on a >> particularly great implementation

Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Edward Capriolo
On Fri, Jan 13, 2017 at 8:14 PM, Jonathan Haddad wrote: > I've thought about this for years and have never arrived on a particularly > great implementation. Your idea will be maybe OK if the sets are very > small and if the values don't change very often. But in a system where the > values of t

Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Jonathan Haddad
I've thought about this for years and have never arrived on a particularly great implementation. Your idea will be maybe OK if the sets are very small and if the values don't change very often. But in a system where the values of the keys in the set change frequently (lots of tombstones) or the s

Re: implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Edward Capriolo
On Fri, Jan 13, 2017 at 5:14 PM, Mike Torra wrote: > We currently use redis to store sorted sets that we increment many, many > times more than we read. For example, only about 5% of these sets are ever > read. We are getting to the point where redis is becoming difficult to > scale (currently at

implementing a 'sorted set' on top of cassandra

2017-01-13 Thread Mike Torra
We currently use redis to store sorted sets that we increment many, many times more than we read. For example, only about 5% of these sets are ever read. We are getting to the point where redis is becoming difficult to scale (currently at >20 nodes). We've started using cassandra for other thin

Re: incremental repairs with -pr flag?

2017-01-13 Thread Bruno Lavoie
Another point, I've done another test on my 5 node cluster. Created a keyspace with replication factor of 5 and inserted some data in it. Run a full repair on each node to make sstable appear on disk. Then run multiple times on each nodes: 1 - nodetool repair 2 - nodetool repair -pr Due to incr

Re: incremental repairs with -pr flag?

2017-01-13 Thread Bruno Lavoie
Thanks for your reply, But can't figure out why it's not recommendend, by DataStax, to run primary-range with incremental repair... It's just doing less work on each repair call on the repaired node. At the end, when all the nodes are repaired using either method, all data is equally consistent?

Re: Metric to monitor partition size

2017-01-13 Thread Bryan Cheng
We're on 2.X so this information may not apply to your version, but you should see: 1) A log statement upon compaction, like "Writing large partition", including the primary partition key (see https://issues.apache.org/jira/browse/CASSANDRA-9643). Configurable threshold in cassandra.yaml 2) Probl

Re: Backups eating up disk space

2017-01-13 Thread Kunal Gangakhedkar
Great, thanks a lot to all for the help :) I finally took the dive and went with Razi's suggestions. In summary, this is what I did: - turn off incremental backups on each of the nodes in rolling fashion - remove the 'backups' directory from each keyspace on each node. This ended up freein

Re: Check snapshot / sstable integrity

2017-01-13 Thread Jérôme Mainaud
Hello Alain, Thank you for your answer. Basically having a tool to check all sstables in a folder using the checksum would be nice. But finally I can have the same result using some shasum tool. The goal is to verify integrity of files copied back from an external backup tool. The question came