Re: Forum and it threads

2011-12-12 Thread Waqar Azeem
Hi, 'threads' are nested in a 'forum', therefore, I decided to create a column-family 'thread' with a column named 'parent'. Is this idea matched with Cassandra philosophy? Because I feel that extracting the data recursively is computational hungry for a site like reddit.com. Therefore one ide

Tried to restart a node, and now it won't come up

2011-12-12 Thread huyle
Hi, We are running 1.0.3. I tried to restart a node, but it won't come up. Below is the error logged: INFO [SSTableBatchOpen:1] 2011-12-13 04:14:52,095 SSTableReader.java (line 134) Opening /u01/cassandra/data/system/HintsColumnFamily-hb-114 (66 bytes) INFO [main] 2011-12-13 04:14:52,145 Datab

Deleted rows re-appearing on repair in 0,8.6

2011-12-12 Thread Maxim Potekhin
Hello, I know that this problem used to exist in 0.8.1 -- I delete rows, run a repair and these rows are back with a vengeance. I recall I was told that this was fixed in 0.8.6 -- is that the case? I still keep seeing that behavior. Thanks Maxim

Re: Need to reconcile data from 2 drives

2011-12-12 Thread Stephane Legay
That's good to know, thanks. I'm looking through my inbox but can't find the notification you're referring to. Oh well. Looks like we'll survive the day. That wasn't much fun at all. On Mon, Dec 12, 2011 at 4:50 PM, Michael Shuler wrote: > On 12/12/2011 12:53 PM, Stephane Legay wrote: > > On Dec

Re: Need to reconcile data from 2 drives

2011-12-12 Thread Michael Shuler
On 12/12/2011 12:53 PM, Stephane Legay wrote: > On Dec. 9th, for some reason both instances were rebooted (not sure yet > what triggered the reboot). You should have received an emailed reboot schedule - this has an example: http://cloudscaling.com/blog/cloud-computing/aws-rebooting-100s-or-1000s

Re: Node repair : excessive data

2011-12-12 Thread Tyler Hobbs
On Mon, Dec 12, 2011 at 3:47 PM, Brian Fleming wrote: > > However after the repair completed, we had over 2.5 times the original > load. Issuing a 'cleanup' reduced this to about 1.5 times the original > load. We observed an increase in the number of keys via 'cfstats' which is > obviously accou

Re: best practices for simulating transactions in Cassandra

2011-12-12 Thread John Laban
Ok, great. I'll be sure to look into the virtualization-specific NTP guides. Another benefit of using Cassandra over Zookeeper for locking is that you don't have to worry about losing your connection to Zookeeper (and with it your locks) while hammering away at data in Cassandra. If using Cassan

Re: Need to reconcile data from 2 drives

2011-12-12 Thread Jeremiah Jordan
If you don't want downtime, you can take the original data and use the bulk sstable loader to send it back into the cluster. If you don't mind downtime you can take all the files from both data folders and put them together, make sure there aren't any with the same names (rename them if there

Re: best practices for simulating transactions in Cassandra

2011-12-12 Thread John Laban
Hey Jake, So I guess my problem is that I've never really relied on NTP before to try to guarantee consistency in my application. Does it tend to work really well in practice? What's the maximum clock skew you can see even when running NTP (especially if you're using more than one DC where you m

Node repair : excessive data

2011-12-12 Thread Brian Fleming
Hi, We simulated a node 'failure' on one of our nodes by deleting the entire Cassandra installation directory & reconfiguring a fresh instance with the same token. When we issued a 'repair' it started streaming data back onto the node as expected. However after the repair completed, we had over

Re: each_quorum in pycassa

2011-12-12 Thread Tyler Hobbs
pycassa.ConsistencyLevel.EACH_QUORUM On Mon, Dec 12, 2011 at 2:51 PM, A J wrote: > What is the syntax for each_quorum in pycassa ? > > Thanks. > -- Tyler Hobbs DataStax

Re: best practices for simulating transactions in Cassandra

2011-12-12 Thread Dominic Williams
Hi John, On 12 December 2011 19:35, John Laban wrote: > > So I responded to your algorithm in another part of this thread (very > interesting) but this part of the paper caught my attention: > > > When client application code releases a lock, that lock must not > actually be > > released for a pe

each_quorum in pycassa

2011-12-12 Thread A J
What is the syntax for each_quorum in pycassa ? Thanks.

Re: best practices for simulating transactions in Cassandra

2011-12-12 Thread Jake Luciani
> > > Jake: The algorithm you've outlined is pretty similar to how Zookeeper > clients implement locking. The potential only issue that I see with it > implemented in Cassandra is that it uses the timestamps of the inserted > columns to determine the winner of the lock. The column timestamps are

Re: best practices for simulating transactions in Cassandra

2011-12-12 Thread John Laban
Hi Dominic, So I responded to your algorithm in another part of this thread (very interesting) but this part of the paper caught my attention: > When client application code releases a lock, that lock must not actually be > released for a period equal to one millisecond plus twice the maximum pos

Re: best practices for simulating transactions in Cassandra

2011-12-12 Thread John Laban
Dominic/Jake: very interesting. This is getting more into fundamentals on locking/isolation rather than transactions/atomicity, but it is still relevant as I was going to use ZooKeeper for that stuff, but it would certainly nice to KISS and remove a component from my setup if I can do without it.

Need to reconcile data from 2 drives

2011-12-12 Thread Stephane Legay
Here's the situation. We're running a 2-node cluster on EC2 (v 0.8.6). Each node writes data to a mounted EBS volume mounted on /mnt2. On Dec. 9th, for some reason both instances were rebooted (not sure yet what triggered the reboot). But the EBS volumes were not added to /etc/fstab, and didn't mo

Re: cassandra in production environment

2011-12-12 Thread Ramesh Natarajan
- We are seeing DecoratedKey error during compaction. Looks like the sha1sum of the data file doesn't match the digest file created by cassandra. I dont have any clue where things are failing. It could be either at OS level, ESXi HBA level, or the DotHill iSCSI raid layer. - We are using sun JRE 1

RE: cassandra in production environment

2011-12-12 Thread Jason Wellonen
RHEL 6.1 and 6.2 with KVM. No file corruptions that I am aware of. Jason -Original Message- From: Ramesh Natarajan [mailto:rames...@gmail.com] Sent: Sunday, December 11, 2011 5:05 PM To: user@cassandra.apache.org Subject: cassandra in production environment Hi, We are currently test

Re: cassandra in production environment

2011-12-12 Thread Jeremiah Jordan
What java are you using? OpenJDK or Sun/Oracle (http://www.oracle.com/technetwork/java/javase/downloads/index.html)? If you are using OpenJDK you might try Sun. Have you run diagnostics on the disk? It is more likely there is an issue with your disk, not with Cassandra. On 12/11/2011 07:0

Re: Prevent create snapshot when truncate

2011-12-12 Thread Edward Capriolo
This is a tad annoying there are a few commands that snapshot for safety reasons. truncate, scrub, and maybe cleanup. You have to keep mental note to run nodetool clearsnapshot a few seconds after running these. On Mon, Dec 12, 2011 at 3:29 AM, ruslan usifov wrote: > Hello > > Every time when we

Re: a query that's killing cassandra

2011-12-12 Thread Philippe
You've got at least one row over 1GB, compacted ! Have you checked whether you are running out of heap ? 2011/12/12 Wojtek Kulik > Hello everyone! > > I have a strange problem with Cassandra (v1.0.5, one node, 8GB, 2xCPU): a > query asking for each key from a certain (super) CF results in timeou

unsubscribe

2011-12-12 Thread Priyanka Ganuthula

One ColumnFamily places data on only 3 out of 4 nodes

2011-12-12 Thread Bart Swedrowski
Hello everyone, I seem to have came across rather weird (at least for me!) problem / behaviour with Cassandra. I am running a 4-nodes cluster on Cassandra 0.8.7. For the keyspace in question, I have RF=3, SimpleStrategy with multiple ColumnFamilies inside the KeySpace. On of the ColumnFamilies

RE: Suggestion about syntax of CREATE COLUMN FAMILY

2011-12-12 Thread Stephen Pope
I'd like to second this. I've been working with Cassandra for a good while now, but when I first started little things like this were confusing. From: Don Smith [mailto:dsm...@likewise.com] Sent: Friday, December 09, 2011 3:41 PM To: user@cassandra.apache.org Subject: Suggestion about syntax of C

Re: best practices for simulating transactions in Cassandra

2011-12-12 Thread Jake Luciani
I've written a locking mechanism for Solandra (I refer to it as a reservation system) which basically allows you to acquire a lock. This is used to ensure a node is service unique sequential IDs for lucene. It sounds a bit similar to Dominic's description but I'll explain how the Solandra one wo

a query that's killing cassandra

2011-12-12 Thread Wojtek Kulik
Hello everyone! I have a strange problem with Cassandra (v1.0.5, one node, 8GB, 2xCPU): a query asking for each key from a certain (super) CF results in timeout and almost dead cassandra after that (it's somewhat alive, but does not return any data - has to be restarted). CF details:

Re: best practices for simulating transactions in Cassandra

2011-12-12 Thread Dominic Williams
Hi guys, just thought I'd chip in... Fight My Monster is still using Cages, which is working fine, but... I'm looking at using Cassandra to replace Cages/ZooKeeper(!) There are 2 main reasons:- 1. Although a fast ZooKeeper cluster can handle a lot of load (we aren't getting anywhere near to capa

Re: CPU bound workload

2011-12-12 Thread Philippe
> > Ah, I keep always assuming random partitions since it is a very common > case (just to be sure: unless you specifically want the ordering > despite the downsides, you generally want to default to the random > partitioner). > Yes, I'm working on geographical data so everything is keyed by a deri

Re: Meaning of values in tpstats

2011-12-12 Thread Philippe
> > Took me a while to figure out that // == "parallel" :) > Sorry, that's left over from Math classes :) > I'm pretty sure (but not entirely, I'd have to check the code) that > the request is forwarded as one request to the necessary node(s); what > Humm... hadn't even thought of that forwarding

Prevent create snapshot when truncate

2011-12-12 Thread ruslan usifov
Hello Every time when we do truncate, cassandra automatically create snapshots. How can we prevent this?