Column Slice Query performance after deletions

2013-03-02 Thread Víctor Hugo Oliveira Molinar
Hello guys. I'm investigating the reasons of performance degradation for my case scenario which follows: - I do have a column family which is filled of thousands of columns inside a unique row(varies between 10k ~ 200k). And I do have also thousands of rows, not much more than 15k. - This rows are

Re: Column Slice Query performance after deletions

2013-03-02 Thread Michael Kjellman
When is the last time you did a cleanup on the cf? On Mar 2, 2013, at 9:48 AM, "Víctor Hugo Oliveira Molinar" wrote: > Hello guys. > I'm investigating the reasons of performance degradation for my case scenario > which follows: > > - I do have a column family which is filled of thousands of c

Re: Column Slice Query performance after deletions

2013-03-02 Thread Víctor Hugo Oliveira Molinar
I have a daily maintenance of my cluster where I truncate this column family. Because its data doesnt need to be kept more than a day. Since all the regular operations on it finishes around 4 hours before finishing the day. I regurlarly run a truncate on it followed by a repair at the end of the da

Re: Column Slice Query performance after deletions

2013-03-02 Thread Michael Kjellman
What is your gc_grace set to? Sounds like as the number of tombstones records increase your performance decreases. (Which I would expect) On Mar 2, 2013, at 10:28 AM, "Víctor Hugo Oliveira Molinar" mailto:vhmoli...@gmail.com>> wrote: I have a daily maintenance of my cluster where I truncate thi

Re: Column Slice Query performance after deletions

2013-03-02 Thread Edward Capriolo
Casandra's data files are write once. Deletes are another write. Until compaction they all live on disk.Making really big rows has these problem. On Sat, Mar 2, 2013 at 1:42 PM, Michael Kjellman wrote: > What is your gc_grace set to? Sounds like as the number of tombstones > records increase your

Re: Column Slice Query performance after deletions

2013-03-02 Thread Víctor Hugo Oliveira Molinar
What is your gc_grace set to? Sounds like as the number of tombstones records increase your performance decreases. (Which I would expect) gr_grace is default. Casandra's data files are write once. Deletes are another write. Until compaction they all live on disk.Making really big rows has these

Re: Column Slice Query performance after deletions

2013-03-02 Thread Michael Kjellman
Tombstones stay around until gc grace so you could lower that to see of that fixes the performance issues. Size tiered or leveled comparison? On Mar 2, 2013, at 11:15 AM, "Víctor Hugo Oliveira Molinar" mailto:vhmoli...@gmail.com>> wrote: What is your gc_grace set to? Sounds like as the number

Re: Column Slice Query performance after deletions

2013-03-02 Thread Víctor Hugo Oliveira Molinar
Tombstones stay around until gc grace so you could lower that to see of that fixes the performance issues. If the tombstones get collected,the column will live again, causing data inconsistency since I cant run a repair during the regular operations. Not sure if I got your thoughts on this. Size

Re: -pr vs. no -pr

2013-03-02 Thread Jim Cistaro
One other slight advantage of -prŠ We sometimes have repairs that hang and need to be killed and restarted. -pr means you have to "redo" a fraction of the work. jc -Original Message- From: , Dean Reply-To: "user@cassandra.apache.org" Date: Friday, March 1, 2013 5:46 AM To: "user@cassan