Re: Performance issues with "many" CQL columns

2016-02-14 Thread Gianluca Borello
Considering the (simplified) table that I wrote before: create table data ( id bigint, ts bigint, column1 blob, column2 blob, column3 blob, ... column29 blob, column30 blob primary key (id, ts) A user request (varies every time) translates into a set of queries asking a subset of the columns (< 1

Re: Performance issues with "many" CQL columns

2016-02-14 Thread Jack Krupansky
What does your query actually look like today? Is your non-EQ on timestamp selecting a single row a few rows or many rows (dozens, hundreds, thousands)? -- Jack Krupansky On Sun, Feb 14, 2016 at 7:40 PM, Gianluca Borello wrote: > Thanks again. > > One clarification about "reading in a single

Re: Performance issues with "many" CQL columns

2016-02-14 Thread Gianluca Borello
Thanks again. One clarification about "reading in a single SELECT": in my point 2, I mentioned the need to read a variable subset of columns every time, usually in the range of ~5 out of 30. I can't find a way to do that in a single SELECT unless I use the IN operator (which I can't, as explained)

Re: Performance issues with "many" CQL columns

2016-02-14 Thread Jack Krupansky
You can definitely read all of columns in a single SELECT. And the n-INSERTS can be batched and will insert fewer cells in the storage engine than the previous approach. -- Jack Krupansky On Sun, Feb 14, 2016 at 7:31 PM, Gianluca Borello wrote: > Thank you for your reply. > > Your advice is def

Re: Performance issues with "many" CQL columns

2016-02-14 Thread Gianluca Borello
Thank you for your reply. Your advice is definitely sound, although it still seems suboptimal to me because: 1) It requires N INSERT queries from the application code (where N is the number of columns) 2) It requires N SELECT queries from my application code (where N is the number of columns I n

Re: Performance issues with "many" CQL columns

2016-02-14 Thread Jack Krupansky
You could add the column number as an additional clustering key. And then you can actually use COMPACT STORAGE for even more efficient storage and access (assuming there is only a single non-PK data column, the blob value.) You can then access (read or write) an individual column/blob or a slice o

Re: Performance issues with CQL3 collections?

2013-06-28 Thread Fabien Rousseau
IMHO, having many tombstones can slow down reads and writes in the following cases : - For reads, it is slow if the requested slice contains many tombstones - For writes, it is is slower if the row in the memtable contains many tombstones. It's because, if the IntervalTree contains N intervals,

Re: Performance issues with CQL3 collections?

2013-06-28 Thread Sylvain Lebresne
As documented at http://cassandra.apache.org/doc/cql3/CQL.html#collections, the lists have 3 operations that require a read before a write (and should thus be avoided in performance sensitive code), namely setting and deleting by index, and removing by value. Outside of that, collections involves n

Re: Performance issues with CQL3 collections?

2013-06-27 Thread Theo Hultberg
the thing I was doing was definitely triggering the range tombstone issue, this is what I was doing: UPDATE clocks SET clock = ? WHERE shard = ? in this table: CREATE TABLE clocks (shard INT PRIMARY KEY, clock MAP) however, from the stack overflow posts it sounds like they aren't necess

Re: Performance issues with CQL3 collections?

2013-06-27 Thread aaron morton
Can you provide details of the mutation statements you are running ? The Stack Overflow posts don't seem to include them. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 27/06/2013, at 5:58 AM, Theo Hultberg wrote:

Re: Performance issues with CQL3 collections?

2013-06-26 Thread Theo Hultberg
do I understand it correctly if I think that collection modifications are done by reading the collection, writing a range tombstone that would cover the collection and then re-writing the whole collection again? or is it just the modified parts of the collection that are covered by the range tombst

Re: Performance issues with CQL3 collections?

2013-06-26 Thread Fabien Rousseau
Hi, I'm pretty sure that it's related to this ticket : https://issues.apache.org/jira/browse/CASSANDRA-5677 I'd be happy if someone tests this patch. It should apply easily on 1.2.5 & 1.2.6 After applying the patch, by default, the current implementation is still used, but modify your cassandra.

RE: Performance issues after upgrading to Cassandra 0.7.6-2 from Cassandra 0.6.6

2011-06-03 Thread Nir Cohen
ites <http://www.similarsites.com/> | TopSite <http://www.topsite.com/> From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Friday, June 03, 2011 2:41 AM To: user@cassandra.apache.org Subject: Re: Performance issues after upgrading to Cassandra 0.7.6-2 from Cassandra 0.6.6 Where is th

Re: Performance issues after upgrading to Cassandra 0.7.6-2 from Cassandra 0.6.6

2011-06-02 Thread Jonathan Ellis
Where is the bottleneck? See http://spyced.blogspot.com/2010/01/linux-performance-basics.html On Thu, Jun 2, 2011 at 11:18 AM, Nir Cohen wrote: > Hi all, > > Recently we have upgraded our Cassandra servers to version 0.7.6-2 from > 0.6.6 > > We are experiencing a severe performance issues – we m

Re: Performance Issues

2010-07-13 Thread Ran Tavory
Since you're using hector hector-users@ is a good place to be, so u...@cassandra to bcc operateWithFailover is one stop before sending the request over the network and waiting, so it makes lots of sense that a significant part of the application is spent in it. On Tue, Jul 13, 2010 at 6:22 PM, Sa