Re: Possibility of going OOM using get_count

2011-09-25 Thread Boris Yen
Hi Aaron, Thanks for the explanation, I know the performance will be varied when the offset is a very large number, just like what has been mentioned on CASSANDRA-261. Even if the users implement the offset on the client side, they suffer the same issues, I just think it would be nice if cassandra

Re: Possibility of going OOM using get_count

2011-09-24 Thread aaron morton
The changes in get_count() are designed to stop counts for very large rows running out of memory as they try to hold millions of columns in memory. So if you ask to count all the cols in a row with 1M cols, it will (by default) read the first 1024 columns, and then the next 1024 using the last

Re: Possibility of going OOM using get_count

2011-09-22 Thread Boris Yen
On Fri, Sep 23, 2011 at 12:28 PM, aaron morton wrote: > Offsets have been discussed in previously. IIRC the main concerns were > either: > > There is no way to reliably count to start the offset, i.e. we do not lock > the row > In the new get_count function, cassandra does the internal paging in

Re: Possibility of going OOM using get_count

2011-09-22 Thread aaron morton
Offsets have been discussed in previously. IIRC the main concerns were either: There is no way to reliably count to start the offset, i.e. we do not lock the row Or performance related in, as there is not a reliable way to skip 10,000 columns other than counting 10,000 columns. With a start col

Re: Possibility of going OOM using get_count

2011-09-22 Thread Boris Yen
I was wondering if it is possible to use similar way as CASSANDRA-2894 to have the slice_predict support the offset concept? With the offset, it would be much easier to implement the paging from the client side. Boris On Mon, Sep 19, 2011 at 9

Re: Possibility of going OOM using get_count

2011-09-19 Thread Tharindu Mathew
Yes, Aaron that self implemented paging is what I'm trying. Jonathan, the last column read in the previous result fetched is the starting column of the next iteration. The end column remains constant. This is using slice ranges. Afaiu, that should work. Regards, Tharindu Sent from my iPhone

Re: Possibility of going OOM using get_count

2011-09-19 Thread Jonathan Ellis
Unfortunately no, because you don't know what the actual last-column-counted was. On Mon, Sep 19, 2011 at 4:25 AM, aaron morton wrote: > get_count() supports the same predicate as get_slice. So you can implement > the paging yourself. > Cheers > - > Aaron Morton > Freelance Cassan

Re: Possibility of going OOM using get_count

2011-09-19 Thread aaron morton
get_count() supports the same predicate as get_slice. So you can implement the paging yourself. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 19/09/2011, at 8:45 PM, Tharindu Mathew wrote: > > > On Mon, Sep 19, 2011 at 12:40

Re: Possibility of going OOM using get_count

2011-09-19 Thread Tharindu Mathew
On Mon, Sep 19, 2011 at 12:40 PM, Benoit Perroud wrote: > The workaround for 0.7 is calling get_slice and count on client side. > It's heavier, sure, but you will then be able to set start column > accordingly. > I was afraid of that :( Will follow that method. Thanks. > > > > 2011/9/19 Tharin

Re: Possibility of going OOM using get_count

2011-09-19 Thread Benoit Perroud
The workaround for 0.7 is calling get_slice and count on client side. It's heavier, sure, but you will then be able to set start column accordingly. 2011/9/19 Tharindu Mathew : > Thanks Aaron and Jake for the replies. > Any chance of a possible workaround to use for Cassandra 0.7? > > On Mon, Se

Re: Possibility of going OOM using get_count

2011-09-18 Thread Tharindu Mathew
Thanks Aaron and Jake for the replies. Any chance of a possible workaround to use for Cassandra 0.7? On Mon, Sep 19, 2011 at 3:48 AM, aaron morton wrote: > Cool > > Thanks, A > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On

Re: Possibility of going OOM using get_count

2011-09-18 Thread aaron morton
Cool Thanks, A - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 19/09/2011, at 9:55 AM, Jake Luciani wrote: > This is fixed in 1.0 > https://issues.apache.org/jira/browse/CASSANDRA-2894 > > > On Sun, Sep 18, 2011 at 2:16 PM, Tharindu M

Re: Possibility of going OOM using get_count

2011-09-18 Thread Jake Luciani
This is fixed in 1.0 https://issues.apache.org/jira/browse/CASSANDRA-2894 On Sun, Sep 18, 2011 at 2:16 PM, Tharindu Mathew wrote: > Hi everyone, > > I noticed this line in the API docs, > > The method is not O(1). It takes all the columns from disk to calculate the > answer. The only benefit of

Re: Possibility of going OOM using get_count

2011-09-18 Thread aaron morton
yes. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 19/09/2011, at 7:16 AM, Tharindu Mathew wrote: > Hi everyone, > > I noticed this line in the API docs, > The method is not O(1). It takes all the columns from disk to calculate the >

Possibility of going OOM using get_count

2011-09-18 Thread Tharindu Mathew
Hi everyone, I noticed this line in the API docs, The method is not O(1). It takes all the columns from disk to calculate the answer. The only benefit of the method is that you do not need to pull all the columns over Thrift interface to count them. Does this mean if a row has a large number of c