Re: Proper use of COUNT

2016-04-19 Thread DuyHai Doan
Jack, you should have a look at my blog post, I did some testing with various value for paging using aggregate functions: http://www.doanduyhai.com/blog/?p=2015 On Tue, Apr 19, 2016 at 10:23 PM, Jack Krupansky wrote: > BTW, I did notice this Jira for setting a client timeout for cqlsh, so > mayb

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
BTW, I did notice this Jira for setting a client timeout for cqlsh, so maybe this is the culprit for that user: CASSANDRA-7516 - Configurable client timeout for cqlsh https://issues.apache.org/jira/browse/CASSANDRA-7516 Or, should they actually be using the --request-timeout command line option f

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
Sylvain & Tyler, this Jira is for a user reporting a timeout for SELECT COUNT(*) using 3.3: https://issues.apache.org/jira/browse/CASSANDRA-11566 I'll let one of you guys follow up on that. I mean, I thought it was timing out die to the amount of data, but you guys are saying that paging should ma

Re: Proper use of COUNT

2016-04-19 Thread Tyler Hobbs
On Tue, Apr 19, 2016 at 11:32 AM, Jack Krupansky wrote: > > Are the queries sent from the coordinator to other nodes sequencing > through partitions in token order and that's what allows the coordinator to > dedupe with just a single page at a time? IOW, if a target node responds > with a row fro

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
Thanks, Tyler. "Deduping (i.e. normal conflict resolution) happens per-page" Are the queries sent from the coordinator to other nodes sequencing through partitions in token order and that's what allows the coordinator to dedupe with just a single page at a time? IOW, if a target node responds wit

Re: Proper use of COUNT

2016-04-19 Thread Tyler Hobbs
On Tue, Apr 19, 2016 at 9:51 AM, Jack Krupansky wrote: > > 1. Another clarification: All of the aggregate functions, AVG, SUM, MIN, > MAX are in exactly the same boat as COUNT, right? > Yes. > > 2. Is the paging for COUNT, et al, done within the coordinator node? > Yes. > > 3. Does dedupe o

Re: Proper use of COUNT

2016-04-19 Thread Jack Krupansky
Thanks for that clarification, Sylvain. 1. Another clarification: All of the aggregate functions, AVG, SUM, MIN, MAX are in exactly the same boat as COUNT, right? 2. Is the paging for COUNT, et al, done within the coordinator node? 3. Does dedupe on the coordinator node consume memory proportion

Re: Proper use of COUNT

2016-04-19 Thread Sylvain Lebresne
> > > Accept for relatively small or narrow queries, it seems to have a > propensity for timing out. > For recent enough version of C*, it shouldn't since it pages internally (it will be slow and as always be, but it shouldn't time out if some decent page size is used, which should be the default)

Proper use of COUNT

2016-04-18 Thread Jack Krupansky
Based on a recent inquiry and a recent thread of my own, and the coming support for wide rows, I'll focus in on this question that I feel needs better documentation of recommended best practice: When can the COUNT(*) aggregate row-counting function be used? Accept for relatively small or narrow q