Re: large range read in Cassandra

2015-02-02 Thread Dan Kinder
For the benefit of others, I ended up finding out that the CQL library I was using (https://github.com/gocql/gocql) at this time leaves paging page size defaulted to no paging, so Cassandra was trying to pull all rows of the partition into memory at once. Setting the page size to a reasonable numbe

Re: large range read in Cassandra

2014-11-25 Thread Dan Kinder
Thanks, very helpful Rob, I'll watch for that. On Tue, Nov 25, 2014 at 11:45 AM, Robert Coli wrote: > On Tue, Nov 25, 2014 at 10:45 AM, Dan Kinder wrote: > >> To be clear, I expect this range query to take a long time and perform >> relatively heavy I/O. What I expected Cassandra to do was use

Re: large range read in Cassandra

2014-11-25 Thread Robert Coli
On Tue, Nov 25, 2014 at 10:45 AM, Dan Kinder wrote: > To be clear, I expect this range query to take a long time and perform > relatively heavy I/O. What I expected Cassandra to do was use auto-paging ( > https://issues.apache.org/jira/browse/CASSANDRA-4415, > http://stackoverflow.com/questions/1

Re: large range read in Cassandra

2014-11-25 Thread Dan Kinder
Thanks Rob. To be clear, I expect this range query to take a long time and perform relatively heavy I/O. What I expected Cassandra to do was use auto-paging ( https://issues.apache.org/jira/browse/CASSANDRA-4415, http://stackoverflow.com/questions/17664438/iterating-through-cassandra-wide-row-with

Re: large range read in Cassandra

2014-11-24 Thread Robert Coli
On Mon, Nov 24, 2014 at 4:26 PM, Dan Kinder wrote: > We have a web crawler project currently based on Cassandra ( > https://github.com/iParadigms/walker, written in Go and using the gocql > driver), with the following relevant usage pattern: > > - Big range reads over a CF to grab potentially mil

large range read in Cassandra

2014-11-24 Thread Dan Kinder
Hi, We have a web crawler project currently based on Cassandra ( https://github.com/iParadigms/walker, written in Go and using the gocql driver), with the following relevant usage pattern: - Big range reads over a CF to grab potentially millions of rows and dispatch new links to crawl - Fast inse