What is the data size of the column family you're trying to fetch with
paging ? Are you storing big blob or just primitive values ?

On Fri, Jan 9, 2015 at 8:33 AM, Mohammed Guller <moham...@glassbeam.com>
wrote:

>  Hi –
>
>
>
> We have an ETL application that reads all rows from Cassandra (2.1.2),
> filters them and stores a small subset in an RDBMS. Our application is
> using Datastax’s Java driver (2.1.4) to fetch data from the C* nodes. Since
> the Java driver supports automatic paging, I was under the impression that
> SELECT queries should not cause an OOM error on the C* nodes. However, even
> with just 16GB data on each nodes, the C* nodes start throwing OOM error as
> soon as the application starts iterating through the rows of a table.
>
>
>
> The application code looks something like this:
>
>
>
> Statement stmt = new SimpleStatement("SELECT x,y,z FROM
> cf").setFetchSize(5000);
>
> ResultSet rs = session.execute(stmt);
>
> while (!rs.isExhausted()){
>
>       row = rs.one()
>
>       process(row)
>
> }
>
>
>
> Even after we reduced the page size to 1000, the C* nodes still crash. C*
> is running on M3.xlarge machines (4-cores, 15GB). We manually increased the
> heap size to 8GB just to see how much heap C* consumes. With 10-15 minutes,
> the heap usage climbs up to 7.6GB. That does not make sense. Either
> automatic paging is not working or we are missing something.
>
>
>
> Does anybody have insights as to what could be happening? Thanks.
>
>
>
> Mohammed
>
>
>
>
>

Reply via email to