What is the data size of the column family you're trying to fetch with paging ? Are you storing big blob or just primitive values ?
On Fri, Jan 9, 2015 at 8:33 AM, Mohammed Guller <moham...@glassbeam.com> wrote: > Hi – > > > > We have an ETL application that reads all rows from Cassandra (2.1.2), > filters them and stores a small subset in an RDBMS. Our application is > using Datastax’s Java driver (2.1.4) to fetch data from the C* nodes. Since > the Java driver supports automatic paging, I was under the impression that > SELECT queries should not cause an OOM error on the C* nodes. However, even > with just 16GB data on each nodes, the C* nodes start throwing OOM error as > soon as the application starts iterating through the rows of a table. > > > > The application code looks something like this: > > > > Statement stmt = new SimpleStatement("SELECT x,y,z FROM > cf").setFetchSize(5000); > > ResultSet rs = session.execute(stmt); > > while (!rs.isExhausted()){ > > row = rs.one() > > process(row) > > } > > > > Even after we reduced the page size to 1000, the C* nodes still crash. C* > is running on M3.xlarge machines (4-cores, 15GB). We manually increased the > heap size to 8GB just to see how much heap C* consumes. With 10-15 minutes, > the heap usage climbs up to 7.6GB. That does not make sense. Either > automatic paging is not working or we are missing something. > > > > Does anybody have insights as to what could be happening? Thanks. > > > > Mohammed > > > > >