> I was playing with a single-node Cassandra installation when discovered > that a request like [SELECT COUNT(*) FROM CF] seems to load the entire > dataset of CF into RAM. >
This is the case (the whole CF will be loaded in memory). And it's currently a know limitation of Cassandra 1.2. This will be fix in Cassandra 2.0 but require some ground work (made in https://issues.apache.org/jira/browse/CASSANDRA-4415) that is too complex to backport in 1.2. So avoid those count queries for now unless you know the data set is small. > As far as I understand, a counting request works roughly the same way as > [SELECT * FROM] with only difference that it doesn't return any data back. > Is my reasoning correct? > That part is pretty much correct. If you do SELECT * FROM CF (without any WHERE clause that is), it will also load the whole CF in memory and I would bet that this OOM as well (if the count(*) OOM). -- Sylvain