SELECT some_column vs SELECT *

Kai Wang Tue, 24 Nov 2015 07:47:11 -0800

Hi all,

If I have the following table:
CREATE TABLE t (
  pk int,
  ck int,
  c1 int,
  c2 int,
  ...
  PRIMARY KEY (pk, ck)
)


There are lots of non-clustering columns (1000+). From time to time I need
to do a query like this:

SELECT c1 FROM t WHERE pk = abc AND ck > xyz;

How efficient is this query compared to SELECT * ...? Apparently SELECT c1
would save a lot of network bandwidth since only c1 needs to be transferred
on the wire. But I am more interested in the impact on disk IO. If I
understand C* storage engine correctly, one CQL row is clustered together
on disk. That means c1 from different rows are stored apart. In the case of
SELECT c1, does C* do multiple seeks to only lift c1 of each row from disk
or lift the whole row into memory and return c1 from there?

>From comments on https://issues.apache.org/jira/browse/CASSANDRA-5762 it
seems C* lifts the whole row as of 1.2.7. Is this still the case on 2.1.*?

Thanks.

SELECT some_column vs SELECT *

Reply via email to