Shenghua, > The problem is the user might only want all the data via a "select *" > like statement. It seems that 257 connections to query the rows are necessary. > However, is there any way to prohibit 257 concurrent connections?
Your reasoning is correct. The number of connections should be tunable via the "cassandra.input.split.size" property. See ConfigHelper.setInputSplitSize(..) The problem is that vnodes completely trashes this, since splits returned don't span across vnodes. There's an issue out for this – https://issues.apache.org/jira/browse/CASSANDRA-6091 but part of the problem is that the thrift stuff involved here is getting rewritten¹ to be pure cql. In the meantime you override the CqlInputFormat and manually re-merge splits together, where location sets match, so to better honour inputSplitSize and to return to a more reasonable number of connections. We do this, using code similar to this patch https://github.com/michaelsembwever/cassandra/pull/2/files ~mck ¹ https://issues.apache.org/jira/browse/CASSANDRA-8358