I did another experiment to verify indeed 3*257 (1 of 257 ranges is null
effectively) mappers were created.

Thanks mcm for the information !

On Wed, Jan 28, 2015 at 12:17 AM, mck <m...@apache.org> wrote:

> Shenghua,
>
> > The problem is the user might only want all the data via a "select *"
> > like statement. It seems that 257 connections to query the rows are
> necessary.
> > However, is there any way to prohibit 257 concurrent connections?
>
>
> Your reasoning is correct.
> The number of connections should be tunable via the
> "cassandra.input.split.size" property. See
> ConfigHelper.setInputSplitSize(..)
>
> The problem is that vnodes completely trashes this, since splits
> returned don't span across vnodes.
> There's an issue out for this –
> https://issues.apache.org/jira/browse/CASSANDRA-6091
>  but part of the problem is that the thrift stuff involved here is
>  getting rewritten¹ to be pure cql.
>
> In the meantime you override the CqlInputFormat and manually re-merge
> splits together, where location sets match, so to better honour
> inputSplitSize and to return to a more reasonable number of connections.
> We do this, using code similar to this patch
> https://github.com/michaelsembwever/cassandra/pull/2/files
>
> ~mck
>
> ¹ https://issues.apache.org/jira/browse/CASSANDRA-8358
>



-- 

Regards,
Shenghua (Daniel) Wan

Reply via email to