Hi Shenghua, as I understand, each range is assigned to a mapper. Mapper will not share connections. So, it needs at least 256 connections to read all. But all 256 connections should not be set up at the same time unless you have 256 mappers running at the same time.
On Tue, Jan 27, 2015 at 9:34 PM, Shenghua(Daniel) Wan <wansheng...@gmail.com > wrote: > By default, each C* node is set with 256 tokens. On a local 1-node C* > server, my hadoop drop creates 256 connections to the server. Is there any > way to control this behavior? e.g. reduce the number of connections to a > pre-configured gap. > > I debugged C* source code and found the client asks for partition ranges, > or virtual nodes. Then the client was told by server there were 257 ranges, > corresponding to 257 column family splits. > > Here is a snapshot of my logs > > 15/01/27 18:02:20 DEBUG hadoop.AbstractColumnFamilyInputFormat: adding > ColumnFamilySplit((9121856086738887846, '-9223372036854775808] @[localhost]) > ... > totally 257 splits. > > The problem is the user might only want all the data via a "select *" like > statement. It seems that 257 connections to query the rows are necessary. > However, is there any way to prohibit 257 concurrent connections? > > My C* version is 2.0.11 and I also tried CqlPagingInputFormat, which has > same behavior. > > Thank you. > > -- > > Regards, > Shenghua (Daniel) Wan >