I mean when the number of nodes grow, there are more virtual nodes in total. For each vnode (or a partition range), a connection will be created. For 3 node, 256 tokens each, replication factor=1 for simplicity, there will be 3*256 virtual nodes, and therefore that many connections. Let me know if there is any incorrect reasoning here. Thanks.
On Tue, Jan 27, 2015 at 11:21 PM, Huiliang Zhang <zhl...@gmail.com> wrote: > In that case, each node will have 256/3 connections at most. Still 256 > mappers. Someone please correct me if I am wrong. > > On Tue, Jan 27, 2015 at 11:04 PM, Shenghua(Daniel) Wan < > wansheng...@gmail.com> wrote: > >> Hi, Huiliang, >> Great to hear from you, again! >> Image you have 3 nodes, replication factor=1, and using default number of >> tokens. You will have 3*256 mappers... In that case, you will be soon out >> of mappers or reach the limit. >> >> >> On Tue, Jan 27, 2015 at 10:59 PM, Huiliang Zhang <zhl...@gmail.com> >> wrote: >> >>> Hi Shenghua, as I understand, each range is assigned to a mapper. Mapper >>> will not share connections. So, it needs at least 256 connections to read >>> all. But all 256 connections should not be set up at the same time unless >>> you have 256 mappers running at the same time. >>> >>> On Tue, Jan 27, 2015 at 9:34 PM, Shenghua(Daniel) Wan < >>> wansheng...@gmail.com> wrote: >>> >>>> By default, each C* node is set with 256 tokens. On a local 1-node C* >>>> server, my hadoop drop creates 256 connections to the server. Is there any >>>> way to control this behavior? e.g. reduce the number of connections to a >>>> pre-configured gap. >>>> >>>> I debugged C* source code and found the client asks for partition >>>> ranges, or virtual nodes. Then the client was told by server there were 257 >>>> ranges, corresponding to 257 column family splits. >>>> >>>> Here is a snapshot of my logs >>>> >>>> 15/01/27 18:02:20 DEBUG hadoop.AbstractColumnFamilyInputFormat: adding >>>> ColumnFamilySplit((9121856086738887846, '-9223372036854775808] >>>> @[localhost]) >>>> ... >>>> totally 257 splits. >>>> >>>> The problem is the user might only want all the data via a "select *" >>>> like statement. It seems that 257 connections to query the rows are >>>> necessary. However, is there any way to prohibit 257 concurrent >>>> connections? >>>> >>>> My C* version is 2.0.11 and I also tried CqlPagingInputFormat, which >>>> has same behavior. >>>> >>>> Thank you. >>>> >>>> -- >>>> >>>> Regards, >>>> Shenghua (Daniel) Wan >>>> >>> >>> >> >> >> -- >> >> Regards, >> Shenghua (Daniel) Wan >> > > -- Regards, Shenghua (Daniel) Wan