I mean when the number of nodes grow, there are more virtual nodes in
total. For each vnode (or a partition range), a connection will be created.
For 3 node, 256 tokens each, replication factor=1 for simplicity, there
will be 3*256 virtual nodes, and therefore that many connections. Let me
know if there is any incorrect reasoning here. Thanks.

On Tue, Jan 27, 2015 at 11:21 PM, Huiliang Zhang <zhl...@gmail.com> wrote:

> In that case, each node will have 256/3 connections at most. Still 256
> mappers. Someone please correct me if I am wrong.
>
> On Tue, Jan 27, 2015 at 11:04 PM, Shenghua(Daniel) Wan <
> wansheng...@gmail.com> wrote:
>
>> Hi, Huiliang,
>> Great to hear from you, again!
>> Image you have 3 nodes, replication factor=1, and using default number of
>> tokens. You will have 3*256 mappers... In that case, you will be soon out
>> of mappers or reach the limit.
>>
>>
>> On Tue, Jan 27, 2015 at 10:59 PM, Huiliang Zhang <zhl...@gmail.com>
>> wrote:
>>
>>> Hi Shenghua, as I understand, each range is assigned to a mapper. Mapper
>>> will not share connections. So, it needs at least 256 connections to read
>>> all. But all 256 connections should not be set up at the same time unless
>>> you have 256 mappers running at the same time.
>>>
>>> On Tue, Jan 27, 2015 at 9:34 PM, Shenghua(Daniel) Wan <
>>> wansheng...@gmail.com> wrote:
>>>
>>>> By default, each C* node is set with 256 tokens. On a local 1-node C*
>>>> server, my hadoop drop creates 256 connections to the server. Is there any
>>>> way to control this behavior? e.g. reduce the number of connections to a
>>>> pre-configured gap.
>>>>
>>>> I debugged C* source code and found the client asks for partition
>>>> ranges, or virtual nodes. Then the client was told by server there were 257
>>>> ranges, corresponding to 257 column family splits.
>>>>
>>>> Here is a snapshot of my logs
>>>>
>>>> 15/01/27 18:02:20 DEBUG hadoop.AbstractColumnFamilyInputFormat: adding
>>>> ColumnFamilySplit((9121856086738887846, '-9223372036854775808] 
>>>> @[localhost])
>>>> ...
>>>> totally 257 splits.
>>>>
>>>> The problem is the user might only want all the data via a "select *"
>>>> like statement. It seems that 257 connections to query the rows are
>>>> necessary. However, is there any way to prohibit 257 concurrent
>>>> connections?
>>>>
>>>> My C* version is 2.0.11 and I also tried CqlPagingInputFormat, which
>>>> has same behavior.
>>>>
>>>> Thank you.
>>>>
>>>> --
>>>>
>>>> Regards,
>>>> Shenghua (Daniel) Wan
>>>>
>>>
>>>
>>
>>
>> --
>>
>> Regards,
>> Shenghua (Daniel) Wan
>>
>
>


-- 

Regards,
Shenghua (Daniel) Wan

Reply via email to