Hi Ewan,

It’s not the low 30 bits that are reserved. Just the first 2^30 block IDs. 
Block IDs are longs so we are unlikely to ever run out.

The choice was arbitrary and not based on likelihood of collision with earlier 
block IDs. 


On 8/25/16, 2:38 AM, "Ewan Higgs" <ewan.hi...@hgst.com> wrote:

    Hi all,
    I see in o.a.h.hdfs.server.blockmanagement.SequentialBlockIdGenerator that 
the low 30 bits of the Block ID are reserved. This was set out in HDFS-4645 
[1,2]:
    
    “””
    We do not change the block ID of any existing blocks on upgrade. Such 
existing blocks whose IDs were randomly generated are subsequently referred to 
as legacy blocks.
    
    Henceforth block IDs will be allocated sequentially starting from a fixed 
constant e.g. 2^30.
    “””
    
    This doesn’t really follow since a uniform distribution wouldn’t have made 
block IDs all that likely to have populated those low 30 bits. My only guess is 
that the pseudorandom number generator in the legacy block ID generation was 
not uniform across the 64 bit block ID space.
    
    In the Jira, Suresh suggested only 16 bits:
    
    “””
    I think we could reserve few block IDs say 0-65535 and start generating 
from 65535. When it reaches some max, we could rollover to negative numbers. 
That is a decision that can be made in the future.
    “””
    
    So I’m curious why the initial range was skipped if the pseudorandom number 
block ID generator wouldn’t really have favoured the low range of block IDs. 
And why was the initial range of 16 bits changed to skip the initial 30 bits?
    
    Thanks for the help in understanding this!
    
    Yours,
    Ewan
    
    [1] https://issues.apache.org/jira/browse/HDFS-4645
    [2]  
https://issues.apache.org/jira/secure/attachment/12589172/SequentialblockIDallocation.pdf
    
    Western Digital Corporation (and its subsidiaries) E-mail Confidentiality 
Notice & Disclaimer:
    
    This e-mail and any files transmitted with it may contain confidential or 
legally privileged information of WDC and/or its affiliates, and are intended 
solely for the use of the individual or entity to which they are addressed. If 
you are not the intended recipient, any disclosure, copying, distribution or 
any action taken or omitted to be taken in reliance on it, is prohibited. If 
you have received this e-mail in error, please notify the sender immediately 
and delete the e-mail in its entirety from your system.
    


Reply via email to