Haohui Mai created HDFS-5698:
--------------------------------

             Summary: Use protobuf to serialize / deserialize FSImage
                 Key: HDFS-5698
                 URL: https://issues.apache.org/jira/browse/HDFS-5698
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Haohui Mai
            Assignee: Haohui Mai


Currently, the code serializes FSImage using in-house serialization mechanisms. 
There are a couple disadvantages of the current approach:

# Mixing the responsibility of reconstruction and serialization / 
deserialization. The current code paths of serialization / deserialization have 
spent a lot of effort on maintaining compatibility. What is worse is that they 
are mixed with the complex logic of reconstructing the namespace, making the 
code difficult to follow.
# Poor documentation of the current FSImage format. The format of the FSImage 
is practically defined by the implementation. An bug in implementation means a 
bug in the specification. Furthermore, it also makes writing third-party tools 
quite difficult.
# Changing schemas is non-trivial. Adding a field in FSImage requires bumping 
the layout version every time. Bumping out layout version requires (1) the 
users to explicitly upgrade the clusters, and (2) putting new code to maintain 
backward compatibility.


This jira proposes to use protobuf to serialize the FSImage. Protobuf has been 
used to serialize / deserialize the RPC message in Hadoop.

Protobuf addresses all the above problems. It clearly separates the 
responsibility of serialization and reconstructing the namespace. The protobuf 
files document the current format of the FSImage. The developers now can add 
optional fields with ease, since the old code can always read the new FSImage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to