[ https://issues.apache.org/jira/browse/HADOOP-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer resolved HADOOP-5071. -------------------------------------- Resolution: Won't Fix Hadoop 1.0 was released. > Hadoop 1.0 Compatibility Requirements > ------------------------------------- > > Key: HADOOP-5071 > URL: https://issues.apache.org/jira/browse/HADOOP-5071 > Project: Hadoop Common > Issue Type: Sub-task > Reporter: Sanjay Radia > Assignee: Sanjay Radia > > The purpose of this Jira is to decide on Hadoop 1.0 Compatibility > requirements > A proposal is described below that was discussed on email alias > core-...@hadoop.apache.org > Release terminology used below: > *Standard release numbering: major, minor, dot releases* > * Only bug fixes in dot releases: m.x.y > ** no changes to API, disk format, protocols or config etc. in a dot release > * new features in major (m.0) and minor (m.x.0) releases > *Hadoop Compatibility Proposal* > - *1 API Compatibility* > No need for client recompilation when upgrading across minor releases (ie. > from m.x to m.y, where x <= y) > Classes or methods deprecated in m.x can be removed in (m+1).0 > Note that this is stronger than what we have been doing in Hadoop 0.x > releases. > This is fairly standard compatibility rules for major and minor > releases. > - *2 Data Compatibility* > -- Motivation: Users expect File systems preserve data transparently across > releases. > -- 2.a HDFS metadata and data can change across minor or major releases , but > such changes are transparent to user application. That is release upgrade > must automatically convert the metadata and data as needed. Further, a > release upgrade must allow a cluster to roll back to the older version and > its older disk format. (rollback needs to restore the orignal data not any > updated data). > -- 2.a-WeakerAutomaticConversion: > Automatic conversion is support across a small number of releases. If a user > wants to jump across multiple releases he may be forced to go through a few > intermediate release to get to the final desired release. > - *3 Wire Protocol Compatibility* > We offer no wire compatibility in our 0.x release today. > -- Motivation: The motivation *isn't* to make the hadoop protocols public. > Applications will not call the protocol directly but through a library (in > our case FileSystem class and its implementations). Instead the motivation is > that customers run multiple clusters and have apps that access data across > clusters. Customers cannot be expected to update all clusters simultaneously. > -- 3.a Old m.x clients can connect to new m.y servers, where x <= y but the > old clients might get reduced functionality or performance. m.x clients might > not be able to connect to (m+1).z servers > -- 3.b. New m.y clients must be able to connect to old m.x server, where x< y > but only for old m.x functionality. > Comment: Generally old API methods continue to use old rpc methods. However, > it is legal to have new implementations of old API methods call new > rpcs methods, as long as the library transparently handles the fallback case > for old servers. > -- 3.c. At any major release transition [ ie from a release m.x to a release > (m+1).0], a user should be able to read data from the cluster running the old > version. > --- Motivation: data copying across clusters is a common operation for many > customers. For example this is routinely at done at Yahoo; another use case > is HADOOP-4058. Today, http (or hftp) provides a guaranteed compatible way of > copying data across versions. Clearly one cannot force a customer to > simultaneously update all its Hadoop clusters on to a new major release. We > can satisfy this requirement via the http/hftp mechanism or some other > mechanism. > -- 3.c-Stronger > Shall we add a stronger requirement for 1. 0 : wire compatibility across > major versions? That is not just for reading but for all operations. This can > be supported by class loading or other games. > Note we can wait to provide this when 2. 0 happens. If Hadoop provided this > guarantee then it would allow customers to partition their data across > clusters without risking apps breaking across major releases due to wire > incompatibility issues. > --- Motivation: Data copying is a compromise. Customers really want to run > apps across clusters running different versions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira