If protocol compatibility of v2 and v3 is a goal, HADOOP-8990 should be a blocker for v2.
__Luke On Fri, Apr 26, 2013 at 12:07 PM, Eli Collins <e...@cloudera.com> wrote: > On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <a...@hortonworks.com> > wrote: > > > > On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote: > > > >> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <a...@hortonworks.com> > wrote: > >> > >>> With that in mind, I really want to make a serious push to lock down > APIs and wire-protocols for hadoop-2.0.5-beta. > >>> Thus, we can confidently support hadoop-2.x in a compatible manner in > the future. So, it's fine to add new features, > >>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta > >> > >> Arun, since it sounds like you have a pretty definite idea > >> in mind for what you want 'beta' label to actually mean, > >> could you, please, share the exact criteria? > > > > Sorry, I'm not sure if this is exactly what you are looking for but, as > I mentioned above, the primary aim would be make the final set of required > API/write-protocol changes so that we can call it a 'beta' i.e. once > 2.0.5-beta ships users & downstream projects can be confident about forward > compatibility in hadoop-2.x line. Obviously, we might discover a blocker > bug post 2.0.5 which *might* necessitate an unfortunate change - but that > should be an outstanding exception. > > Arun, Suresh, > > Mind reviewing the following page Karthik put together on > compatibility? http://wiki.apache.org/hadoop/Compatibility > > I think we should do something similar to what Sanjay proposed in > HADOOP-5071 for Hadoop v2. If we get on the same page on > compatibility terms/APIs then we can quickly draft the policy, at > least for the things we've already got consensus on. I think our new > developers, users, downstream projects, and partners would really > appreciate us making this clear. If people like the content we can > move it to the Hadoop website and maintain it in svn like the bylaws. > > The reason I think we need to do so is because there's been confusion > about what types of compatibility we promise and some open questions > which I'm not sure everyone is clear on. Examples: > - Are we going to preserve Hadoop v3 clients against v2 servers now > that we have protobuf support? (I think so..) > - Can we break rolling upgrade of daemons in updates post GA? (I don't > think so..) > - Do we disallow HDFS metadata changes that require an HDFS upgrade in > an update? (I think so..) > - Can we remove methods from v2 and v2 updates that were deprecated in > v0.20-22? (Unclear) > - Will we preserve binary compatibility for MR2 going forward? (I think > so..) > - Does the ability to support multiple versions of MR simultaneously > via MR2 change the MR API compatibility story? (I don't think so..) > - Are the RM protocols sufficiently stable to disallow incompatible > changes potentially required by non-MR projects? (Unclear, most large > Yarn deployments I'm aware of are running 0.23, not v2 alphas) > > I'm also not sure there's currently consensus on what an incompatible > change is. For example, I think HADOOP-9151 is incompatible because it > broke client/server wire compatibility with previous releases and any > change that breaks wire compatibility is incompatible. Suresh felt it > was not an incompatible change because it did not affect API > compatibility (ie PB is not considered part of the API) and the change > occurred while v2 is in alpha. Not sure we need to go through the > whole exercise of what's allowed in an alpha and beta (water under the > bridge, hopefully), but I do think we should clearly define an > incompatible change. It's fine that v2 has been a bit wild wild west > in the alpha development stage but I think we need to get a little > more rigorous. > > Thanks, > Eli >