I hear what you are saying. Lets begin with 3 concerns: - How will we keep the community motivated on fixing both master and branch-2? Until we do a stable release from master, stable releases can come only from branch-2. If a contributor wants to see their fix reach to users on a stable line quickly they would have to have a fix on branch-2. Also, a release manager can pick whatever fixes she wants, so even if contributor doesn't commit it on branch-2, a release manger who wants to do a release containing a set of fixes thats always possible.
- *Harder cherry-picks between master and branch-2*. That is certainly possible. But hope is we want to keep branch-2 stable, so we don't backport large features which may run into this issue. Smaller focussed bug fix backport should be possible. - *Removal of MR2 on the master branch*. This is something I personally would like to see. But exact timing of it will be decided by community. I am certainly not saying that as soon as branch-2 is created, lets remove MR2 on master. I would also say that in the end ASF is volunteer organization, we cant force people to adopt one branch or another. Its upto the contributors what jiras they work on and when and where they commit it. By not creating a branch-2 only thing we can guarantee is that rate of development on master to remain slow because we don't want to start doing backward incompatible changes without explicitly acknowledging that. Thanks, Ashutosh On Thu, Mar 9, 2017 at 12:01 PM, Sergio Pena <sergio.p...@cloudera.com> wrote: > Hey Ashutosh, thanks for soliciting feedback on this. > > I like the idea you're proposing; maintaining compatibility and at the > same time adding newer features to > Hive consumes a lot of development time and effort. > > However, I think some users and companies have just started to use Hive 2.x > branch as their main major upgrade on Hive > (possible due to waiting for stabilization and testing upgrades), but > cutting this major branch that just has 1 year of life > might make us look like we will forget about the quality of Hive 2.x as we > did with branch-1. > > Hive 1.x latest version was 1.2, and its development stopped because new > features on Hive 2.x > Hive 2.x latest version is 2.1, and we want to create Hive 3.x because of > newer features and incompatibilities. > Will Hive 3.x have the same future after 3.1 is released? > > What I'm also concerned is about these three things: > > - *Branch-2 quality commitment*. > How will we keep the community motivated on fixing both master and > branch-2? > - *Harder cherry-picks between master and branch-2*. > Because master will be incompatible by nature, then cherry-picks to > branch-2 will be harder. > - *Removal of MR2 on the master branch*. > This was marked as deprecated just last year, but MR2 is still an engine > that is used by several users. > > I accept that the end of life of major versions will come at some point, > and these concerns will expire, > but Hive 2.x is kind of young, isn't it? > > Should we try to stabilize the Hive 2.x line first, and have a few more > releases before starting to work on Hive 3.0? > Should we add more test coverage to Hive jenkins jobs to validate Hive 2.x > quality? > Should we agree on a date about when we should drop community support on > Hive versions to let users know about this? > > Again, I like your proposal, but I'm afraid that users who just upgraded to > 2.x won't have any more features and improvements > because they will be developed on 3.0. > > - Sergio > > > > On Mon, Mar 6, 2017 at 1:24 PM, Ashutosh Chauhan < > ashutosh.chau...@gmail.com > > wrote: > > > The way it helps shedding debt is because dev can now do refactoring > > without fear of breaking some rarely used features. The way that helps > for > > adding feature faster is since codebase is lean and easier to reason > about > > its much easier to add new features. > > > > More importantly though, it also helps users because we are setting the > > expectation from dev community. They can expect that future releases of > 2.x > > to be backward compatible. At the same time whenever they decide to > upgrade > > they only need to test their application once against 3.x as oppose to > > continuous breakage of one form or another if we continue to make > > incompatible changes in master without branching for 2.x > > > > Thanks, > > Ashutosh > > > > On Sat, Mar 4, 2017 at 10:19 AM, Edward Capriolo <edlinuxg...@gmail.com> > > wrote: > > > > > Also i dont follow how we remove > > > > > > On Saturday, March 4, 2017, Edward Capriolo <edlinuxg...@gmail.com> > > wrote: > > > > > > > > > > > > > > > On Fri, Mar 3, 2017 at 8:46 PM, Thejas Nair <thejas.n...@gmail.com > > > > <javascript:_e(%7B%7D,'cvml','thejas.n...@gmail.com');>> wrote: > > > > > > > >> +1 > > > >> There are some features that are incomplete and what I would not > > > recommend > > > >> for any real production use.The 'legacy authorization mode' is a > great > > > >> example of that - > > > >> https://cwiki.apache.org/confluence/display/Hive/Hive+Defaul > > > >> t+Authorization+-+Legacy+Mode > > > >> . It is inherently insecure mode that nobody should be using. > > > >> > > > >> There is also potential to cleanup of the thrift api. However, there > > are > > > >> many users of this api, we would need to go the deprecation then > > remove > > > >> after couple of releases route or so for that. > > > >> > > > >> I am sure there are many other candidates. We will have to evaluate > > each > > > >> of > > > >> those features on the risk/benefit of keeping them and arriving at a > > > >> decision. > > > >> > > > >> Also, +1 on getting a 2.2 release out before we branch. > > > >> > > > >> > > > >> > > > >> On Fri, Mar 3, 2017 at 1:50 PM, Ashutosh Chauhan < > > hashut...@apache.org > > > >> <javascript:_e(%7B%7D,'cvml','hashut...@apache.org');>> > > > >> wrote: > > > >> > > > >> > Hi all, > > > >> > > > > >> > Hive project has come a long way. With wide-spread adoption also > > comes > > > >> > expectations. Expectation of being backward compatible and not > > > breaking > > > >> > things. However that doesn't come free of cost and results in lot > of > > > >> legacy > > > >> > code which can't be refactored without fear of breaking things. > As a > > > >> result > > > >> > project has accumulated lot of debt over time. At the same time > > there > > > >> are > > > >> > also lot of features which have seen little uptake. We may want to > > > drop > > > >> > some of those. > > > >> > > > > >> > In order to move forward and shed that debt we may need a major > > > version > > > >> > release which allows us to make backward incompatible changes and > > drop > > > >> > rarely used features. At the same time there are lots of users > which > > > are > > > >> > consuming currently released 2.1 , 2.2 branches and expect them to > > > stay > > > >> on > > > >> > it for some time. So, I propose that we create branch-2 from > current > > > tip > > > >> > and do future 2.x releases from that branch and keep it backward > > > >> > compatible. This will allow devs to land breaking changes on > master > > > and > > > >> > pave way to release hive 3.0 in future. > > > >> > > > > >> > Ofcourse, each specific incompatible change and feature drop even > > on > > > >> > master need to be evaluated on its own merit on corresponding > jira. > > > This > > > >> > email is just a solicitation of feedback for creating branch-2 and > > > >> allowing > > > >> > breaking changes in master. Thoughts? > > > >> > > > > >> > Thanks, > > > >> > Ashutosh > > > >> > > > > >> > > > > > > > > One of the challenges of the developers conducting the risk-benefit > > > > analysis are that the developers are mostly focused on new features, > > but > > > > there are deployments of hive that are 5+ years old and people that > > rely > > > on > > > > the features are not on the mailing list. > > > > > > > > For example I developed and use this frequently: > > > > > > > > https://community.hortonworks.com/articles/8861/apache-hive- > > > > groovy-udf-examples.html > > > > > > > > My career went away from hive for a while. I was quite surprised to > > find > > > > out the cli->beeline it was more or less decided not to port it. I > > > learned > > > > of this the first time I was forced to work in a hive server only > > > > environment and it did not work. > > > > > > > > Now I have to go and spend time adding this back so I don't have to > > work > > > > around it not being there. > > > > > > > > What we should do continue/doing is making code that is modular we > need > > > to > > > > break hard dependencies like ThriftSerde or OrcSerde being "native" > and > > > > having to be linked to the metastore move them out into proper > > > submodules. > > > > There is too much code that only works for one implementation of a > > serde > > > > etc. > > > > > > > > > > > > > > > > > > I would like a timeline to understand this. It sounds as if master is > not > > > releasable currently, so already broken in a way. We make a branch and > > > aggreasively break it more? > > > > > > Im not following what makes this branching policy makes adding features > > > faster or how it helps shed debt faster. > > > > > > > > > -- > > > Sorry this was sent from mobile. Will do less grammar and spell check > > than > > > usual. > > > > > >