Hmm.  I'm not sure why something in the history would not be
"relevant"-- do you mean that the code was removed during the merge?
While that happens sometimes, it doesn't seem that common.

In any case, we have this same problem with CHANGES.txt, JIRA, and
every other issue tracking system we use.  Just because an issue was
relevant to Hadoop 0.23 doesn't mean it's still relevant to Hadoop
2.6.  We have a "fix version" but not a "this stopped being relevant
at version."

I'm not really sure what you mean by "a merge-forward model" either.
It sounds suspiciously like something that was designed around a
subversion workflow.  If a change isn't relevant to a branch, why
would you merge it into that branch?  In Hadoop, we do feature
development either in trunk or in a feature branch.  When it's done we
merge it into the release branches that it's relevant to.

I suppose some changes from some feature branches are sometimes
overwritten by the existing code during a merge.  But even if they
are, we still pull them into CHANGES.txt.  So again-- same behavior as
git log.  And in practice it doesn't seem to be a problem.

best,
Colin

On Wed, Mar 18, 2015 at 1:20 PM, Sean Busbey <bus...@cloudera.com> wrote:
> On the matter of handling merges in the history, this comes up over in
> Apache Accumulo where development follows a merge-forward model (commits go
> oldest first and merge into newer branches). This means that every commit
> on an older-but-still-active development branch eventually ends up merged
> into the history of newer branches even when the issue was only relevant to
> the older branch. The easiest problem with relying on just the git history
> for changes then is that there's no way to programmatically know which of
> the commits that show up in the log for a given release tag are relevant to
> that release and which ones were only relevant to the older development
> line.
>
> -Sean
>
> On Wed, Mar 18, 2015 at 2:59 PM, Colin P. McCabe <cmcc...@apache.org> wrote:
>
>> Alan, can you forward those private conversations (or some excerpt
>> thereof) to the list to explain the problem that you see?
>>
>> I have been using "git log" to track change history for years and
>> never had a problem.  In fact, we don't even maintain CHANGES.txt in
>> Cloudera's distribution including Hadoop.  It causes too many spurious
>> conflicts during cherry picks so we just discard the CHANGES.txt part
>> of the change when backporting things to our branches.  When you are
>> backporting hundreds of patches, and each one has a conflict on
>> CHANGES.txt (and generally, ALL of them do), it's just not worth it to
>> hand-resolve those conflicts.
>>
>> I also wrote a script to compare which JIRAs were in which branches by
>> doing a delta of the git commits.  It works pretty well.  You can even
>> visualize merges in git if you want, with tools like gitk (or even
>> plain old git log with the right options)
>>
>> Colin
>>
>>
>> On Tue, Mar 17, 2015 at 11:21 AM, Allen Wittenauer <a...@altiscale.com>
>> wrote:
>> >
>> >         Nope.  I’m not particularly in the mood to write a book about a
>> topic that I’ve beat to death in private conversations over the past 6
>> months other than highlighting that any solution needs to be able to work
>> against scenarios like we had 3 years ago with four active release branches
>> + trunk.
>> >
>> > On Mar 17, 2015, at 10:56 AM, Yongjun Zhang <yzh...@cloudera.com> wrote:
>> >
>> >> Thanks Ravi and Colin for the feedback.
>> >>
>> >> Hi Allen,
>> >>
>> >> You pointed out that "git log" has problem when dealing with branch that
>> >> has merges, would you please elaborate the problem?
>> >>
>> >> Thanks.
>> >>
>> >> --Yongjun
>> >>
>> >> On Mon, Mar 16, 2015 at 7:08 PM, Colin McCabe <cmcc...@alumni.cmu.edu>
>> >> wrote:
>> >>
>> >>> Branch merges made it hard to access change history on subversion
>> >>> sometimes.
>> >>>
>> >>> You can read the tale of woe here:
>> >>>
>> >>>
>> http://programmers.stackexchange.com/questions/206016/maintaining-svn-history-for-a-file-when-merge-is-done-from-the-dev-branch-to-tru
>> >>>
>> >>> Excerpt:
>> >>> "....prior to Subversion 1.8. The files in the branch and the files in
>> >>> trunk are copies and Subversion keeps track with svn log only for
>> >>> specific files, not across branches."
>> >>>
>> >>> I think that's how the custom of CHANGES.txt started, and it was
>> >>> cargo-culted forward into the git era despite not serving much purpose
>> >>> any more these days (in my opinion.)
>> >>>
>> >>> best,
>> >>> Colin
>> >>>
>> >>> On Mon, Mar 16, 2015 at 4:49 PM, Ravi Prakash <ravi...@ymail.com>
>> wrote:
>> >>>> +1 for automating the information contained in CHANGES.txt. There are
>> >>> some changes which go in without JIRAs sometimes (CVEs eg.) . I like
>> git
>> >>> log because its the absolute source of truth (cryptographically secure,
>> >>> audited, distributed, yadadada). We could always use git hooks to
>> force a
>> >>> commit message format.
>> >>>> a) cherry-picks have the same message (by default) as the original)b)
>> >>> I'm not sure why branch-mergers would be a problem?c) "Whoops I missed
>> >>> something in the previous commit" wouldn't happen if our hooks were
>> >>> smartishd) "no identification of what type of commit it was without
>> hooking
>> >>> into JIRA anyway." This would be in the format of the commit message
>> >>>>
>> >>>> Either way I think would be an improvement.
>> >>>> Thanks for your ideas folks
>> >>>>
>> >>>>
>> >>>>
>> >>>>     On Monday, March 16, 2015 11:51 AM, Colin P. McCabe <
>> >>> cmcc...@apache.org> wrote:
>> >>>>
>> >>>>
>> >>>> +1 for generating CHANGES.txt from JIRA and/or git as part of making a
>> >>>> release.  Or just dropping it altogether.  Keeping it under version
>> >>>> control creates lot of false conflicts whenever submitting a patch and
>> >>>> generally makes committing minor changes unpleasant.
>> >>>>
>> >>>> Colin
>> >>>>
>> >>>> On Sat, Mar 14, 2015 at 8:36 PM, Yongjun Zhang <yzh...@cloudera.com>
>> >>> wrote:
>> >>>>> Hi Allen,
>> >>>>>
>> >>>>> Thanks a lot for your input!
>> >>>>>
>> >>>>> Looks like problem a, c, d you listed is not too bad, assuming we can
>> >>> solve
>> >>>>> d by pulling this info from jira as Sean pointed out.
>> >>>>>
>> >>>>> Problem b (branch mergers) seems to be a real one, and your approach
>> of
>> >>>>> using JIRA system to build changes.txt is a reasonably good way. This
>> >>> does
>> >>>>> count on that we update jira accurately. Since this update is a
>> manual
>> >>>>> process, it's possible to have inconsistency, but may be not too bad.
>> >>> Since
>> >>>>> any mistake found here can be remedied by fixing the jira side and
>> >>>>> refreshing the result.
>> >>>>>
>> >>>>> I wonder if we as a community should switch to using your way, and
>> save
>> >>>>> committer's effort of taking care of CHANGES.txt (quite some save
>> IMO).
>> >>>>> Hope more people can share their thoughts.
>> >>>>>
>> >>>>> Thanks.
>> >>>>>
>> >>>>> --Yongjun
>> >>>>>
>> >>>>> On Fri, Mar 13, 2015 at 4:45 PM, Allen Wittenauer <a...@altiscale.com>
>> >>> wrote:
>> >>>>>
>> >>>>>>
>> >>>>>> I think the general consensus is don’t include the changes.txt file
>> in
>> >>>>>> your commit. It won’t be correct for both branches if such a commit
>> is
>> >>>>>> destined for both. (No, the two branches aren’t the same.)
>> >>>>>>
>> >>>>>> No, git log isn’t more accurate.  The problems are:
>> >>>>>>
>> >>>>>> a) cherry picks
>> >>>>>> b) branch mergers
>> >>>>>> c) “whoops i missed something in that previous commit”
>> >>>>>> d) no identification of what type of commit it was without hooking
>> into
>> >>>>>> JIRA anyway.
>> >>>>>>
>> >>>>>> This is why I prefer building the change log from JIRA.  We already
>> >>> build
>> >>>>>> release notes from JIRA, BTW.  (Not that anyone appears to read them
>> >>> given
>> >>>>>> the low quality of our notes…)  Anyway, here’s what I’ve been
>> >>>>>> building/using as changes.txt and release notes:
>> >>>>>>
>> >>>>>> https://github.com/aw-altiscale/hadoop-release-metadata
>> >>>>>>
>> >>>>>> I try to update these every day. :)
>> >>>>>>
>> >>>>>> On Mar 13, 2015, at 4:07 PM, Yongjun Zhang <yzh...@cloudera.com>
>> >>> wrote:
>> >>>>>>
>> >>>>>>> Thanks Esteban, I assume this report gets info purely from the jira
>> >>>>>>> database, but not "git log" of a branch, right?
>> >>>>>>>
>> >>>>>>> I hope we get the info from "git log" of a release branch because
>> >>> that'd
>> >>>>>> be
>> >>>>>>> more accurate.
>> >>>>>>>
>> >>>>>>> --Yongjun
>> >>>>>>>
>> >>>>>>> On Fri, Mar 13, 2015 at 3:11 PM, Esteban Gutierrez <
>> >>> este...@cloudera.com
>> >>>>>>>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>>> JIRA already provides a report:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12327179&styleName=Html&projectId=12310240
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> cheers,
>> >>>>>>>> esteban.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Cloudera, Inc.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Fri, Mar 13, 2015 at 3:01 PM, Sean Busbey <bus...@cloudera.com
>> >
>> >>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> So long as you include the issue number, you can automate pulling
>> >>> the
>> >>>>>>>> type
>> >>>>>>>>> from jira directly instead of putting it in the message.
>> >>>>>>>>>
>> >>>>>>>>> On Fri, Mar 13, 2015 at 4:49 PM, Yongjun Zhang <
>> >>> yzh...@cloudera.com>
>> >>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Hi,
>> >>>>>>>>>>
>> >>>>>>>>>> I found that changing CHANGES.txt when committing a jira is
>> error
>> >>>>>> prone
>> >>>>>>>>>> because of the different sections in the file, and sometimes we
>> >>> forget
>> >>>>>>>>>> about changing this file.
>> >>>>>>>>>>
>> >>>>>>>>>> After all, git log would indicate the history of a branch. I
>> >>> wonder if
>> >>>>>>>> we
>> >>>>>>>>>> could switch to a new method:
>> >>>>>>>>>>
>> >>>>>>>>>> 1. When committing, ensure the message include the type of the
>> >>> jira,
>> >>>>>>>> "New
>> >>>>>>>>>> Feature", "Bug Fixes", "Improvement" etc.
>> >>>>>>>>>>
>> >>>>>>>>>> 2. No longer need to make changes to CHANGES.txt for each commit
>> >>>>>>>>>>
>> >>>>>>>>>> 3. Before releasing a branch, create the CHANGES.txt by using
>> "git
>> >>>>>> log"
>> >>>>>>>>>> command for the given branch..
>> >>>>>>>>>>
>> >>>>>>>>>> Thanks.
>> >>>>>>>>>>
>> >>>>>>>>>> --Yongjun
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>> Sean
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>
>> >>>>
>> >>>
>> >
>>
>
>
>
> --
> Sean

Reply via email to