On Sun, 29 Dec 2019, Joseph Myers wrote: > I've now made those changes to the checked-in list so it's pure UTF-8, and > thus easier to review and edit. We still need to implement code in > bugdb.py to use that list to pick the preferred form from each list of > variants (and people may wish to change the preferred forms in some > cases).
I've now implemented that code in bugdb.py. Given those fixes, I'm planning to compare author names from the reposurgeon conversion and Maxim's conversion, as I think cases where they find different authors (not just different email addresses) are good cases for manual review (we already have various such manual author fixups for individual commits in bugdb.py). In fact that manual review may show up *other* commits that should be reattributed. One example Maxim gave of a missing author was Aymeric Vincent. That was a commit on premerge-fsf-branch where the reposurgeon heuristic "don't use attributions from ChangeLog for a ChangeLog-only commit" applied. But whether or not the commit just adding the ChangeLog entry should be reattributed to the person named in that ChangeLog entry, the real changes that ChangeLog entry relates to are two previous commits (each file committed separately), so it shows up that those two previous commits ought to be reattributed. -- Joseph S. Myers jos...@codesourcery.com