On Sat, 28 Dec 2019, Joseph Myers wrote:

> On Sat, 28 Dec 2019, Richard Earnshaw (lists) wrote:
> 
> > I've added the list of emails that I posted yesterday to the conversion
> > scripts.  I've not written anything to reprocess that yet.  I want to
> > leave that until we've completed the general review of the preferred
> > changes we want.  Auto-generating that data from the list will probably
> > be easier than maintaining it inside bugdb.py for now.
> 
> I've now pushed a change to automate removing "" or () around names.  
> Together with the automatic conversion of ISO-8859-1 names to UTF-8 that 
> should slightly reduce the number of cases needing handling from that 
> list.

Concretely, what I'd suggest is: convert ISO-8859-1 entries in the 
checked-in list to UTF-8, removing anything that thereby becomes a 
duplicate or unnecessary; handle anything whose encoding isn't simply 
ISO-8859-1 or UTF-8 via a hardcoded entry in bugdb.py using hex escapes 
like the existing such entries there.  Once the checked-in list is pure 
UTF-8 it's easier for people to review and edit.  Where the issue is only 
presence of ISO-8859 NBSP, or "" or () around the names, remove that in 
the checked-in list and again remove duplicates.  That way the list can be 
limited to non-encoding variations.

-- 
Joseph S. Myers
j...@polyomino.org.uk

Reply via email to