On Sun, Aug 18, 2013 at 10:01:51PM +0300, Daniel Shahaf wrote: > Ivan Zhakov wrote on Sun, Aug 18, 2013 at 22:04:58 +0400: > > > * r1514785 > > > ra_serf: Improve SSL certificate verification failure message. > > > @@ -211,6 +210,8 @@ Candidate changes: > > > informative. Regression from Subversion 1.7.x > > > Votes: > > > +1: ivan, stefan2 > > > + danielsh: I believe chopping off the last 2 bytes is wrong, _(", ") > > > would > > > + be longer than two bytes in Japanese locale. > > > > Actually not, because we use UTF8 internally so ', ' will be always two > > bytes long. > > Yes, ", " will be two bytes long, but _(", ") may be any number of > bytes. It is not guaranteed that the localised version ends with an > ASCII comma and an ASCII space; it might end with a character whose > representation has three bytes. >
Case in point: >>> unicodedata.lookup('ARABIC COMMA').encode('utf-8') b'\xd8\x8c' If we add an Arabic localization, the localised version would end with bytes D8 8C 20 00, and chopping off two bytes would result in a bytestring that ends with D8 00, which is invalid UTF-8. Daniel > > String will be convert to required console locale if needed. > > The code could be improved btw: remove ', ' and ': ' from loclized strings > > and them seaparately to prevent translators broke output accidently. > > But it does not prevent backport this change IMHO.