Re: Encoding of user ID strings

2016-05-24 Thread Ingo Klöcker
On Tuesday 24 May 2016 08:26:54 Werner Koch wrote: > On Mon, 23 May 2016 20:19, r...@sixdemonbag.org said: > > At first blush it appears the answer is "no, but most people use > > UTF-8."> > > If so that's fine, but I'll have to silently discard a number of > > user > > OpenPGP requires that th

Re: Encoding of user ID strings

2016-05-23 Thread Werner Koch
On Mon, 23 May 2016 20:19, r...@sixdemonbag.org said: > At first blush it appears the answer is "no, but most people use UTF-8." > If so that's fine, but I'll have to silently discard a number of user OpenPGP requires that the user id is UTF-8 encoded. Older PGP versions did not care about enco

Re: Encoding of user ID strings

2016-05-23 Thread Andrew Gallagher
On 23 May 2016, at 23:24, Robert J. Hansen wrote: >> In the case of "all 8-bit characters, no 7-bit" you're dealing with >> either a practical joker or EBCDIC. Same thing really... > > Or KOI-8R/Windows-1251. I'd forgotten about that. Or any of the iso-8859 that encode non-Latin scripts. Or s

Re: Encoding of user ID strings

2016-05-23 Thread Robert J. Hansen
> In the case of "all 8-bit characters, no 7-bit" you're dealing with > either a practical joker or EBCDIC. Same thing really... Or KOI-8R/Windows-1251. > After that you're into heuristics. There are quite a few programs out > there that attempt to detect encodings statistically, but with such a

Re: Encoding of user ID strings

2016-05-23 Thread Andrew Gallagher
> On 23 May 2016, at 20:19, Robert J. Hansen wrote: > > Is there any way to determine the encoding for a user ID string? > > At first blush it appears the answer is "no, but most people use UTF-8." You can tell fairly reliably if someone is using either vanilla ascii or UTF8, in the cases of

Encoding of user ID strings

2016-05-23 Thread Robert J. Hansen
Is there any way to determine the encoding for a user ID string? At first blush it appears the answer is "no, but most people use UTF-8." If so that's fine, but I'll have to silently discard a number of user IDs that appear to be in foreign encodings or are garbled UTF-8. I'd prefer not to do th