Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-30 Thread Roy Smith
On 20/10/2013 03:13, I wrote: > Heck, I can't even really move off 2.6 because we use Amazon's EMR > service, which is stuck on 2.6. On Sunday, October 20, 2013 5:11:32 AM UTC-4, Mark Lawrence wrote: > Dear Amazon, > > Please upgrade to Python 3.3 or similar so that users can have better > unic

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-20 Thread Mark Lawrence
On 20/10/2013 03:13, Roy Smith wrote: In article , Chris Angelico wrote: Heck, I can't even really move off 2.6 because we use Amazon's EMR service, which is stuck on 2.6. Hrm. 2.6 is now in source-only security-only support, and that's about to end (there's a 2.6.9 in the pipeline, and th

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Chris Angelico
On Sun, Oct 20, 2013 at 1:26 PM, Ben Finney wrote: > Roy Smith writes: > >> In article , >> Chris Angelico wrote: >> >> > > Heck, I can't even really move off 2.6 because we use Amazon's EMR >> > > service, which is stuck on 2.6. >> > >> > Hrm. 2.6 is now in source-only security-only support, a

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Ben Finney
Roy Smith writes: > In article , > Chris Angelico wrote: > > > > Heck, I can't even really move off 2.6 because we use Amazon's EMR > > > service, which is stuck on 2.6. > > > > Hrm. 2.6 is now in source-only security-only support, and that's > > about to end (there's a 2.6.9 in the pipeline,

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Roy Smith
In article , Chris Angelico wrote: > > Heck, I can't even really move off 2.6 because we use Amazon's EMR > > service, which is stuck on 2.6. > > Hrm. 2.6 is now in source-only security-only support, and that's about > to end (there's a 2.6.9 in the pipeline, and that's that). It's about > time

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Chris Angelico
On Sun, Oct 20, 2013 at 12:52 PM, Roy Smith wrote: > In article , > Chris Angelico wrote: > >> Or are you saying that that particular error code path did NOT handle >> non-ASCII characters? > > Exactly. The fundamental error was caught, and then we raised another > UnicodeEncodeError generating

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Roy Smith
In article , Chris Angelico wrote: > On Sun, Oct 20, 2013 at 3:49 AM, Roy Smith wrote: > > So, yesterday, I tracked down an uncaught exception stack in our logs to a > > user whose username included the unicode character 'SMILING FACE WITH > > SUNGLASSES' (U+1F60E). It turns out, that's perf

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Chris Angelico
On Sun, Oct 20, 2013 at 3:49 AM, Roy Smith wrote: > So, yesterday, I tracked down an uncaught exception stack in our logs to a > user whose username included the unicode character 'SMILING FACE WITH > SUNGLASSES' (U+1F60E). It turns out, that's perfectly fine as a user name, > except that in o

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Roy Smith
On Saturday, October 19, 2013 12:16:02 PM UTC-4, Steven D'Aprano wrote: > Another reasonable use for accent-stripping is searches. If I'm searching > for music by the Blue Öyster Cult, it would be good to see results for > Blue Oyster Cult as well. Tell me about it (I work at Songza; music sear

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Steven D'Aprano
On Sat, 19 Oct 2013 11:14:30 -0300, Zero Piraeus wrote: > : > > On Sat, Oct 19, 2013 at 09:19:12AM +, Steven D'Aprano wrote: >> Make no mistake, this sort of simple-minded stripping of accents and >> diacritics is an extremely ham-fisted thing to do. [...] > Joking aside, there is a legitimat

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread rusi
On Saturday, October 19, 2013 8:40:37 PM UTC+5:30, Roy Smith wrote: > Zero Piraeus wrote: > > > For example, a miscreant may create the username 'míguel' in order to > > pose as another user 'miguel', relying on other users inattentiveness. > > Asciifying is one way of reducing the risk of that.

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Roy Smith
In article , Zero Piraeus wrote: > For example, a miscreant may create the username 'míguel' in order to > pose as another user 'miguel', relying on other users inattentiveness. > Asciifying is one way of reducing the risk of that. Determining if two strings are "almost the same" is not easy.

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Zero Piraeus
: On Sat, Oct 19, 2013 at 09:19:12AM +, Steven D'Aprano wrote: > Make no mistake, this sort of simple-minded stripping of accents and > diacritics is an extremely ham-fisted thing to do. I used to live on a street called Calle Colón, so I'm aware of the dangers of stripping diacritics: http

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread caldwellinva
Zero/Stephen ... thank you for your replies ... they were both very helpful, both in addressing the immediate issue and for getting a better understanding of the context of the conversion. Greatly appreciate your taking the time for such good solutions. -- https://mail.python.org/mailman/listin

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Roy Smith
In article , caldwelli...@gmail.com wrote: > I am looking for an example of a UNICODE to ASCII conversion example that > will remove diacritics from characters (and leave the characters, i.e., Klüft > to Kluft) as well as handle the conversion of other characters, like große to > grosse. http

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Steven D'Aprano
On Fri, 18 Oct 2013 13:45:53 -0700, caldwellinva wrote: > Hi! > > I am looking for an example of a UNICODE to ASCII conversion example > that will remove diacritics from characters (and leave the characters, > i.e., Klüft to Kluft) as well as handle the conversion of other > characters, like groß

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-18 Thread Zero Piraeus
: On Fri, Oct 18, 2013 at 01:45:53PM -0700, caldwelli...@gmail.com wrote: > I am looking for an example of a UNICODE to ASCII conversion example > that will remove diacritics from characters (and leave the characters, > i.e., Klüft to Kluft) as well as handle the conversion of other > characters,