Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-30 Thread Roy Smith
On 20/10/2013 03:13, I wrote: > Heck, I can't even really move off 2.6 because we use Amazon's EMR > service, which is stuck on 2.6. On Sunday, October 20, 2013 5:11:32 AM UTC-4, Mark Lawrence wrote: > Dear Amazon, > > Please upgrade to Python 3.3 or similar so that users can have better > unic

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-20 Thread Mark Lawrence
On 20/10/2013 03:13, Roy Smith wrote: In article , Chris Angelico wrote: Heck, I can't even really move off 2.6 because we use Amazon's EMR service, which is stuck on 2.6. Hrm. 2.6 is now in source-only security-only support, and that's about to end (there's a 2.6.9 in the pipeline, and th

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Chris Angelico
On Sun, Oct 20, 2013 at 1:26 PM, Ben Finney wrote: > Roy Smith writes: > >> In article , >> Chris Angelico wrote: >> >> > > Heck, I can't even really move off 2.6 because we use Amazon's EMR >> > > service, which is stuck on 2.6. >> > >> > Hrm. 2.6 is now in source-only security-only support, a

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Ben Finney
Roy Smith writes: > In article , > Chris Angelico wrote: > > > > Heck, I can't even really move off 2.6 because we use Amazon's EMR > > > service, which is stuck on 2.6. > > > > Hrm. 2.6 is now in source-only security-only support, and that's > > about to end (there's a 2.6.9 in the pipeline,

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Roy Smith
In article , Chris Angelico wrote: > > Heck, I can't even really move off 2.6 because we use Amazon's EMR > > service, which is stuck on 2.6. > > Hrm. 2.6 is now in source-only security-only support, and that's about > to end (there's a 2.6.9 in the pipeline, and that's that). It's about > time

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Chris Angelico
On Sun, Oct 20, 2013 at 12:52 PM, Roy Smith wrote: > In article , > Chris Angelico wrote: > >> Or are you saying that that particular error code path did NOT handle >> non-ASCII characters? > > Exactly. The fundamental error was caught, and then we raised another > UnicodeEncodeError generating

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Roy Smith
In article , Chris Angelico wrote: > On Sun, Oct 20, 2013 at 3:49 AM, Roy Smith wrote: > > So, yesterday, I tracked down an uncaught exception stack in our logs to a > > user whose username included the unicode character 'SMILING FACE WITH > > SUNGLASSES' (U+1F60E). It turns out, that's perf

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Chris Angelico
On Sun, Oct 20, 2013 at 3:49 AM, Roy Smith wrote: > So, yesterday, I tracked down an uncaught exception stack in our logs to a > user whose username included the unicode character 'SMILING FACE WITH > SUNGLASSES' (U+1F60E). It turns out, that's perfectly fine as a user name, > except that in o

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Roy Smith
On Saturday, October 19, 2013 12:16:02 PM UTC-4, Steven D'Aprano wrote: > Another reasonable use for accent-stripping is searches. If I'm searching > for music by the Blue Öyster Cult, it would be good to see results for > Blue Oyster Cult as well. Tell me about it (I work at Songza; music sear

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Steven D'Aprano
On Sat, 19 Oct 2013 11:14:30 -0300, Zero Piraeus wrote: > : > > On Sat, Oct 19, 2013 at 09:19:12AM +, Steven D'Aprano wrote: >> Make no mistake, this sort of simple-minded stripping of accents and >> diacritics is an extremely ham-fisted thing to do. [...] > Joking aside, there is a legitimat

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread rusi
On Saturday, October 19, 2013 8:40:37 PM UTC+5:30, Roy Smith wrote: > Zero Piraeus wrote: > > > For example, a miscreant may create the username 'míguel' in order to > > pose as another user 'miguel', relying on other users inattentiveness. > > Asciifying is one way of reducing the risk of that.

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Roy Smith
In article , Zero Piraeus wrote: > For example, a miscreant may create the username 'míguel' in order to > pose as another user 'miguel', relying on other users inattentiveness. > Asciifying is one way of reducing the risk of that. Determining if two strings are "almost the same" is not easy.

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Zero Piraeus
: On Sat, Oct 19, 2013 at 09:19:12AM +, Steven D'Aprano wrote: > Make no mistake, this sort of simple-minded stripping of accents and > diacritics is an extremely ham-fisted thing to do. I used to live on a street called Calle Colón, so I'm aware of the dangers of stripping diacritics: http

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread caldwellinva
Zero/Stephen ... thank you for your replies ... they were both very helpful, both in addressing the immediate issue and for getting a better understanding of the context of the conversion. Greatly appreciate your taking the time for such good solutions. -- https://mail.python.org/mailman/listin

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Roy Smith
In article , caldwelli...@gmail.com wrote: > I am looking for an example of a UNICODE to ASCII conversion example that > will remove diacritics from characters (and leave the characters, i.e., Klüft > to Kluft) as well as handle the conversion of other characters, like große to > grosse. http

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-19 Thread Steven D'Aprano
On Fri, 18 Oct 2013 13:45:53 -0700, caldwellinva wrote: > Hi! > > I am looking for an example of a UNICODE to ASCII conversion example > that will remove diacritics from characters (and leave the characters, > i.e., Klüft to Kluft) as well as handle the conversion of other > characters, like groß

Re: Looking for UNICODE to ASCII Conversioni Example Code

2013-10-18 Thread Zero Piraeus
: On Fri, Oct 18, 2013 at 01:45:53PM -0700, caldwelli...@gmail.com wrote: > I am looking for an example of a UNICODE to ASCII conversion example > that will remove diacritics from characters (and leave the characters, > i.e., Klüft to Kluft) as well as handle the conversion of other > characters,

Looking for UNICODE to ASCII Conversioni Example Code

2013-10-18 Thread caldwellinva
Hi! I am looking for an example of a UNICODE to ASCII conversion example that will remove diacritics from characters (and leave the characters, i.e., Klüft to Kluft) as well as handle the conversion of other characters, like große to grosse. There used to be a program called any2ascii.py (htt