Re: Normalize a polish L

2007-10-23 Thread Roberto Bonvallet
On Oct 22, 7:50 pm, Mike Orr <[EMAIL PROTECTED]> wrote: > Well, that gets into official vs unofficial conversions. Does the > Spanish Academy really say 'ü' should be converted to 'u'? No, but it's the only conversion that makes sense. The only Spanish letter that doesn't have a standard common

Re: Normalize a polish L

2007-10-22 Thread Mike Orr
On Oct 16, 9:51 am, Roberto Bonvallet <[EMAIL PROTECTED]> wrote: > For example, in Spanish, "ü" (u with umlaut) should be represented as > "u", but in German, it should be represented as "ue". > > pingüino -> pinguino > Frühstück -> Fruehstueck > > I'd like that web applications (e.g. blogs

Re: Normalize a polish L

2007-10-16 Thread Roberto Bonvallet
On Oct 15, 6:57 pm, John Machin <[EMAIL PROTECTED]> wrote: > To "asciify" such text, you need to build a look-up table that suits > your purpose. unicodedata.decomposition() is (accidentally) useful in > providing *some* of the entries for such a table. This is the only approach that can actually

Re: Normalize a polish L

2007-10-16 Thread Peter Bengtsson
On Oct 15, 10:57 pm, John Machin <[EMAIL PROTECTED]> wrote: > On Oct 16, 2:33 am, Peter Bengtsson <[EMAIL PROTECTED]> wrote: > > > > > In UTF8, \u0141 is a capital L with a little dash through it as can be > > seen in this image:http://static.peterbe.com/lukasz.png > > > I tried this:>>> import uni

Re: Normalize a polish L

2007-10-15 Thread Bjoern Schliessmann
Thorsten Kampe wrote: > The 'L' is actually pronounced like the English "w"... '?' originally comes from "L" () and is AFAIK transcribed so. Also, a friend of mine writes himself "Lukas" (pronounced L-) even though in Polish his name is Łukas (short Wh-). Regard

Re: Normalize a polish L

2007-10-15 Thread Bjoern Schliessmann
Thorsten Kampe wrote: > Why do you try to use characters in a character set that does not > contain these characters? That doesn't make any sense. I thought KNode was smart enough to switch to UTF-8; obviously, it isn't. Regards, Björn -- BOFH excuse #121: halon system went off and killed t

Re: Normalize a polish L

2007-10-15 Thread John Machin
On Oct 16, 2:33 am, Peter Bengtsson <[EMAIL PROTECTED]> wrote: > In UTF8, \u0141 is a capital L with a little dash through it as can be > seen in this image:http://static.peterbe.com/lukasz.png > > I tried this:>>> import unicodedata > >>> unicodedata.normalize('NFKD', u'\u0141').encode('ascii','ig

Re: Normalize a polish L

2007-10-15 Thread Thorsten Kampe
* Bjoern Schliessmann (Mon, 15 Oct 2007 21:51:54 +0200) > Thorsten Kampe wrote: > > The 'L' is actually pronounced like the English "w"... > > '?' originally comes from "L" () and > is AFAIK transcribed so. There are lots of possible transcriptions for "LATIN CAPI

Re: Normalize a polish L

2007-10-15 Thread Rob Wolfe
Peter Bengtsson <[EMAIL PROTECTED]> writes: > In UTF8, \u0141 is a capital L with a little dash through it as can be > seen in this image: > http://static.peterbe.com/lukasz.png > > I tried this: import unicodedata unicodedata.normalize('NFKD', u'\u0141').encode('ascii','ignore') > '' >

Re: Normalize a polish L

2007-10-15 Thread Bjoern Schliessmann
Thorsten Kampe wrote: > The 'L' is actually pronounced like the English "w"... '?' originally comes from "L" () and is AFAIK transcribed so. Also, a friend of mine writes himself "Lukas" (pronounced L-) even though in Polish his name is ?ukas (short Wh-). Regard

Re: Normalize a polish L

2007-10-15 Thread Thorsten Kampe
* Peter Bengtsson (Mon, 15 Oct 2007 16:33:26 -) > In UTF8, \u0141 is a capital L with a little dash through it as can be > seen in this image: > http://static.peterbe.com/lukasz.png > I tried this: > >>> import unicodedata > >>> unicodedata.normalize('NFKD', u'\u0141').encode('ascii','ignore')