Fredrik Lundh wrote:
> John Machin wrote:
>
> > Another point: there are many non-latin1 characters that could be
> > mapped to ASCII. For example:
> > u"\u0141ukasziewicz".translate(unaccented_map())
> > doesn't work unless an entry is added to the no-decomposition table:
> > 0x0141: u"L"
John Machin wrote:
> Another point: there are many non-latin1 characters that could be
> mapped to ASCII. For example:
> u"\u0141ukasziewicz".translate(unaccented_map())
> doesn't work unless an entry is added to the no-decomposition table:
> 0x0141: u"L", # LATIN CAPITAL LETTER L WITH STR
Fredrik Lundh wrote:
> John Machin wrote:
>
> > 3. ... and to check for missing maps. The OP may be working only with
> > French text, and may not care about Icelandic and German letters, but
> > other readers who stumble on this (and miss past thread(s) on this
> > topic) may like something done
John Machin wrote:
> 3. ... and to check for missing maps. The OP may be working only with
> French text, and may not care about Icelandic and German letters, but
> other readers who stumble on this (and miss past thread(s) on this
> topic) may like something done with \xde (capital thorn), \xfe
Frederic Rentsch wrote:
> Try this:
>
> from_characters =
> '\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xff\xe7\xe8\xe9\xea\x
Dan wrote:
> On 22 nov, 22:59, "John Machin" <[EMAIL PROTECTED]> wrote:
>
>
>>> processes (Vigenère)
>>>
>> So why do you want to strip off accents? The history of communication
>> has several examples of significant difference in meaning caused by
>> minute differences in punctuation or
On 22 nov, 22:59, "John Machin" <[EMAIL PROTECTED]> wrote:
> > processes (Vigenère)
> So why do you want to strip off accents? The history of communication
> has several examples of significant difference in meaning caused by
> minute differences in punctuation or accents including one of which yo
On Wed, 22 Nov 2006 22:59:01 +0100, John Machin <[EMAIL PROTECTED]>
wrote:
[snip]
> So why do you want to strip off accents? The history of communication
> has several examples of significant difference in meaning caused by
> minute differences in punctuation or accents including one of which you
Klaas wrote:
> It's not too hard to imagine an accentual difference, eg:
especially in languages where certain combinations really are distinct
letters, not just letters with accents or silly marks.
I have a Swedish children's book somewhere, in which some characters are
harassed by a big ugly
David H Wild wrote:
> In article <[EMAIL PROTECTED]>,
>John Machin <[EMAIL PROTECTED]> wrote:
> > So why do you want to strip off accents? The history of communication
> > has several examples of significant difference in meaning caused by
> > minute differences in punctuation or accents includ
David H Wild wrote:
> In article <[EMAIL PROTECTED]>,
>John Machin <[EMAIL PROTECTED]> wrote:
> > So why do you want to strip off accents? The history of communication
> > has several examples of significant difference in meaning caused by
> > minute differences in punctuation or accents inclu
In article <[EMAIL PROTECTED]>,
John Machin <[EMAIL PROTECTED]> wrote:
> So why do you want to strip off accents? The history of communication
> has several examples of significant difference in meaning caused by
> minute differences in punctuation or accents including one of which you
> may hav
Dan wrote:
> Thank you for your answers.
>
> In fact, I'm getting start with Python.
That was a good decision. Welcome!
>
> I was looking for transform a text through elementary cryptographic
> processes (Vigenère).
So why do you want to strip off accents? The history of communication
has severa
Thank you for your answers.
In fact, I'm getting start with Python.
I was looking for transform a text through elementary cryptographic
processes (Vigenère).
The initial text is in a file, and my system is under UTF-8 by default
(Ubuntu)
--
http://mail.python.org/mailman/listinfo/python-list
hg wrote:
> Duncan Booth wrote:
> > hg <[EMAIL PROTECTED]> wrote:
> >
> >>> or in other words, put this at the top of your file (where "utf-8" is
> >>> whatever your editor/system is using):
> >>>
> >>># -*- coding: utf-8 -*-
> >>>
> >>> and use
> >>>
> >>>u''
> >>>
> >>> for all non-ASCII
Fredrik Lundh wrote:
> hg wrote:
>
>> How would you handle the string.maketrans then ?
>
> maketrans works on bytes, not characters. what makes you think that you
> can use maketrans if you haven't gotten the slightest idea what's in the
> string?
>
> if you want to get rid of accents in a Unic
hg wrote:
> How would you handle the string.maketrans then ?
maketrans works on bytes, not characters. what makes you think that you
can use maketrans if you haven't gotten the slightest idea what's in the
string?
if you want to get rid of accents in a Unicode string, you can do the
approach
Duncan Booth wrote:
> hg <[EMAIL PROTECTED]> wrote:
>
>>> or in other words, put this at the top of your file (where "utf-8" is
>>> whatever your editor/system is using):
>>>
>>># -*- coding: utf-8 -*-
>>>
>>> and use
>>>
>>>u''
>>>
>>> for all non-ASCII literals.
>>>
>>>
>>>
>> Hi,
>>
>>
hg <[EMAIL PROTECTED]> wrote:
>> or in other words, put this at the top of your file (where "utf-8" is
>> whatever your editor/system is using):
>>
>># -*- coding: utf-8 -*-
>>
>> and use
>>
>>u''
>>
>> for all non-ASCII literals.
>>
>>
>>
>
> Hi,
>
> The problem is that:
>
> # -
hg wrote:
> Fredrik Lundh wrote:
>> hg wrote:
>>
>>> We noticed that len('à') != len('a')
>> sounds odd.
>>
> len('à') == len('a')
>> True
>>
>> are you perhaps using an UTF-8 editor?
>>
>> to keep your sanity, no matter what editor you're using, I recommend
>> adding a coding directive to the
Fredrik Lundh wrote:
> hg wrote:
>
>> We noticed that len('à') != len('a')
>
> sounds odd.
>
len('à') == len('a')
> True
>
> are you perhaps using an UTF-8 editor?
>
> to keep your sanity, no matter what editor you're using, I recommend
> adding a coding directive to the source file, and
hg wrote:
> We noticed that len('à') != len('a')
sounds odd.
>>> len('à') == len('a')
True
are you perhaps using an UTF-8 editor?
to keep your sanity, no matter what editor you're using, I recommend
adding a coding directive to the source file, and using *only* Unicode
string literals for n
Hi,
I'm bringing over a thread that's going on on f.c.l.python.
The point was to get rid of french accents from words.
We noticed that len('à') != len('a') and I found the hack below to fix
the "problem" ... yet I do not understand - especially since 'à' is
included in the extended ASCII table,
23 matches
Mail list logo