>
> is there a way to sort this string properly (sorted()?)
> I mean first 'a' then 'à' then 'e' etc. (sorted puts accented letters at
> the end). Or should I have to provide a comparison function to sorted?
After setting the locale...
locale.strcoll()
--
damjan
--
http://mail.python.org/mai
Ricardo Aráoz wrote:
> Lawrence D'Oliveiro wrote:
>> In message <[EMAIL PROTECTED]>, tool69 wrote:
>>
>>> p2.content = """Ce poste possède des accents : é à ê è"""
>>
>> My guess is this is being encoded as a Latin-1 string, but when you try
>> to output it it goes through the ASCII encoder, whi
Lawrence D'Oliveiro wrote:
> In message <[EMAIL PROTECTED]>, tool69 wrote:
>
>> p2.content = """Ce poste possède des accents : é à ê è"""
>
> My guess is this is being encoded as a Latin-1 string, but when you try to
> output it it goes through the ASCII encoder, which doesn't understand the
> ac
Diez B. Roggisch a écrit :
> tool69 wrote:
>
>> Hi,
>>
>> I would like to transform reST contents to HTML, but got problems
>> with accented chars.
>>
>> Here's a rather simplified version using SVN Docutils 0.5:
>>
>> %-
>>
>> #!/usr/bin
Lawrence D'Oliveiro a écrit :
> In message <[EMAIL PROTECTED]>, tool69 wrote:
>
>> p2.content = """Ce poste possède des accents : é à ê è"""
>
> My guess is this is being encoded as a Latin-1 string, but when you try to
> output it it goes through the ASCII encoder, which doesn't understand the
>
tool69 wrote:
> Hi,
>
> I would like to transform reST contents to HTML, but got problems
> with accented chars.
>
> Here's a rather simplified version using SVN Docutils 0.5:
>
> %-
>
> #!/usr/bin/env python
> # -*- coding: utf-8 -*-
In message <[EMAIL PROTECTED]>, tool69 wrote:
> p2.content = """Ce poste possède des accents : é à ê è"""
My guess is this is being encoded as a Latin-1 string, but when you try to
output it it goes through the ASCII encoder, which doesn't understand the
accents. Try this:
p2.content = u"""Ce po
Serge Orlov wrote:
> The problem is that U+0587 is a ligature in Western Armenian dialect
> (hy locale) and a character in Eastern Armenian dialect (hy_AM locale).
> It is strange the code point is marked as compatibility char. It either
> mistake or political decision. It used to be a ligature bef
Jean-Paul Calderone wrote:
> On Fri, 24 Mar 2006 09:33:19 +1100, John Machin <[EMAIL PROTECTED]> wrote:
> >On 24/03/2006 8:36 AM, Peter Otten wrote:
> >> John Machin wrote:
> >>
> >>>You can replace ALL of this upshifting and accent removal in one blow by
> >>>using the string translate() method wi
Martin v. Löwis wrote:
> John Machin wrote:
> >> and, for things like u'\u0565\u0582' (ARMENIAN SMALL LIGATURE ECH
> >> YIWN), it does not even work.
> >
> > Sorry, I don't understand.
> > 0565 is stand-alone ECH
> > 0582 is stand-alone YIWN
> > 0587 is the ligature.
> > What doesn't work? At first
John Machin wrote:
>> and, for things like u'\u0565\u0582' (ARMENIAN SMALL LIGATURE ECH
>> YIWN), it does not even work.
>
> Sorry, I don't understand.
> 0565 is stand-alone ECH
> 0582 is stand-alone YIWN
> 0587 is the ligature.
> What doesn't work? At first guess, in the absence of an Armenian
John Machin wrote:
> Some of the transformations are a little unfortunate :-(
here's a slightly silly way to map a unicode string to its "unaccented"
version:
###
import unicodedata, sys
CHAR_REPLACEMENT = {
0xc6: u"AE", # LATIN CAPITAL LETTER AE
0xd0: u"D", # LATIN CAPITAL LETTER ETH
On 24/03/2006 11:44 PM, Peter Otten wrote:
> John Machin wrote:
>
>
>>0x00d0: ord('D'), # Ð
>>0x00f0: ord('o'), # ð
>>Icelandic capital eth becomes D, OK; but the small letter becomes o!!!
>
>
> I see information flow from Iceland is a bit better than from Armenia :-)
No information flow neede
Duncan Booth wrote:
> [...]
> Unfortunately, just as I finished writing this I discovered that the
> latscii module isn't as robust as I thought, it blows up on consecutive
> accented characters.
>
> :(
Replace the error handler with this (untested) and it should work with
consecutive accent
John Machin wrote:
> 0x00d0: ord('D'), # Ð
> 0x00f0: ord('o'), # ð
> Icelandic capital eth becomes D, OK; but the small letter becomes o!!!
I see information flow from Iceland is a bit better than from Armenia :-)
> Some of the transformations are a little unfortunate :-(
The OP, as you pointed
On 24/03/2006 8:11 PM, Duncan Booth wrote:
> Peter Otten wrote:
>
>
>>>You can replace ALL of this upshifting and accent removal in one blow
>>>by using the string translate() method with a suitable table.
>>
>>Only if you convert to unicode first or if your data maintains 1 byte
>>== 1 character
Duncan Booth wrote:
> There's a nice little codec from Skip Montaro for removing accents from
> latin-1 encoded strings. It also has an error handler so you can convert
> from unicode to ascii and strip all the accents as you do so:
>
> http://orca.mojam.com/~skip/python/latscii.py
>
import
Peter Otten wrote:
>> You can replace ALL of this upshifting and accent removal in one blow
>> by using the string translate() method with a suitable table.
>
> Only if you convert to unicode first or if your data maintains 1 byte
> == 1 character, in particular it is not UTF-8.
>
There's a ni
On 24/03/2006 2:19 PM, Jean-Paul Calderone wrote:
> On Fri, 24 Mar 2006 09:33:19 +1100, John Machin <[EMAIL PROTECTED]>
> wrote:
>
>> On 24/03/2006 8:36 AM, Peter Otten wrote:
>>
>>> John Machin wrote:
>>>
You can replace ALL of this upshifting and accent removal in one
blow by
us
On Fri, 24 Mar 2006 09:33:19 +1100, John Machin <[EMAIL PROTECTED]> wrote:
>On 24/03/2006 8:36 AM, Peter Otten wrote:
>> John Machin wrote:
>>
>>>You can replace ALL of this upshifting and accent removal in one blow by
>>>using the string translate() method with a suitable table.
>>
>> Only if you
On 24/03/2006 8:36 AM, Peter Otten wrote:
> John Machin wrote:
>
>>You can replace ALL of this upshifting and accent removal in one blow by
>>using the string translate() method with a suitable table.
>
> Only if you convert to unicode first or if your data maintains 1 byte == 1
> character, in p
John Machin wrote:
> You can replace ALL of this upshifting and accent removal in one blow by
> using the string translate() method with a suitable table.
Only if you convert to unicode first or if your data maintains 1 byte == 1
character, in particular it is not UTF-8.
Peter
--
http://mail.
On 23/03/2006 10:07 PM, bussiere bussiere wrote:
> hi i'am making a program for formatting string,
> or
> i've added :
> #!/usr/bin/python
> # -*- coding: utf-8 -*-
>
> in the begining of my script but
>
> str = str.replace('Ç', 'C')
> str = str.replace('é', 'E')
> str = str.repl
Seems to work fine for me.
>>> x="éÇ"
>>> x=x.replace('é','E')
'E\xc7'
>>> x=x.replace('Ç','C')
>>> x
'E\xc7'
>>> x=x.replace('Ç','C')
>>> x
'EC'
You should also be able to use .upper() method to
uppercase everything in the string in a single statement:
tstr=ligneA.upper()
Note: you should neve
bussiere bussiere wrote:
> hi i'am making a program for formatting string,
> i've added :
> #!/usr/bin/python
> # -*- coding: utf-8 -*-
>
> in the begining of my script but
>
> str = str.replace('Ç', 'C')
> ...
> doesn't work it put me " and , instead of remplacing é by E
Are your sure your scr
> (to email use "boris at batiment71 dot ch")
oops, that's "boris at batiment71 dot net"
--
http://mail.python.org/mailman/listinfo/python-list
26 matches
Mail list logo