unescape HTML entities

2006-10-28 Thread Rares Vernica
Hi,

How can I unescape HTML entities like " "?

I know about xml.sax.saxutils.unescape() but it only deals with "&", 
"<", and ">".

Also, I know about htmlentitydefs.entitydefs, but not only this 
dictionary is the opposite of what I need, it does not have " ".

It has to be in python 2.4.

Thanks a lot,
Ray

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unescape HTML entities

2006-10-30 Thread Rares Vernica
Thanks a lot for all the answers!
Ray

Frederic Rentsch wrote:
> Rares Vernica wrote:
>> Hi,
>>
>> How can I unescape HTML entities like " "?
>>
>> I know about xml.sax.saxutils.unescape() but it only deals with "&", 
>> "<", and ">".
>>
>> Also, I know about htmlentitydefs.entitydefs, but not only this 
>> dictionary is the opposite of what I need, it does not have " ".
>>
>> It has to be in python 2.4.
>>
>> Thanks a lot,
>> Ray
>>
> One way is this:
> 
>  >>> import SE  # 
> Download from http://cheeseshop.python.org/pypi/SE/2.2%20beta
>  >>> SE.SE ('HTM2ISO.se')('input_file_name', 'output_file_name')# 
> HTM2ISO.se is included
> 'output_file_name'
> 
> For repeated translations the SE object would be assigned to a variable:
> 
>  >>> HTM_Decoder = SE.SE ('HTM2ISO.se')
> 
> SE objects take and return strings as well as file names which is useful 
> for translating string variables, doing line-by-line translations and 
> for interactive development or verification. A simple way to check a 
> substitution set is to use its definitions as test data. The following 
> is a section of the definition file HTM2ISO.se:
> 
> test_string = '''
> ø=(xf8)   #  248  f8
> ù=(xf9)   #  249  f9
> ú=(xfa)   #  250  fa
> û=(xfb)#  251  fb
> ü=(xfc) #  252  fc
> ý=(xfd)   #  253  fd
> þ=(xfe)#  254  fe
> é=(xe9)
> ê=(xea)
> ë=(xeb)
> ì=(xec)
> í=(xed)
> î=(xee)
> ï=(xef)
> '''
> 
>  >>> print HTM_Decoder (test_string)
> 
> ø=(xf8)   #  248  f8
> ù=(xf9)   #  249  f9
> ú=(xfa)   #  250  fa
> û=(xfb)#  251  fb
> ü=(xfc) #  252  fc
> ý=(xfd)   #  253  fd
> þ=(xfe)#  254  fe
> é=(xe9)
> ê=(xea)
> ë=(xeb)
> ì=(xec)
> í=(xed)
> î=(xee)
> ï=(xef)
> 
> Another feature of SE is modularity.
> 
>  >>> strip_tags = '''
>~<(.|\x0a)*?>~=(9)   # one tag to one tab
>~~=(9)  # one comment to one tab
> |   # run
>"~\x0a[ \x09\x0d\x0a]*~=(x0a)"   # delete empty lines
>~\t+~=(32)   # one or more tabs to one space
>~\x20\t+~=(32)   # one space and one or more tabs to 
> one space
>~\t+\x20~=(32)   # one or more tab and one space to 
> one space
> '''
> 
>  >>> HTM_Stripper_Decoder = SE.SE (strip_tags + ' HTM2ISO.se ')   # 
> Order doesn't matter
> 
> If you write 'strip_tags' to a file, say 'STRIP_TAGS.se' you'd name it 
> together with HTM2ISO.se:
> 
>  >>> HTM_Stripper_Decoder = SE.SE ('STRIP_TAGS.se  HTM2ISO.se')   # 
> Order doesn't matter
> 
> Or, if you have two SE objects, one for stripping tags and one for 
> decoding the ampersands, you can nest them like this:
> 
>  >>> test_string = " style='line-height:110%'>René est un garçon qui 
> paraît plus âgé. "
> 
>  >>> print Tag_Stripper (HTM_Decoder (test_string))
>   René est un garçon qui paraît plus âgé.
> 
> Nesting works with file names too, because file names are returned:
> 
>  >>> Tag_Stripper (HTM_Decoder ('input_file_name'), 'output_file_name')
> 'output_file_name'
> 
> 
> Frederic
> 
> 
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unescape HTML entities

2006-10-31 Thread Rares Vernica
Hi,

How does your code deal with ' like entities?

Thanks,
Ray

Klaus Alexander Seistrup wrote:
> Rares Vernica wrote:
> 
>> How can I unescape HTML entities like " "?
>>
>> I know about xml.sax.saxutils.unescape() but it only deals with
>> "&", "<", and ">".
>>
>> Also, I know about htmlentitydefs.entitydefs, but not only this 
>> dictionary is the opposite of what I need, it does not have 
>> " ".
> 
> How about something like:
> 
> #v+
> #!/usr/bin/env/python
> '''dehtml.py'''
> 
> import re
> import htmlentitydef
> 
> myrx = re.compile('&(' + '|'.join(htmlentitydefs.name2codepoint.keys()) + 
> ');')
> 
> def dehtml(s):
> return re.sub(
> myrx,
> lambda m: unichr(htmlentitydefs.name2codepoint[m.group(1)]),
> s
> )
> # end def dehtml
> 
> if __name__ == '__main__':
> import sys
> print dehtml(sys.stdin.read()).encode('utf-8')
> # end if
> 
> #v-
> 
> E.g.:
> 
> #v+
> 
> $ echo 'frække frølår' | ./dehtml.py
> frække frølår
> $ 
> 
> #v-
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unescape HTML entities

2006-11-01 Thread Rares Vernica
Hi,

Nice module!

I downloaded 2.3 and I started to play with it. The file names have 
funny names, they are all caps, including extension.

For example the main module file is "SE.PY". Is you try "import SE" it 
will not work as Python expects the file extension to be "py".

Thanks,
Ray

Frederic Rentsch wrote:
> Rares Vernica wrote:
>> Hi,
>>
>> How can I unescape HTML entities like " "?
>>
>> I know about xml.sax.saxutils.unescape() but it only deals with "&", 
>> "<", and ">".
>>
>> Also, I know about htmlentitydefs.entitydefs, but not only this 
>> dictionary is the opposite of what I need, it does not have " ".
>>
>> It has to be in python 2.4.
>>
>> Thanks a lot,
>> Ray
>>
> One way is this:
> 
>  >>> import SE  # 
> Download from http://cheeseshop.python.org/pypi/SE/2.2%20beta
>  >>> SE.SE ('HTM2ISO.se')('input_file_name', 'output_file_name')# 
> HTM2ISO.se is included
> 'output_file_name'
> 
> For repeated translations the SE object would be assigned to a variable:
> 
>  >>> HTM_Decoder = SE.SE ('HTM2ISO.se')
> 
> SE objects take and return strings as well as file names which is useful 
> for translating string variables, doing line-by-line translations and 
> for interactive development or verification. A simple way to check a 
> substitution set is to use its definitions as test data. The following 
> is a section of the definition file HTM2ISO.se:
> 
> test_string = '''
> ø=(xf8)   #  248  f8
> ù=(xf9)   #  249  f9
> ú=(xfa)   #  250  fa
> û=(xfb)#  251  fb
> ü=(xfc) #  252  fc
> ý=(xfd)   #  253  fd
> þ=(xfe)#  254  fe
> é=(xe9)
> ê=(xea)
> ë=(xeb)
> ì=(xec)
> í=(xed)
> î=(xee)
> ï=(xef)
> '''
> 
>  >>> print HTM_Decoder (test_string)
> 
> ø=(xf8)   #  248  f8
> ù=(xf9)   #  249  f9
> ú=(xfa)   #  250  fa
> û=(xfb)#  251  fb
> ü=(xfc) #  252  fc
> ý=(xfd)   #  253  fd
> þ=(xfe)#  254  fe
> é=(xe9)
> ê=(xea)
> ë=(xeb)
> ì=(xec)
> í=(xed)
> î=(xee)
> ï=(xef)
> 
> Another feature of SE is modularity.
> 
>  >>> strip_tags = '''
>~<(.|\x0a)*?>~=(9)   # one tag to one tab
>~~=(9)  # one comment to one tab
> |   # run
>"~\x0a[ \x09\x0d\x0a]*~=(x0a)"   # delete empty lines
>~\t+~=(32)   # one or more tabs to one space
>~\x20\t+~=(32)   # one space and one or more tabs to 
> one space
>~\t+\x20~=(32)   # one or more tab and one space to 
> one space
> '''
> 
>  >>> HTM_Stripper_Decoder = SE.SE (strip_tags + ' HTM2ISO.se ')   # 
> Order doesn't matter
> 
> If you write 'strip_tags' to a file, say 'STRIP_TAGS.se' you'd name it 
> together with HTM2ISO.se:
> 
>  >>> HTM_Stripper_Decoder = SE.SE ('STRIP_TAGS.se  HTM2ISO.se')   # 
> Order doesn't matter
> 
> Or, if you have two SE objects, one for stripping tags and one for 
> decoding the ampersands, you can nest them like this:
> 
>  >>> test_string = " style='line-height:110%'>René est un garçon qui 
> paraît plus âgé. "
> 
>  >>> print Tag_Stripper (HTM_Decoder (test_string))
>   René est un garçon qui paraît plus âgé.
> 
> Nesting works with file names too, because file names are returned:
> 
>  >>> Tag_Stripper (HTM_Decoder ('input_file_name'), 'output_file_name')
> 'output_file_name'
> 
> 
> Frederic
> 
> 
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unescape HTML entities

2006-11-01 Thread Rares Vernica
Hi,

I downloades 2.2 beta, just to be sure I have the same version as you 
specify. (The file names are no longer funny.) Anyway, it does not seem 
to do as you said:

In [14]: import SE

In [15]: SE.version
---> SE.version()
Out[15]: 'SE 2.2 beta - SEL 2.2 beta'

In [16]: HTM_Decoder = SE.SE ('HTM2ISO.se')

In [17]: test_string = '''
: ø=(xf8)   #  248  f8
: ù=(xf9)   #  249  f9
: ú=(xfa)   #  250  fa
: û=(xfb)#  251  fb
: ü=(xfc) #  252  fc
: ý=(xfd)   #  253  fd
: þ=(xfe)#  254  fe
: é=(xe9)
: ê=(xea)
: ë=(xeb)
: ì=(xec)
: í=(xed)
: î=(xee)
: ï=(xef)
: '''

In [18]: print HTM_Decoder (test_string)

ø=(xf8)   #  248  f8
ù=(xf9)   #  249  f9
ú=(xfa)   #  250  fa
û=(xfb)#  251  fb
ü=(xfc) #  252  fc
ý=(xfd)   #  253  fd
þ=(xfe)#  254  fe
é=(xe9)
ê=(xea)
ë=(xeb)
ì=(xec)
í=(xed)
î=(xee)
ï=(xef)


In [19]:

Thanks,
Ray



Frederic Rentsch wrote:
> Rares Vernica wrote:
>> Hi,
>>
>> How can I unescape HTML entities like " "?
>>
>> I know about xml.sax.saxutils.unescape() but it only deals with "&", 
>> "<", and ">".
>>
>> Also, I know about htmlentitydefs.entitydefs, but not only this 
>> dictionary is the opposite of what I need, it does not have " ".
>>
>> It has to be in python 2.4.
>>
>> Thanks a lot,
>> Ray
>>
> One way is this:
> 
>  >>> import SE  # 
> Download from http://cheeseshop.python.org/pypi/SE/2.2%20beta
>  >>> SE.SE ('HTM2ISO.se')('input_file_name', 'output_file_name')# 
> HTM2ISO.se is included
> 'output_file_name'
> 
> For repeated translations the SE object would be assigned to a variable:
> 
>  >>> HTM_Decoder = SE.SE ('HTM2ISO.se')
> 
> SE objects take and return strings as well as file names which is useful 
> for translating string variables, doing line-by-line translations and 
> for interactive development or verification. A simple way to check a 
> substitution set is to use its definitions as test data. The following 
> is a section of the definition file HTM2ISO.se:
> 
> test_string = '''
> ø=(xf8)   #  248  f8
> ù=(xf9)   #  249  f9
> ú=(xfa)   #  250  fa
> û=(xfb)#  251  fb
> ü=(xfc) #  252  fc
> ý=(xfd)   #  253  fd
> þ=(xfe)#  254  fe
> é=(xe9)
> ê=(xea)
> ë=(xeb)
> ì=(xec)
> í=(xed)
> î=(xee)
> ï=(xef)
> '''
> 
>  >>> print HTM_Decoder (test_string)
> 
> ø=(xf8)   #  248  f8
> ù=(xf9)   #  249  f9
> ú=(xfa)   #  250  fa
> û=(xfb)#  251  fb
> ü=(xfc) #  252  fc
> ý=(xfd)   #  253  fd
> þ=(xfe)#  254  fe
> é=(xe9)
> ê=(xea)
> ë=(xeb)
> ì=(xec)
> í=(xed)
> î=(xee)
> ï=(xef)
> 
> Another feature of SE is modularity.
> 
>  >>> strip_tags = '''
>~<(.|\x0a)*?>~=(9)   # one tag to one tab
>~~=(9)  # one comment to one tab
> |   # run
>"~\x0a[ \x09\x0d\x0a]*~=(x0a)"   # delete empty lines
>~\t+~=(32)   # one or more tabs to one space
>~\x20\t+~=(32)   # one space and one or more tabs to 
> one space
>~\t+\x20~=(32)   # one or more tab and one space to 
> one space
> '''
> 
>  >>> HTM_Stripper_Decoder = SE.SE (strip_tags + ' HTM2ISO.se ')   # 
> Order doesn't matter
> 
> If you write 'strip_tags' to a file, say 'STRIP_TAGS.se' you'd name it 
> together with HTM2ISO.se:
> 
>  >>> HTM_Stripper_Decoder = SE.SE ('STRIP_TAGS.se  HTM2ISO.se')   # 
> Order doesn't matter
> 
> Or, if you have two SE objects, one for stripping tags and one for 
> decoding the ampersands, you can nest them like this:
> 
>  >>> test_string = " style='line-height:110%'>René est un garçon qui 
> paraît plus âgé. "
> 
>  >>> print Tag_Stripper (HTM_Decoder (test_string))
>   René est un garçon qui paraît plus âgé.
> 
> Nesting works with file names too, because file names are returned:
> 
>  >>> Tag_Stripper (HTM_Decoder ('input_file_name'), 'output_file_name')
> 'output_file_name'
> 
> 
> Frederic
> 
> 
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Physical constants

2006-11-03 Thread Rares Vernica
Hi,

I am not sure how the constants are implemented in math, but here is how 
I would do it. The main idea is to declare the constants as globals in 
some file.

Declare all the constants in a file:
const.py
---
pi = 3.14

Whenever you want to use pi from another file, just do:
somecode.py
---
from const import pi

a = 2 * pi

Regards,
Ray

Tommy Grav wrote:
> I have some code for doing orbital computations. The code is kind of
> extensive with many classes, each having several functions. In these
> functions I need to use constants (like the gravitational constant). 
> What is the best way of implementing a solution when constants are
> used in several different classes and functions? I do not want to 
> pass the constant down through the functions. I have thought of
> making a class of constants but I do not want to invoke an 
> instant in each function. How is the pi and e constants in math
> coded?
> 
> Tommy
> 
> [EMAIL PROTECTED] 
> 
> http://homepage.mac.com/tgrav/
> 
> 
> "Any intelligent fool can make things bigger, 
> more complex, and more violent. It takes a 
> touch of genious -- and a lot of courage -- 
> to move in the opposite direction"
>  -- Albert Einstein
> 
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: use of (a**b) (a*b)

2006-11-07 Thread Rares Vernica
Hi,

Check out some examples:
In [16]: 9./2
Out[16]: 4.5

In [17]: 9.//2
Out[17]: 4

In [18]: 2*3
Out[18]: 6

In [19]: 2**3
Out[19]: 8

Here is the documentation for these operations:
http://docs.python.org/lib/typesnumeric.html

Regards,
Ray

Santosh Chikkerur wrote:
> Hi Friends,
> Let me know the use of ' ** ' operator and ' \\' use.
> 
> f=(3*s**2) how is it different, if use only single '*' and also the 
> divide operator.
> 
> 
> Thanks in advance,
> Santosh
> 
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


remove a list from a list

2006-11-17 Thread Rares Vernica
Hi,

I have the following problem:

I have a list like
   e = ['a', 'b', 'e']
and another list like
   l = ['A', 'a', 'c', 'D', 'E']
I would like to remove from l all the elements that appear in e 
case-insensitive. That is, the result would be
   r = ['c', 'D']

What is a *nice* way of doing it?

Thanks a lot,
Ray

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: remove a list from a list

2006-11-17 Thread Rares Vernica
That is a nice solution.

But, how about modifying the list in place?

That is, l would become ['c', 'D'].

Thanks a lot,
Ray

Tim Chase wrote:
>> I have a list like
>>e = ['a', 'b', 'e']
>> and another list like
>>l = ['A', 'a', 'c', 'D', 'E']
>> I would like to remove from l all the elements that appear in e 
>> case-insensitive. That is, the result would be
>>r = ['c', 'D']
>>
>> What is a *nice* way of doing it?
> 
> 
> Well, it's usually advantageous (for speed purposes) to make a 
> set out of your lookup data.  One can then use it for a list 
> comprehension something like
> 
>  >>> e = ['a', 'b', 'e']
>  >>> l = ['A', 'a', 'c', 'D', 'E']
>  >>> s = set(e)
>  >>> [x for x in l if x.lower() not in s]
> ['c', 'D']
> 
> This presumes that "e" is all lowercase letters.  Otherwise, you 
> can force it with
> 
>   s = set(c.lower() for c in e)
> 
> -tkc
> 
> 
> 
> 
> 
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: remove a list from a list

2006-11-17 Thread Rares Vernica
Yeah, I ended up doing a similar kind of loop. That is pretty messy.

Is there any other way?

Thanks,
Ray

Tim Chase wrote:
>> That is a nice solution.
>>
>> But, how about modifying the list in place?
>>
>> That is, l would become ['c', 'D'].
>>
>>>  >>> e = ['a', 'b', 'e']
>>>  >>> l = ['A', 'a', 'c', 'D', 'E']
>>>  >>> s = set(e)
>>>  >>> [x for x in l if x.lower() not in s]
>>> ['c', 'D']
> 
> 
> Well...changing the requirements midstream, eh? ;-)
> 
> You can just change that last item to be a reassignment if "l" is 
> all you care about:
> 
>  >>> l = [x for x in l ...]
> 
> Things get a bit hairier if you *must* do it in-place.  You'd 
> have to do something like this (untested)
> 
> for i in xrange(len(l), 0, -1):
>   if l[i-1].lower() in s:
>   del l[i-1]
> 
> 
> which should do the job.
> 
> -tkc
> 
> 
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: remove a list from a list

2006-11-17 Thread Rares Vernica
Sorry for not being clear from the beginning and for not using clear 
variable names.

Problem context:

import os
dirs_exclude = set(('a', 'b', 'e'))
for root, dirs, files in os.walk('python/Lib/email'):
 # Task:
 # delete from "dirs" the directory names from "dirs_exclude"
 # case-insensitive

The solution so far is:

for i in xrange(len(dirs), 0, -1):
   if dirs[i-1].lower() in dirs_exclude:
 del dirs[i-1]

I am looking for a nicer solution.

Thanks a lot,
Ray

Tim Chase wrote:
>> Yeah, I ended up doing a similar kind of loop. That is pretty messy.
>>
>> Is there any other way?
> 
> I've already provided 2 (or 3 depending on how one counts) 
> solutions, each of which solve an interpretation of your original 
> problem, neither of which involve more than 3 lines of fairly 
> clean code.  Perhaps a little more context regarding what you 
> *want* to do would help.  However, I suspect that answer is 
> "there is no *cleaner* way to do it".
> 
> Unless you're modifying an existing list that is referenced 
> elsewhere, the reassignment (l = [x for x in l ...]) solution 
> should work just fine.  Thus, unless you have a situation akin to:
> 
>   g = l
>   l = [x for x in l if x.lower() not in s]
>   assert(thing_from_s not in g)
> 
> then just reassign "l".  If not, use the loop.  It's that easy 
> and clean.  Don't try to opaquify it by collapsing it further. 
> Perhaps, if your loop is messy, use my clean loop suggestion.
> 
> -tkc
> 
> 
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: remove a list from a list

2006-11-17 Thread Rares Vernica
The problem with skipping over them is that "walk" would still walk them 
and their content. If they have a lot of other dirs and files inside 
then this might end up being time consuming.

Thanks,
Ray

Neil Cerutti wrote:
> On 2006-11-17, Rares Vernica <[EMAIL PROTECTED]> wrote:
>> Sorry for not being clear from the beginning and for not using
>> clear variable names.
>>
>> Problem context:
>>
>> import os
>> dirs_exclude = set(('a', 'b', 'e'))
>> for root, dirs, files in os.walk('python/Lib/email'):
>>  # Task:
>>  # delete from "dirs" the directory names from "dirs_exclude"
>>  # case-insensitive
>>
>> The solution so far is:
>>
>> for i in xrange(len(dirs), 0, -1):
>>if dirs[i-1].lower() in dirs_exclude:
>>  del dirs[i-1]
>>
>> I am looking for a nicer solution.
> 
> I'd probably just skip over those dirs as I came them instead of
> troubling about mutating the list. Unless the list is needed in
> more than one place.
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: remove a list from a list

2006-11-17 Thread Rares Vernica
This solution I think is pretty nice:

source[:] = [x for x in source if x.lower() not in target]

Thanks a lot for all the answers,
Ray

Steven D'Aprano wrote:
> On Fri, 17 Nov 2006 12:00:46 -0800, Rares Vernica wrote:
> 
>> Problem context:
>>
>> import os
>> dirs_exclude = set(('a', 'b', 'e'))
>> for root, dirs, files in os.walk('python/Lib/email'):
>>  # Task:
>>  # delete from "dirs" the directory names from "dirs_exclude"
>>  # case-insensitive
>>
>> The solution so far is:
>>
>> for i in xrange(len(dirs), 0, -1):
>>if dirs[i-1].lower() in dirs_exclude:
>>  del dirs[i-1]
>>
>> I am looking for a nicer solution.
> 
> Define "nicer".
> 
> First thing I'd do is change the loop:
> 
> for i in xrange(len(dirs)-1, -1, -1):
> if dirs[i].lower() in dirs_exclude:
> del dirs[i]
> 
> Second thing I'd do is encapsulate it in a function instead of calling it
> in place:
> 
> def remove_in_place(source, target):
> for i in xrange(len(source)-1, -1, -1):
> if source[i].lower() in target:
> del source[i]
> 
> Third thing I'd do is replace the delete-in-place code away, and build a
> new list using the set idiom, finally using list slicing to change the
> source in place:
> 
> def remove_in_place2(source, target):
> target = set(s.lower() for s in target)
> source[:] = [x for x in source if x.lower() not in target]
> # note the assignment to a slice
> 
> And finally, I would test the two versions remove_in_place and
> remove_in_place2 to see which is faster.
> 
> 
> import timeit
> 
> setup = """from __main__ import remove_in_place
> target = list("aEIOu")
> source = list("AbcdEfghIjklmnOpqrstUvwxyz")
> """
> 
> tester = """tmplist = source[:] # make a copy of the list!
> remove_in_place(tmplist, target)
> """
> 
> timeit.Timer(tester, setup).timer()
> 
> You have to make a copy of the list on every iteration because you are
> changing it in place; otherwise you change the values you are testing
> against, and the second iteration onwards doesn't have to remove anything.
> 
> 
> (All code above untested. Use at own risk.)
> 

-- 
http://mail.python.org/mailman/listinfo/python-list


re.search

2006-03-05 Thread Rares Vernica
Hi,

Isn't the following code supposed to return ('1994')?

 >>> re.search('(\d{4})?', '4 1994').groups()
(None,)

Thanks,
Ray
-- 
http://mail.python.org/mailman/listinfo/python-list


Unicode error handler

2007-01-26 Thread Rares Vernica
Hi,

Does anyone know of any Unicode encode/decode error handler that does a 
better replace job than the default replace error handler?

For example I have an iso-8859-1 string that has an 'e' with an accent 
(you know, the French 'e's). When I use s.encode('ascii', 'replace') the 
'e' will be replaced with '?'. I would prefer to be replaced with an 'e' 
even if I know it is not 100% correct.

If only this letter would be the problem I would do it manually, but 
there is an entire set of letters that need to be replaced with their 
closest ascii letter.

Is there an encode/decode error handler that can replace all the 
not-ascii letters from iso-8859-1 with their closest ascii letter?

Thanks a lot,
Ray

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode error handler

2007-01-26 Thread Rares Vernica
It does the job.

Thanks a lot,
Ray

Peter Otten wrote:
> Rares Vernica wrote:
> 
>> Is there an encode/decode error handler that can replace all the
>> not-ascii letters from iso-8859-1 with their closest ascii letter?
> 
> A mapping, not an error handler, but it might do the job:
> 
> http://effbot.org/zone/unicode-convert.htm
> 
> Peter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: locale, format monetary values

2006-04-18 Thread Rares Vernica
That's it. Thanks a lot!

There is no example on the locale.format in the docs so I was confused.

Regards,
Ray

deelan wrote:
> Rares Vernica wrote:
>> Hi,
>>
>> Can I use locale to format monetary values? If yes, how? If no, is 
>> there something I can use?
>>
>> E.g.,
>> I have 1 and I want to get "$10,000".
> 
> try something like:
> 
>  >>> import locale
>  >>> locale.setlocale(locale.LC_ALL, "en-US")
> 'English_United States.1252'
>  >>> locale.format("%f", 1, True)
> '10,000.00'
>  >>> locale.format("$%.2f", 1, True)
> '$10,000.00'
> 
> bye.
> 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: which datastructure for fast sorted insert?

2008-05-25 Thread Rares Vernica
use a set to store them:

>>> s=set()
>>> s.add('a')
>>> s.add('b')
>>> s
set(['a', 'b'])
>>> s.add('a')
>>> s
set(['a', 'b'])
>>> s.add('c')
>>> s
set(['a', 'c', 'b'])
>>> 

it does remove duplicates, but is it not ordered. to order it you can
use:

>>> l=list(s)
>>> l.sort()
>>> l
['a', 'b', 'c']

hth,
Rares
--
http://mail.python.org/mailman/listinfo/python-list


readline support

2006-04-03 Thread Rares Vernica
Hi,

I am trying to get readline support in python.

I am working on Linux and I have the latest version from svn. After I 
./configure and make I can run python, but the readline support is not 
there.

If I do:
%./configure >& out.txt
%grep readline out.txt
checking for readline in -lreadline... yes
checking for rl_callback_handler_install in -lreadline... yes
checking for rl_pre_input_hook in -lreadline... yes
checking for rl_completion_matches in -lreadline... yes

it seems that readline library is there and is found by python.

How can I further track down this issue?

Thanks,
Ray
-- 
http://mail.python.org/mailman/listinfo/python-list


locale, format monetary values

2006-04-16 Thread Rares Vernica
Hi,

Can I use locale to format monetary values? If yes, how? If no, is there 
something I can use?

E.g.,
I have 1 and I want to get "$10,000".

Thanks,
Ray
-- 
http://mail.python.org/mailman/listinfo/python-list