Rares Vernica wrote: > Hi, > > Nice module! > > I downloaded 2.3 and I started to play with it. The file names have > funny names, they are all caps, including extension. > > For example the main module file is "SE.PY". Is you try "import SE" it > will not work as Python expects the file extension to be "py". > > Thanks, > Ray > > Frederic Rentsch wrote: > >> Rares Vernica wrote: >> >>> Hi, >>> >>> How can I unescape HTML entities like " "? >>> >>> I know about xml.sax.saxutils.unescape() but it only deals with "&", >>> "<", and ">". >>> >>> Also, I know about htmlentitydefs.entitydefs, but not only this >>> dictionary is the opposite of what I need, it does not have " ". >>> >>> It has to be in python 2.4. >>> >>> Thanks a lot, >>> Ray >>> >>> >> One way is this: >> >> >>> import SE # >> Download from http://cheeseshop.python.org/pypi/SE/2.2%20beta >> >>> SE.SE ('HTM2ISO.se')('input_file_name', 'output_file_name') # >> HTM2ISO.se is included >> 'output_file_name' >> >> For repeated translations the SE object would be assigned to a variable: >> >> >>> HTM_Decoder = SE.SE ('HTM2ISO.se') >> >> SE objects take and return strings as well as file names which is useful >> for translating string variables, doing line-by-line translations and >> for interactive development or verification. A simple way to check a >> substitution set is to use its definitions as test data. The following >> is a section of the definition file HTM2ISO.se: >> >> test_string = ''' >> ø=(xf8) # 248 f8 >> ù=(xf9) # 249 f9 >> ú=(xfa) # 250 fa >> û=(xfb) # 251 fb >> ü=(xfc) # 252 fc >> ý=(xfd) # 253 fd >> þ=(xfe) # 254 fe >> é=(xe9) >> ê=(xea) >> ë=(xeb) >> ì=(xec) >> í=(xed) >> î=(xee) >> ï=(xef) >> ''' >> >> >>> print HTM_Decoder (test_string) >> >> ø=(xf8) # 248 f8 >> ù=(xf9) # 249 f9 >> ú=(xfa) # 250 fa >> û=(xfb) # 251 fb >> ü=(xfc) # 252 fc >> ý=(xfd) # 253 fd >> þ=(xfe) # 254 fe >> é=(xe9) >> ê=(xea) >> ë=(xeb) >> ì=(xec) >> í=(xed) >> î=(xee) >> ï=(xef) >> >> Another feature of SE is modularity. >> >> >>> strip_tags = ''' >> ~<(.|\x0a)*?>~=(9) # one tag to one tab >> ~<!--(.|\x0a)*?-->~=(9) # one comment to one tab >> | # run >> "~\x0a[ \x09\x0d\x0a]*~=(x0a)" # delete empty lines >> ~\t+~=(32) # one or more tabs to one space >> ~\x20\t+~=(32) # one space and one or more tabs to >> one space >> ~\t+\x20~=(32) # one or more tab and one space to >> one space >> ''' >> >> >>> HTM_Stripper_Decoder = SE.SE (strip_tags + ' HTM2ISO.se ') # >> Order doesn't matter >> >> If you write 'strip_tags' to a file, say 'STRIP_TAGS.se' you'd name it >> together with HTM2ISO.se: >> >> >>> HTM_Stripper_Decoder = SE.SE ('STRIP_TAGS.se HTM2ISO.se') # >> Order doesn't matter >> >> Or, if you have two SE objects, one for stripping tags and one for >> decoding the ampersands, you can nest them like this: >> >> >>> test_string = "<p class=MsoNormal >> style='line-height:110%'><i>René</i> est un garçon qui >> paraît plus âgé. </p>" >> >> >>> print Tag_Stripper (HTM_Decoder (test_string)) >> René est un garçon qui paraît plus âgé. >> >> Nesting works with file names too, because file names are returned: >> >> >>> Tag_Stripper (HTM_Decoder ('input_file_name'), 'output_file_name') >> 'output_file_name' >> >> >> Frederic >> >> >> >> > > Arrrgh!
Did it again capitalizing extensions. We had solved this problem and here we have it again. I am so sorry. Fortunately it isn't hard to solve, renaming the files once one identifies the problem, which you did. I shall change the upload within the next sixty seconds. Frederic I'm glad you find it useful. -- http://mail.python.org/mailman/listinfo/python-list