[Stefan Behnel] wrote & schrieb: >Martin Bless wrote: >> What's a good way to encode and decode those entities like € or >> € ? > >Hmm, since you provide code, I'm not quite sure what your actual question is.
- What's a GOOD way? - Am I reinventing the wheel? - Are there well tested, fast, state of the art, builtin ways? - Is something like line.decode('htmlentities') out there? - Am I in conformity with relevant RFCs? (I'm hoping so ...) >So I'll just comment on the code here. > > >> def entity2uc(entity): >> """Convert entity like { to unichr. >> >> Return (result,True) on success or (input string, False) >> otherwise. Example: >> entity2cp('€') -> (u'\u20ac',True) >> entity2cp('€') -> (u'\u20ac',True) >> entity2cp('€') -> (u'\u20ac',True) >> entity2cp('&foobar;') -> ('&foobar;',False) >> """ > >Is there a reason why you return a tuple instead of just returning the >converted result and raising an exception if the conversion fails? Mainly a matter of style. When I'll be using the function in future this way it's unambigously clear that there might have been unconverted entities. But I don't have to deal with the details of how this has been discovered. And may be I'd like to change the algorithm in future? This way it's nicely encapsulated. Have a nice day Martin -- http://mail.python.org/mailman/listinfo/python-list