On 3 Jul., 06:40, Simon Forman <sajmik...@gmail.com> wrote: > On Jul 2, 4:31 am, Tep <petshm...@googlemail.com> wrote: > > > > > > > On 2 Jul., 10:25, Tep <petshm...@googlemail.com> wrote: > > > > On 2 Jul., 01:56, MRAB <pyt...@mrabarnett.plus.com> wrote: > > > > > someone wrote: > > > > > Hello, > > > > > > how can I replace '—' sign from string? Or do split at that character? > > > > > Getting unicode error if I try to do it: > > > > > > UnicodeDecodeError: 'ascii' codec can't decode byte 0x97 in position > > > > > 1: ordinal not in range(128) > > > > > > Thanks, Pet > > > > > > script is # -*- coding: UTF-8 -*- > > > > > It sounds like you're mixing bytestrings with Unicode strings. I can't > > > > be any more helpful because you haven't shown the code. > > > > Oh, I'm sorry. Here it is > > > > def cleanInput(input) > > > return input.replace('—', '') > > > I also need: > > > #input is html source code, I have problem with only this character > > #input = 'foo — bar' > > #return should be foo > > def splitInput(input) > > parts = input.split(' — ') > > return parts[0] > > > Thanks! > > Okay people want to help you but you must make it easy for us. > > Post again with a small piece of code that is runnable as-is and that > causes the traceback you're talking about, AND post the complete > traceback too, as-is. > > I just tried a bit of your code above in my interpreter here and it > worked fine: > > |>>> data = 'foo — bar' > |>>> data.split('—') > |['foo ', ' bar'] > |>>> data = u'foo — bar' > |>>> data.split(u'—') > |[u'foo ', u' bar'] > > Figure out the smallest piece of "html source code" that causes the > problem and include that with your next post.
The problem was, I've converted "html source code" to unicode object and didn't encoded to utf-8 back, before using split... Thanks for help and sorry for not so smart question Pet > > HTH, > ~Simon > > You might also read this:http://catb.org/esr/faqs/smart-questions.html -- http://mail.python.org/mailman/listinfo/python-list