On Jul 2, 4:31 am, Tep <petshm...@googlemail.com> wrote: > On 2 Jul., 10:25, Tep <petshm...@googlemail.com> wrote: > > > > > On 2 Jul., 01:56, MRAB <pyt...@mrabarnett.plus.com> wrote: > > > > someone wrote: > > > > Hello, > > > > > how can I replace '—' sign from string? Or do split at that character? > > > > Getting unicode error if I try to do it: > > > > > UnicodeDecodeError: 'ascii' codec can't decode byte 0x97 in position > > > > 1: ordinal not in range(128) > > > > > Thanks, Pet > > > > > script is # -*- coding: UTF-8 -*- > > > > It sounds like you're mixing bytestrings with Unicode strings. I can't > > > be any more helpful because you haven't shown the code. > > > Oh, I'm sorry. Here it is > > > def cleanInput(input) > > return input.replace('—', '') > > I also need: > > #input is html source code, I have problem with only this character > #input = 'foo — bar' > #return should be foo > def splitInput(input) > parts = input.split(' — ') > return parts[0] > > Thanks!
Okay people want to help you but you must make it easy for us. Post again with a small piece of code that is runnable as-is and that causes the traceback you're talking about, AND post the complete traceback too, as-is. I just tried a bit of your code above in my interpreter here and it worked fine: |>>> data = 'foo — bar' |>>> data.split('—') |['foo ', ' bar'] |>>> data = u'foo — bar' |>>> data.split(u'—') |[u'foo ', u' bar'] Figure out the smallest piece of "html source code" that causes the problem and include that with your next post. HTH, ~Simon You might also read this: http://catb.org/esr/faqs/smart-questions.html -- http://mail.python.org/mailman/listinfo/python-list