Re: getting rid of —

Tep Fri, 03 Jul 2009 03:31:34 -0700

On 3 Jul., 06:40, Simon Forman <sajmik...@gmail.com> wrote:
> On Jul 2, 4:31 am, Tep <petshm...@googlemail.com> wrote:
>
>
>
>
>
> > On 2 Jul., 10:25, Tep <petshm...@googlemail.com> wrote:
>
> > > On 2 Jul., 01:56, MRAB <pyt...@mrabarnett.plus.com> wrote:
>
> > > > someone wrote:
> > > > > Hello,
>
> > > > > how can I replace '—' sign from string? Or do split at that character?
> > > > > Getting unicode error if I try to do it:
>
> > > > > UnicodeDecodeError: 'ascii' codec can't decode byte 0x97 in position
> > > > > 1: ordinal not in range(128)
>
> > > > > Thanks, Pet
>
> > > > > script is # -*- coding: UTF-8 -*-
>
> > > > It sounds like you're mixing bytestrings with Unicode strings. I can't
> > > > be any more helpful because you haven't shown the code.
>
> > > Oh, I'm sorry. Here it is
>
> > > def cleanInput(input)
> > >     return input.replace('—', '')
>
> > I also need:
>
> > #input is html source code, I have problem with only this character
> > #input = 'foo — bar'
> > #return should be foo
> > def splitInput(input)
> >     parts = input.split(' — ')
> >     return parts[0]
>
> > Thanks!
>
> Okay people want to help you but you must make it easy for us.
>
> Post again with a small piece of code that is runnable as-is and that
> causes the traceback you're talking about, AND post the complete
> traceback too, as-is.
>
> I just tried a bit of your code above in my interpreter here and it
> worked fine:
>
> |>>> data = 'foo — bar'
> |>>> data.split('—')
> |['foo ', ' bar']
> |>>> data = u'foo — bar'
> |>>> data.split(u'—')
> |[u'foo ', u' bar']
>
> Figure out the smallest piece of "html source code" that causes the
> problem and include that with your next post.


The problem was, I've converted "html source code" to unicode object
and didn't encoded to utf-8 back, before using split...
Thanks for help and sorry for not so smart question
Pet

>
> HTH,
> ~Simon
>
> You might also read this:http://catb.org/esr/faqs/smart-questions.html

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: getting rid of —

Reply via email to