Re: getting rid of —

Simon Forman Thu, 02 Jul 2009 21:46:31 -0700

On Jul 2, 4:31 am, Tep <[email protected]> wrote:
> On 2 Jul., 10:25, Tep <[email protected]> wrote:
>
>
>
> > On 2 Jul., 01:56, MRAB <[email protected]> wrote:
>
> > > someone wrote:
> > > > Hello,
>
> > > > how can I replace '—' sign from string? Or do split at that character?
> > > > Getting unicode error if I try to do it:
>
> > > > UnicodeDecodeError: 'ascii' codec can't decode byte 0x97 in position
> > > > 1: ordinal not in range(128)
>
> > > > Thanks, Pet
>
> > > > script is # -*- coding: UTF-8 -*-
>
> > > It sounds like you're mixing bytestrings with Unicode strings. I can't
> > > be any more helpful because you haven't shown the code.
>
> > Oh, I'm sorry. Here it is
>
> > def cleanInput(input)
> >     return input.replace('—', '')
>
> I also need:
>
> #input is html source code, I have problem with only this character
> #input = 'foo — bar'
> #return should be foo
> def splitInput(input)
>     parts = input.split(' — ')
>     return parts[0]
>
> Thanks!


Okay people want to help you but you must make it easy for us.

Post again with a small piece of code that is runnable as-is and that
causes the traceback you're talking about, AND post the complete
traceback too, as-is.

I just tried a bit of your code above in my interpreter here and it
worked fine:

|>>> data = 'foo — bar'
|>>> data.split('—')
|['foo ', ' bar']
|>>> data = u'foo — bar'
|>>> data.split(u'—')
|[u'foo ', u' bar']

Figure out the smallest piece of "html source code" that causes the
problem and include that with your next post.

HTH,
~Simon

You might also read this: http://catb.org/esr/faqs/smart-questions.html
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: getting rid of —

Reply via email to