Re: Some questions about decode/encode

glacier Sun, 27 Jan 2008 02:21:31 -0800

On 1月24日, 下午3时29分, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote:
> En Thu, 24 Jan 2008 04:52:22 -0200, glacier <[EMAIL PROTECTED]> escribió:
>
> > According to your reply, what will happen if I try to decode a long
> > string seperately.
> > I mean:
> > ######################################
> > a='你好吗'*100000
> > s1 = u''
> > cur = 0
> > while cur < len(a):
> >     d = min(len(a)-i,1023)
> >     s1 += a[cur:cur+d].decode('mbcs')
> >     cur += d
> > ######################################
>
> > May the code above produce any bogus characters in s1?
>
> Don't do that. You might be splitting the input string at a point that is  
> not a character boundary. You won't get bogus output, decode will raise a  
> UnicodeDecodeError instead.
> You can control how errors are handled, see  
> http://docs.python.org/lib/string-methods.html#l2h-237
>
> --
> Gabriel Genellina


Thanks Gabriel,

I guess I understand what will happen if I didn't split the string at
the character's boundry.
I'm not sure if the decode method will miss split the boundry.
Can you tell me then ?

Thanks a lot.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Some questions about decode/encode

Reply via email to