"Rhamphoryncus" <[EMAIL PROTECTED]> writes:
> > > >   i = s.index(e) => s[i] = e
> > > > Then this algorithm is no longer guaranteed to work with strings.
> > > It never worked correctly on unicode strings anyway (which becomes the
> > > canonical string in python 3.0).
> >
> > What?!   Are you sure?  That sounds broken to me.
> 
> Nope, it's pretty fundamental to working with text, unicode only being
> an extreme example: there's a wide number of ways to break down a
> chunk of text, making the odds of "e" being any particular one fairly
> low.  Python's unicode type only makes this slightly worse, not
> promising any particular one is available.

I don't understand this.  I thought that unicode was a character
coding system like ascii, except with an enormous character set
combined with a bunch of different algorithms for encoding unicode
strings as byte sequences.  But I've thought of those algorithms
(UTF-8 and so forth) as basically being kludgy data compression
schemes, and unicode strings are still just sequences of code points.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to