"Rhamphoryncus" <[EMAIL PROTECTED]> writes: > > > > i = s.index(e) => s[i] = e > > > > Then this algorithm is no longer guaranteed to work with strings. > > > It never worked correctly on unicode strings anyway (which becomes the > > > canonical string in python 3.0). > > > > What?! Are you sure? That sounds broken to me. > > Nope, it's pretty fundamental to working with text, unicode only being > an extreme example: there's a wide number of ways to break down a > chunk of text, making the odds of "e" being any particular one fairly > low. Python's unicode type only makes this slightly worse, not > promising any particular one is available.
I don't understand this. I thought that unicode was a character coding system like ascii, except with an enormous character set combined with a bunch of different algorithms for encoding unicode strings as byte sequences. But I've thought of those algorithms (UTF-8 and so forth) as basically being kludgy data compression schemes, and unicode strings are still just sequences of code points. -- http://mail.python.org/mailman/listinfo/python-list