On Sun, Mar 8, 2015 at 3:40 AM, Mark Lawrence <breamore...@yahoo.co.uk> wrote: >> Here's an example: >> >> b = b'\x80' >> >> Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping >> from str objects to bytes objects. >> > > Python 2 might, Python 3 doesn't.
He was talking about this line of code: b.decode('utf-8').encode('utf-8') == b With the above assignment, that does indeed throw an error - which is correct behaviour. Challenge: Figure out a byte-string input that will make this function return True. def is_utf8_broken(b): return b.decode('utf-8').encode('utf-8') != b Correct responses for this function are either False or raising an exception. ChrisA -- https://mail.python.org/mailman/listinfo/python-list