I'd like to have a fast way to validate large amounts of string data as being UTF-8.
I don't see a fast way to do it in Python, though: unicode(s,'utf-8').encode('utf-8) seems to notice at least some of the time (the unicode() part works but the encode() part bombs). I don't consider a RE based solution to be fast. GLib provides a routine to do this, and I am using GTK so it's included in there somewhere, but I don't see a way to call GLib routines. I don't want to write another extension module. Is there a (fast) Python function to validate UTF-8 data? Is there some other fast way to validate UTF-8 data? Is there a general way to call GLib functions? ________________________________________________________________________ TonyN.:' [EMAIL PROTECTED] ' <http://www.georgeanelson.com/> -- http://mail.python.org/mailman/listinfo/python-list