Xah Lee wrote: > how to represent the unicode "em space" in regex?
You will have to pass a Unicode literal as the regular expression, e.g. fracture=re.split(u'\u2003*\\|\u2003*',myline,re.U) Notice that, in raw Unicode literals, you can still use \u to escape characters, e.g. fracture=re.split(ur'\u2003*\|\u2003*',myline,re.U) Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list