After researching it some more, it seems that Django uses UTF-8 byte strings internally (as opposed to the actual Unicode strings that Python supports). So the following regular expression actually does work:
r"^name/(?P<name>[^/]+)/$" What is passed in the `name` parameter is a UTF-8 byte string, so if you need an actual Unicode string, you can do the following: unicode_name = name.decode('utf8') The only thing I worried about initially was the possibility of having one of the bytes making up the Chinese character be `0x2F`, which is the code for a forward slash. Since the regular expression matches against the UTF-8 byte string, it treats each byte as an independent character and thus would treat such a byte as a forward slash. But after reading a bit about UTF-8, it sounds like `0x2F` is never used in anything but the forward slash. Now my issue switches to the analogous one for the model layer. It seems that SQLite, at least, works fine with the UTF-8 byte strings Django gives it, and faithfully returns them when asked. But again, one has to worry about remembering to decode them into Unicode strings when needed, which is a bit annoying. And of course you have to remember to make your database fields three or four times longer than needed, since each character takes up three or four bytes. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---