David Abrahams wrote:
> I've been running into a problem that seems very similar to
> http://code.djangoproject.com/ticket/170, although I see that that
> issue was fixed so I am betting the bug is on my end somewhere.
> Unfortunately, I'm a little green w.r.t. unicode issues so I'm hoping
> someone else can correct my misconceptions.
> 
> My app's templatetags/navigation.py file is enclosed.  If you look for
> the string ".title()" you can see where the title() method on my Page
> objects is getting called.  Unless I change that method to encode its
> result as ascii or utf-8, I get the exception.  Can anyone explain
> what's going on?  I suspect problems with mixing utf-8 and ascii
> encoded strings, but I'm really out of my depth here.

Looking at title() that you included:

     def title(self):
         return 'Home'

it already returns an ASCII encoded string. But I suspect you actual 
class returns a unicode, right?

If yes then trying to concatenate a unicode string with a byte string 
will force Python to decode a byte string into a unicode using whatever 
current locale is active. This automatic decoding-encoding is always 
error-prone because in different places you will have different locale. 
So it's always needed to do this explicitly.

Since most of the code in your template tag does its work using byte 
strings it would be easier to encode title()'s output into a byte string 
manually (an alternative would be converting all your tag's code to work 
on unicode strings). The question is in what byte encoding to encode. It 
looks obvious to convert it into settings.DEFAULT_CHARSET since it's an 
encoding of all your output. However if you set DEFAULT_CHARSET into 
some legacy encoding (i.e. other than 'utf-8') there might be cases 
(theoretically) when a unicode string contains characters that can't be 
encoded in it (for example you can't have russian characters in Western 
European windows-1252). So you may want to take a safety measure:

     title().encode(settings.DEFAULT_CHARSET, errors='xmlcharrefreplace')

... and all characters that cannot be encoded into DEFAULT_CHARSET will 
appear as for example А which is acceptable for HTML.

--~--~---------~--~----~------------~-------~--~----~
 You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to