Well, thank God you took a look at the code and agreed on my findings. I'll just adjust my urls.py for now.
Should I submit a bug report? (it'd be my first :) ) Thanks again for your time on this, I'm glad it was a bug and not my mis-standing of django or the way this all works together. Now I can move on a continue my app. John On Jun 4, 12:59 pm, "Karen Tracey" <[EMAIL PROTECTED]> wrote: > On Wed, Jun 4, 2008 at 2:03 PM, John M <[EMAIL PROTECTED]> wrote: > > Yes, I understand that, and I think it's a good thing, but when it > > redirects, it mangles the parameters, would you agree? > > Yes, I think that's a bug in Django. The code that is doing the > APPEND_SLASH handling tries to use request.GET.urlencode() to restore the > original query parameters to the new url it has generated (specifically > here:http://code.djangoproject.com/browser/django/trunk/django/middleware/...). > However this fails to reconstitute the original query parameters when they > were not in fact valid utf-8 to begin with (as your info_hash is not). Back > when the GET QueryDict was constructed, this code: > > http://code.djangoproject.com/browser/django/trunk/django/http/__init... > > took the info_hash bytestring with repr > '\x10\xc2\xe1\x96\xe0\x8d\x90\x05\xb7\xdf\xc6\xbc\x8e\xc2\x15\xe4=`\xcc\x84' > and generated the unicode string with repr > u'\x10\ufffd\ufffd\ufffd\x05\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\u0304' in > its place. The request was assumed to be encoded in utf-8 for want of any > better information. All those '\ufffd's are the Unicode replacement > character, indicating that the input bytestring contained invalid utf-8 > sequences. (For example, while the first byte \x10 is a valid 1-byte utf-8 > sequence, the next two bytes \xc2 \xe1 are not. \xc2 is a valid first byte > for a 2-byte sequence, but the 2nd byte must then begin with binary 10, > where \xe1 begins binary 11. So those two bytes are tossed and '\ufffd' put > in their place.) At this point there is no way to go back to the original > input since generating the replacement char in place of invalid input throws > away the original information. When the APPEND_SLASH code tries to > urlencode() this unicode version of the query string, you see a lot of > %EF%BF%BD because \xef\xbf\xbd is the 3-byte utf-8 encoding of the Unicode > replacement character \ufffd. Clear as mud? > > Anyway I think line 83 of django/middleware/common.py should be: > > newurl += '?' + request.META['QUERY_STRING'] > > instead of: > > newurl += '?' + request.GET.urlencode() > > That will ensure that the query parameters included in the redirect url are > identical to what was included in the original url. > > Karen --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---