Re: Capturing full URL string

John M Wed, 04 Jun 2008 13:36:07 -0700

Well, thank God you took a look at the code and agreed on my
findings.  I'll just adjust my urls.py for now.


Should I submit a bug report?  (it'd be my first :) )

Thanks again for your time on this, I'm glad it was a bug and not my
mis-standing of django or the way this all works together.  Now I can
move on a continue my app.

John

On Jun 4, 12:59 pm, "Karen Tracey" <[EMAIL PROTECTED]> wrote:
> On Wed, Jun 4, 2008 at 2:03 PM, John M <[EMAIL PROTECTED]> wrote:
> > Yes, I understand that, and I think it's a good thing, but when it
> > redirects, it mangles the parameters, would you agree?
>
> Yes, I think that's a bug in Django.  The code that is doing the
> APPEND_SLASH handling tries to use request.GET.urlencode() to restore the
> original query parameters to the new url it has generated (specifically
> here:http://code.djangoproject.com/browser/django/trunk/django/middleware/...).
> However this fails to reconstitute the original query parameters when they
> were not in fact valid utf-8 to begin with (as your info_hash is not).  Back
> when the GET QueryDict was constructed, this code:
>
> http://code.djangoproject.com/browser/django/trunk/django/http/__init...
>
> took the info_hash bytestring with repr
> '\x10\xc2\xe1\x96\xe0\x8d\x90\x05\xb7\xdf\xc6\xbc\x8e\xc2\x15\xe4=`\xcc\x84'
> and generated the unicode string with repr
> u'\x10\ufffd\ufffd\ufffd\x05\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\u0304' in
> its place.  The request was assumed to be encoded in utf-8 for want of any
> better information.  All those '\ufffd's are the Unicode replacement
> character, indicating that the input bytestring contained invalid utf-8
> sequences.  (For example, while the first byte \x10 is a valid 1-byte utf-8
> sequence, the next two bytes \xc2 \xe1 are not.  \xc2 is a valid first byte
> for a 2-byte sequence, but the 2nd byte must then begin with binary 10,
> where \xe1 begins binary 11. So those two bytes are tossed and '\ufffd' put
> in their place.)  At this point there is no way to go back to the original
> input since generating the replacement char in place of invalid input throws
> away the original information.  When the APPEND_SLASH code tries to
> urlencode() this unicode version of the query string, you see a lot of
> %EF%BF%BD because \xef\xbf\xbd is the 3-byte utf-8 encoding of the Unicode
> replacement character \ufffd.  Clear as mud?
>
> Anyway I think line 83 of django/middleware/common.py should be:
>
>     newurl += '?' + request.META['QUERY_STRING']
>
> instead of:
>
>     newurl += '?' + request.GET.urlencode()
>
> That will ensure that the query parameters included in the redirect url are
> identical to what was included in the original url.
>
> Karen
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Capturing full URL string

Reply via email to