Re: Capturing full URL string

Leeland (The Code Janitor) Thu, 05 Jun 2008 10:03:02 -0700

I read this with great interest. It actually will save me time on my
project. I will be encountering this behavior shortly. Thank you!


Please, please submit a bug report on this with all the details here.
If people do not submit bug reports future user of Django will
eventually have the same problem and loose hours and hours searching
for the cause (like Karen, Matthias, Gregor and yourself just did).
Make that hard work you just did troubleshooting this count!

+ Leeland

On Jun 4, 1:35 pm, John M <[EMAIL PROTECTED]> wrote:
> Well, thank God you took a look at the code and agreed on my
> findings.  I'll just adjust my urls.py for now.
>
> Should I submit a bug report?  (it'd be my first :) )
>
> Thanks again for your time on this, I'm glad it was a bug and not my
> mis-standing of django or the way this all works together.  Now I can
> move on a continue my app.
>
> John
>
> On Jun 4, 12:59 pm, "Karen Tracey" <[EMAIL PROTECTED]> wrote:
>
> > On Wed, Jun 4, 2008 at 2:03 PM, John M <[EMAIL PROTECTED]> wrote:
> > > Yes, I understand that, and I think it's a good thing, but when it
> > > redirects, it mangles the parameters, would you agree?
>
> > Yes, I think that's a bug in Django.  The code that is doing the
> > APPEND_SLASH handling tries to use request.GET.urlencode() to restore the
> > original query parameters to the new url it has generated (specifically
> > here:http://code.djangoproject.com/browser/django/trunk/django/middleware/...).
> > However this fails to reconstitute the original query parameters when they
> > were not in fact valid utf-8 to begin with (as your info_hash is not).  Back
> > when the GET QueryDict was constructed, this code:
>
> >http://code.djangoproject.com/browser/django/trunk/django/http/__init...
>
> > took the info_hash bytestring with repr
> > '\x10\xc2\xe1\x96\xe0\x8d\x90\x05\xb7\xdf\xc6\xbc\x8e\xc2\x15\xe4=`\xcc\x84'
> > and generated the unicode string with repr
> > u'\x10\ufffd\ufffd\ufffd\x05\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\u0304' in
> > its place.  The request was assumed to be encoded in utf-8 for want of any
> > better information.  All those '\ufffd's are the Unicode replacement
> > character, indicating that the input bytestring contained invalid utf-8
> > sequences.  (For example, while the first byte \x10 is a valid 1-byte utf-8
> > sequence, the next two bytes \xc2 \xe1 are not.  \xc2 is a valid first byte
> > for a 2-byte sequence, but the 2nd byte must then begin with binary 10,
> > where \xe1 begins binary 11. So those two bytes are tossed and '\ufffd' put
> > in their place.)  At this point there is no way to go back to the original
> > input since generating the replacement char in place of invalid input throws
> > away the original information.  When the APPEND_SLASH code tries to
> > urlencode() this unicode version of the query string, you see a lot of
> > %EF%BF%BD because \xef\xbf\xbd is the 3-byte utf-8 encoding of the Unicode
> > replacement character \ufffd.  Clear as mud?
>
> > Anyway I think line 83 of django/middleware/common.py should be:
>
> >     newurl += '?' + request.META['QUERY_STRING']
>
> > instead of:
>
> >     newurl += '?' + request.GET.urlencode()
>
> > That will ensure that the query parameters included in the redirect url are
> > identical to what was included in the original url.
>
> > Karen
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Capturing full URL string

Reply via email to