I did some research on encodings as a follow up. It seems that both '+' and '%20' are considered valid encodings for spaces. There are several sources for this information, here are a few:
http://www.w3schools.com/tags/ref_urlencode.asp http://stackoverflow.com/questions/1634271/url-encoding-the-space-character-or-20 http://en.wikipedia.org/wiki/Percent-encoding Given this information, I modified MHD's 'MHD_http_unescape' function to accept the '+' sign as a space, and it worked as expected. It was just an additional 'case' at the top of the switch (see below). 5 lines of code. It could be shortened a bit if the 'default' clause is changed (so the wpos++ and rpos++ were outside the switch), but I didn't want to be presumptuous. In 'internal.c': ---------------------------------- size_t MHD_http_unescape (void *cls, struct MHD_Connection *connection, char *val) { char *rpos = val; char *wpos = val; char *end; unsigned int num; char buf3[3]; while ('\0' != *rpos) { switch (*rpos) { case '+': *wpos = ' '; wpos++; rpos++; break; case '%': if ( ('\0' == rpos[1]) || ('\0' == rpos[2]) ) { *wpos = '\0'; return wpos - val; } buf3[0] = rpos[1]; .... ---------------------------------- In url enoding, +'s are encoded with "%2B", so this solution really should just work all the time. (i.e., it's not going to inadvertently remove a '+'). That said, I'm not sure this is the correct solution. Thoughts/comments? Worthwhile addition to MHD, or is this wrong for some reason? I can't think of why this would be a bad thing to include, but I'm certainly open to other ideas and/or just not using MHD's post processor at all. Ken On Wed, Sep 17, 2014 at 8:44 AM, Kenneth Mastro <[email protected]> wrote: > All, > > I'm using MHD's post-processor to process form data and several AJAX > requests. I have noticed that when the encoding is > 'application/x-www-form-urlencoded', strings with spaces contain a '+' sign > instead of the spaces. > > For form data, if I explicitly set the encoding to 'multipart/form-data', > the strings are parsed properly and there are no '+'s, which is how I've > been getting around the problem (I assumed I was doing something wrong and > haven't had time to dig into it). However, this isn't working for my AJAX > requests - setting the encoding to 'multipart/form-data' breaks things in > ways I haven't fully investigated, yet. I consider that a hack anyway, so > I don't really want to pursue it. I need to figure out why > 'application/x-www-form-urlencoded' isn't working for me. > > In looking at the 'Content-Type' the server is receiving for the AJAX > requests, it is 'application/x-www-form-urlencoded; charset=UTF-8'. I > thought the charset might be causing an issue, but I'm having trouble > getting jQuery to not use UTF-8. From the jQuery ajax page: "The W3C > XMLHttpRequest specification dictates that the charset is always UTF-8; > specifying another charset will not force the browser to change the > encoding." I.e., I'm stuck with UTF-8 because it's the standard, which I'm > fine with. Regardless, MHD successfully creates the post processor, so > it's seeing the actual base encoding (this works because it only compares > the first chunk of chars of the content type - essentially ignoring the > charset part). > > MHD does not seem to provide an option for REPLACING a header (i.e., using > MHD_set_connection_value only ADDS a header - it won't replace the existing > Content-Type header), so even if I actually could be sure the data was > ASCII, I can't fix this in the server without doing my own POST > processing. I doubt that would work anyway unless I could get the web page > / browser to not do UTF-8 somehow. (Although I think ASCII is a subset of > UTF-8, maybe there are differences even in those low-numbered characters > I'm not aware of?) > > Anyway - In short - my question is: Is the MHD post processor just failing > on 'application/x-www-form-urlencoded' data? I.e., it's not parsing out > the +'s when it should? Or, does MHD not work with UTF-8 encoded data > (despite the all the characters being in the ASCII range) and I need to do > my own POST processing? Or, does this actually work and I'm just doing > something wrong? > > > Thanks much, > Ken > >
