Hi Ethan, I've implemented this now in Git master. Feedback welcome...
Happy hacking! Christian On 12/25/19 9:45 AM, Ethan Tuttle wrote: > Thanks for the response, Chrstian. I'll take a look at the parsing > code, but I don't have high confidence I can fix it to your liking > (and in a secure / performant way). > > Otherwise, I'll stick with my uriparser patch until you get around to it. > > Happy hacking and happy holidays! > > Ethan > > On Tue, Dec 24, 2019 at 6:30 AM Christian Grothoff <groth...@gnunet.org> > wrote: >> >> Hi Ethan, >> >> I agree that this is a bug, and yes, we should fix it. Thanks for the >> test case, that is very helpful. >> >> I don't know when I can work on it, but worst-case expect a patch by >> the end of January, if you're lucky and my other bugs go fast (or >> someone else sends a fix), might of course happen earlier. >> >> Happy hacking! >> >> Christian >> >> On Tue, 2019-12-24 at 04:11 -0800, Ethan Tuttle wrote: >>> Given post body "a&b=1", how should MHD interpret the data? >>> >>> I'm at the end of a long investigation and it's come down to that >>> question >>> for post_process_urlencoded() in MHD. >>> >>> MHD currently calls back with ("a&b", "1"). The app I'm working >>> breaks >>> when it doesn't receive a callback for "b". The http client in this >>> case >>> (that did the urlencoding) is google-http-java-client 1.23.0. >>> >>> The client behavior may be questionable, but MHD's behavior doesn't >>> seem >>> right either. Isn't there some principle, "clients should strive for >>> conformance, servers should strive for forgiveness". >>> >>> I've attached a patch to MHD to add a failing test, but without a >>> fix. Is >>> MHD open to changing this behavior, given its maturity? >>> >>> Should I post a bug to the bug tracker? >>> >>> As for relevant standards, the W3C[1] is not detailed enough to cover >>> it. >>> The WhatWG[2] is more specific and allows for empty values and even >>> empty >>> keys. I'd like to callout uriparser[3], another C library I've >>> patched in >>> as a work-around. Uriparser documents their handling of these cases >>> well: >>> >>> * NULL in the value member means there was no '=' in the item text as >>> with >>> "?abc&def". >>> * An empty string in the value member means there was '=' in the item >>> as >>> with "?abc=&def". >>> >>> [1] https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1 >>> [2] https://url.spec.whatwg.org/#urlencoded-parsing >>> [3] https://uriparser.github.io/doc/api/latest/#querystrings >>> >>> commit aa0534af56d135e1b261d127af09c22015c1ff87 >>> Author: Ethan Tuttle <et...@ethantuttle.com> >>> Date: Tue Dec 24 03:50:59 2019 -0800 >>> >>> urlencoding post-processor: add failing tests for keys without >>> values >>> >>> diff --git a/src/microhttpd/test_postprocessor.c >>> b/src/microhttpd/test_postprocessor.c >>> index 75b5ba33..6d5be040 100644 >>> --- a/src/microhttpd/test_postprocessor.c >>> +++ b/src/microhttpd/test_postprocessor.c >>> @@ -42,8 +42,18 @@ >>> * five NULL-entries. >>> */ >>> const char *want[] = { >>> +#define URL_NOVALUE1_DATA "abc&x=5" >>> +#define URL_NOVALUE1_START 0 >>> + "abc", NULL, NULL, NULL, NULL, >>> + "x", NULL, NULL, NULL, "5", >>> +#define URL_NOVALUE1_END (URL_NOVALUE1_START + 10) >>> +#define URL_NOVALUE2_DATA "abc=&x=5" >>> +#define URL_NOVALUE2_START URL_NOVALUE1_END >>> + "abc", NULL, NULL, NULL, "", >>> + "x", NULL, NULL, NULL, "5", >>> +#define URL_NOVALUE2_END (URL_NOVALUE2_START + 10) >>> #define URL_DATA "abc=def&x=5" >>> -#define URL_START 0 >>> +#define URL_START URL_NOVALUE2_END >>> "abc", NULL, NULL, NULL, "def", >>> "x", NULL, NULL, NULL, "5", >>> #define URL_END (URL_START + 10) >>> @@ -125,12 +135,14 @@ value_checker (void *cls, >>> >>> >>> static int >>> -test_urlencoding (void) >>> +test_urlencoding_case (unsigned int want_start, >>> + unsigned int want_end, >>> + const char *url_data) >>> { >>> struct MHD_Connection connection; >>> struct MHD_HTTP_Header header; >>> struct MHD_PostProcessor *pp; >>> - unsigned int want_off = URL_START; >>> + unsigned int want_off = want_start; >>> size_t i; >>> size_t delta; >>> size_t size; >>> @@ -147,19 +159,27 @@ test_urlencoding (void) >>> pp = MHD_create_post_processor (&connection, >>> 1024, &value_checker, &want_off); >>> i = 0; >>> - size = strlen (URL_DATA); >>> + size = strlen (url_data); >>> while (i < size) >>> { >>> delta = 1 + MHD_random_ () % (size - i); >>> - MHD_post_process (pp, &URL_DATA[i], delta); >>> + MHD_post_process (pp, &url_data[i], delta); >>> i += delta; >>> } >>> MHD_destroy_post_processor (pp); >>> - if (want_off != URL_END) >>> + if (want_off != want_end) >>> return 1; >>> return 0; >>> } >>> >>> +static int >>> +test_urlencoding (void) { >>> + unsigned int errorCount = 0; >>> + errorCount += test_urlencoding_case(URL_START, URL_END, URL_DATA); >>> + errorCount += test_urlencoding_case(URL_NOVALUE1_START, >>> URL_NOVALUE1_END, URL_NOVALUE1_DATA); >>> + errorCount += test_urlencoding_case(URL_NOVALUE2_START, >>> URL_NOVALUE2_END, URL_NOVALUE2_DATA); >>> + return errorCount; >>> +} >>> >>> static int >>> test_multipart_garbage (void) >> >> >