On Jul 19, 2013, at 12:59, Junio C Hamano wrote:
"Kyle J. McKay" <mack...@gmail.com> writes:

+#define URL_ALPHA "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
+#define URL_DIGIT "0123456789"
+#define URL_HEXDIGIT URL_DIGIT "ABCDEFabcdef"
+#define URL_ALPHADIGIT URL_ALPHA URL_DIGIT
+#define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-"
+#define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals need [:] */ +#define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F, 0x7F-0xFF */
+#define URL_GEN_RESERVED ":/?#[]@"
+#define URL_SUB_RESERVED "!$&'()*+,;="
+#define URL_RESERVED URL_GEN_RESERVED URL_SUB_RESERVED /* only allowed delims */
+ ...
+       while (from_len) {
+               int ch = *from++;
+               int was_esc = 0;
+
+               from_len--;
+               if (ch == '%') {
+                       if (from_len < 2 ||
+                           !strchr(URL_HEXDIGIT, from[0]) ||
+                           !strchr(URL_HEXDIGIT, from[1]))

I actually do like the readability of the approach in this patch,
but these repeated strchrs() in a loop may want to be optimized,
using a trick similar to what is used in ctype.c::sane_ctype[].

A small build-time-only program or script gen-http-ctype.perl that
defines and uses these URL_* cpp macros and generates a C source
file http-ctype-gen.c that can be #included from http.c, with
something like this in the Makefile:

        http-ctype-gen.c: gen-http-ctype.perl
                rm -f $@ $@+
               $(PERL_PATH) gen-http-ctype.perl >$@+
               mv $@+ $@
        http.o: http.c http-ctype-gen.c

would give us both readability and efficiency, perhaps?

Hmmm.  That's a very fast technique.  However something like:

#define IS_HEX_DIGIT(c) \
  (('0'<=(c)&&(c)<='9')||('a'<=(c)&&(c)<='f')||('A'<=(c)&&(c)<='F'))

I would think would be suitably fast without needing any added build files.

However, looks like there is a ctype.h isxdigit() function and it looks like there's a version of that in git-compat-util.h as well as a convenient hexval_table to use for the conversion, so I will alter the code to use those instead which will also do away with the hex_digit_value() function.

If you mean for all the strchr etc. calls, multiple tables would be required since URL_SCHEME_CHARS and URL_HOST_CHARS partially overlap, but it could be done. Is the speed of strchr that much of a concern? The code will only be invoked for http.<url>.* option settings in any case and I expect the user would have to set an awfully large number of those to even begin to notice.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to