Re: [PATCH v2 2/2] mailinfo: support Unicode scissors

Junio C Hamano Tue, 02 Apr 2019 23:47:58 -0700

Jeff King <p...@peff.net> writes:

> In fact, I think you could then combine this with the previous
> conditional and get:
>
>   if (skip_prefix(c, ">8", &end) ||
>       skip_prefix(c, "8<", &end) ||
>       skip_prefix(c, ">%", &end) ||
>       skip_prefix(c, "%<", &end) ||
>       /* U-2702 in UTF-8 */
>       skip_prefix(c, "\xE2\x9C\x82", &end)) {
>           in_perforation = 1;
>         perforation += end - c;
>         scissors += end - c;
>         c = end - 1; /* minus one to account for loop increment */
>   }
>
> (Though I'm still on the fence regarding the whole idea, so do not take
> this as an endorsement ;) ).


I do not think we want to add more, but use of skip_prefix does
sound sensible.  I was very tempted to suggest

        static const char *scissors[] = {
                ">8", "8<", ">%", "%<",
                NULL,
        };
        const char **s;

        for (s = scissors; *s; s++)
                if (skip_prefix, c, *s, &end) {
                        in_perforation = 1;
                        ...
                        break;
                }
        }
        if (!s)
                ... we are not looking at any of the scissors[] ...

but that would encourage adding more random entries to the array,
which we would want to avoid in order to help reduce the cognirive
load of end-users.

In hindsight, addition of an undocumented '%' was already a mistake.
I wonder how widely it is in use (yes, I am tempted to deprecate and
remove these two to match the code to the docs).

Re: [PATCH v2 2/2] mailinfo: support Unicode scissors

Reply via email to