Yi EungJun <[email protected]> writes:
> From: Yi EungJun <[email protected]>
>
> Add an Accept-Language header which indicates the user's preferred
> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
>
> Examples:
> LANGUAGE= -> ""
> LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
> LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
> LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
>
> This gives git servers a chance to display remote error messages in
> the user's preferred language.
>
> Signed-off-by: Yi EungJun <[email protected]>
> ---
> http.c | 134
> +++++++++++++++++++++++++++++++++++++++++++++
> remote-curl.c | 2 +
> t/t5550-http-fetch-dumb.sh | 31 +++++++++++
> 3 files changed, 167 insertions(+)
>
> diff --git a/http.c b/http.c
> index 3a28b21..ed4e8e1 100644
> --- a/http.c
> +++ b/http.c
> @@ -67,6 +67,8 @@ static struct curl_slist *no_pragma_header;
>
> static struct active_request_slot *active_queue_head;
>
> +static struct strbuf *cached_accept_language = NULL;
Please drop " = NULL" that is unnecessary for BSS.
> @@ -512,6 +514,9 @@ void http_cleanup(void)
> cert_auth.password = NULL;
> }
> ssl_cert_password_required = 0;
> +
> + if (cached_accept_language)
> + strbuf_release(cached_accept_language);
> }
> @@ -983,6 +988,129 @@ static void extract_content_type(struct strbuf *raw,
> struct strbuf *type,
> strbuf_addstr(charset, "ISO-8859-1");
> }
>
> +/*
> + * Guess the user's preferred languages from the value in LANGUAGE
> environment
> + * variable and LC_MESSAGES locale category.
> + *
> + * The result can be a colon-separated list like "ko:ja:en".
> + */
> +static const char *get_preferred_languages(void)
> +{
> + const char *retval;
> +
> + retval = getenv("LANGUAGE");
> + if (retval && *retval)
> + return retval;
> +
> + retval = setlocale(LC_MESSAGES, NULL);
> + if (retval && *retval &&
> + strcmp(retval, "C") &&
> + strcmp(retval, "POSIX"))
> + return retval;
> +
> + return NULL;
> +}
> +
> +/*
> + * Get an Accept-Language header which indicates user's preferred languages.
> + *
> + * Examples:
> + * LANGUAGE= -> ""
> + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
> + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *;
> q=0.1"
> + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
> + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
> + * LANGUAGE= LANG=C -> ""
> + */
> +static struct strbuf *get_accept_language(void)
> +{
> + const char *lang_begin, *pos;
> + int q, max_q;
> + int num_langs;
> + int decimal_places;
> + int is_codeset_or_modifier = 0;
> + static struct strbuf buf = STRBUF_INIT;
> + struct strbuf q_format_buf = STRBUF_INIT;
> + char *q_format;
> +
> + if (cached_accept_language)
> + return cached_accept_language;
> +
> + lang_begin = get_preferred_languages();
> +
> + /* Don't add Accept-Language header if no language is preferred. */
> + if (!(lang_begin && *lang_begin)) {
It is not wrong per-se, but given how hard get_preferred_languages()
tries not to return a pointer to an empty string, this seems a bit
overly defensive to me.
> + cached_accept_language = &buf;
> + return cached_accept_language;
It is somewhat unconventional to have a static pointer outside to
point at a singleton and then have a singleton actually as a static
structure. I would have done without "buf" in this function and
instead started this function like so:
if (cached_accept_language)
return cached_accept_language;
cached_accept_language = xmalloc(sizeof(struct strbuf));
strbuf_init(cached_accept_language, 0);
lang_begin = get_preferred_languages();
if (!lang_begin)
return cached_accept_language;
> + }
> +
> + /* Count number of preferred lang_begin to decide precision of q-factor
> */
> + for (num_langs = 1, pos = lang_begin; *pos; pos++)
> + if (*pos == ':')
> + num_langs++;
> +
> + /* Decide the precision for q-factor on number of preferred lang_begin.
> */
> + num_langs += 1; /* for '*' */
> + decimal_places = 1 + (num_langs > 10) + (num_langs > 100);
What if you got 60000 languages ;-)? I do not think we want to bend
backwards and make the code list all 60000 of them, assigning a
unique and decreasing q value to each of them, forming an overlong
Accept-Language header, but at the same time, I do not think we want
to show nonsense output because we compute the precision incorrectly
here.
> + strbuf_addf(&q_format_buf, "; q=0.%%0%dd", decimal_places);
> + q_format = strbuf_detach(&q_format_buf, NULL);
q_format_buf is an overkill use of strbuf, isn't it? Just
char q_format_buf[32];
sprintf(q_format_buf, ";q=0.%%0%d", decimal_places);
or something should be more than sufficient, no?
> + for (max_q = 1; decimal_places-- > 0;) max_q *= 10;
As you have to do one loop like this that amounts to computing log10
of num_langs, why not compute decimal_places the same way while at
it? It may also make sense to cap the number of languages to avoid
spitting out overly long Accept-Language header with practicaly
useless list of many languages. That is, something along the lines
of ... (note that I may very well have off-by-one or off-by-ten
errors here you may need to tweak to get right):
if (MAX_LANGS < num_langs)
num_langs = MAX_LANGS;
for (max_q = 1, decimal_places = 1;
max_q < num_langs;
decimal_places++, max_q *= 10)
;
If you are to use the MAX_LANGS cap, the main loop would also need
to pay attention to it by breaking out of the loop early before you
reach the end of the string, of course.
> + q = max_q;
> +
> + strbuf_addstr(&buf, "Accept-Language: ");
> +
> + /*
> + * Convert a list of colon-separated locale values [1][2] to a list of
> + * comma-separated language tags [3] which can be used as a value of
> + * Accept-Language header.
> + *
> + * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html
> + * [2]:
> http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html
> + * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5
> + */
> + for (pos = lang_begin; ; pos++) {
> + if (*pos == ':' || !*pos) {
> + /* Ignore if this character is the first one. */
> + if (pos == lang_begin)
> + continue;
By doing this "ignore empty" here, but not doing the same when you
count num_langs, are you potentially miscounting num_langs?
> + is_codeset_or_modifier = 0;
> +
> + /* Put a q-factor only if it is less than 1.0. */
> + if (q < max_q)
... is it the same thing as "do not do this for the first round, but
do so for all the other round"?
> + strbuf_addf(&buf, q_format, q);
> +
> + if (q > 1)
Hmm, I am puzzled. C this ever be an issue (unless you have
off-by-one error or you add "cap num_langs to MAX_LANGS", that is)?
> + q--;
> + /* NULL pos means this is the last language. */
> + if (*pos)
> + strbuf_addstr(&buf, ", ");
> + else
> + break;
> +
> + } else if (is_codeset_or_modifier)
> + continue;
> + else if (*pos == '.' || *pos == '@') /* Remove .codeset and
> @modifier. */
> + is_codeset_or_modifier = 1;
> + else
> + strbuf_addch(&buf, *pos == '_' ? '-' : *pos);
> + }
> +
> + /* Don't add Accept-Language header if no language is preferred. */
> + if (q >= max_q) {
Can q go over max_q, or is it "q may be max_q"? In other words, is
this essentially saying "if we did not find any language in the
preferred languages list"?
> + cached_accept_language = &buf;
> + return cached_accept_language;
> + }
> +
> + /* Add '*' with minimum q-factor greater than 0.0. */
> + strbuf_addstr(&buf, ", *");
> + strbuf_addf(&buf, q_format, 1);
> +
> + cached_accept_language = &buf;
> + return cached_accept_language;
> +}
> +
> /* http_request() targets */
> #define HTTP_REQUEST_STRBUF 0
> #define HTTP_REQUEST_FILE 1
> @@ -995,6 +1123,7 @@ static int http_request(const char *url,
> struct slot_results results;
> struct curl_slist *headers = NULL;
> struct strbuf buf = STRBUF_INIT;
> + struct strbuf* accept_language;
As we write in C, not C++, our asterisks stick to the variable, not
the type.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html