bug#66698: I think hex decoding with basenc -d --base16 should be case-insensitive
Hi, the docs for basenc --base16 says "hex encoding (RFC4648 section 8)". The referenced section in that RFC says Essentially, Base 16 encoding is the standard case-insensitive hex encoding and may be referred to as "base16" or "hex". I think it would be both more useful, and consistent with docs, if basenc -d --base16 accepted either upper- or lowercase hex digits. Current behavior, with basenc (GNU coreutils) 9.1: $ echo 666F6F0A |basenc --base16 -d foo $ echo 666F6f0A |basenc --base16 -d fobasenc: invalid input I think both inputs should give the same output, "foo\n", at least by default. Possibly configurable with options like --strict, --upper, --lower, etc (--upper/--lower would be useful also for the --base16 encoding, i.e., no -d). Regards, /Niels -- Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677. Internet email is subject to wholesale government surveillance.
bug#66698: I think hex decoding with basenc -d --base16 should be case-insensitive
On 23/10/2023 10:37, Niels Möller wrote: Hi, the docs for basenc --base16 says "hex encoding (RFC4648 section 8)". The referenced section in that RFC says Essentially, Base 16 encoding is the standard case-insensitive hex encoding and may be referred to as "base16" or "hex". I think it would be both more useful, and consistent with docs, if basenc -d --base16 accepted either upper- or lowercase hex digits. Current behavior, with basenc (GNU coreutils) 9.1: $ echo 666F6F0A |basenc --base16 -d foo $ echo 666F6f0A |basenc --base16 -d fobasenc: invalid input I think both inputs should give the same output, "foo\n", at least by default. Possibly configurable with options like --strict, --upper, --lower, etc (--upper/--lower would be useful also for the --base16 encoding, i.e., no -d). Agreed. Will apply the attached later. Marking this as done. thanks, PádraigFrom 69f8e90185e518d1722ed6a036f4b18779553e49 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Mon, 23 Oct 2023 12:51:19 +0100 Subject: [PATCH] basenc: --base16: support lower case hex digits * src/basenc.c (base16_decode_ctx): Convert to uppercase before converting from hex. * tests/basenc/basenc.pl: Add a test case. * NEWS: Mention the change in behavior. Addresses https://bugs.gnu.org/66698 --- NEWS | 3 +++ src/basenc.c | 2 +- tests/basenc/basenc.pl | 1 + 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 93f98b99d..56c2a4785 100644 --- a/NEWS +++ b/NEWS @@ -19,6 +19,9 @@ GNU coreutils NEWS-*- outline -*- base32 and base64 no longer require padding when decoding. Previously an error was given for non padded encoded data. + basenc --base16 -d no supports lower case hexadecimal characters. + Previously an error was given for lower case hex digits. + ls --dired now implies long format output without hyperlinks enabled, and will take precedence over previously specified formats or hyperlink mode. diff --git a/src/basenc.c b/src/basenc.c index 12021e900..74cf03a49 100644 --- a/src/basenc.c +++ b/src/basenc.c @@ -577,7 +577,7 @@ base16_decode_ctx (struct base_decode_context *ctx, continue; } - int nib = *in++; + int nib = c_toupper (*in++); if ('0' <= nib && nib <= '9') nib -= '0'; else if ('A' <= nib && nib <= 'F') diff --git a/tests/basenc/basenc.pl b/tests/basenc/basenc.pl index de20d2dbc..2b0e79e93 100755 --- a/tests/basenc/basenc.pl +++ b/tests/basenc/basenc.pl @@ -159,6 +159,7 @@ my @Tests = ['b16_7', '--base16 -d', {IN=>'G'}, {EXIT=>1}, {ERR=>"$prog: invalid input\n"}], ['b16_8', '--base16 -d', {IN=>"AB\nCD"}, {OUT=>"\xAB\xCD"}], + ['b16_9', '--base16 -d', {IN=>lc ($base16_out)}, {OUT=>$base16_in}], -- 2.41.0
bug#66698: I think hex decoding with basenc -d --base16 should be case-insensitive
Pádraig Brady writes: > Will apply the attached later. > Marking this as done. Thanks! It would make some sense to me to also have options --upper/--lower; on encoding, they would specify case of the output, on decoding, they would reject the other case (with default being to accept either). But less important than fixing the default behavior. > + basenc --base16 -d no supports lower case hexadecimal characters. > + Previously an error was given for lower case hex digits. s/ no / now / Regards, /Niels -- Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677. Internet email is subject to wholesale government surveillance.
bug#66698: I think hex decoding with basenc -d --base16 should be case-insensitive
On 23/10/2023 13:50, Niels Möller wrote: Pádraig Brady writes: Will apply the attached later. Marking this as done. Thanks! It would make some sense to me to also have options --upper/--lower; on encoding, they would specify case of the output, on decoding, they would reject the other case (with default being to accept either). But less important than fixing the default behavior. I was thinking `tr '[:lower:]' '[:upper:]'` would suffice for that when encoding. When decoding I don't see much need for the strictness, but that could also be enforced easily by prefiltering with something like `tr 'A-F' x` The same argument could be made of course for not needing this patch at all, by prefiltering through tr. However the default operation should be the most common requirement (and also the RFC documented operation in this case). A similar case I hit very frequently is pasting hex into bc, and it's very annoying to have to convert to uppercase before doing this. + basenc --base16 -d no supports lower case hexadecimal characters. + Previously an error was given for lower case hex digits. s/ no / now / Thanks, pushed. Pádraig.
bug#66713: Expr substr on plus symbol
Greetings, I'm not sure if this is the intended UNIX/POSIX behaviour, but on: < expr substr a 1 2 , I get: > a , which is right, but on: < expr substr + 1 2 I get: > expr: syntax error: missing argument after ‘2’ On expr "$line_of_text" 1 2, this error is thrown if the line is a simple '+'. A real-world scenario is getting the first character of each line in bulk, crashing if the line is '+'. < expr --version > expr (GNU coreutils) 8.32 --
bug#66714: [FIXED] Expr substr on plus symbol
NEVERMIND, IT'S ON 'INFO EXPR'. Could you folks please add it to 'man expr'? XXX Greetings, I'm not sure if this is the intended UNIX/POSIX behaviour, but on: < expr substr a 1 2 , I get: a , which is right, but on: < expr substr + 1 2 I get: expr: syntax error: missing argument after ‘2’ On expr "$line_of_text" 1 2, this error is thrown if the line is a simple '+'. A real-world scenario is getting the first character of each line in bulk, crashing if the line is '+'. < expr --version expr (GNU coreutils) 8.32 --
bug#66714: [FIXED] Expr substr on plus symbol
On 10/23/23 12:58, petabaud51 wrote: Could you folks please add it to 'man expr'? Man pages are supposed to be terse