According to POSIX standard, a backslash in the replacement of sub() should be treated as a literal backslash if it is not preceded by a '&' or another backslash. But busybox awk skips it unconditionally, regardless of the following character. For example,
$ echo "abc" | busybox awk 'sub(/abc/, "\\d")' d where \d is expected here. This is known to break rsync's documentation converter. Let's check the next character before skipping the backslash, following POSIX standard and behavior of GNU awk. Link: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html Link: https://github.com/RsyncProject/rsync/blob/62bb9bba022ce6a29f8c92307d5569c338b2f711/help-from-md.awk#L22 Fixes: 5f84c5633 ("awk: fix backslash handling in sub() builtins") Signed-off-by: Yao Zi <[email protected]> --- editors/awk.c | 7 ++++++- testsuite/awk.tests | 5 +++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/editors/awk.c b/editors/awk.c index 64e752f4b..40f5ba7f7 100644 --- a/editors/awk.c +++ b/editors/awk.c @@ -2636,8 +2636,13 @@ static int awk_sub(node *rn, const char *repl, int nm, var *src, var *dest /*,in resbuf = qrealloc(resbuf, residx + replen + n, &resbufsize); memcpy(resbuf + residx, sp + pmatch[j].rm_so - start_ofs, n); residx += n; - } else + } else { +/* '\\' and '&' following a backslash keep its original meaning, any other + * occurrence of a '\\' should be treated as literal */ + if (bslash && c != '\\' && c != '&') + resbuf[residx++] = '\\'; resbuf[residx++] = c; + } bslash = 0; } } diff --git a/testsuite/awk.tests b/testsuite/awk.tests index be25f6696..61b3bc7d6 100755 --- a/testsuite/awk.tests +++ b/testsuite/awk.tests @@ -617,4 +617,9 @@ testing 'awk gsub erroneous word start match' \ 'abc\n' \ '' '' +testing 'awk sub literal backslash in replacement' \ + 'awk '$sq'sub(/abc/, "\\\d")'$sq \ + '\d\n' \ + '' 'abc\n' + exit $FAILCOUNT -- 2.47.0 _______________________________________________ busybox mailing list [email protected] https://lists.busybox.net/mailman/listinfo/busybox
