On Sat, 28 Oct 2023 10:33:09 -0600, "Todd C. Miller" wrote: > Looks like an off-by-one introduced in the utf8 conversion. > The following fixes the bug for me. I will file a PR upstream.
Unfortunately, things are not so simple. Below is a workaround until I have a better fix. - todd Index: usr.bin/awk/run.c =================================================================== RCS file: /cvs/src/usr.bin/awk/run.c,v retrieving revision 1.79 diff -u -p -u -r1.79 run.c --- usr.bin/awk/run.c 6 Oct 2023 22:29:24 -0000 1.79 +++ usr.bin/awk/run.c 28 Oct 2023 18:24:38 -0000 @@ -1017,6 +1017,8 @@ Cell *substr(Node **a, int nnn) /* subs y = gettemp(); mb = u8_char2byte(s, m-1); /* byte offset of start char in s */ nb = u8_char2byte(s, m-1+n); /* byte offset of end+1 char in s */ + if (nb >= k) + nb = k - 1; temp = s[nb]; /* with thanks to John Linderman */ s[nb] = '\0';