Speaking of the devil; if a string contains an invalid UTF-8 char, substr
gets a really wierd behavior: $ echo | ~/program/9base/awk/awk '{s =
sprintf("asdf%casdf", 195); printf("\"%s\"\n", substr(s, 6, 4)); print s;}'
"" asdfÃasdf
Try changing the second and third arg of substr (set length to 1 and it
returns "f").
David
On Jan 22 2013, Peter A. Shevtsov wrote:
Hello,
I've found the bug in 9base's awk. It seems that printf works incorrectly
with utf-8 strings. The way it counts string lengs is weird:
echo latin кириллица | /usr/local/plan9/bin/awk
'{printf("[%20s][%20s]\n", $1, $2)}'
and the output is:
[ latin][ кириллица]
It seems that it counts every cyrillic letter as two, i. e. it ain't
count letters (or runes) but bytes.