Speaking of the devil; if a string contains an invalid UTF-8 char, substr gets a really wierd behavior: $ echo | ~/program/9base/awk/awk '{s = sprintf("asdf%casdf", 195); printf("\"%s\"\n", substr(s, 6, 4)); print s;}' "" asdfÃasdf

Try changing the second and third arg of substr (set length to 1 and it returns "f").


David

On Jan 22 2013, Peter A. Shevtsov wrote:

Hello,

I've found the bug in 9base's awk. It seems that printf works incorrectly with utf-8 strings. The way it counts string lengs is weird:

echo latin кириллица | /usr/local/plan9/bin/awk '{printf("[%20s][%20s]\n", $1, $2)}'

and the output is:

[               latin][  кириллица]

It seems that it counts every cyrillic letter as two, i. e. it ain't count letters (or runes) but bytes.



Reply via email to