On 1/12/22 12:19, zendas via GNU coreutils Bug Reports wrote: > I have considered dealing with this problem directly with three bytes > instead, but I have two doubts, I can correctly use wc -m to recognize the > bytes in the same environment (but cut can't?), and my script goal is to > recognize Chinese, will The probability of execution is higher on platforms > that support Chinese environment. In addition, the fixed three-byte approach > cannot handle the mixed content of full shape and half shape. I need a lot of > judgment and conversion, which will greatly increase the possibility of > errors.
As Bob wrote, some downstream distributions have multi-byte support in cut(1) for many years, e.g. RHEL/Fedora and SUSE/openSUSE. E.g. here on my openSUSE system: $ echo "你好啊" | LC_ALL=zh_CN.UTF-8 cut -c 1 你 Have a nice day, Berny