severity 24924 wishlist tags 24924 wishlist notabug thanks
Hello Dan, On 11/11/2016 11:10 AM, 積丹尼 Dan Jacobson wrote:
The pr documentation (man, info) doesn't mention how it has no concept of wide characters. $ pr -m --sep-string='^^^' file file
Indeed, most of the current coreutils programs do not support wide or multi-byte characters correctly. The current official implementation does not support it (which is why I marked this item as 'wishlist' and not a bug). On RedHat systems, there is the 'i18n' patch, which adds some support but also introduces some problematic issues: https://github.com/pixelb/coreutils/tree/i18n However, there is an active effort to make all of them multibyte aware. The latest updates are (in reverse chronological order, these are somewhat long threads): http://lists.gnu.org/archive/html/coreutils/2016-09/msg00026.html http://lists.gnu.org/archive/html/coreutils/2016-09/msg00011.html http://lists.gnu.org/archive/html/coreutils/2016-07/msg00013.html 'cut' and 'expand' were the first two programs I worked on. 'pr' is definitely on the list - once I have a proof-of-concept working, I would very much appreciate if you could help me test it as there are many edge-cases with multibyte support and wide-characters. As a curiosity, are you using UTF-8 locales exclusively, or do you have experience with Shift-JIS or EUC-JP locales? I'm leaving this ticket open, and welcome discussion and comments. regards, - assaf P.S. The usual disclaimer applies: there is currently no ETA for multibyte support in coreutils.
