Hello Coreutils maintainers!
I've recently spent some time adding multibyte support to the coreutils
text processing tools (sort, uniq, join, tr, cut, paste, expand, unexpand,
fmt, fold, and pr) in this repository:
https://github.com/ericfischer/coreutils-utf8
I haven't tackled cut -bn yet, or multibyte octal escapes in tr, or figured
out whether there is an appropriate way to do multibyte case mappings in
dd, and tr probably uses too much memory, but I think all the other places
where POSIX specifies characters instead of bytes are covered.
I just learned from
http://lists.gnu.org/archive/html/bug-coreutils/2017-12/msg00017.html that
there is another ongoing multibyte project. I wish I had known about that
before duplicating effort, but at least it looks like I have touched some
areas that the other branch hasn't, so I hope my changes will still be of
some use.
Before I put any more work into cleaning up my branch, I also have a
general development question: Is it OK to make multibyte additions to the
lib directory here, or do those changes need to made in an upstream
repository or in the applications themselves?
Eric