Hi, Quoting Adam Borowski (2013-08-12 02:51:52) > On Mon, May 06, 2013 at 02:49:57PM +0200, Andreas Beckmann wrote: > > now might be the right time to start a discussion about release goals > > for jessie. > > I would like to propose full UTF-8 support. I don't mean here full > support for all of Unicode's finer points, merely complete eradication of > mojibake. That is, ensuring that /m.o/ matches "möo", or that "ä" sorts > as equal to "a""combining ¨" is out of scope of this proposal. > > I propose the following sub-goals: > > 1. all programs should, in their default configuration, accept UTF-8 input > and pass it through uncorrupted. Having to manually specify encoding > is acceptable only in a programmatic interface, GUI/std{in,out,err}/ > command line/plain files should work with nothing but LC_CTYPE.
as an addendum to this release goal proposal, it is maybe also worth mentioning working multibyte character support in coreutils as a possible goal. From http://bugs.debian.org/139861 : $ echo -e "日\n本\nで\nは" | sort -u | wc -l 3 $ echo -e "日\n本\nで\nは" | sort | wc -l 4 Or having head/tail which work character base instead of byte based would be sweet as well. While upstream doesnt seem to support this, it seems that Fedora has a patch for coreutils: http://pkgs.fedoraproject.org/cgit/coreutils.git/tree/coreutils-i18n.patch?id=6e10f376996b64f538259091a524df2249b653fb;id2=HEAD or also: http://trac.cross-lfs.org/browser/patches/coreutils-6.12-unicode-1.patch?rev=577dd2d59133e10bd32c58844293e93af0e6f162 cheers, josch -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131014105058.7934.26083@hoothoot