Jim Gifford wrote:
With the recent thread in Cross-LFS and LFS-dev lists, I wanted to
pose an idea. Couldn't we add the utf-8 pages to an appendix and refer
to them via notes in the builds, like we currently do with Cracklib in
Shadow, the only difference is that we would be refering to an
appendix in the book. Is this acceptable to everyone.
Jim,
I have prepared a sample patch (attached, but please don't apply now)
that implements the changes described by you to the Coreutils page. Does
this sample look OK to you?
If so, I will make similar changes to other package pages and submit a
combined patch. All multibyte patches will become optional (but you will
have to deal with "how to upgrade my system to UTF-8 capable one"
support questions yourself then).
--
Alexander E. Patrakov
Index: chapter06/coreutils.xml
===================================================================
--- chapter06/coreutils.xml (revision 7278)
+++ chapter06/coreutils.xml (working copy)
@@ -29,6 +29,10 @@
<sect2 role="installation">
<title>Installation of Coreutils</title>
+<caution><para>The Coreutils package has some issues when used in a multibyte
+locale. For a full explanation of the issues and the related build
+procedure changes, see <xref linkend="l-coreutils"/>.</para></caution>
+
<para>A known issue with the <command>uname</command> program from
this package is that the <parameter>-p</parameter> switch always
returns <computeroutput>unknown</computeroutput>. The following patch
@@ -41,21 +45,7 @@
<screen><userinput>patch -Np1 -i ../&coreutils-suppress-patch;</userinput></screen>
-<para>POSIX requires that programs from Coreutils recognize character
-boundaries correctly even in multibyte locales. The following patch
-fixes this non-compliance and other internationalization-related bugs:</para>
-
-<screen><userinput>patch -Np1 -i ../&coreutils-i18n-patch;</userinput></screen>
-
-<para>In order for the tests added by this patch to pass, the permissions for
-the test file have to be changed:</para>
-
-<screen><userinput>chmod +x tests/sort/sort-mb-tests</userinput></screen>
-
-<note><para>In the past, many bugs were found in this patch. When reporting
-new bugs to Coreutils maintainers, please check first if they are reproducible
-without this patch.</para></note>
-
+<!-- this happens even in non-UTF-8 locales, so this sed is not optional -->
<para>It has been found that translated messages sometimes overflow a buffer
in the <command>who -Hu</command> command. Increase the buffer size:</para>
Index: index.xml
===================================================================
--- index.xml (revision 7278)
+++ index.xml (working copy)
@@ -40,6 +40,7 @@
<?dbhtml filename="part4.html"?>
<xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="appendixa/acronymlist.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="appendixb/acknowledgments.xml"/>
+<xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="appendixc/locale-issues.xml"/>
</part>
<index/>
Index: appendixc/locale-issues.xml
===================================================================
--- appendixc/locale-issues.xml (revision 0)
+++ appendixc/locale-issues.xml (revision 0)
@@ -0,0 +1,80 @@
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+ "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [
+ <!ENTITY % general-entities SYSTEM "../general.ent">
+ %general-entities;
+]>
+
+<appendix id="appendixc" xreflabel="Appendix C">
+ <?dbhtml dir="appendixc"?>
+ <?dbhtml filename="locale-issues.html"?>
+
+ <title>Locale related issues</title>
+
+ <para>This appendix contains modifications of the build procedure that are
+ needed for the resulting system to support multibyte locales (including
+ UTF-8-based ones). All these changes are harmless if only traditional
+ 8-bit locales are used. It is safe to apply all of them
+ unconditionally.
+ </para>
+ <sect1 id="l-coreutils">
+ <title>Coreutils-&coreutils-version;</title>
+
+ <para>By default, the <command>cut</command>, <command>expand</command>,
+ <command>fold</command>, <command>join</command>,
+ <command>pr</command>, <command>sort</command>,
+ <command>unexpand</command> and <command>uniq</command> programs
+ don't behave correctly in locales where a character can occupy
+ more than one cell or can be represented by a multibyte sequence.
+ Errors include:</para>
+ <itemizedlist>
+ <listitem>
+ <para>failures to recognize multibyte characters as field separators,
+ </para>
+ </listitem>
+ <listitem>
+ <para>cases where the newline character or something else gets inserted
+ between the bytes that form a multibyte character (thus damaging
+ it),
+ </para>
+ </listitem>
+ <listitem>
+ <para>cases where the length of string in bytes is used where its
+ width in cells should be used (thus causing incorrect formatting
+ and alignment of the output text),
+ </para>
+ </listitem>
+ <listitem>
+ <para>failures to treat multibyte whitespace characters (according to the
+ current locale) as white space.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>Such erroneous behaviour violates POSIX requirements.
+ In order to remove some of these errors, apply the following patch
+ before running the configure script:
+ </para>
+
+ <screen><userinput>patch -Np1 -i ../&coreutils-i18n-patch;</userinput></screen>
+
+ <para>In order for the tests added by this patch to pass, the permissions
+ for the new test file have to be changed:</para>
+
+<screen><userinput>chmod +x tests/sort/sort-mb-tests</userinput></screen>
+
+ <note>
+ <para>In the past, many bugs were found in this patch. When reporting
+ new bugs to Coreutils maintainers, please check first if they are
+ reproducible without this patch.
+ </para>
+ </note>
+ <para>It should be also noted that the patch does not resolve all
+ issues in Coreutils functionality in multibyte locales. One of the
+ remaining issues is that the
+ <command>tr [:upper:] [:lower:]</command>
+ command doesn't change the case of non-ASCII characters.
+ </para>
+ </sect1>
+
+</appendix>
--
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page