Package: docbook-utils Version: 0.6.12-2 Severity: normal I experience a strange misfeature when generating HTML code from docbook. The parser do not treat newline as whitespace, and seem to include it in the HTML file. I made a small example to demonstrate the problem. I believe these two XML files should be generate the same result:
File 1: <?xml version="1.0" encoding="ASCII"?> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" []> <book lang="en"> <bookinfo> <title>T</title> </bookinfo> <chapter> <title>T1</title> <sect1> <title>T12</title> <para></para> <para>P</para> <para>P1</para> <para>P12</para> <para>P123</para> </sect1> <sect1> <title>T123</title> <para>P1234</para> </sect1> </chapter> </book> File 2: <?xml version="1.0" encoding="ASCII"?> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" []> <book lang="en"> <bookinfo> <title> T </title> </bookinfo> <chapter> <title> T1 </title> <sect1> <title> T12 </title> <para> </para> <para> P </para> <para> P1 </para> <para> P12 </para> <para> P123 </para> </sect1> <sect1> <title> T123 </title> <para> P1234 </para> </sect1> </chapter> </book> The only difference is the newline between the tags and the content. When generating HTML from these two sources using 'docbook2html --nochunks', the HTML code have differences like this: --- test.en.html 2003-09-13 23:41:04.000000000 +0000 +++ test.en2.html 2003-09-13 23:41:23.000000000 +0000 @@ -2,7 +2,8 @@ <HTML ><HEAD ><TITLE ->T</TITLE +> T + </TITLE ><META NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.7"></HEAD Notice the extra ' ' inserted in front of the book title. Why is this so? Is it a bug in the parser, or something else? -- System Information Debian Release: 3.0 Architecture: i386 Kernel: Linux minerva.hungry.com 2.4.19-386 #1 Mon Nov 18 21:50:03 EST 2002 i686 Locale: LANG=no_NO, LC_CTYPE=no_NO Versions of packages docbook-utils depends on: ii docbook-dsssl 1.76-1 Modular DocBook DSSSL stylesheets, ii jadetex 3.12-2 LaTeX macros for SGML to DVI/PS/PD ii links 0.96.20020409-2 Character mode WWW browser ii lynx 2.8.4.1b-3.2 Text-mode WWW Browser ii perl 5.6.1-8.3 Larry Wall's Practical Extraction ii sgmlspl 1.03ii-20 SGMLS-based example Perl script fo ii sp 1.3.4-1.2.1-28 James Clark's SGML parsing tools