On Wed, 24 Jun 2020 Finn wrote:
Charles Curley wrote:
Before you do that, are you sure what you see isn't Firefox's reaction
to buggy HTML? Have you run the code through an HTML validator?
Thanks for reminding that. HTML validator shows only one error, with
"width" attribute[1].
[1]:
https://validator.w3.org/check?uri=https%3A%2F%2Fwww.debian.org%2Fdoc%2Fmanuals%2Fdebian-reference%2Fch10.en.html%23_removable_storage_device&charset=%28detect+automatically%29&doctype=Inline&group=0
I see this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
^^^^^
and I see a lot of empty elements written in combined start-end form,
like this,
<tagname attribs />
but without any space preceding the forward slash at the end of the tag.
And by "a lot", I mean one dozen and two hundreds:
$ grep -o '<[^<]*[^[:blank:]]/>' ch10.en.html | wc -l # NB[1]
212
$ grep -o '<[^<]*[^[:blank:]]/>' ch10.en.html |
> grep -o "^<[[:alpha:]]*" |
> tr -d "<" | sort | uniq -c
67 a
15 br
54 col
2 hr
67 img
5 link
2 meta
In ye olde book about xhtml[2], it is written:
Section 16.3.3 Handling Empty Elements
In XML, and thus XHTML, every tag must have a corresponding end
tag---even those that aren't allowed to contain other tags or
content. Accordingly, XHTML expects the line break to appear as
<br></br> in your document. Ugh.
Fortunately, there is an acceptable alternative: include a slash
before the closing bracket of the tag to indicate its ending (eg,
<br />). If the tag has attributes, the slash comes after, the
slash comes after all the attributes so that an image could be
defined as:
<img src="kumquat.gif" />
While this notation may seem foreign and annoying to an HTML
author, it actually serves a useful purpose. Any XHTML element that
has no content can be written this way. Thus, an empty paragraph
can be written as <p />, and an empty table cell can be written as
<td />. This is a handy way to mark empty table cells.
Clever as it may seem, writing empty tags in this abbreviated way
may confuse HTML browsers. To avoid compatibility problems, you can
fool the HTML browsers by placing a space before the forward slash
in an empty element using the XHTML version of its end tag. For
example, use <br />, with a space between the "br" and '/', instead
of the XHTML equivalents <br/> and <br></br>. Table 16-1 contains
all of the empty HTML tags, expressed in their acceptable XHTML
(transitional DTD) forms.
Table 16-1. *HTML empty tags in XHTML format*
<area /> <base /> <basefont />
<br /> <col /> <frame />
<hr /> <img /> <input />
<isindex /> <link /> <meta />
<param />
NOTES
1. "Every time you attempt to parse HTML with regular expressions, the
unholy child weeps the blood of virgins, and Russian hackers pwn
your webapp."
stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454
2. Musciano and Kennedy 2007, "HTML & XHTML the Definitive Guide" (6ed)
--
Firstly, you must always implicitly obey orders, without attempting to
form any opinion of your own respecting their propriety. Secondly, you
must consider every man your enemy who speaks ill of your king; and
thirdly, you must hate a Frenchman, as you do the devil. --H. Nelson