Hi Ken,
>> ... and by the way why not open a little discussion >> about using CSS in all Apache docs? > By all means. What we really want to do, methinks, > is rip out all of the layout information and let the > stylesheets apply it when generating the HTML. Full Ack. This would mean to understand the semantic primitived used most often inside the Apache docs and then make them CSS classes of appropriate tags. > However, I don't think we're doing the XML thing for > the 1.3 documentation (though I personally think it > would be a Good Thing; any volunteers?). If you mean the generating HTML from XML, then I am not the XML freak to do so. But using CSS would be most useful if good and solid HTML is being used - see below for lots of details. > Let's definitely start working on a CSS stylesheet that > does what we want for the documentation. I'm all in favour. > When we've hammered out something with which we're > consensually happy, we can retrofit the docco to use it. This may be one way to do it. I am in favor of another way: decide to go the way and start doing steps that won't ever be revoked but don't do any harm. Thus I would try to migrate all files incrementally and in parallel - and this is just what I did during the last days. > In the meantime, though, I think having CSS in some files, > and not in others, and even different CSS in different > files, is just a confusing morass that we'd do well to avoid. :-) This was the direct trigger for what you will now read below. Regards, Michael - - - - - - - - - - - - - - - - - - - - - - - - - - - You allowed for a discussion about using CSS in the Apache docs. And you still cared about the Apache 1.3 - so do I. Here we are: Would you please have a look at http://www.schroepl.net/projekte/apache1326xhtml11/ and http://www.schroepl.net/projekte/apache1326xhtml11/validate/ What you will find there is a _complete_ version of the Apache 1.3.26 manual which is now valid XHTML 1.0 Strict. (Oh, sorry - one file is still XHTML 1.0 Transitional - see below why.) The file http://www.schroepl.net/projekte/apache1326xhtml11/apache1326xhtml11.zip (1.1 MB) contains this document tree (and my perl script to generate the /validate/ page) if you want to download it. Despite the path name on my server, during the process I decided against using XHTML 1.1, as this would no longer tolerate <a name="..."> and thus prevent Netscape 4 from working correctly, which might be one year too early to do so. Moving from XHTML 1.0 Strict to XHTML 1.1 would then only require replacing <a name="..."> by <a id="..."> which isn't too much work. Using CSS for the Apache documentation IMHO should serve the purpose of unifying the looks of all documents as well as unifying the looks of document parts that want to express a specific common semantics (like "this is code" or "this is an example" or "this is a syntax description of a directive"). So to be able to understand in how far such common structures and semantics are already in use I took a close look over the whole Apache 1.3.26 documentation. I decided to do it for the 1.3.26 docs as I am familiar with it, and also because it is no longer updated frequently, thus my version wouldn't be completely outdated before I could finish it. And then, the 1.3.26 docs' source code is still HTML; whatever would result from this project would have to be adapted to the XML->HTML generation process to be used for the 2.0 documentation, where there is not yet a complete source available (IIRC). To make sure I look closely at the HTML coding style being in use I made it my task to port this documentation to XHTML 1.0 Strict. M$IE and Mozilla now support a "standard compilant mode" if a <DOCTYPE> line is detected, so the document had better be a valid one now. It won't look great in Netscape 4 for sure - but I was just curious about how much would have to be done until valid XHTML 1.0 Strict. I did this by fixing technically broken parts of the documents (as simply as possible) and emulating layout definitions that are deprecated in XHTML by using a little CSS instead. The result ought to look very similar to the original, at least in modern browsers, but would provide a higher degree of abstraction about document structure. But all tags formatted via CSS are now highlighed in some colors. This doesn't mean I consider these colors to be reasonable; I only wanted to give an impression which parts of the documents have been modified during the XHTML 1.1 Strict validation process and which ones are represented by which HTML tags. This is implemented by just an hand- ful of CSS lines inside the global "httpd.css" file and can easily be removed. The only explicit deviation from the original looks that I have implemented is the changing background color of hovering links. Again, this is nothing but one line in the central CSS file. This might be a starting point for further modifications (replacing specific code sequences by the use of abstract class definitions), but it might just serve as a proof of concept that the Apache 1.3 docs might well be published in XHTML 1.0 Strict without severe drawbacks. I didn't go as far as I would have done with my own files, like I didn't eliminate cellpadding/cellspacing attributes by CSS definitions etc, and I didn't eliminate valign="top". I only did what the W3C validator told me to do for XHTML 1.0 Strict and tried to keep the "diff" as small as possible, so that you can easily understand what I changed. I can well imagine that you won't just take my version but apply only a part of the changes I have made. And on the way I found a handful of HTML coding errors whose correction might even be reasonable in the current XHTML 1.0 Transitional version (see below). I tried to log as many as possible in this file but may have corrected some more. Please "diff" my files against the 1.3.26 ones to be sure about that. Sorry for the lack of structure of all that follows below - these are just my thoughts and ideas while editing and validating all those 250 HTML files. P.S.: By the way, don't use the string "referer" within a file name (!) whose content you want to validate ... ========================================================= Added <?xml version="1.0" encoding="iso-8859-1" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> to all HTML documents, except those (Japanese only) that already contained more specific tags (which I didn't touch). Each document that was already XHTML 1.0 Strict (several dozens) contained _two_ DOCTYPE statements, not one; removed the older "Transitional" DOCTYPE from these. There are three Japanese documents with a different <html> tag than the rest of the Japanese translation: - mod/mod_cgi.html.ja.jis - mod/mod_indexbytype.ja.jis - cgi_path.html.ja.jis I didn't touch this but it might be noteworthy, and might even be unified by some of the Japanese users. One document was still HTML 3.2 Final: - mod/directives.html.de I ported this one to XHTML as well. Many French translations contain an comment which english revision their translation is based on; should this be a general rule for all translations? Should this be expressed by some <meta> tag instead of a comment? There are several <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> tags inside french documents, which might now be replaced by setting the charset in the <?xml version="1.0" encoding="iso-8859-1"?> definition. I used this <?xml> - as the W3C validator urged me to - but didn't touch the <meta>s. One document mpeix.html contained <meta name="GENERATOR" content="Mozilla/4.75 [en] (Win98; U) [Netscape]" /> which I removed. This one is also using <hr width="100%" />, which I have reduced to <hr />, as 100% would be the default value anyway. This might be formatted using CSS if necessary. <font> is deprecated in XHTML 1.0 Strict. It was used in the following documents: - mod/mod_rewrite.html: <font size="-1"> replaced by <small>. - mod/mod_cookies.html, mod/mod_dtd.html, mod/core.html.html, mod/core.html.en: <font color="red"> -> <span class="important"> and provided a global CSS formatting for this one. - custom_error.html.fr: <font face="Courier"> -> <code> <font size="+1"> has been replaced by <big>, but should possibly even be replaced by some CSS formatting as well. Likewise, <font size="-1"> became <small> in some cases. The misc/howto.html document is the only one that contains <meta description> and <meta keywords>; it might be a good idea to provide these for all documents. The misc/fin_wait_2.html document is the only one that contains <link rev="made"> to tell about the author, while the aforementioned mpeix.html uses a <meta> tag for this purpose. Again, unifying this concept might increase usability somehow for the future. Replaced <img src="../images/sub.gif" alt="[APACHE DOCUMENTATION]" /> by <img src="../images/sub.gif" alt="[APACHE DOCUMENTATION]" height="62" width="500" /> to make rendering of the document faster. I might have done the same for the other <img> but XHTML Strict didn't require me to; so I decided to keep the file difference smaller and more understandable for now. Added <link rel="Stylesheet" type="text/css" href="httpd.css" /> to 89 top_level HTML documents and <link rel="Stylesheet" type="text/css" href="../httpd.css" /> to 161 documents one directory level deeper. So now all docs include one global CSS file where any style sheet definitions may be placed. (See below for one specific case of another CSS file.) Removed all <meta name="generator" content="HTML Tidy, see www.w3.org" /> tags, as the version I created isn't "generated" any more. Replaced all <body ...> tags by a simply <body>; formatting body colors and link colors globally via CSS. Then removed the <!-- Background white, links blue (unvisited), navy (visited), red (active) --> comment from all documents, as this is no longer being set inside each document; this comment now resides inside httpd.css where it belongs. Checked for <h1> tags in all documents. Nearly all of them are <h1 align="center"> and use <h2>, <h3> ... for section structuring, except for the following four documents: 1. mpeix.htm: <center><h1></center> instead of <h1 align="center">, replaced this line. 2. howto/auth.html: <h1> used for several purposes here; downgraded all <h*> tags one level, except for the first one. 3. misc/rewriteguide.html (using <h1> tags for <h2> purposes); downgraded all <h*> tags one level, except for the first one. 4. dso.html is skipping the <h2> level totally; upgraded all <h3> tags to <h2>. Now each document contained exactly one <h1 align="center">; I replaced this by <h1> (in 242 documents) and made it align centered via CSS. Furthermore, misc/rewriteguide.html needed three local CSS definitions to format things that are unique to this file (tables and centered <address>). Located the starting sequence of each document (content of "header.html"), using a <h3> tag and a <div align="center"> around it; replaced all that by <div id="theapachelogo"> <img src="images/sub.gif" alt="[APACHE DOCUMENTATION]" height="62" width="500" /><br />Apache HTTP Server 1.3 </div> (the id="logo" was already occupied by misc/FAQ.html), made it align centered via CSS, and set the font face and font weight so that it looks like the former <h3> (as to no longer mis-using a hierarchy tag for layout purposes). Some documents missed the "1.3" version number here, this has been unified during the process. Replaced <h3 align="CENTER">Apache HTTP Server Version 1.3</h3> by <p class="footer">Apache HTTP Server Version 1.3</p> in all documents; make p.footer align centered via CSS. Some documents missed the "1.3" version number, this has been unified during the process. Surrounded <a href="./"><img src="images/index.gif" alt="Index" /></a> by <p>...</p> at the end of about 90 documents, as this content wasn't inside any block element. The same applies to <a href="./"><img src="../images/index.gif" alt="Index" /></a> <a href="../"><img src="../images/home.gif" alt="Home" /></a> at the end of about 160 document inside subdirectories. Replaced <p align="LEFT"> by simply <p>, as left alignment is the default anyway, and might be specified via CSS for the <p> tag if necessary. This was used inside three "suexec" documents only, which are formatted very differently than the rest of the Apache docs; these files should possibly be reformatted. The way how links to the table of content are formatted in this document seems to be unique for the whole Apache docs; thus I didn't introduce another global format but specified one (p.tocontent) inside these three documents. Some documents, like mod/mod_rewrite, want to indent the whole <body> and used <blockquote> for this purpose. Replaced this by setting a CSS-margin-left into <body>, but don't feel this should really be used at all. Seven documents, mostly named "win_", seem to need underlining. As the <u> tag is deprecated and invalid in XHTML 1.0 Strict I supplied a global definition of <span class="underline">. Indentation via <blockquote> has been used at over 400 positions in more than 90 files; but just writing text inside a <blockquote> without some <p> etc. isn't valid XHTML 1.0 Strict. Thus I replaced <blockquote> by <div class="indent"> and created a global CSS definition for this one, and another one for <div class="indentdeep"> (for the one document that used several cascaded <blockquote>s - maybe better eliminate this totally). This point was most interesting to me, as there seem to be different semantics that are currently expressed by indenting, like "this is an example" or "this is important" or "look at this", all of which might be explicitly de- fined, mapped to some CSS class (like <div class="example">) and then used consistently throughout the whole Apache documentation. In many cases the indentation just covered a <pre> section, so that in this case it would have been better to classify this <pre> semantically (such as <pre class="directive">, see elsewhere in this document) and omit the <div class="indent"> wrapper at this point. This case may serve as a rule of thumb how to reasonably migrate to using CSS: Identify common semantic structures in all documents, and define some appropriate combination of tag and class for this, and then use it uniquely in each document. There should not be any need to invent any document specific formatting; either use some global format specification or invent some new one but make it usable for other documents. For example, there are lots of constructions like <p>Example:<br /> Suppose the local server has address <samp>http://wibble.org/</samp>; then</p> <pre> ProxyPass /mirror/foo/ http://foo.com/ ProxyPassReverse /mirror/foo/ http://foo.com/ </pre> <p>will not only cause a local request for the <<samp>http://wibble.org/mirror/foo/bar</samp>> to be internally converted into a proxy request to <<samp>http://foo.com/bar</samp>> (the functionality <samp>ProxyPass</samp> provides here).</p> that might better be coded as <div class="example">Example:<br /> Suppose the local server has address <samp>http://wibble.org/</samp>; then <pre> ProxyPass /mirror/foo/ http://foo.com/ ProxyPassReverse /mirror/foo/ http://foo.com/ </pre> will not only cause a local request for the <<samp>http://wibble.org/mirror/foo/bar</samp>> to be internally converted into a proxy request to <<samp>http://foo.com/bar</samp>> (the functionality <samp>ProxyPass</samp> provides here).</div> and then get some unique "look of an example" via CSS, potentially different to the normal text (such as having its own background color, border etc.). I didn't change things like these, as this would require a decision to do so globally for all documents. Likewise, I would try to define semantical classes for different types of non-proportional output. Right now I can think of at least these types: - Apache directives (i. e. content of httpd.conf> - URIs, file names within normal text (like ".htaccess") - C source code (like in misc/API.html) At least directives and C source code would be formatted in <pre> (to provide similar looks to older browsers), so this might then be <pre class="directives"> etc., while inlined non-proportional text might remain <code> but use classes like <code class="filename"> and the like. (I am a fan of tagging a _lot_ of meta information into visual differences; I can well understand if you say "who cares" about these details.) By the way, identifying such semantic "primitives" may even have some influence on the XML structure definition from which such HTML code might later be generated, if such a concept would be applied to the Apache 2.0 docs. The misc/FAQ.html is formatting code using - this one should rather use <pre>, like many other documents. And the looks of <pre> should possibly be formatted with CSS (font-size, background-color) so that code sections are easy to detect as such. Compare this to dso.html where the author even used a <table> to give some tiny code section a colored background; I ported this to a local CSS definition for <td> but consider this bad style, better have a global concept for displaying code parts. The howto/auth.html is using link targets whose names contain nothing but digits; but a target name must start with a letter in XHTML 1.0 Strict. As there is not a single reference to these targets within the complete Apache documentation, I just removed them. The same happened in misc/known_client_problems; again this target wasn't referenced anywhere, so I changed it to a different name (this sounded like a potential target from outside links; I am aware of the fact that this might cause problems in case anyone linked to this paragraph from other sites.) cygwin.html contains a broken HTML tag at <href="http://www.cygwin.com"> (tag name "a" is missing) and two definitions of the identical target "inst" (changed this to "socket" in line ~ 417). vhosts/name_based.html.html contains a link to "directive-dist.html#Context" which is a typo, replaced "dist" by "dict" (in all three language variants). <center> tags are no longer valid in XHTML 1.0 Strict. Four files were using them: 1. install-tpf.html 2. readme-tpf.html to center a <h2> tag that serves for document title where all other documents would use <h1> instead; upgraded all <h*> tags by one step. 3. misc/descriptors.html for centering a <pre> section that doesn't need this -> removed this (making this <pre> look like all other <pre>s in the Apache docs) 4. mod/mod_rewrite.html for centering <h1> tags, which are centered anyway via CSS now -> removed <center> here Both TPF documents use their unique technique of pro- viding lists of links to all chapters. Furthermore, they don't contain just one document but a collection of articles without any top-level headline; their character differs from most other documents so it is not easy to decide about which HTML tags would serve best for formatting these documents. Replaced the <center>ed link line by <p class="links">. Also, install-tpf.html uses a special formatting of a "tip" (red and bold), implemented via <font> tags; replaced this by <div class="tip"> and made a CSS formatting inside this document. (Might be migrated to the global CSS file and then used everywhere.) This document contained an internal link to "#configure" which is undefined; replaced this by "#run-configure". In XHTML Strict <ul> and <li> must be cascaded such that an inner <ul> is _inside_ of an <li>, not between two of them. Tried to fix a couple of documents like install.html.*, but this one is using ordered lists in a way that cannot easily be reproduced at all in valid XHTML Strict without misusing some tags. Thus I removed the automatic numbering in line 326. (All translations of this document had already modified the HTML code at this point somehow.) core.html contained some other errors as well: - incorrect <p>...</p> usage - one " " without the terminating ";" - unknown tag <emph> -> <em> Two files, - misc/FAQ.html and - mod/mod_example.html use <ol type="A">; replaced this by <ol style="list-style-type:upper-alpha;">, as this one is nearly unique for the Apache docs. May well be replaced by <ol class="uppercase"> and migrated into the global CSS file. The misc/FAQ.html file seems to have a strange origin. Maybe it once has been a number of separate files that have been copied to one file? It contained "</body></html>" lots of times within its content; removed all these except the last one Furthermore, it uses <li value="1"> etc. which aren't necessary (and are not valid in XHTML 1.0 Strict); removed all these value definitions. And then, I corrected a handful of other coding errors, like incomplete tags. One may read this file's "diff" very carefully. ebcdic.html is the only file to center _all_ headlines. I mapped this to a document local CSS style but would rather suggest to eliminate this deviation. ssi.html: using apostroph in link targets is illegal; checked that this isn't referenced anywhere and removed the apostroph. core.html contains a single <table> with some formatting; replaced this by a local CSS style for <th> inside this file. Also, one <dl compact="compact"> was replaced by <dl class="compact"> in core.html and some (empty) local CSS definition (one might specify anything useful there if necessary). Same thing happened for misc/API.html and mod/mod_auth.html. mod/mod_auth_anon.html contained "<" and ">" characters other than to delimit tags; replaced this by "<" and ">" entities. mod/mod_auth_digest.html contained a link that wasn't closed by </a>. mod/mod_rewrite.html contained lots of <hr noshade="noshade">. Removed this parameter as it is the default anyway; this may be formatted via CSS for all documents if necessary. One single illegal order of <pre> and <small> has been replaced by a <pre style="font-size:80%"> - not a perfect solution, there should rather be some class for this kind of <pre>s. This file also contained some very specific layout parts (tables with lots of parameters like background colors) which I didn't dare to touch; so this single document of all didn't reach XHTML 1.0 Strict, and I set its DOCTYPE back to Transitional. vhosts/name-based.html illegally uses <td align="top">; this must be valign="top". Fixed it, as well as two ">" without trailing ";" in the japanese version. "<br>" must be written "<br />" in XHTML Strict. Fixed this in - vhosts/name-based.html and - mod/mod_proxy.html It might be reasonable to use just _one_ HTML tag for specific code snippets, like file names or URLs or directives or the like. Currently there are in use: - <code> occurs 5844 times in 186 files, - <samp> occurs 1172 times in 77 files, - <tt> occurs 448 times in 31 files. But then, there are files like mod/mod_headers.html that don't even use any of the above to markup code, like the parameters of the "Headers" directive. Many files use "<br /><br />" to create some fancy distance between two paragraphs. I was not able to detect any kind of systematic use of this technique. Mostly these <br /> were even outside of any block element, thus I had to wrap them into <p>...</p>, but I don't consider this a reasonable solution. CSS definitions for "margin-top" would be a better way to solve this, if required at all. Maybe just eliminate these, as they differ from the looks of many other documents anyway. My own local syntax checker reports " characters within normal text as errors and suggests using " instead. As this was not required by XHTML 1.1 Strict and would have meant several thousand replacement positions and huge file "diff"s I did't touch this. Sometimes link target definitions were positioned outside any block elements as well; I moved them into the tag they were obviously meant to locate. Found and fixed lots of these while running the W3C validator over each document - check the "diff"s to find out where this happened. And then, hundreds of missing <p>...</p> around text sections in more than 100 documents; they all had to be fixed to reach XHTML 1.0 Strict. One type that occurred very frequently were the links for attributes of directives in - bind.html - keepalive.html - location.html - man-template.html - multilogs.html and - nearly every single mod/*.html file. I wrapped all these into <div class="attributes">; thus they can now be formatted globally via one line inside the httpd.css file (and I already did so). This was rather difficult for the Japanese versions, so I may have made some errors there - please proof- read this with most care. (Anyway I can just hope my text editor didn't damage any of the Japanese files.) At this point I recognized that the directive names are included partially in <h2>, partially in <h3> tags; I don't really consider these definitions to be chapters, thus it might be more reasonable (and better for unified looks of all documents) to use some <div class="directive"> for this purpose. For now, I didn't touch this yet. Finally, the index.html.* documents, with lots of tables and hard-coded formatting. They really were a challenge. I had to replace so many things as to make it valid XHTML 1.0 Strict that I decided to totally rewrite this document, use CSS as I think it should be used there, and store the CSS definitions in a separate file index.css (as this file is available in four language instances already). I ran Xenu over the resulting tree; this one checks the default language files only but didn't find any broken link now. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
