On Wed, Oct 06, 1999 at 02:24:27PM +0000, Duncan Simpson wrote:
> When I last heard XML is a standard like SGML. You have to supply a DTD (document
>type description?). You need different tools for different DTDs :-) M$ could produce
>a wierd word DTD that is very hard to import, for example by emphanising physical
>style tags everywhere. Without having read the FS I *think* HTML can be repressented
>as a "brand" of XML. (At least one current HTML checker is a SGML checker packaged
>with a load of HTML DTDs).
>
Just for the record XML is a SGML subset, it excludes some of
the more exoteric sgml's features, that are rarely used to
have a cleaner format that is easily parsable.
DTD stands for Document Type Definition.
XML is also a meta-language, and HTML defined using XML.
There are available DTDs where this defined.
The idea of SGML and XML is that you can have the same tool
for different DTDs. Supposing that M$ follows the standard
(ok, lets start the bets ;) you will be able to use, as an example,
libxml (used with gnome now) to parse it.
This is the whole point of XML.
> Both HTML and DocBook are SGML but nestcape will not handle DocBook and jade does
>not handle HTML. Welcome to propietry formats that can masquerade as open standards.
>word2x 2 might get a XML reader and it's structure extraction engines, both existing
>and vaporware, might manage the rest. [word2x 2 is
> currently incomplete and non-working develware.]
I think the HTML dtd is XML conformant.
Docbook will be (at least this is the intent now) XML compatible
with version 5 (~2001).
> --
> Duncan (-:
> "software industry, the: unique industry where selling substandard goods is
> legal and you can charge extra for fixing the problems."
--
José