Horst Gutmann wrote:
> I currently have quite a big problem with minidom and special chars
> (for example ü) in HTML.
Yes. Ignoring the issue of the wrong doctype, minidom is a pure XML
parser and knows nothing of XHTML and its doctype's entities 'uuml' and
the like. Only the built-in entities (
Horst Gutmann napisał(a):
Don't use minidom or convert HTML4 to XHTML and change declaration of
doctype.
This was just a bad example :-) I get the same problem with XHTML in the
doctype. The funny thing here IMO is, that the special chars are simply
removed. No warning, no nothing :-(
As Fredri
Jarek Zgoda wrote:
Horst Gutmann napisał(a):
I currently have quite a big problem with minidom and special chars
(for example ü) in HTML.
Let's say I have following input file:
--
http://www.w3.org/TR/html4/strict.dtd";>
HTML4 is not an XML appli
Horst Gutmann napisał(a):
I currently have quite a big problem with minidom and special chars (for
example ü) in HTML.
Let's say I have following input file:
--
http://www.w3.org/TR/html4/strict.dtd";>
HTML4 is not an XML application. Even if mini
Fredrik Lundh wrote:
umm. doesn't that doctype point to an SGML DTD? even if minidom did fetch
external DTD's (I don't think it does), it would probably choke on that DTD.
running your documents through "tidy -asxml -numeric" before parsing them as
XML might be a good idea...
http://tidy.sour
Horst Gutmann wrote:
> I currently have quite a big problem with minidom and special chars (for
> example ü) in HTML.
>
> Let's say I have following input file:
> --
>
> "http://www.w3.org/TR/html4/strict.dtd";>
>
>
> ü
>
>
>
Hi :-)
I currently have quite a big problem with minidom and special chars (for
example ü) in HTML.
Let's say I have following input file:
--
http://www.w3.org/TR/html4/strict.dtd";>
ü
--
And fol