Re: Problem with minidom and special chars in HTML

2005-02-24 Thread and-google
Horst Gutmann wrote: > I currently have quite a big problem with minidom and special chars > (for example ü) in HTML. Yes. Ignoring the issue of the wrong doctype, minidom is a pure XML parser and knows nothing of XHTML and its doctype's entities 'uuml' and the like. Only the built-in entities (

Re: Problem with minidom and special chars in HTML

2005-02-23 Thread Jarek Zgoda
Horst Gutmann napisał(a): Don't use minidom or convert HTML4 to XHTML and change declaration of doctype. This was just a bad example :-) I get the same problem with XHTML in the doctype. The funny thing here IMO is, that the special chars are simply removed. No warning, no nothing :-( As Fredri

Re: Problem with minidom and special chars in HTML

2005-02-23 Thread Horst Gutmann
Jarek Zgoda wrote: Horst Gutmann napisał(a): I currently have quite a big problem with minidom and special chars (for example ü) in HTML. Let's say I have following input file: -- http://www.w3.org/TR/html4/strict.dtd";> HTML4 is not an XML appli

Re: Problem with minidom and special chars in HTML

2005-02-22 Thread Jarek Zgoda
Horst Gutmann napisał(a): I currently have quite a big problem with minidom and special chars (for example ü) in HTML. Let's say I have following input file: -- http://www.w3.org/TR/html4/strict.dtd";> HTML4 is not an XML application. Even if mini

Re: Problem with minidom and special chars in HTML

2005-02-22 Thread Horst Gutmann
Fredrik Lundh wrote: umm. doesn't that doctype point to an SGML DTD? even if minidom did fetch external DTD's (I don't think it does), it would probably choke on that DTD. running your documents through "tidy -asxml -numeric" before parsing them as XML might be a good idea... http://tidy.sour

Re: Problem with minidom and special chars in HTML

2005-02-22 Thread Fredrik Lundh
Horst Gutmann wrote: > I currently have quite a big problem with minidom and special chars (for > example ü) in HTML. > > Let's say I have following input file: > -- > > "http://www.w3.org/TR/html4/strict.dtd";> > > > ü > > >

Problem with minidom and special chars in HTML

2005-02-22 Thread Horst Gutmann
Hi :-) I currently have quite a big problem with minidom and special chars (for example ü) in HTML. Let's say I have following input file: -- http://www.w3.org/TR/html4/strict.dtd";> ü -- And fol