Re: [fpc-pascal] XML DOM and HTML

2008-06-20 Thread Sebastian Günther
Johannes Nohl schrieb: Dear list, I player around with the units dom and xmlread. I liked them very much. Now I thought I could parse websites with it. But they are slightly different as far as I know. In xml everthing is within a node while in HTML there are more then one value in a node. E.g.:

Re: [fpc-pascal] XML DOM and HTML

2008-06-17 Thread Lee Jenkins
Johannes Nohl wrote: Dear list, dear Michael! There are multiple problems with HTML parsing: HTML is not a well-formed XML document, because - the tags are case insensitive (in XML they are case sensitive) - Not all tags must be closed. If the HTML is XHTML, then the DOM unit can be used to par

Re: [fpc-pascal] XML DOM and HTML

2008-06-08 Thread Michael Van Canneyt
On Sun, 8 Jun 2008, Johannes Nohl wrote: > Dear list, dear Michael! > > > There are multiple problems with HTML parsing: HTML is not a well-formed > > XML document, because > > - the tags are case insensitive (in XML they are case sensitive) > > - Not all tags must be closed. > > If the HTML is

Re: [fpc-pascal] XML DOM and HTML

2008-06-08 Thread Johannes Nohl
Dear list, dear Michael! > There are multiple problems with HTML parsing: HTML is not a well-formed > XML document, because > - the tags are case insensitive (in XML they are case sensitive) > - Not all tags must be closed. > If the HTML is XHTML, then the DOM unit can be used to parse it. But ho

Re: [fpc-pascal] XML DOM and HTML

2008-06-07 Thread Michael Van Canneyt
On Sat, 7 Jun 2008, Johannes Nohl wrote: > Dear list, > > I player around with the units dom and xmlread. I liked them very > much. Now I thought I could parse websites with it. But they are > slightly different as far as I know. In xml everthing is within a node > while in HTML there are more

[fpc-pascal] XML DOM and HTML

2008-06-07 Thread Johannes Nohl
Dear list, I player around with the units dom and xmlread. I liked them very much. Now I thought I could parse websites with it. But they are slightly different as far as I know. In xml everthing is within a node while in HTML there are more then one value in a node. E.g.: possible XML: asdf1