Re: Extracting xml from html

2007-09-19 Thread Stefan Behnel
[EMAIL PROTECTED] wrote: >>row = tree.find("//Row") >>print row.findtext("primaryowner") >>print row.findtext("customeraddress") > > I tried this your way and Laurent's way and both give me this error: > > AttributeError: 'NoneType' object has no attribute 'findtext' Well, error hand

Re: Extracting xml from html

2007-09-19 Thread kyosohma
On Sep 19, 3:13 am, Stefan Behnel <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > Does this make sense? It works pretty well, but I don't really > > understand everything that I'm doing. > > > def Parser(filename): > > It's uncommon to give a function a capitalised name, unless it's a fac

Re: Extracting xml from html

2007-09-19 Thread Stefan Behnel
[EMAIL PROTECTED] wrote: > Does this make sense? It works pretty well, but I don't really > understand everything that I'm doing. > > def Parser(filename): It's uncommon to give a function a capitalised name, unless it's a factory function (which this isn't). > parser = etree.HTMLParser() >

Re: Extracting xml from html

2007-09-19 Thread Laurent Pointal
[EMAIL PROTECTED] a écrit : > On Sep 18, 1:56 am, Stefan Behnel <[EMAIL PROTECTED]> wrote: >> [EMAIL PROTECTED] wrote: >>> I am attempting to extract some XML from an HTML document that I get >>> returned from a form based web page. For some reason, I cannot figure >>> out how to do this. >>> Here'

Re: Extracting xml from html

2007-09-19 Thread Stefan Behnel
George Sakkis wrote: > Given that you can do in 2 lines what > took you around 15 with lxml, I wouldn't think it twice. Don't judge a tool by beginner's code. Stefan -- http://mail.python.org/mailman/listinfo/python-list

Re: Extracting xml from html

2007-09-18 Thread George Sakkis
On Sep 18, 3:31 pm, [EMAIL PROTECTED] wrote: > On Sep 17, 4:51 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> > wrote: > > > > > En Mon, 17 Sep 2007 17:31:19 -0300, <[EMAIL PROTECTED]> escribi?: > > > > I am attempting to extract some XML from an HTML document that I get > > > returned from a form bas

Re: Extracting xml from html

2007-09-18 Thread kyosohma
On Sep 18, 1:56 am, Stefan Behnel <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > I am attempting to extract some XML from an HTML document that I get > > returned from a form based web page. For some reason, I cannot figure > > out how to do this. > > Here's a sample of the html: > > >

Re: Extracting xml from html

2007-09-18 Thread kyosohma
On Sep 17, 4:51 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote: > En Mon, 17 Sep 2007 17:31:19 -0300, <[EMAIL PROTECTED]> escribi?: > > > I am attempting to extract some XML from an HTML document that I get > > returned from a form based web page. For some reason, I cannot figure > > out how to

Re: Extracting xml from html

2007-09-18 Thread Paul Boddie
On 17 Sep, 23:14, [EMAIL PROTECTED] wrote: > > I have lxml installed and I appear to also have libxml2dom installed. > I know lxml has decent docs, but I don't see much for yours. Is this > the only place to go:http://www.boddie.org.uk/python/libxml2dom.html > ? Unfortunately yes, with regard to o

Re: Extracting xml from html

2007-09-18 Thread Stefan Behnel
[EMAIL PROTECTED] wrote: > I am attempting to extract some XML from an HTML document that I get > returned from a form based web page. For some reason, I cannot figure > out how to do this. > Here's a sample of the html: > > > > lots of screwy text including divs and spans > > 1126264 >

Re: Extracting xml from html

2007-09-17 Thread Gabriel Genellina
En Mon, 17 Sep 2007 17:31:19 -0300, <[EMAIL PROTECTED]> escribi�: > I am attempting to extract some XML from an HTML document that I get > returned from a form based web page. For some reason, I cannot figure > out how to do this. I thought I could use the minidom module to do it, > but all I get

Re: Extracting xml from html

2007-09-17 Thread kyosohma
On Sep 17, 4:01 pm, Paul Boddie <[EMAIL PROTECTED]> wrote: > On 17 Sep, 22:31, [EMAIL PROTECTED] wrote: > > > > > What's the best way to get at the XML? Do I need to somehow parse it > > using the HTMLParser and then parse that with minidom or what? > > Probably easiest is to use an XML processing

Re: Extracting xml from html

2007-09-17 Thread Paul Boddie
On 17 Sep, 22:31, [EMAIL PROTECTED] wrote: > > What's the best way to get at the XML? Do I need to somehow parse it > using the HTMLParser and then parse that with minidom or what? Probably easiest is to use an XML processing toolkit or library which supports HTML parsing. Since the libxml2 librar

Extracting xml from html

2007-09-17 Thread kyosohma
Hi, I am attempting to extract some XML from an HTML document that I get returned from a form based web page. For some reason, I cannot figure out how to do this. I thought I could use the minidom module to do it, but all I get is a screwy traceback: Traceback (most recent call last): File "\\m