Have fun.
At 18:29 14.1. 2001, James Duncan wrote the following:
--------------------------------------------------------------
>Oh that sounds promising... I will have to go check their website myself now
>;) Thanks for all your help on this matter!
>
>James
>
>
>-----Original Message-----
>From: Cynic [mailto:[EMAIL PROTECTED]]
>Sent: 14 January 2001 17:14
>To: James Duncan
>Cc: [EMAIL PROTECTED]
>Subject: RE: [PHP-WIN] DOM
>
>Well, after writing that mail I checked libxml's homepage, and
>it seems they've managed to build in an HTML mode, so maybe
>it's forgiving enough to parse really anything.
>
>
>At 18:06 14.1. 2001, James Duncan wrote the following:
>--------------------------------------------------------------
>>But I thought you said that the DOM XML wouldn't parse a normal HTML web
>>page because 98% of web pages aren't truly XML compatible and the XML
>parser
>>would die with an error message(s)?
>>
>>I want to be able to feed the parser any old HTML web page and read the
>node
>>values from the DOM (created by the parser), just like I do with IE and
>>Javascript.
>>
>>Thanks
>>
>>PS: I am learning slowly so don't get tooooo mad with me ;)
>>
>>
>>-----Original Message-----
>>From: Cynic [mailto:[EMAIL PROTECTED]]
>>Sent: 14 January 2001 17:01
>>To: James Duncan; [EMAIL PROTECTED]
>>Cc: [EMAIL PROTECTED]
>>Subject: RE: [PHP-WIN] DOM
>>
>>What you want has already been done, with two different
>>approaches: DOM XML functions and Sablotron functions (SAX
>>interface). Just use one of these modules in your script.
>>
>>
>>At 16:28 14.1. 2001, James Duncan wrote the following:
>>--------------------------------------------------------------
>>>As I'm asking stupid questions at the moment: Could someone write an
>>>(XML/HTML?) parser for PHP that exposes the DOM in the same way as the
>>>Javascript one does in IE 5? This would allow me to access the node
>>elements
>>>(#text, etc) via PHP on an HTML file stored on the server in the same way
>>as
>>>I can via Javascript in IE 5? Why do I want to do this? It would allow me
>>to
>>>download a web page, parse it into a DOM tree-structure, loop through all
>>>#text nodes and extract all the textual data. This would make capturing
>>>textual data from an HTML file so much easier than attempting to strip all
>>>the HTML tags, etc. The parser would only need to support a "read" mode
>for
>>>my requirements, which should simplify the parser (it wouldn't need to
>>worry
>>>about updating node values, etc or writing them back to the HTML file). It
>>>sounds like a good idea to me but I might be way off course...
>>>
>>>This would allow all work to be performed server-side, whereas at the
>>moment
>>>I'm having to send the HTML file to IE, run Javascript DOM code to extract
>>>the #text values, dump those values into a hidden field and post the data
>>>back to the server, where PHP can process it.
>>>
>>>Thanks
>>>
>>>James
>>>
>>>-----Original Message-----
>>>From: Cynic [mailto:[EMAIL PROTECTED]]
>>>Sent: 14 January 2001 01:38
>>>To: James Duncan; [EMAIL PROTECTED]
>>>Cc: [EMAIL PROTECTED]
>>>Subject: RE: [PHP-WIN] DOM
>>>
>>>It's not PHP vs. DOM. It's XML (DOM) vs. (bad) HTML. PHP just
>>>provides you with an interface to an XML parser.
>>>
>>>www.php4win.de
>>>
>>>
>>>At 01:14 14.1. 2001, James Duncan wrote the following:
>>>--------------------------------------------------------------
>>>>Yikes. I'm just reading more about DOM and PHP at the moment on the
>>>>PHPBuilder website.
>>>>
>>>>Does anyone have a version of PHP complied with DOM support included for
>>>>Windows (I'm developing on a Windows system before moving it over to
>>>Linux -
>>>>RedHat)?
>>>>
>>>>So loading any old web page and trying to construct a DOM document from
>it
>>>>via PHP isn't going to work? How does IE v5 manage to parse the same web
>>>>page correctly (or what seems to be correctly)? I've already read in the
>>>DOM
>>>>table node elements #text and their values via Javascript in IE.
>>>>
>>>>Still learning lots ;)
>>>>
>>>>Thanks
>>>>
>>>>James
>>>>
>>>>
>>>>-----Original Message-----
>>>>From: Cynic [mailto:[EMAIL PROTECTED]]
>>>>Sent: 14 January 2001 00:07
>>>>To: James Duncan; [EMAIL PROTECTED]
>>>>Cc: [EMAIL PROTECTED]
>>>>Subject: RE: [PHP-WIN] DOM
>>>>
>>>>I should warn you that XML functions require the document to be
>>>>very 'correct'. Most (I guess 98%... I wish browsers weren't so
>>>>forgiving, all might've been much easier and better) of HTML
>>>>pages on the internet basically aren't HTML (which is a son of
>>>>SGML, and an older, heavily cripled brother of XML), and even
>>>>strict HTML isn't XML compliant up to XHTML 1.0, which is the
>>>>latest version of HTML, fully XML compliant.
>>>>If you'll try to load such document into an XML parser, it'll
>>>>die with an error message, because XML requires the document
>>>>to be well-formed.
>>>>
>>>>At 00:54 14.1. 2001, James Duncan wrote the following:
>>>>--------------------------------------------------------------
>>>>>Ah rite... thanks for the info. As I said I'm very new to all of this
>and
>>>>>reading lots, whilst trying to make sense of it all ;) So it is possible
>>>to
>>>>>use PHP to access DOM elements (via the XML DOM library) created from an
>>>>>HTML source file (a code example would be very handy)? Does anyone know
>>if
>>>>>an XML parser will be built into PHP in the future? I then assume I
>could
>>>>>access DOM elements from an HTML file in the same easy way as I can via
>>>>>Javascript in IE?
>>>>>
>>>>>Thanks
>>>>>
>>>>>James
>>>>>
>>>>>
>>>>>-----Original Message-----
>>>>>From: Cynic [mailto:[EMAIL PROTECTED]]
>>>>>Sent: 13 January 2001 23:22
>>>>>To: James Duncan; [EMAIL PROTECTED]
>>>>>Cc: [EMAIL PROTECTED]
>>>>>Subject: RE: [PHP-WIN] DOM
>>>>>
>>>>>You don't understand the basic concept.
>>>>>
>>>>>DOM (Document Object Model) is a tree representing the structure
>>>>>of a document, where the elements (logically separated parts of)
>>>>>content is enclosed within tags to allow for computerized
>>>>>processing. IE exposes it's own version of DOM through its
>>>>>implementations of JS. If you want to access and manipulate a HTML
>>>>>document in PHP using this tree-like abstraction (DOM), you will
>>>>>have to use XML DOM library. No XML parser is an integral part of
>>>>>the language.
>>>>>
>>>>>
>>>>>At 18:20 13.1. 2001, James Duncan wrote the following:
>>>>>--------------------------------------------------------------
>>>>>>I don't think this will work in my case because I don't control the
>>>layout
>>>>>>of the HTML page and hence can't add the hidden fields. I'm downloading
>>>>the
>>>>>>HTML pages from a website. It would require as much work to insert the
>>>>>>hidden fields as trying to strip the HTML tags in an attempt to read
>the
>>>>>>data directly from the HTML page itself. There must be a way to access
>>>the
>>>>>>DOM directly from PHP? I notice in the manual there is a section
>>>regarding
>>>>>>XML DOM but not the DOM itself.
>>>>>>
>>>>>>Are the DOM values only available on the client? If that's the case
>then
>>>>>PHP
>>>>>>can't be used to read them because it's limited to the server side?
>>>>>>
>>>>>>Thanks
>>>>>>
>>>>>>James
>>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Michael Stearne [mailto:[EMAIL PROTECTED]]
>>>>>>Sent: 13 January 2001 17:06
>>>>>>To: James Duncan
>>>>>>Cc: [EMAIL PROTECTED]
>>>>>>Subject: Re: [PHP-WIN] DOM
>>>>>>
>>>>>>Could you do something like:
>>>>>>
>>>>>>myForm.myField.value=tablejames.firstChild.childNodes[1].childNodes[4].
>f
>>i
>>>r
>>>>s
>>>>>t
>>>>>>Child.firstChild.node Value;
>>>>>>
>>>>>>Set up a form of hidden fields. Extract the values from the DOM and
>>then
>>>>>>have the user hit a Submit button to get to the next page. At that
>>point
>>>>>>the values that were collected and put into the hidden form fields will
>>>be
>>>>>>submitted and you next page (the PHP page) could INSERT the values into
>>>>the
>>>>>>database,
>>>>>>
>>>>>>Michael
>>>>>>
>>>>>>
>>>>>>On Friday, January 12, 2001, at 07:30 PM, James Duncan wrote:
>>>>>>
>>>>>>> Hi folks,
>>>>>>>
>>>>>>> I'm still new to HTML, Javascript and PHP but learning (fast
>>>hopefully).
>>>>>>> I've just started accessing DOM elements. I have worked out how to
>>>>update
>>>>>>> the contents of table cells directly using this method, etc. In
>>>>>Javascript
>>>>>>I
>>>>>>> would use code like:
>>>>>>>
>>>>>>> alert("Value is: " +
>>>>>>>
>>>>>>tablejames.firstChild.childNodes[1].childNodes[4].firstChild.firstChild
>.
>>n
>>>o
>>>>d
>>>>>e
>>>>>>> Name);
>>>>>>> alert("Value is: " +
>>>>>>>
>>>>>>tablejames.firstChild.childNodes[1].childNodes[5].firstChild.firstChild
>.
>>n
>>>o
>>>>d
>>>>>e
>>>>>>> Value);
>>>>>>>
>>>>>>> This Javascript shows the name and value of the child element.
>>>>>>>
>>>>>>> Now I want to use PHP to extract data (values) from HTML pages like I
>>>do
>>>>>>> with the above Javascript. Is this possible? Obviously with the
>>>>>Javascript
>>>>>>> the HTML page has already been rendered in the browser (i.e. all tree
>>>>>>> elements have been created). This makes extracting data a simple case
>>>of
>>>>>>> finding the "#text" elements and reading in the values. Can I do the
>>>>same
>>>>>>> thing with PHP and an HTML file I've downloaded from the Internet?
>>>>>>Obviously
>>>>>>> this file is sitting on my server and hasn't been rendered in a
>>>>>browser...
>>>>>>>
>>>>>>> The whole point of this exercise is so that I can extract values from
>>>an
>>>>>>> HTML table and populate them into a database. Maybe it's easier to
>>>>>process
>>>>>>> the HTML file line by line and strip the unwanted HTML tags? However,
>>>>>with
>>>>>>> this approach I've got to hardcode each webpage...
>>>>>>>
>>>>>>> If this is a silly question then sorry but you only learn if you ask
>>;)
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> James
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> PHP Windows Mailing List (http://www.php.net/)
>>>>>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>>>>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>>>>> To contact the list administrators, e-mail:
>>>[EMAIL PROTECTED]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>--
>>>>>>PHP Windows Mailing List (http://www.php.net/)
>>>>>>To unsubscribe, e-mail: [EMAIL PROTECTED]
>>>>>>For additional commands, e-mail: [EMAIL PROTECTED]
>>>>>>To contact the list administrators, e-mail:
>[EMAIL PROTECTED]
>>>>>------end of quote------
>>>>>
>>>>>
>>>>>
>>>>>____________________________________________________________
>>>>>Cynic:
>>>>>
>>>>>A member of a group of ancient Greek philosophers who taught
>>>>>that virtue constitutes happiness and that self control is
>>>>>the essential part of virtue.
>>>>>
>>>>>[EMAIL PROTECTED]
>>>>------end of quote------
>>>>
>>>>
>>>>
>>>>____________________________________________________________
>>>>Cynic:
>>>>
>>>>A member of a group of ancient Greek philosophers who taught
>>>>that virtue constitutes happiness and that self control is
>>>>the essential part of virtue.
>>>>
>>>>[EMAIL PROTECTED]
>>>------end of quote------
>>>
>>>
>>>
>>>____________________________________________________________
>>>Cynic:
>>>
>>>A member of a group of ancient Greek philosophers who taught
>>>that virtue constitutes happiness and that self control is
>>>the essential part of virtue.
>>>
>>>[EMAIL PROTECTED]
>>------end of quote------
>>
>>
>>
>>____________________________________________________________
>>Cynic:
>>
>>A member of a group of ancient Greek philosophers who taught
>>that virtue constitutes happiness and that self control is
>>the essential part of virtue.
>>
>>[EMAIL PROTECTED]
>------end of quote------
>
>
>
>____________________________________________________________
>Cynic:
>
>A member of a group of ancient Greek philosophers who taught
>that virtue constitutes happiness and that self control is
>the essential part of virtue.
>
>[EMAIL PROTECTED]
------end of quote------
____________________________________________________________
Cynic:
A member of a group of ancient Greek philosophers who taught
that virtue constitutes happiness and that self control is
the essential part of virtue.
[EMAIL PROTECTED]
--
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]