As I'm asking stupid questions at the moment: Could someone write an
(XML/HTML?) parser for PHP that exposes the DOM in the same way as the
Javascript one does in IE 5? This would allow me to access the node elements
(#text, etc) via PHP on an HTML file stored on the server in the same way as
I can via Javascript in IE 5? Why do I want to do this? It would allow me to
download a web page, parse it into a DOM tree-structure, loop through all
#text nodes and extract all the textual data. This would make capturing
textual data from an HTML file so much easier than attempting to strip all
the HTML tags, etc. The parser would only need to support a "read" mode for
my requirements, which should simplify the parser (it wouldn't need to worry
about updating node values, etc or writing them back to the HTML file). It
sounds like a good idea to me but I might be way off course...
This would allow all work to be performed server-side, whereas at the moment
I'm having to send the HTML file to IE, run Javascript DOM code to extract
the #text values, dump those values into a hidden field and post the data
back to the server, where PHP can process it.
Thanks
James
-----Original Message-----
From: Cynic [mailto:[EMAIL PROTECTED]]
Sent: 14 January 2001 01:38
To: James Duncan; [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: RE: [PHP-WIN] DOM
It's not PHP vs. DOM. It's XML (DOM) vs. (bad) HTML. PHP just
provides you with an interface to an XML parser.
www.php4win.de
At 01:14 14.1. 2001, James Duncan wrote the following:
--------------------------------------------------------------
>Yikes. I'm just reading more about DOM and PHP at the moment on the
>PHPBuilder website.
>
>Does anyone have a version of PHP complied with DOM support included for
>Windows (I'm developing on a Windows system before moving it over to
Linux -
>RedHat)?
>
>So loading any old web page and trying to construct a DOM document from it
>via PHP isn't going to work? How does IE v5 manage to parse the same web
>page correctly (or what seems to be correctly)? I've already read in the
DOM
>table node elements #text and their values via Javascript in IE.
>
>Still learning lots ;)
>
>Thanks
>
>James
>
>
>-----Original Message-----
>From: Cynic [mailto:[EMAIL PROTECTED]]
>Sent: 14 January 2001 00:07
>To: James Duncan; [EMAIL PROTECTED]
>Cc: [EMAIL PROTECTED]
>Subject: RE: [PHP-WIN] DOM
>
>I should warn you that XML functions require the document to be
>very 'correct'. Most (I guess 98%... I wish browsers weren't so
>forgiving, all might've been much easier and better) of HTML
>pages on the internet basically aren't HTML (which is a son of
>SGML, and an older, heavily cripled brother of XML), and even
>strict HTML isn't XML compliant up to XHTML 1.0, which is the
>latest version of HTML, fully XML compliant.
>If you'll try to load such document into an XML parser, it'll
>die with an error message, because XML requires the document
>to be well-formed.
>
>At 00:54 14.1. 2001, James Duncan wrote the following:
>--------------------------------------------------------------
>>Ah rite... thanks for the info. As I said I'm very new to all of this and
>>reading lots, whilst trying to make sense of it all ;) So it is possible
to
>>use PHP to access DOM elements (via the XML DOM library) created from an
>>HTML source file (a code example would be very handy)? Does anyone know if
>>an XML parser will be built into PHP in the future? I then assume I could
>>access DOM elements from an HTML file in the same easy way as I can via
>>Javascript in IE?
>>
>>Thanks
>>
>>James
>>
>>
>>-----Original Message-----
>>From: Cynic [mailto:[EMAIL PROTECTED]]
>>Sent: 13 January 2001 23:22
>>To: James Duncan; [EMAIL PROTECTED]
>>Cc: [EMAIL PROTECTED]
>>Subject: RE: [PHP-WIN] DOM
>>
>>You don't understand the basic concept.
>>
>>DOM (Document Object Model) is a tree representing the structure
>>of a document, where the elements (logically separated parts of)
>>content is enclosed within tags to allow for computerized
>>processing. IE exposes it's own version of DOM through its
>>implementations of JS. If you want to access and manipulate a HTML
>>document in PHP using this tree-like abstraction (DOM), you will
>>have to use XML DOM library. No XML parser is an integral part of
>>the language.
>>
>>
>>At 18:20 13.1. 2001, James Duncan wrote the following:
>>--------------------------------------------------------------
>>>I don't think this will work in my case because I don't control the
layout
>>>of the HTML page and hence can't add the hidden fields. I'm downloading
>the
>>>HTML pages from a website. It would require as much work to insert the
>>>hidden fields as trying to strip the HTML tags in an attempt to read the
>>>data directly from the HTML page itself. There must be a way to access
the
>>>DOM directly from PHP? I notice in the manual there is a section
regarding
>>>XML DOM but not the DOM itself.
>>>
>>>Are the DOM values only available on the client? If that's the case then
>>PHP
>>>can't be used to read them because it's limited to the server side?
>>>
>>>Thanks
>>>
>>>James
>>>
>>>-----Original Message-----
>>>From: Michael Stearne [mailto:[EMAIL PROTECTED]]
>>>Sent: 13 January 2001 17:06
>>>To: James Duncan
>>>Cc: [EMAIL PROTECTED]
>>>Subject: Re: [PHP-WIN] DOM
>>>
>>>Could you do something like:
>>>
>>>myForm.myField.value=tablejames.firstChild.childNodes[1].childNodes[4].fi
r
>s
>>t
>>>Child.firstChild.node Value;
>>>
>>>Set up a form of hidden fields. Extract the values from the DOM and then
>>>have the user hit a Submit button to get to the next page. At that point
>>>the values that were collected and put into the hidden form fields will
be
>>>submitted and you next page (the PHP page) could INSERT the values into
>the
>>>database,
>>>
>>>Michael
>>>
>>>
>>>On Friday, January 12, 2001, at 07:30 PM, James Duncan wrote:
>>>
>>>> Hi folks,
>>>>
>>>> I'm still new to HTML, Javascript and PHP but learning (fast
hopefully).
>>>> I've just started accessing DOM elements. I have worked out how to
>update
>>>> the contents of table cells directly using this method, etc. In
>>Javascript
>>>I
>>>> would use code like:
>>>>
>>>> alert("Value is: " +
>>>>
>>>tablejames.firstChild.childNodes[1].childNodes[4].firstChild.firstChild.n
o
>d
>>e
>>>> Name);
>>>> alert("Value is: " +
>>>>
>>>tablejames.firstChild.childNodes[1].childNodes[5].firstChild.firstChild.n
o
>d
>>e
>>>> Value);
>>>>
>>>> This Javascript shows the name and value of the child element.
>>>>
>>>> Now I want to use PHP to extract data (values) from HTML pages like I
do
>>>> with the above Javascript. Is this possible? Obviously with the
>>Javascript
>>>> the HTML page has already been rendered in the browser (i.e. all tree
>>>> elements have been created). This makes extracting data a simple case
of
>>>> finding the "#text" elements and reading in the values. Can I do the
>same
>>>> thing with PHP and an HTML file I've downloaded from the Internet?
>>>Obviously
>>>> this file is sitting on my server and hasn't been rendered in a
>>browser...
>>>>
>>>> The whole point of this exercise is so that I can extract values from
an
>>>> HTML table and populate them into a database. Maybe it's easier to
>>process
>>>> the HTML file line by line and strip the unwanted HTML tags? However,
>>with
>>>> this approach I've got to hardcode each webpage...
>>>>
>>>> If this is a silly question then sorry but you only learn if you ask ;)
>>>>
>>>> Thanks
>>>>
>>>> James
>>>>
>>>>
>>>>
>>>> --
>>>> PHP Windows Mailing List (http://www.php.net/)
>>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>> To contact the list administrators, e-mail:
[EMAIL PROTECTED]
>>>>
>>>>
>>>>
>>>
>>>
>>>--
>>>PHP Windows Mailing List (http://www.php.net/)
>>>To unsubscribe, e-mail: [EMAIL PROTECTED]
>>>For additional commands, e-mail: [EMAIL PROTECTED]
>>>To contact the list administrators, e-mail: [EMAIL PROTECTED]
>>------end of quote------
>>
>>
>>
>>____________________________________________________________
>>Cynic:
>>
>>A member of a group of ancient Greek philosophers who taught
>>that virtue constitutes happiness and that self control is
>>the essential part of virtue.
>>
>>[EMAIL PROTECTED]
>------end of quote------
>
>
>
>____________________________________________________________
>Cynic:
>
>A member of a group of ancient Greek philosophers who taught
>that virtue constitutes happiness and that self control is
>the essential part of virtue.
>
>[EMAIL PROTECTED]
------end of quote------
____________________________________________________________
Cynic:
A member of a group of ancient Greek philosophers who taught
that virtue constitutes happiness and that self control is
the essential part of virtue.
[EMAIL PROTECTED]
--
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]