James,

If you are the owner of the dowloaded html page you could use WDDX. Check it
out at WDDX.org.

Voll

-----Original Message-----
From: James Duncan [mailto:[EMAIL PROTECTED]]
Sent: Saturday, January 13, 2001 8:11 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: RE: [PHP-WIN] DOM


Hi again,

Thanks for your help so far. I will explain what I'm trying to achieve. I
want to pull down a web page that contains share prices, extract those share
prices, and update a database with the new prices. What I want is a nice and
neat solution that runs like a service (i.e. I can stop and start it from a
web browser, change the update interval (how long it waits before it repeats
the process)), etc.

I know a little PHP and Javascript (what I've taught/learnt over the last
few weeks). The process I have so far (not implemented at all yet):

1) PHP script that pulls down the relevant web page to my server
2) Data extraction from HTML web page
3) Updating of database with data from step 2
4) Running step 1 again after a certain period of time

Step 2 is the most complex by far. I was hoping to use PHP to access the
#text value via DOM but obviously this isn't possible because the DOM
doesn't exist server-side but only after the HTML has been rendered
client-side. Like you say, I could create a form and hidden fields in the
HTML file and use Javascript to read the #text node values into the hidden
fields. Then trigger a POST operation, where I can read in all the values
but I don't like the sound of this because then I would have to have a
browser interacting with my PHP scripts!?! I'm trying to create a
self-contained "service" that doesn't have any external dependants.

Is there any other way of accomplishing this without involving a browser?
The only other way I can see is to use PHP to strip all HTML tags, leaving
just the text? I could then write PHP code to read in the remaining text,
etc.

What is the best way to accomplish this? Is there a PHP command that strips
all HTML tags (and Javascript, etc) from an HTML file? Example code would be
great ;)

Thanks

James


-----Original Message-----
From: Michael Stearne [mailto:[EMAIL PROTECTED]]
Sent: 13 January 2001 18:40
To: James Duncan
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: [PHP-WIN] DOM

No, actually I think that using PHP to insert the FORM and then use
Javascript to get the #text node values would be easier.  Parsing Table
cells using RegExs is no fun.

Michael

On Saturday, January 13, 2001, at 01:06 PM, James Duncan wrote:

> But surely if I'm using fopen to insert a hidden form and fields I might
as
> well use fopen to extract the data from the HTML page in the first place?
It
> just seems so much easier to capture the #text node values from the DOM,
> rather than using fopen to locate the same information!?!
>
> Another idea would be to use my Javascript to capture the text node values
> from the DOM and write it to a cookie file. The contents of the cookie
file
> could then by read by PHP to populate the database?
>
> Thanks
>
> James
>
>
> -----Original Message-----
> From: Michael Stearne [mailto:[EMAIL PROTECTED]]
> Sent: 13 January 2001 17:32
> To: James Duncan
> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Subject: RE: [PHP-WIN] DOM
>
>
> On Saturday, January 13, 2001, at 12:20 PM, James Duncan wrote:
>
> > I don't think this will work in my case because I don't control the
layout
> > of the HTML page and hence can't add the hidden fields. I'm downloading
> the
> > HTML pages from a website. It would require as much work to insert the
> > hidden fields as trying to strip the HTML tags in an attempt to read the
> > data directly from the HTML page itself. There must be a way to access
the
> > DOM directly from PHP? I notice in the manual there is a section
regarding
> > XML DOM but not the DOM itself.
> >
> > Are the DOM values only available on the client? If that's the case then
> PHP
> > can't be used to read them because it's limited to the server side?
>
> Well by the time you are talking about PHP is out of the picture.  PHP can
> be used to generate a DOM but once its generated PHP (or any server side
> language) is out of the picture, it then goes to the client-side stuff
like
> you said. You can use PHP's fopen() to grab the page and then add the form
> and hidden fields I was talking about.  By doing this, you are setting up
> the page to be handled correctly by the Javascript code you inserted
through
> PHP.
>
> Michael
>
> >
> > Thanks
> >
> > James
> >
> > -----Original Message-----
> > From: Michael Stearne [mailto:[EMAIL PROTECTED]]
> > Sent: 13 January 2001 17:06
> > To: James Duncan
> > Cc: [EMAIL PROTECTED]
> > Subject: Re: [PHP-WIN] DOM
> >
> > Could you do something like:
> >
> >
>
myForm.myField.value=tablejames.firstChild.childNodes[1].childNodes[4].first
> > Child.firstChild.node Value;
> >
> > Set up a form of hidden fields.  Extract the values from the DOM and
then
> > have the user hit a Submit button to get to the next page.  At that
point
> > the values that were collected and put into the hidden form fields will
be
> > submitted and you next page (the PHP page) could INSERT the values into
> the
> > database,
> >
> > Michael
> >
> >
> > On Friday, January 12, 2001, at 07:30 PM, James Duncan wrote:
> >
> > > Hi folks,
> > >
> > > I'm still new to HTML, Javascript and PHP but learning (fast
hopefully).
> > > I've just started accessing DOM elements. I have worked out how to
> update
> > > the contents of table cells directly using this method, etc. In
> Javascript
> > I
> > > would use code like:
> > >
> > >   alert("Value is: " +
> > >
> >
>
tablejames.firstChild.childNodes[1].childNodes[4].firstChild.firstChild.node
> > > Name);
> > >   alert("Value is: " +
> > >
> >
>
tablejames.firstChild.childNodes[1].childNodes[5].firstChild.firstChild.node
> > > Value);
> > >
> > > This Javascript shows the name and value of the child element.
> > >
> > > Now I want to use PHP to extract data (values) from HTML pages like I
do
> > > with the above Javascript. Is this possible? Obviously with the
> Javascript
> > > the HTML page has already been rendered in the browser (i.e. all tree
> > > elements have been created). This makes extracting data a simple case
of
> > > finding the "#text" elements and reading in the values. Can I do the
> same
> > > thing with PHP and an HTML file I've downloaded from the Internet?
> > Obviously
> > > this file is sitting on my server and hasn't been rendered in a
> browser...
> > >
> > > The whole point of this exercise is so that I can extract values from
an
> > > HTML table and populate them into a database. Maybe it's easier to
> process
> > > the HTML file line by line and strip the unwanted HTML tags? However,
> with
> > > this approach I've got to hardcode each webpage...
> > >
> > > If this is a silly question then sorry but you only learn if you ask
;)
> > >
> > > Thanks
> > >
> > > James
> > >
> > >
> > >
> > > --
> > > PHP Windows Mailing List (http://www.php.net/)
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > > To contact the list administrators, e-mail:
[EMAIL PROTECTED]
> > >
> > >
> > >
> >
> >
> >
>
>
> --
> PHP Windows Mailing List (http://www.php.net/)
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> To contact the list administrators, e-mail: [EMAIL PROTECTED]
>
>
>


-- 
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]

-- 
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]

Reply via email to