Dave -
Check out the URL extractor example that ships with pyparsing. It
handles many kinds of URL formats.
Download pyparsing at pyparsing.sourceforge.net.
-- Paul
--
http://mail.python.org/mailman/listinfo/python-list
Cheers for all your replies,
Peter Hansen: Couldnt agree more with you about not effectivly spamming
companies with my resume, thats why im crawling the local newspapers
website to - to find scrapes of information about companies :-)
I Wasnt planning to automate everything, more the boring addres
[EMAIL PROTECTED] writes:
> Anyway to the orginally replier - I wish it was homework ;-), that
> would mean I wouldnt be trying to find myself a job as a recent
> graduate... I decided to crawl something similar to the yellow pages
> (do you have them in the US?) for my select area and then find
[EMAIL PROTECTED] wrote:
> Anyway to the orginally replier - I wish it was homework ;-), that
> would mean I wouldnt be trying to find myself a job as a recent
> graduate... I decided to crawl something similar to the yellow pages
Sounds like a useful task, and a good way to learn more about Pytho
Thanks for the rapid replys, I cracked the problem 15 seconds after
posting here, doh!
Anyway to the orginally replier - I wish it was homework ;-), that
would mean I wouldnt be trying to find myself a job as a recent
graduate... I decided to crawl something similar to the yellow pages
(do you hav
[EMAIL PROTECTED] wrote:
> Hiya,
>
> Im trying to find a method of searching a html file (ive grabbed it
> with FancyURLopener), basically in the html file there is a series of
> links in the following format -
>
> A HREF="../../company/11/13/820.htm">some name
> so I want to search the file for
Sounds somewhat like homework. So I won't just give you a code
solution. Use the regular expression(re) module to match the urls.
--
http://mail.python.org/mailman/listinfo/python-list
Hiya,
Im trying to find a method of searching a html file (ive grabbed it
with FancyURLopener), basically in the html file there is a series of
links in the following format -
A HREF="../../company/11/13/820.htm">some namehttp://mail.python.org/mailman/listinfo/python-list