Re: Web Crawling/Threading and Things That Go Bump in the Night

2006-08-04 Thread [EMAIL PROTECTED]
Rem, what OS are you trying this on? Windows XP SP2 has a limit of around 40 tcp connections per second... Remarkable wrote: > Hello all > > I am trying to write a reliable web-crawler. I tried to write my own > using recursion and found I quickly hit the "too many sockets" open > problem. So I lo

Web Crawling/Threading and Things That Go Bump in the Night

2006-08-04 Thread Remarkable
Hello all I am trying to write a reliable web-crawler. I tried to write my own using recursion and found I quickly hit the "too many sockets" open problem. So I looked for a threaded version that I could easily extend. The simplest/most reliable I found was called Spider.py (see attached). At

Re: web crawling.

2006-01-19 Thread John M. Gabriele
Alex Martelli wrote: > S Borg <[EMAIL PROTECTED]> wrote: > > >> Hello, >> >> I have been writing very simple Python programs that parse HTML and >>such, mainly just to get >>a better feel for the language. Here is my question: If I parsed an >>HTML page into all of the image >>files listed on tha

Re: web crawling.

2006-01-19 Thread Fuzzyman
Use BeautifulSoup to get all the image tags out of the html. You'll need to join the urls of the images to the url of the page (urlparse.urljoin off the top of my head). If you look at BeautifulSoup you will see how to get the 'src' reference of each image tag. All the best, Fuzzyman http://www.

Re: web crawling.

2006-01-19 Thread gene tani
S Borg wrote: > Hello, > > I have been writing very simple Python programs that parse HTML and > such, mainly just to get > a better feel for the language. Here is my question: If I parsed an > HTML page into all of the image > files listed on that page, how could I request all of those images an

Re: web crawling.

2006-01-18 Thread Alex Martelli
S Borg <[EMAIL PROTECTED]> wrote: > Hello, > > I have been writing very simple Python programs that parse HTML and > such, mainly just to get > a better feel for the language. Here is my question: If I parsed an > HTML page into all of the image > files listed on that page, how could I request

web crawling.

2006-01-18 Thread S Borg
Hello, I have been writing very simple Python programs that parse HTML and such, mainly just to get a better feel for the language. Here is my question: If I parsed an HTML page into all of the image files listed on that page, how could I request all of those images and download them into some

How to turn $6 to $16000 in few days of web crawling

2005-10-27 Thread Master
It Will Work. If you do as I have done! Just Do It! follow the 4 steps. $6.00 to $15,000.00 in 30 days! Steps: Follow the Logic, Just Do it and It will work. $$$ in 4 easy steps. 1. Set Up a Free Paypal Account. 2. Send $1.00 to six Email Accounts from your Paypal Account 3. Delete email addres