I am involving in the development of a crawler. i need my script to
detect the mirror urls of the page. so that i can ignor the urls.
Please tell me any idea to detect..
--
http://mail.python.org/mailman/listinfo/python-list
How ca i create a databse file using rnopen?
How to set the key for this file?
i am tried but i got error.. please can any one tell...
--
http://mail.python.org/mailman/listinfo/python-list
On Jun 18, 11:06 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> On Jun 17, 8:51 pm, Squzer Crawler <[EMAIL PROTECTED]> wrote:
>
> > i am developing distributed environment in my college using Python. I
> > am using therads in client for downloading w
i am developing distributed environment in my college using Python. I
am using therads in client for downloading wepages. Even though i am
reusing the thread, memory usage get increased. I don know why.? I am
using BerkelyDB for URLQueue, BeautifulShop for Parsing the webpages.
Any idea of redus