Rob Hudson wrote:
wget -E --load-cookies /path/to/firefox/profiles/cookies.txt -r -k -l -r = recurse ...'
I missed this with pycurl & have yet to find example that supports it :( Then I scanned the curl FAQ and found 3.15 [0]
3.15 Can I do recursive fetches with curl? http://curl.mirrors.cyberservers.net/docs/faq.html#3.15
This means to use pycurl you need a list of urls. This is difficult as you need to parse the returned page. I would have thought curl could support this. Obviously not. One obvious way is to use pycurl by pointing it to a base url, then for each returned page (assuming html) parse with HTMLParser [1] build a list for each url and extract the pages that way. Not what you asked. Want a quick hack solution do what 'Rob Hudson' suggests & use wget (which you probably did anyway) OR otherwise try an old favourite of mine and use websucker. [2] The reason I looked at pycurl was the obvious django, pycurl integration (ie: a django app that mirrored sites). Reference ---------------- [0] Curl FAQ, 'Can I do recursive fetches with curl?, NO' http://curl.mirrors.cyberservers.net/docs/faq.html#3.15 [Accessed Saturday, 6 January, 2007] [1] Python HTMLParser module, ''Parses text files in format of HTML & XHTML' http://docs.python.org/lib/module-HTMLParser.html [Accessed Saturday, 6 January, 2007] [2] python websucker, 'creates "mirror copy of a remote site"' http://svn.python.org/view/python/trunk/Tools/webchecker/ [Accessed Saturday, 6 January, 2007] --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---