Hi James,

I was pondering the same thing just the other day - trying to get flat
pages from django site. Now at first I thought using *wget* would
suffice, but I also needed to do other things with the files
(archiving, uploading to ftp). So I needed some way of interacting with
the static pages after download.

The obvious thing might have been to find a python wget module. No such
luck. Also further checking wget
(http://en.wikipedia.org/wiki/Wget#Criticisms_of_Wget) I found numerous
things that may get in the way of extracting complete pages. For
example HTTP 1.0 only support, hence js, css referenced data might not
be extracted.

So I looked for an alternative. cURL comes to mind
(http://en.wikipedia.org/wiki/CURL), And becuase the libCurl is exposed
a python interface to cURL, http://pycurl.sourceforge.net/.  Using this
approach I can not only download pages with greater flexability but
also script the downloads in python instead of relying on a shell
command call to wget.


--~--~---------~--~----~------------~-------~--~----~
 You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to