On 8 March 2010 07:45, mjlissner <mjliss...@gmail.com> wrote:
> I'm trying to write a program that scrapes all the pdfs off of this
> page:
> http://www.ca3.uscourts.gov/recentop/week/recprec.htm
>
> And then puts them onto my hard drive, with the reference to their
> location in a FileField.
>
> I've defined a FileField in my model, and I believe I can scrape the
> site OK (though my code needs improvement), but I can't for the life
> of me figure out how to go from "here's my file" to "I've placed my
> file in the right directory, and there's a reference in my database to
> it - awesome."
>
> I could give more details, but I'm not entirely sure they'd be
> helpful. Essentially, I just want to know how to download a PDF, and
> store it locally, while updating the django db to point to it
> appropriately.
>
> So far, I've been very much unable to pull this off...
>
> Thanks,
>
> Mike

I'd write a management command that wraps around a couple of functions:


* One that gets the file and stores it locally using urllib2
* One that then takes the file and saves it in your model. (Have a
google for using django orm without the full django stack)

The thing to remember is to be nice when scraping and check that your
use of the data is legal.

You could break out the two parts and have one script that downloads
all teh files to a folder, then another that imports them all.

HTH
Dan











> --
> You received this message because you are subscribed to the Google Groups 
> "Django users" group.
> To post to this group, send email to django-us...@googlegroups.com.
> To unsubscribe from this group, send email to 
> django-users+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/django-users?hl=en.
>
>



-- 
Dan Hilton
============================
www.twitter.com/danhilton
www.DanHilton.co.uk
============================

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to