On Fri, Jul 11, 2008 at 9:08 AM, Jon Brisbin <[EMAIL PROTECTED]> wrote:
> I was a little bummed to discover that the Postgres blob support we depend
> on at work I can't use through Django in a project for myself. I'm trying to
> keep multiple versions of an original document (including the embedded
> content that's inside the file) and had planned on using blobs to store the
> binary data. At some point in the past, it may have been more inefficient to
> store binary data inside the database, but we have an application at work
> that stores thousands of reports per day and serves them to a report viewer
> application. It allows for an environment-agnostic way to retrieve and store
> those files and I don't have to worry about path prefixes and the various
> problems of managing a messy directory structure of a large number of files
> (I guess I'm supposed to just tar up the files using a cron job to back them
> up? Sounds like a pain).

As Karen mentioned, you could write a custom field for this, which
would be a bit easier for you than it would be otherwise, since you
don't have to worry about compatibility with other databases. If you
go this route, I'd recommend subclassing the existing FileField, so
that it presents the same API as any other file, but there are other
things on there (like get_FIELD_filename and get_FIELD_url) that might
not serve any useful purpose for you, so choose wisely how you want to
approach it. Also, the way FileField itself works will be changing
soon, so you might want to hold off a bit anyway. More details below.

> I'm having to change the way I was planning on handling these files and I'm
> not sure how to go about doing that. I have an abstract representation of
> the document as a model and I need to add the uploaded document as a version
> attached to that document model. It's not date-based, but hash-based. I need
> to store the file in a directory outside the document root (I don't want the
> original directly accessible) I guess using some scheme like:
>
> "/my/docs/dir/%id1/%id2/%hash/mydoc.doc"
>
> How do I get from where Django will put the uploaded file to where I really
> want the file stored (which is based on values I won't know until everything
> is saved) and make sure the file field gets updated to reflect the new path?
> Can I even move the file after it's been uploaded? The whole concept of
> "upload_to" seems pretty limiting to me because the uploaded file is just
> the first part of a processing chain that's more interested in what's inside
> the file than it is with the file itself.

I'm currently in some of the final stages of a significant patch[1] to
improve how Django names, stores and manages files, and I think those
improvements will really help you out. It's not in trunk yet, but it
should make its way there in the next few weeks, so it can make it
into the 1.0 release this fall. There's a lot changing with it, but
I'll give you a quick rundown of what I think will help you, if you
decide to go with regular files instead of a BLOB.

* "upload_to" gets a lot smarter, by accepting a function in place of
the current format string. Strings will still be allowed as well, but
using a function lets you have much more control. That function will
accept the uploaded filename as well as the object it's being attached
to, so you can write a function to retrieve IDs and calculate the
document hash and whatnot as part of the file naming process.
* The actual saving and loading of files is moved out into a new
"Storage" class, which defaults to a FileSystemStorage that behaves
exactly as Django does now. This can be subclassed, though, and you
can tell Django to use your custom storage class by way of a new
setting or passing into your FileField instance. By simply overriding
a method or two on that class, you'll have even more control over how
and where the file gets saved, whether under MEDIA_ROOT or somewhere
else entirely, possibly even based on the contents of the file itself.
* Rather than only being available as string content, files will be
made available as a new File object, which works much like the
built-in Python file object, but with a few differences. For one, it
integrates with whatever storage system you're using (see the second
bullet), so you can swap from one to another without having to change
your code. Perhaps more importantly, though, you can subclass
FileField and set an attribute on the new class to define what class
to use for these objects. That way, you can subclass the provided File
and override methods like "save" so you can add new versions without
overwriting old ones, add attributes like "version" or "hash" and
methods for doing things like retrieving old versions, whatever you
like.

> I guess I'm just a little fuzzy on how manipulating files is supposed to
> work doing it the "django way". Any help would be appreciated.

I know I just dumped a lot of information on you, and there's even
more where that came from, but hopefully it helps explain the
direction Django's headed. You can take a look at the ticket to see
what other problems it's trying to solve, but the latest patch on
there hasn't been updated with recent changes, so you'll still have to
wait a little bit for working code to use. The documentation in that
patch will also be updated, but it will be mostly accurate to give you
a better idea of how it all works.

I hope this helps!

-Gul

[1] http://code.djangoproject.com/ticket/5361

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to