On Fri, Jul 11, 2008 at 9:08 AM, Jon Brisbin <[EMAIL PROTECTED]> wrote: > I was a little bummed to discover that the Postgres blob support we depend > on at work I can't use through Django in a project for myself. I'm trying to > keep multiple versions of an original document (including the embedded > content that's inside the file) and had planned on using blobs to store the > binary data. At some point in the past, it may have been more inefficient to > store binary data inside the database, but we have an application at work > that stores thousands of reports per day and serves them to a report viewer > application. It allows for an environment-agnostic way to retrieve and store > those files and I don't have to worry about path prefixes and the various > problems of managing a messy directory structure of a large number of files > (I guess I'm supposed to just tar up the files using a cron job to back them > up? Sounds like a pain).
As Karen mentioned, you could write a custom field for this, which would be a bit easier for you than it would be otherwise, since you don't have to worry about compatibility with other databases. If you go this route, I'd recommend subclassing the existing FileField, so that it presents the same API as any other file, but there are other things on there (like get_FIELD_filename and get_FIELD_url) that might not serve any useful purpose for you, so choose wisely how you want to approach it. Also, the way FileField itself works will be changing soon, so you might want to hold off a bit anyway. More details below. > I'm having to change the way I was planning on handling these files and I'm > not sure how to go about doing that. I have an abstract representation of > the document as a model and I need to add the uploaded document as a version > attached to that document model. It's not date-based, but hash-based. I need > to store the file in a directory outside the document root (I don't want the > original directly accessible) I guess using some scheme like: > > "/my/docs/dir/%id1/%id2/%hash/mydoc.doc" > > How do I get from where Django will put the uploaded file to where I really > want the file stored (which is based on values I won't know until everything > is saved) and make sure the file field gets updated to reflect the new path? > Can I even move the file after it's been uploaded? The whole concept of > "upload_to" seems pretty limiting to me because the uploaded file is just > the first part of a processing chain that's more interested in what's inside > the file than it is with the file itself. I'm currently in some of the final stages of a significant patch[1] to improve how Django names, stores and manages files, and I think those improvements will really help you out. It's not in trunk yet, but it should make its way there in the next few weeks, so it can make it into the 1.0 release this fall. There's a lot changing with it, but I'll give you a quick rundown of what I think will help you, if you decide to go with regular files instead of a BLOB. * "upload_to" gets a lot smarter, by accepting a function in place of the current format string. Strings will still be allowed as well, but using a function lets you have much more control. That function will accept the uploaded filename as well as the object it's being attached to, so you can write a function to retrieve IDs and calculate the document hash and whatnot as part of the file naming process. * The actual saving and loading of files is moved out into a new "Storage" class, which defaults to a FileSystemStorage that behaves exactly as Django does now. This can be subclassed, though, and you can tell Django to use your custom storage class by way of a new setting or passing into your FileField instance. By simply overriding a method or two on that class, you'll have even more control over how and where the file gets saved, whether under MEDIA_ROOT or somewhere else entirely, possibly even based on the contents of the file itself. * Rather than only being available as string content, files will be made available as a new File object, which works much like the built-in Python file object, but with a few differences. For one, it integrates with whatever storage system you're using (see the second bullet), so you can swap from one to another without having to change your code. Perhaps more importantly, though, you can subclass FileField and set an attribute on the new class to define what class to use for these objects. That way, you can subclass the provided File and override methods like "save" so you can add new versions without overwriting old ones, add attributes like "version" or "hash" and methods for doing things like retrieving old versions, whatever you like. > I guess I'm just a little fuzzy on how manipulating files is supposed to > work doing it the "django way". Any help would be appreciated. I know I just dumped a lot of information on you, and there's even more where that came from, but hopefully it helps explain the direction Django's headed. You can take a look at the ticket to see what other problems it's trying to solve, but the latest patch on there hasn't been updated with recent changes, so you'll still have to wait a little bit for working code to use. The documentation in that patch will also be updated, but it will be mostly accurate to give you a better idea of how it all works. I hope this helps! -Gul [1] http://code.djangoproject.com/ticket/5361 --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---