Re: [PERFORM] Storing Digital Video
On Thu, Feb 09, 2006 at 04:14:09PM -0700, Jan Peterson wrote: > In my experience, you don't want to store this stuff in the database. > In general, it will work fine, until you have to VACUUM the > pg_largeobject table. Unless you have a very powerful I/O subsystem, > this VACUUM will kill your performance. Good point about the vacuum issue; I haven't had to deal with vacuuming very large objects. > > You're forgetting about cleanup and transactions. If you store outside > > the database you either have to write some kind of garbage collector, or > > you add a trigger to delete the file on disk when the row in the > > database pointing at it is deleted and hope that the transaction doesn't > > rollback. > > Our solution to this problem was to have a separate table of "external > files to delete". When you want to delete a file, you just stuff an > entry into this table. If your transaction rolls back, so does your > insert into this table. You have a separate thread that periodically > walks this table and zaps the files from the filesystem. Sure, there's lots of ways around it. My point was that there *is* a tradeoff. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] 10+hrs vs 15min because of just one index
On Fri, Feb 10, 2006 at 09:24:39AM -0800, Aaron Turner wrote: > On 2/10/06, Matthew T. O'Connor wrote: > > Aaron Turner wrote: > > > So I'm trying to figure out how to optimize my PG install (8.0.3) to > > > get better performance without dropping one of my indexes. > > > > What about something like this: > > > > begin; > > drop slow_index_name; > > update; > > create index slow_index_name; > > commit; > > vacuum; > > Right. That's exactly what I'm doing to get the update to occur in 15 > minutes. Unfortunately though, I'm basically at the point of every > time I insert/update into that table I have to drop the index which is > making my life very painful (having to de-dupe records in RAM in my > application is a lot faster but also more complicated/error prone). > > Basically, I need some way to optimize PG so that I don't have to drop > that index every time. > > Suggestions? I think you'll have a tough time making this faster; or I'm just not understanding the problem well enough. It's probably time to start thinking about re-architecting some things in the application so that you don't have to do this. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] 10+hrs vs 15min because of just one index
On 2/11/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote: > On Fri, Feb 10, 2006 at 09:24:39AM -0800, Aaron Turner wrote: > > On 2/10/06, Matthew T. O'Connor wrote: > > > Aaron Turner wrote: > > > > Basically, I need some way to optimize PG so that I don't have to drop > > that index every time. > > > > Suggestions? > > I think you'll have a tough time making this faster; or I'm just not > understanding the problem well enough. It's probably time to start > thinking about re-architecting some things in the application so that > you don't have to do this. Well before I go about re-architecting things, it would be good to have a strong understanding of just what is going on. Obviously, the unique index on the char(48) is the killer. What I don't know is: 1) Is this because the column is so long? 2) Is this because PG is not optimized for char(48) (maybe it wants powers of 2? or doesn't like even numbers... I don't know, just throwing it out there) 3) Is there some algorithm I can use to estimate relative UPDATE speed? Ie, if I cut the column length in 1/2 does that make it 50% faster? 4) Does decoding the data (currently base64) and storing the binary data improve the distribution of the index, thereby masking it more efficent? Obviously, one solution would be to store the column to be UPDATED in a seperate joined table. That would cost more disk space, and be more complex, but it would be more efficient for updates (inserts would of course be more expensive since now I have to do two). -- Aaron Turner http://synfin.net/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org