date:20060211

Re: [PERFORM] Storing Digital Video

2006-02-11 Thread Jim C. Nasby

On Thu, Feb 09, 2006 at 04:14:09PM -0700, Jan Peterson wrote:
> In my experience, you don't want to store this stuff in the database. 
> In general, it will work fine, until you have to VACUUM the
> pg_largeobject table.  Unless you have a very powerful I/O subsystem,
> this VACUUM will kill your performance.

Good point about the vacuum issue; I haven't had to deal with vacuuming
very large objects.

> > You're forgetting about cleanup and transactions. If you store outside
> > the database you either have to write some kind of garbage collector, or
> > you add a trigger to delete the file on disk when the row in the
> > database pointing at it is deleted and hope that the transaction doesn't
> > rollback.
> 
> Our solution to this problem was to have a separate table of "external
> files to delete".  When you want to delete a file, you just stuff an
> entry into this table.  If your transaction rolls back, so does your
> insert into this table.  You have a separate thread that periodically
> walks this table and zaps the files from the filesystem.

Sure, there's lots of ways around it. My point was that there *is* a
tradeoff.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] 10+hrs vs 15min because of just one index

2006-02-11 Thread Jim C. Nasby

On Fri, Feb 10, 2006 at 09:24:39AM -0800, Aaron Turner wrote:
> On 2/10/06, Matthew T. O'Connor  wrote:
> > Aaron Turner wrote:
> > > So I'm trying to figure out how to optimize my PG install (8.0.3) to
> > > get better performance without dropping one of my indexes.
> >
> > What about something like this:
> >
> > begin;
> > drop slow_index_name;
> > update;
> > create index slow_index_name;
> > commit;
> > vacuum;
> 
> Right.  That's exactly what I'm doing to get the update to occur in 15
> minutes.  Unfortunately though, I'm basically at the point of every
> time I insert/update into that table I have to drop the index which is
> making my life very painful (having to de-dupe records in RAM in my
> application is a lot faster but also more complicated/error prone).
> 
> Basically, I need some way to optimize PG so that I don't have to drop
> that index every time.
> 
> Suggestions?

I think you'll have a tough time making this faster; or I'm just not
understanding the problem well enough. It's probably time to start
thinking about re-architecting some things in the application so that
you don't have to do this.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] 10+hrs vs 15min because of just one index

2006-02-11 Thread Aaron Turner

On 2/11/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
> On Fri, Feb 10, 2006 at 09:24:39AM -0800, Aaron Turner wrote:
> > On 2/10/06, Matthew T. O'Connor  wrote:
> > > Aaron Turner wrote:
> >
> > Basically, I need some way to optimize PG so that I don't have to drop
> > that index every time.
> >
> > Suggestions?
>
> I think you'll have a tough time making this faster; or I'm just not
> understanding the problem well enough. It's probably time to start
> thinking about re-architecting some things in the application so that
> you don't have to do this.

Well before I go about re-architecting things, it would be good to
have a strong understanding of just what is going on.  Obviously, the
unique index on the char(48) is the killer.  What I don't know is:

1) Is this because the column is so long?
2) Is this because PG is not optimized for char(48) (maybe it wants
powers of 2? or doesn't like even numbers... I don't know, just
throwing it out there)
3) Is there some algorithm I can use to estimate relative UPDATE
speed?  Ie, if I cut the column length in 1/2 does that make it 50%
faster?
4) Does decoding the data (currently base64) and storing the binary
data improve the distribution of the index, thereby masking it more
efficent?

Obviously, one solution would be to store the column to be UPDATED in
a seperate joined table.  That would cost more disk space, and be more
complex, but it would be more efficient for updates (inserts would of
course be more expensive since now I have to do two).

--
Aaron Turner
http://synfin.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Storing Digital Video

Re: [PERFORM] 10+hrs vs 15min because of just one index

Re: [PERFORM] 10+hrs vs 15min because of just one index

3 matches

Site Navigation

Mail list logo

Footer information