Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-19 Thread Tom Lane
Robert Haas writes: > What if we change the hash table to have RelFileNode as the key and an > array of MAX_FORKNUM bitmapsets as the value? Then when you get a > "forget" request, you can just zap all the sets to empty. Hm ... the only argument I can really make against that is that there'll be

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-19 Thread Robert Haas
On Thu, Jul 19, 2012 at 2:57 PM, Tom Lane wrote: > Robert Haas writes: >> On Thu, Jul 19, 2012 at 10:09 AM, Tom Lane wrote: >>> What about checking just the immediately previous entry? This would >>> at least fix the problem for bulk-load situations, and the cost ought >>> to be about negligibl

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-19 Thread Tom Lane
Robert Haas writes: > On Thu, Jul 19, 2012 at 10:09 AM, Tom Lane wrote: >> What about checking just the immediately previous entry? This would >> at least fix the problem for bulk-load situations, and the cost ought >> to be about negligible compared to acquiring the LWLock. > 2. You say "fix t

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-19 Thread Robert Haas
On Thu, Jul 19, 2012 at 10:09 AM, Tom Lane wrote: > Robert Haas writes: >> Seems a bit complex, but it might be worth it. Keep in mind that I >> eventually want to be able to make an unlogged table logged or a visca >> versa, which will probably entail unlinking just the init fork (for >> the lo

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-19 Thread Tom Lane
Robert Haas writes: > Seems a bit complex, but it might be worth it. Keep in mind that I > eventually want to be able to make an unlogged table logged or a visca > versa, which will probably entail unlinking just the init fork (for > the logged -> unlogged direction). Well, as far as that goes,

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-19 Thread Robert Haas
On Wed, Jul 18, 2012 at 5:17 PM, Tom Lane wrote: > I've been chewing on this issue some more, and no longer like my > previous proposal, which was > ... What I'm thinking about is reducing the hash key to just RelFileNodeBackend + ForkNumber, so that there's one hashtable entry per

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-18 Thread Tom Lane
Robert Haas writes: > On Mon, Jul 16, 2012 at 12:36 PM, Tom Lane wrote: >> Well, that argument is exactly why the code is designed the way it is... >> but we are now finding out that sending useless fsync requests isn't as >> cheap as all that. > I agree, but I think the problem can be solved fo

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-17 Thread Greg Smith
On 07/16/2012 02:39 PM, Robert Haas wrote: Unfortunately, there are lots of important operations (like bulk loading, SELECT * FROM bigtable, and VACUUM notverybigtable) that inevitably end up writing out their own dirty buffers. And even when the background writer does write something, it's not

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Tom Lane
Robert Haas writes: > Yes, it seems to have done just that. The comment for > ForwardFsyncRequest is a few bricks short of a load too: > ... > Line 2 seems to have been mechanically changed from "background > writer" to "checkpointer", but of course it should still say > "background writer" in th

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Robert Haas
On Mon, Jul 16, 2012 at 3:18 PM, Tom Lane wrote: > Robert Haas writes: >> At any rate, I'm somewhat less convinced that the split was a good >> idea than I was when we did it, mostly because we haven't really gone >> anywhere with it subsequently. > > BTW, while we are on the subject: hasn't this

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Tom Lane
Robert Haas writes: > At any rate, I'm somewhat less convinced that the split was a good > idea than I was when we did it, mostly because we haven't really gone > anywhere with it subsequently. BTW, while we are on the subject: hasn't this split completely broken the statistics about backend-init

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Tom Lane
Robert Haas writes: > Unfortunately, there are lots of important operations (like bulk > loading, SELECT * FROM bigtable, and VACUUM notverybigtable) that > inevitably end up writing out their own dirty buffers. And even when > the background writer does write something, it's not always clear tha

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Robert Haas
On Mon, Jul 16, 2012 at 12:57 PM, Tom Lane wrote: > Robert Haas writes: >> In my view, the elephant in the room here is that it's dramatically >> inefficient for every backend to send an fsync request on every block >> write. > > Yeah. This was better before the decision was taken to separate >

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Tom Lane
Robert Haas writes: > In my view, the elephant in the room here is that it's dramatically > inefficient for every backend to send an fsync request on every block > write. Yeah. This was better before the decision was taken to separate bgwriter from checkpointer; before that, only local communica

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Robert Haas
On Mon, Jul 16, 2012 at 12:36 PM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Jul 16, 2012 at 12:08 PM, Tom Lane wrote: >>> Uh, that's exactly what's under discussion. Not sending useless fsync >>> requests when fsync is off is just one part of it; a part that happens >>> to be quite usefu

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Tom Lane
Robert Haas writes: > On Mon, Jul 16, 2012 at 12:08 PM, Tom Lane wrote: >> Uh, that's exactly what's under discussion. Not sending useless fsync >> requests when fsync is off is just one part of it; a part that happens >> to be quite useful for some test scenarios, even if not so much for >> pro

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Robert Haas
On Mon, Jul 16, 2012 at 12:08 PM, Tom Lane wrote: > Robert Haas writes: >> On Sun, Jul 15, 2012 at 2:29 PM, Tom Lane wrote: >>> Yeah, you have a point there. It's not real clear that switching fsync >>> from off to on is an operation that we can make any guarantees about, >>> short of executing

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Tom Lane
Robert Haas writes: > On Sun, Jul 15, 2012 at 2:29 PM, Tom Lane wrote: >> Yeah, you have a point there. It's not real clear that switching fsync >> from off to on is an operation that we can make any guarantees about, >> short of executing something like the code recently added to initdb >> to f

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-16 Thread Robert Haas
On Sun, Jul 15, 2012 at 2:29 PM, Tom Lane wrote: > I think what we ought to do is bite the bullet and refactor the > representation of the pendingOps table. What I'm thinking about > is reducing the hash key to just RelFileNodeBackend + ForkNumber, > so that there's one hashtable entry per fork,

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-15 Thread Craig Ringer
On 07/16/2012 09:37 AM, Tom Lane wrote: There's one way that doesn't have any housekeeping cost to Pg. It's pretty bad manners if there's anybody other than Pg on the system though: sync() Yeah, I thought about that: if we could document that issuing a manual sync after turning fsync on leav

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-15 Thread Tom Lane
Craig Ringer writes: > On 07/16/2012 02:29 AM, Tom Lane wrote: >> Yeah, you have a point there. It's not real clear that switching fsync >> from off to on is an operation that we can make any guarantees about, >> short of executing something like the code recently added to initdb >> to force-sync

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-15 Thread Craig Ringer
On 07/16/2012 02:29 AM, Tom Lane wrote: Yeah, you have a point there. It's not real clear that switching fsync from off to on is an operation that we can make any guarantees about, short of executing something like the code recently added to initdb to force-sync the entire PGDATA tree. There's

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-15 Thread Tom Lane
... btw, in the penny wise and pound foolish department, I observe that smgrdounlink calls mdunlink separately for each possibly existing fork of a relation to be dropped. That means we are queuing a separate fsync queue entry for each fork, and could immediately save a factor of four in FORGET_RE

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-15 Thread Tom Lane
Jeff Janes writes: > On Thu, Jul 12, 2012 at 9:55 PM, Jeff Janes wrote: >> The topic was poor performance when truncating lots of small tables >> repeatedly on test environments with fsync=off. >> >> On Thu, Jul 12, 2012 at 6:00 PM, Jeff Janes wrote: >>> I think the problem is in the Fsync Abso

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-14 Thread Jeff Janes
On Thu, Jul 12, 2012 at 9:55 PM, Jeff Janes wrote: > I've moved this thread from performance to hackers. > > The topic was poor performance when truncating lots of small tables > repeatedly on test environments with fsync=off. > > On Thu, Jul 12, 2012 at 6:00 PM, Jeff Janes wrote: > >> I think th

Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-12 Thread Jeff Janes
I've moved this thread from performance to hackers. The topic was poor performance when truncating lots of small tables repeatedly on test environments with fsync=off. On Thu, Jul 12, 2012 at 6:00 PM, Jeff Janes wrote: > I think the problem is in the Fsync Absorption queue. Every truncate > ad