Re: [HACKERS] We need to log aborted autovacuums

2011-02-07 Thread Robert Haas
On Sat, Feb 5, 2011 at 3:34 PM, Cédric Villemain wrote: > Anyway, without GUC is fine too as it won't fill the /var/log itself ! > I am just not opposed to have new GUC in those areas (log && debug). OK. Committed without a new GUC, at least for now. -- Robert Haas EnterpriseDB: http://www.ent

Re: [HACKERS] We need to log aborted autovacuums

2011-02-05 Thread Cédric Villemain
2011/2/5 Robert Haas : > On Sat, Feb 5, 2011 at 3:20 PM, Cédric Villemain > wrote: >>> In the case where a table is skipped for this reason, we log a message >>> at log level LOG.  The version of the patch I posted does that >>> unconditionally, but my intention was to change it before commit so >

Re: [HACKERS] We need to log aborted autovacuums

2011-02-05 Thread Robert Haas
On Sat, Feb 5, 2011 at 3:20 PM, Cédric Villemain wrote: >> In the case where a table is skipped for this reason, we log a message >> at log level LOG.  The version of the patch I posted does that >> unconditionally, but my intention was to change it before commit so >> that it only logs the messag

Re: [HACKERS] We need to log aborted autovacuums

2011-02-05 Thread Cédric Villemain
2011/2/5 Robert Haas : > On Sat, Feb 5, 2011 at 12:54 PM, Cédric Villemain > wrote: >> what do you implement exactly ? >> * The original request from Josh to get LOG when autovac can not run >> because of locks >> * VACOPT_NOWAIT, what is it ? > > What the patch implements is: > > If autovacuum ca

Re: [HACKERS] We need to log aborted autovacuums

2011-02-05 Thread Robert Haas
On Sat, Feb 5, 2011 at 12:54 PM, Cédric Villemain wrote: > what do you implement exactly ? > * The original request from Josh to get LOG when autovac can not run > because of locks > * VACOPT_NOWAIT, what is it ? What the patch implements is: If autovacuum can't get the table lock immediately, i

Re: [HACKERS] We need to log aborted autovacuums

2011-02-05 Thread Cédric Villemain
2011/2/4 Robert Haas : > On Sun, Jan 30, 2011 at 10:26 PM, Robert Haas wrote: >> On Sun, Jan 30, 2011 at 10:03 PM, Alvaro Herrera >> wrote: >>> Excerpts from Robert Haas's message of dom ene 30 23:37:51 -0300 2011: >>> Unless I'm missing something, making autovacuum.c call ConditionalLo

Re: [HACKERS] We need to log aborted autovacuums

2011-02-04 Thread Robert Haas
On Fri, Feb 4, 2011 at 12:59 AM, Josh Berkus wrote: > Robert, > >> Seeing as how there seem to be neither objections nor endorsements, >> I'm inclined to commit what I proposed more or less as-is.  There >> remains the issue of what do about the log spam.  Josh Berkus >> suggested logging it when

Re: [HACKERS] We need to log aborted autovacuums

2011-02-03 Thread Robert Haas
On Sun, Jan 30, 2011 at 10:26 PM, Robert Haas wrote: > On Sun, Jan 30, 2011 at 10:03 PM, Alvaro Herrera > wrote: >> Excerpts from Robert Haas's message of dom ene 30 23:37:51 -0300 2011: >> >>> Unless I'm missing something, making autovacuum.c call >>> ConditionalLockRelationOid() is not going to

Re: [HACKERS] We need to log aborted autovacuums

2011-01-30 Thread Robert Haas
On Sun, Jan 30, 2011 at 10:03 PM, Alvaro Herrera wrote: > Excerpts from Robert Haas's message of dom ene 30 23:37:51 -0300 2011: > >> Unless I'm missing something, making autovacuum.c call >> ConditionalLockRelationOid() is not going to work, because the vacuum >> transaction isn't started until w

Re: [HACKERS] We need to log aborted autovacuums

2011-01-30 Thread Alvaro Herrera
Excerpts from Robert Haas's message of dom ene 30 23:37:51 -0300 2011: > Unless I'm missing something, making autovacuum.c call > ConditionalLockRelationOid() is not going to work, because the vacuum > transaction isn't started until we get all the way down to > vacuum_rel(). Maybe we need Condit

Re: [HACKERS] We need to log aborted autovacuums

2011-01-30 Thread Robert Haas
On Mon, Jan 17, 2011 at 4:08 PM, Tom Lane wrote: > Simon Riggs writes: >> On Mon, 2011-01-17 at 14:46 -0500, Tom Lane wrote: >>> Do we actually need a lock timeout either?  The patch that was being >>> discussed just involved failing if you couldn't get it immediately. >>> I suspect that's suffic

Re: [HACKERS] We need to log aborted autovacuums

2011-01-17 Thread Peter Eisentraut
On mån, 2011-01-17 at 17:26 -0800, Josh Berkus wrote: > However, it's hard for me to imagine a real-world situation where a > table would be under repeated full-table-locks from multiple > connections. Can anyone else? If you want to do assertion-type checks at the end of transactions in the curr

Re: [HACKERS] We need to log aborted autovacuums

2011-01-17 Thread Tom Lane
Josh Berkus writes: > On 1/17/11 11:46 AM, Tom Lane wrote: >> Do we actually need a lock timeout either? The patch that was being >> discussed just involved failing if you couldn't get it immediately. >> I suspect that's sufficient for AV. At least, nobody's made a >> compelling argument why we

Re: [HACKERS] We need to log aborted autovacuums

2011-01-17 Thread Robert Haas
On Mon, Jan 17, 2011 at 8:26 PM, Josh Berkus wrote: > On 1/17/11 11:46 AM, Tom Lane wrote: >> Do we actually need a lock timeout either?  The patch that was being >> discussed just involved failing if you couldn't get it immediately. >> I suspect that's sufficient for AV.  At least, nobody's made

Re: [HACKERS] We need to log aborted autovacuums

2011-01-17 Thread Josh Berkus
On 1/17/11 11:46 AM, Tom Lane wrote: > Do we actually need a lock timeout either? The patch that was being > discussed just involved failing if you couldn't get it immediately. > I suspect that's sufficient for AV. At least, nobody's made a > compelling argument why we need to expend a very subst

Re: [HACKERS] We need to log aborted autovacuums

2011-01-17 Thread Tom Lane
Simon Riggs writes: > On Mon, 2011-01-17 at 14:46 -0500, Tom Lane wrote: >> Do we actually need a lock timeout either? The patch that was being >> discussed just involved failing if you couldn't get it immediately. >> I suspect that's sufficient for AV. At least, nobody's made a >> compelling ar

Re: [HACKERS] We need to log aborted autovacuums

2011-01-17 Thread Simon Riggs
On Mon, 2011-01-17 at 14:46 -0500, Tom Lane wrote: > Josh Berkus writes: > > However, we'd want a separate lock timeout for autovac, of course. I'm > > not at all keen on a *statement* timeout on autovacuum; as long as > > autovacuum is doing work, I don't want to cancel it. Also, WTF would we >

Re: [HACKERS] We need to log aborted autovacuums

2011-01-17 Thread Tom Lane
Josh Berkus writes: > However, we'd want a separate lock timeout for autovac, of course. I'm > not at all keen on a *statement* timeout on autovacuum; as long as > autovacuum is doing work, I don't want to cancel it. Also, WTF would we > set it to? Yeah --- in the presence of vacuum cost delay,

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Robert Haas
On Sun, Jan 16, 2011 at 8:36 PM, Simon Riggs wrote: > I agree with you, but if we want it *this* release, on top of all the > other features we have queued, then I suggest we compromise. If you hold > out for more feature, you may get less. > > Statement timeout = 2 * (100ms + autovacuum_vacuum_co

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Simon Riggs
On Sun, 2011-01-16 at 12:50 -0800, Josh Berkus wrote: > On 1/16/11 11:19 AM, Simon Riggs wrote: > > I would prefer it if we had a settable lock timeout, as suggested many > > moons ago. When that was discussed before it was said there was no > > difference between a statement timeout and a lock tim

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Josh Berkus
On 1/16/11 11:19 AM, Simon Riggs wrote: > I would prefer it if we had a settable lock timeout, as suggested many > moons ago. When that was discussed before it was said there was no > difference between a statement timeout and a lock timeout, but I think > there clearly is, this case being just one

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Simon Riggs
On Sun, 2011-01-16 at 13:08 -0500, Greg Smith wrote: > Simon Riggs wrote: > > I'm fairly confused by this thread. > > > > That's becuase you think it has something to do with cancellation, which > it doesn't. The original report here noted a real problem but got the > theorized cause wrong.

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Tom Lane
Greg Smith writes: > Tom Lane wrote: >> No, I don't believe we should be messing with the semantics of >> try_relation_open. It is what it is. > With only four pretty simple callers to the thing, and two of them > needing the alternate behavior, it seemed a reasonable place to modify > to me.

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Greg Smith
Tom Lane wrote: No, I don't believe we should be messing with the semantics of try_relation_open. It is what it is. With only four pretty simple callers to the thing, and two of them needing the alternate behavior, it seemed a reasonable place to modify to me. I thought the "nowait" bool

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Tom Lane
Greg Smith writes: > Simon Riggs wrote: >> I'm fairly confused by this thread. > That's becuase you think it has something to do with cancellation, which > it doesn't. The original report here noted a real problem but got the > theorized cause wrong. I think that cancellations are also a pote

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Greg Smith
Simon Riggs wrote: I'm fairly confused by this thread. That's becuase you think it has something to do with cancellation, which it doesn't. The original report here noted a real problem but got the theorized cause wrong. It turns out the code that acquires a lock when autovacuum decides

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Tom Lane
Simon Riggs writes: > I'm fairly confused by this thread. > We *do* emit a message when we cancel an autovacuum task. We went to a > lot of trouble to do that. The message is DEBUG2, and says > "sending cancel to blocking autovacuum pid =". That doesn't necessarily match one-to-one with actual c

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Simon Riggs
On Sun, 2011-01-16 at 11:47 -0500, Tom Lane wrote: > Greg Smith writes: > > try_relation_open calls LockRelationOid, which blocks. There is also a > > ConditionalLockRelationOid, which does the same basic thing except it > > exits immediately, with a false return code, if it can't acquire the

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Tom Lane
Greg Smith writes: > try_relation_open calls LockRelationOid, which blocks. There is also a > ConditionalLockRelationOid, which does the same basic thing except it > exits immediately, with a false return code, if it can't acquire the > lock. I think we just need to nail down in what existing

Re: [HACKERS] We need to log aborted autovacuums

2011-01-16 Thread Greg Smith
Robert Haas wrote: On Sat, Jan 15, 2011 at 11:14 AM, Tom Lane wrote: Greg Smith writes: Does try_relation_open need to have a lock acquisition timeout when AV is calling it? Hmm. I think when looking at the AV code, I've always subconsciously assumed that try_relation_open wo

Re: [HACKERS] We need to log aborted autovacuums

2011-01-15 Thread Robert Haas
On Sat, Jan 15, 2011 at 11:14 AM, Tom Lane wrote: > Greg Smith writes: >> Does try_relation_open need to have a lock acquisition timeout when AV >> is calling it? > > Hmm.  I think when looking at the AV code, I've always subconsciously > assumed that try_relation_open would fail immediately if i

Re: [HACKERS] We need to log aborted autovacuums

2011-01-15 Thread Tom Lane
Greg Smith writes: > Does try_relation_open need to have a lock acquisition timeout when AV > is calling it? Hmm. I think when looking at the AV code, I've always subconsciously assumed that try_relation_open would fail immediately if it couldn't get the lock. That certainly seems like it woul

Re: [HACKERS] We need to log aborted autovacuums

2011-01-15 Thread Greg Smith
Josh Berkus wrote: The lack of vacuum could be occurring for any of 4 reasons: 1) Locking 2) You have a lot of tables and not enough autovac_workers / too much sleep time 3) You need to autovac this particular table more frequently, since it gets dirtied really fast 4) The table has been set wit

Re: [HACKERS] We need to log aborted autovacuums

2011-01-08 Thread Dimitri Fontaine
David Fetter writes: > On Fri, Jan 07, 2011 at 08:15:12PM -0500, Greg Smith wrote: >> [1] Silly aside: I was thinking today that I should draw a chart of >> all the common objections to code that show up here, looking like >> those maps you see when walking into a mall. Then we can give a >> cop

Re: [HACKERS] We need to log aborted autovacuums

2011-01-07 Thread David Fetter
On Fri, Jan 07, 2011 at 08:15:12PM -0500, Greg Smith wrote: > [1] Silly aside: I was thinking today that I should draw a chart of > all the common objections to code that show up here, looking like > those maps you see when walking into a mall. Then we can give a > copy to new submitters with a b

Re: [HACKERS] We need to log aborted autovacuums

2011-01-07 Thread Greg Smith
Josh Berkus wrote: It occurs to me that another way of diagnosis would simply be a way to cause the autovac daemon to spit out output we could camp on, *without* requiring the huge volumes of output also required for DEBUG3. This brings us back to the logging idea again. Right. I don't kno

Re: [HACKERS] We need to log aborted autovacuums

2011-01-07 Thread Josh Berkus
Greg, > It's already possible to detect the main symptom--dead row percentage is > much higher than the autovacuum threshold, but there's been no recent > autovacuum. That makes me less enthusiastic that there's such a genuine > need to justify the overhead of storing more table stats just to det

Re: [HACKERS] We need to log aborted autovacuums

2011-01-06 Thread Greg Smith
Josh Berkus wrote: Or should it perhaps be a per-table counter in pg_stat_user_tables, given your statement above? Or even a timestamp: last_autovacuum_attempt, which would record the last time autovacuum was tried. If that's fairly recent and you have a large number of dead rows, you kno

Re: [HACKERS] We need to log aborted autovacuums

2011-01-06 Thread Greg Smith
Robert Treat wrote: This is a great use case for user level tracing support. Add a probe around these bits, and you can capture the information when you need it. Sure. I would also like a pony. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Serv

Re: [HACKERS] We need to log aborted autovacuums

2011-01-05 Thread Josh Berkus
> This is a great use case for user level tracing support. Add a probe > around these bits, and you can capture the information when you need > it. Yeah, would be lovely if user-level tracing existed on all platforms. -- -- Josh Berkus

Re: [HACKERS] We need to log aborted autovacuums

2011-01-05 Thread Robert Treat
On Wed, Jan 5, 2011 at 2:27 PM, Josh Berkus wrote: > >> If you could gather more info on whether this logging catches the >> problem cases you're seeing, that would really be the right test for the >> patch's usefulness.  I'd give you solid 50/50 odds that you've correctly >> diagnosed the issue,

Re: [HACKERS] We need to log aborted autovacuums

2011-01-05 Thread Josh Berkus
> If you could gather more info on whether this logging catches the > problem cases you're seeing, that would really be the right test for the > patch's usefulness. I'd give you solid 50/50 odds that you've correctly > diagnosed the issue, and knowing for sure would make advocating for this > log

Re: [HACKERS] We need to log aborted autovacuums

2011-01-05 Thread Magnus Hagander
On Wed, Jan 5, 2011 at 07:55, Greg Smith wrote: > a bit of work in userland, I don't see this even being justified as an INFO > or LOG level message.  Anytime I can script a SQL-level monitor for > something that's easy to tie into Nagios or something, I greatly prefer that > to log file scrapi

Re: [HACKERS] We need to log aborted autovacuums

2011-01-04 Thread Greg Smith
Josh Berkus wrote: I've been trying to diagnose in a production database why certain tables never get autovacuumed despite having a substantial % of updates. The obvious reason is locks blocking autovacuum from vacuuming the table ... Missed this dicussion when it popped up but have plenty

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Tom Lane
Itagaki Takahiro writes: > On Thu, Nov 18, 2010 at 08:35, Tom Lane wrote: >> Well, the way to deal with that would be to add a GUC that enables >> reporting of those messages at LOG level.  But it's a bit hard to argue >> that we need such a thing without more evidence.  Maybe you could just >>

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Itagaki Takahiro
On Thu, Nov 18, 2010 at 08:35, Tom Lane wrote: >> Yeah, it would be really good to be able to log that without bumping the >> log levels of the server in general to DEBUG3. > > Well, the way to deal with that would be to add a GUC that enables > reporting of those messages at LOG level.  But it's

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Tom Lane
Josh Berkus writes: >> There *is* an elog(DEBUG3) in autovacuum.c >> that reports whether autovac thinks a table needs vacuumed/analyzed ... >> maybe that needs to be a tad more user-accessible. > Yeah, it would be really good to be able to log that without bumping the > log levels of the server

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Josh Berkus
> It's hard to tell, because you're just handwaving about what it is you > think isn't being logged; nor is it clear whether you have any evidence > that locks are the problem. Offhand I'd think it at least as likely > that autovacuum thinks it doesn't need to do anything, perhaps because > of a

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Tom Lane
Josh Berkus writes: > I've been trying to diagnose in a production database why certain tables > never get autovacuumed despite having a substantial % of updates. The > obvious reason is locks blocking autovacuum from vacuuming the table ... > but the trick is we don't log such blocking behavior,

Re: [HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Joshua D. Drake
On Wed, 2010-11-17 at 13:46 -0800, Josh Berkus wrote: > Hackers, > > I've been trying to diagnose in a production database why certain tables > never get autovacuumed despite having a substantial % of updates. The > obvious reason is locks blocking autovacuum from vacuuming the table ... > but th

[HACKERS] We need to log aborted autovacuums

2010-11-17 Thread Josh Berkus
Hackers, I've been trying to diagnose in a production database why certain tables never get autovacuumed despite having a substantial % of updates. The obvious reason is locks blocking autovacuum from vacuuming the table ... but the trick is we don't log such blocking behavior, at all. This mean