Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-17 Thread Fujii Masao
Hi, On Wed, Jun 17, 2009 at 12:22 AM, Czichy, Thoralf (NSN - FI/Helsinki) wrote: > [STONITH is not always best strategy if failures can be declared as > user-space software problem only, limit STONITH to HW/OS failures] > > The isolation of the failing Postgres instance does not require a > STONIT

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-16 Thread Tom Lane
"Czichy, Thoralf (NSN - FI/Helsinki)" writes: > I am working together with Harald on this issue. Below some thoughts on > why we think it should be possible to disable the postmaster-internal > recovery attempt and instead have faults in the processes started > by postmaster escalated to postma

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-16 Thread Czichy, Thoralf (NSN - FI/Helsinki)
hi, I am working together with Harald on this issue. Below some thoughts on why we think it should be possible to disable the postmaster-internal recovery attempt and instead have faults in the processes started by postmaster escalated to postmaster-exit. [Our typical "embedded" situation]

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-15 Thread Alvaro Herrera
Kolb, Harald (NSN - DE/Munich) escribió: > The recovery and restart feature is an excellent solution if the db is > running in a standalone environment and I understand that this should > not be weakened. But in a configuration where the db is only one > resource among others and where you have a

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-15 Thread Kolb, Harald (NSN - DE/Munich)
elsinki) > Subject: Re: [HACKERS] postmaster recovery and automatic > restart suppression > > "Kolb, Harald (NSN - DE/Munich)" writes: > > If you don't want to see this option as a GUC parameter, would it be > > acceptable to have it as a new postmaster cm

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Fujii Masao
Hi, On Wed, Jun 10, 2009 at 4:21 AM, Simon Riggs wrote: > > On Tue, 2009-06-09 at 20:59 +0200, Kolb, Harald (NSN - DE/Munich) wrote: > >> There are some good reasons why a switchover could be an appropriate >> means in case the DB is facing troubles. It may be that the root cause >> is not the DB

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Simon Riggs
On Tue, 2009-06-09 at 15:48 -0500, Kevin Grittner wrote: > My first reaction on hearing the request was that it might have *some* > use; but in trying to recall any restart where it is what I would have > wanted, I come up dry. I haven't even really come up with a good > hypothetical use case.

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Kevin Grittner
Tom Lane wrote: > "Kevin Grittner" writes: >> "Kolb, Harald (NSN - DE/Munich)" wrote: >>> There are some good reasons why a switchover could be an >>> appropriate means in case the DB is facing troubles. It may be >>> that the root cause is not the DB itself, but used resources or >>> other thi

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Tom Lane
"Kevin Grittner" writes: > "Kolb, Harald (NSN - DE/Munich)" wrote: >> There are some good reasons why a switchover could be an appropriate >> means in case the DB is facing troubles. It may be that the root >> cause is not the DB itsself, but used resources or other things >> which are going craz

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Greg Stark
Not really since once you fail over you may as well stop the rebuild since you'll have to restore the whole database. Moreover wouldn't that have to be a manual decision? The closest thing I can come to a use case would be if you run a very large cluster with hundreds of read-only replicas.

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Kevin Grittner
"Kolb, Harald (NSN - DE/Munich)" wrote: >> From: ext Tom Lane [mailto:t...@sss.pgh.pa.us] >> Mechanism should exist to support useful policy. I don't believe >> that the proposed switch has any real-world usefulness. > There are some good reasons why a switchover could be an appropriate > me

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Tom Lane
"Kolb, Harald (NSN - DE/Munich)" writes: > If you don't want to see this option as a GUC parameter, would it be > acceptable to have it as a new postmaster cmd line option ? That would make two kluges, not one (we don't do options that are settable in only one way). And it does nothing whatever

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Simon Riggs
On Tue, 2009-06-09 at 20:59 +0200, Kolb, Harald (NSN - DE/Munich) wrote: > There are some good reasons why a switchover could be an appropriate > means in case the DB is facing troubles. It may be that the root cause > is not the DB itsself, but used resources or other things which are > going cr

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Kolb, Harald (NSN - DE/Munich)
t; (NSN - FI/Helsinki) > Subject: Re: [HACKERS] postmaster recovery and automatic > restart suppression > > Robert Haas writes: > > I see that you've carefully not quoted Greg's remark about > "mechanism > > not policy" with which I completely agree. &g

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Robert Haas
On Mon, Jun 8, 2009 at 7:34 PM, Tom Lane wrote: > Robert Haas writes: >> I see that you've carefully not quoted Greg's remark about "mechanism >> not policy" with which I completely agree. > > Mechanism should exist to support useful policy.  I don't believe that > the proposed switch has any real

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Tom Lane
Robert Haas writes: > I see that you've carefully not quoted Greg's remark about "mechanism > not policy" with which I completely agree. Mechanism should exist to support useful policy. I don't believe that the proposed switch has any real-world usefulness. regards, tom

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Robert Haas
On Mon, Jun 8, 2009 at 4:30 PM, Tom Lane wrote: > Greg Stark writes: >>> On Mon, 2009-06-08 at 09:47 -0400, Tom Lane wrote: I think the proposed don't-restart flag is exceedingly ugly and will not solve any real-world problem. > >> Hm. I'm not sure I see a solid use case for it -- in my

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Tom Lane
Greg Stark writes: >> On Mon, 2009-06-08 at 09:47 -0400, Tom Lane wrote: >>> I think the proposed don't-restart flag is exceedingly ugly and will not >>> solve any real-world problem. > Hm. I'm not sure I see a solid use case for it -- in my experience you > want to be pretty sure you have a pers

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Greg Stark
On Mon, Jun 8, 2009 at 6:58 PM, Simon Riggs wrote: > > On Mon, 2009-06-08 at 09:47 -0400, Tom Lane wrote: > >> I think the proposed don't-restart flag is exceedingly ugly and will not >> solve any real-world problem. > > Agreed. Hm. I'm not sure I see a solid use case for it -- in my experience yo

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Simon Riggs
On Mon, 2009-06-08 at 09:47 -0400, Tom Lane wrote: > I think the proposed don't-restart flag is exceedingly ugly and will not > solve any real-world problem. Agreed. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing li

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Tom Lane
Gregory Stark writes: > I think the accepted way to handle this kind of situation is called STONITH -- > "Shoot The Other Node In The Head". Yeah, and the reason people go to the trouble of having special hardware for that is that pure-software solutions are unreliable. I think the proposed don'

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Fujii Masao
Hi, On Mon, Jun 8, 2009 at 6:45 PM, Gregory Stark wrote: > Fujii Masao writes: > >> On the other hand, the primary postgres might *not* restart automatically. >> So, it's difficult for clusterware to choose whether to do failover when it >> detects the death of the primary postgres, I think. > >

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Gregory Stark
Fujii Masao writes: > On the other hand, the primary postgres might *not* restart automatically. > So, it's difficult for clusterware to choose whether to do failover when it > detects the death of the primary postgres, I think. I think the accepted way to handle this kind of situation is calle

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Fujii Masao
Hi, On Fri, Jun 5, 2009 at 9:24 PM, Kolb, Harald (NSN - DE/Munich) wrote: >> Good point. I also think that this makes a handling of failover >> more complicated. In other words, clusterware cannot determine >> whether to do failover when it detects the death of the primary >> postgres. A wrong dec

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-05 Thread Kolb, Harald (NSN - DE/Munich)
Hi, > -Original Message- > From: ext Fujii Masao [mailto:masao.fu...@gmail.com] > Sent: Friday, June 05, 2009 8:14 AM > To: Kolb, Harald (NSN - DE/Munich) > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] postmaster recovery and automatic > restar

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-04 Thread Fujii Masao
Hi, On Fri, Jun 5, 2009 at 1:02 AM, Kolb, Harald (NSN - DE/Munich) wrote: > Hi, > > in case of a serious failure of a backend or an auxiliary process the > postmaster performs a crash recovery and restarts the db automatically. > > Is there a possibility to deactivate the restart and to force the