Samuel Sieb <[EMAIL PROTECTED]> writes:
> On Tue, Mar 06, 2001 at 12:46:24PM -0800, Nathan Myers wrote:
> >
> > On Linux, /usr/src/linux/include is meaningless for anything in userland;
> > it's meant only for building the kernel and kernel modules. That Red Hat
> > tends to expose it to user
Vadim Mikheev wrote:
>
> Nevertheless, subj is rised. BTW, does anybody know results of kill -9
> in Oracle/Informix/etc? Just curious -:)
Progress has no problem with it that I have ever seen.
Regards,
Andrew.
--
> Nevertheless, subj is rised. BTW, does anybody know results of kill -9
> in Oracle/Informix/etc? Just curious -:)
Informix has no problem with it. Oracle dba's fear it, to say the least.
Andreas
---(end of broadcast)---
TIP 6: Have you search
> I have spent several days now puzzling over the corrupted WAL logfile
> that Scott Parish was kind enough to send me from a 7.1beta4 crash.
> It looks a lot like two different series of transactions were getting
> written into the same logfile. I'd been digging like mad in the WAL
> code to try
> -Original Message-
> From: Tom Lane [mailto:[EMAIL PROTECTED]]
>
> The interlock has to be tightly tied to the PGDATA directory, because
> what we're trying to protect is the files in and under that directory.
> It seems that something based on file(s) in that directory is the way
> to g
[EMAIL PROTECTED] (Nathan Myers) writes:
> On Linux, /usr/src/linux/include is meaningless for anything in userland;
> it's meant only for building the kernel and kernel modules. That Red Hat
> tends to expose it to user-level builds is a long-standing bug in Red
> Hat's distribution
1) it i
Alfred Perlstein wrote:
> What they really need to do is hire some grey beards (old school
> Unix folks) to QA the releases and keep stuff like this from
> happening/shipping.
Like the 250-strong RedHat Beta Team, of which I am a member? :-) I
can't disclose the discussions on that list, but, suf
> >Alfred Perlstein <[EMAIL PROTECTED]> writes:
> >>> Are there any portability problems with relying on shm_nattch to be
> >>> available? If not, I like this a lot...
> >
> >> Well it's available on FreeBSD and Solaris, I'm sure Redhat has
> >> some deamon that resets the value to 0 periodically
BeOS haven't this stat (I have a bunch of others but not this one).
If I unsterstand correctly, you want to check if there is some backend
still attached to shared mem segment of a given key ? In this case, I have an
easy solution to fake the stat, because all segment have an encoded nam
* Lamar Owen <[EMAIL PROTECTED]> [010306 13:27] wrote:
> Nathan Myers wrote:
> > That is why there is no problem with version skew in the syscall
> > argument structures on a correctly-configured Linux system. (On a
> > Red Hat system it is very easy to get them out of sync, but RH fans
> > are u
On Tue, Mar 06, 2001 at 12:46:24PM -0800, Nathan Myers wrote:
>
> On Linux, /usr/src/linux/include is meaningless for anything in userland;
> it's meant only for building the kernel and kernel modules. That Red Hat
> tends to expose it to user-level builds is a long-standing bug in Red
> Hat'
Nathan Myers wrote:
> That is why there is no problem with version skew in the syscall
> argument structures on a correctly-configured Linux system. (On a
> Red Hat system it is very easy to get them out of sync, but RH fans
> are used to problems.)
Is RedHat bashing really necessary here? At l
Bruce Momjian writes:
> This will try a pg_ctl shutdown for 60 seconds, then kill pg_ctl. You
> would then need a kill of you own.
pg_ctl automatically times out after 60 seconds.
--
Peter Eisentraut [EMAIL PROTECTED] http://yi.org/peter-e/
---(end of broa
On Tue, Mar 06, 2001 at 08:19:12PM +0100, Peter Eisentraut wrote:
> Alfred Perlstein writes:
>
> > Seriously, there's some dispute on the type that 'shm_nattch' is,
> > under Solaris it's "shmatt_t" (unsigned long afaik), under FreeBSD
> > it's 'short' (i should fix this. :)).
>
> What I don't l
Alfred Perlstein <[EMAIL PROTECTED]> writes:
> Of course not, the size of the struct changed (short->unsigned
> long, basically int16_t -> uint32_t), because the kernel and userland
> in Linux are hardly in sync you have the fun of guessing if you
> get:
> old struct -> old syscall (ok)
> new str
* Tom Lane <[EMAIL PROTECTED]> [010306 11:49] wrote:
> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > What I don't like is that my /usr/include/sys/shm.h (through other
> > headers) has [foo]
> > whereas /usr/src/linux/include/shm.h has [bar]
>
> Are those declarations perhaps bit-compatible?
* Lamar Owen <[EMAIL PROTECTED]> [010306 11:39] wrote:
> Peter Eisentraut wrote:
> > Not only note the shm_nattch type, but also shm_segsz, and the "unused"
> > fields in between. I don't know a thing about the Linux kernel sources,
> > but this doesn't seem right.
>
> Red Hat 7, right? My RedH
* Tom Lane <[EMAIL PROTECTED]> [010306 11:30] wrote:
> Alfred Perlstein <[EMAIL PROTECTED]> writes:
> > * Tom Lane <[EMAIL PROTECTED]> [010306 11:03] wrote:
> >> I notice that our BeOS and QNX emulations of shmctl() don't support
> >> IPC_STAT, but that could be dealt with, at least to the extent
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> What I don't like is that my /usr/include/sys/shm.h (through other
> headers) has [foo]
> whereas /usr/src/linux/include/shm.h has [bar]
Are those declarations perhaps bit-compatible? Looks a tad endian-
dependent, though ...
Peter Eisentraut wrote:
> Not only note the shm_nattch type, but also shm_segsz, and the "unused"
> fields in between. I don't know a thing about the Linux kernel sources,
> but this doesn't seem right.
Red Hat 7, right? My RedHat 7 system isn't running RH 7 right now (it's
this notebook that I
Alfred Perlstein <[EMAIL PROTECTED]> writes:
> * Tom Lane <[EMAIL PROTECTED]> [010306 11:03] wrote:
>> I notice that our BeOS and QNX emulations of shmctl() don't support
>> IPC_STAT, but that could be dealt with, at least to the extent of
>> stubbing it out.
> Well since we already have spinlock
* Tom Lane <[EMAIL PROTECTED]> [010306 11:03] wrote:
> Alfred Perlstein <[EMAIL PROTECTED]> writes:
> >> Are there any portability problems with relying on shm_nattch to be
> >> available? If not, I like this a lot...
>
> > Well it's available on FreeBSD and Solaris, I'm sure Redhat has
> > some
Alfred Perlstein writes:
> Seriously, there's some dispute on the type that 'shm_nattch' is,
> under Solaris it's "shmatt_t" (unsigned long afaik), under FreeBSD
> it's 'short' (i should fix this. :)).
What I don't like is that my /usr/include/sys/shm.h (through other
headers) has:
typedef unsi
Peter Eisentraut wrote:
> Well, if you have something clever you want to do if the postmaster
> doesn't come down after an orderly shutdown then please share it. The
> current alternatives are 'leave running' or 'kill -9'. I know I'd prefer
> the former.
Well, my preferences aren't really relev
Alfred Perlstein <[EMAIL PROTECTED]> writes:
>> Are there any portability problems with relying on shm_nattch to be
>> available? If not, I like this a lot...
> Well it's available on FreeBSD and Solaris, I'm sure Redhat has
> some deamon that resets the value to 0 periodically just for kicks
>
* Tom Lane <[EMAIL PROTECTED]> [010306 10:35] wrote:
> Alfred Perlstein <[EMAIL PROTECTED]> writes:
>
> > What about encoding the shm id in the pidfile? Then one can just ask
> > how many processes are attached to that segment? (if it doesn't
> > exist, one can assume all backends have exited)
Alfred Perlstein <[EMAIL PROTECTED]> writes:
> * Tom Lane <[EMAIL PROTECTED]> [010306 10:10] wrote:
>> The shmem key is driven primarily by port number
>> not data directory ...)
> This seems like a mistake.
> I'm suprised you guys aren't just using some form of the FreeBSD
> ftok() algorithm
* Tom Lane <[EMAIL PROTECTED]> [010306 10:10] wrote:
> Alfred Perlstein <[EMAIL PROTECTED]> writes:
> > I'm sure some sort of encoding of the PGDATA directory along with
> > the pids stored in the shm segment...
>
> I thought about this too, but it strikes me as not very trustworthy.
> The proble
Lamar Owen writes:
> > case when the postmaster does not come down after 60 seconds. But this is
> > really no problem for the issue at hand because if you do a normal
> > runlevel switch then the postmaster will simply keep running, and during a
> > system shutdown all the backends are going to
Alfred Perlstein <[EMAIL PROTECTED]> writes:
> I'm sure some sort of encoding of the PGDATA directory along with
> the pids stored in the shm segment...
I thought about this too, but it strikes me as not very trustworthy.
The problem is that there's no guarantee that the new postmaster will
even
Peter Eisentraut wrote:
>
> Lamar Owen writes:
>
> > I missed something somehwere: wasn't the consensus a few weeks ago that
> > pg_ctl shouldn't be used for a system initscript?
>
> The consensus(?) was that there was some work to do in pg_ctl before it
> was robust enough to be used (for anyt
> Bruce Momjian writes:
>
> > This will try a pg_ctl shutdown for 60 seconds, then kill pg_ctl. You
> > would then need a kill of you own.
>
> pg_ctl automatically times out after 60 seconds.
Oh, yea, that's right, I saw that in the documenation. Forget my
script. Just run pg_ctl first, then
Lamar Owen writes:
> I missed something somehwere: wasn't the consensus a few weeks ago that
> pg_ctl shouldn't be used for a system initscript?
The consensus(?) was that there was some work to do in pg_ctl before it
was robust enough to be used (for anything). That work has been done.
An examp
> I especially don't think that we should second-guess what the admin
> wants us to do by auto-killing backends that are still serving
> clients.
Sure. But it would be nice anyway if pg_ctl could do this with a
specific command line switch.
--
<< Tout n'y est pas parfait, mais on y honore ce
* Tom Lane <[EMAIL PROTECTED]> [010305 19:13] wrote:
> Lamar Owen <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> Postmaster down, backends alive is not a scenario we're currently
> >> prepared for. We need a way to plug that gap.
>
> > Postmaster can easily enough find out if zombie backen
Tom Lane wrote:
> Well, there's always the possibility of a bug leading to postmaster
> coredump. Historically those have been pretty rare though.
I have never personally seen one, since 6.1.1.
> In any case, I'm not sure that the init script is the place to be
> solving these problems.
Well,
Lamar Owen <[EMAIL PROTECTED]> writes:
> I don't want to reap the postmaster off -- I want to reap off the
> backends associated with that particular postmaster, allowing that
> postmaster to die on its own. Duh. Doing this in a safe manner is not
> going to be easy, given that the PGDATA is not
Lamar Owen <[EMAIL PROTECTED]> writes:
> Is it a correct assumption that this is the only time postmaster might
> drop out?
Well, there's always the possibility of a bug leading to postmaster
coredump. Historically those have been pretty rare though.
In any case, I'm not sure that the init scri
Tom Lane wrote:
> Lamar Owen <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> If you think it's easy enough, enlighten the rest of us ;-).
> > If postgres reported PGDATA on the command line it would be easy enough.
> In ps status you mean? I don't think we are prepared to require ps
> stat
Tom Lane wrote:
> of course, that's the situation you're left with ... but your reasoning
> seems circular to me. "I should kill -9 the postmaster to prevent the
> situation where I've kill -9'd the postmaster."
Ok, while the script can certainly be used from the command line, its
primary purpos
Lamar Owen <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> If you think it's easy enough, enlighten the rest of us ;-).
> If postgres reported PGDATA on the command line it would be easy enough.
In ps status you mean? I don't think we are prepared to require ps
status functionality to let the
Tom Lane wrote:
> Lamar Owen wrote:
> > Postmaster can easily enough find out if zombie backends are 'out there'
> > during startup, right?
> If you think it's easy enough, enlighten the rest of us ;-).
If postgres reported PGDATA on the command line it would be easy enough.
> > What can postm
Lamar Owen <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Please note that the reason we're having this discussion at
>> all is that the init script may be used for purposes other than system
>> shutdown. So the argument that "it's going to happen anyway" is wrong.
> Believe it or not, you jus
Lamar Owen <[EMAIL PROTECTED]> writes:
> I missed something somehwere: wasn't the consensus a few weeks ago that
> pg_ctl shouldn't be used for a system initscript?
I thought there was some concern about whether pg_ctl is really "ready
for prime time". But I don't recall the details either.
> Bruce Momjian wrote:
> > This will try a pg_ctl shutdown for 60 seconds, then kill pg_ctl. You
> > would then need a kill of you own.
>
> I missed something somehwere: wasn't the consensus a few weeks ago that
> pg_ctl shouldn't be used for a system initscript? Or did I black out
> that day?
Bruce Momjian wrote:
> This will try a pg_ctl shutdown for 60 seconds, then kill pg_ctl. You
> would then need a kill of you own.
I missed something somehwere: wasn't the consensus a few weeks ago that
pg_ctl shouldn't be used for a system initscript? Or did I black out
that day? :-) I certain
Lamar Owen <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Postmaster down, backends alive is not a scenario we're currently
>> prepared for. We need a way to plug that gap.
> Postmaster can easily enough find out if zombie backends are 'out there'
> during startup, right?
If you think it's ea
Tom Lane wrote:
> Please note that the reason we're having this discussion at
> all is that the init script may be used for purposes other than system
> shutdown. So the argument that "it's going to happen anyway" is wrong.
Believe it or not, you just disproved your own statement that the
initsc
Tom Lane wrote:
> Yeah, but only a partial crash. If the admin finishes the job by
> killing the backends too, we're fine. Postmaster down, backends alive
> is not a scenario we're currently prepared for. We need a way to plug
> that gap.
Postmaster can easily enough find out if zombie backend
> Ok, since I can't seem to count on killproc's exact behavior, istm that
> I can:
> killproc postmaster -INT
> wait some number of seconds
> if postmaster still up
>killproc postmaster -TERM
> wait some number of seconds
> if postmaster STILL up
>killproc postmaster #and let the grim rea
Lamar Owen <[EMAIL PROTECTED]> writes:
> The last thing I want to do is
> wait too long on some platforms and not long enough on others.
The difficulty is to know how long the final checkpoint will take.
This depends on (at least) your hard disk speed and the number of
dirty buffers, so I think y
Tom Lane wrote:
> The tricky part of this is not to give up the ability to restart when
> there *has* been a crash.
But kill -9 effectively _is_ an admin-initiated crash.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11
---(end of broadcast)---
TIP 2
Tom Lane wrote:
> However, with an explicit kill level that doesn't happen: you get one
> signal of the specified value, no more. Possibly it would be better for
> the init script to send SIGINT (forcibly disconnect clients) instead of
> SIGTERM, however. So I'm now leaning to "killproc postmast
Lamar Owen <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> The tricky part of this is not to give up the ability to restart when
>> there *has* been a crash.
> But kill -9 effectively _is_ an admin-initiated crash.
Yeah, but only a partial crash. If the admin finishes the job by
killing the ba
Lamar Owen <[EMAIL PROTECTED]> writes:
> Is 6.1 this different from 6.2?
Scott sent me a copy of /etc/init.d/functions from his box, and it has
largely the same behavior (I hadn't read the whole code to notice that
it doesn't use the default killlevel...). What's actually happening
here is that
Nathan Myers wrote:
> Not to be a zealot, but this isn't _Linux_ boot-script code, it's
> _Red Hat_ boot-script code. Red Hat would like for us all to confuse
> the two, but they jes' ain't the same. (As a rule of thumb, where it
> works right, credit Linux; where it doesn't, blame Red Hat. :-)
Hiroshi Inoue <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> I think we need a stronger interlock to prevent this scenario, but I'm
>> unsure what it should be. Ideas?
> Seems the simplest way is to inhibit starting postmaster
> if the pid file exists.
Then we're unable to recover from a cras
Bruce Momjian wrote:
> > # TERM first, then KILL if not dead
> Yes, this seems like the proper way to do it.
Now to verify that 6.1 is the sameor different H The
mirrors of ftp.redhat.com (and, in fact, RedHat.com itself) no longer
have the updates or the original for 6.1
On Mon, Mar 05, 2001 at 08:55:41PM -0500, Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > killproc should send a kill -15 to the process, wait a few seconds for
> > it to exit. If it does not, try kill -1, and if that doesn't kill it,
> > then kill -9.
>
> Tell it to the Linux pe
Tom Lane wrote:
>
> Now, killing the postmaster -9 and not cleaning up the backends has
> always been a good way to shoot yourself in the foot, but up to now the
> worst thing that was likely to happen to you was isolated corruption in
> specific tables. In the brave new world of WAL the stakes a
>if [ "$notset" = "1" ] ; then
> if ps h $pid>/dev/null 2>&1; then
> # TERM first, then KILL if not dead
> kill -TERM $pid
> usleep 10
> if ps h $pid >/dev/null 2>&1 ; then
> sleep 1
> if ps h $pid >/dev/null 2>&1 ; then
>
Tom Lane wrote:
>
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > killproc should send a kill -15 to the process, wait a few seconds for
> > it to exit. If it does not, try kill -1, and if that doesn't kill it,
> > then kill -9.
>
> Tell it to the Linux people ... this is their boot-script code
Bruce Momjian <[EMAIL PROTECTED]> writes:
> killproc should send a kill -15 to the process, wait a few seconds for
> it to exit. If it does not, try kill -1, and if that doesn't kill it,
> then kill -9.
Tell it to the Linux people ... this is their boot-script code we're
talking about.
> Lamar Owen <[EMAIL PROTECTED]> writes:
> > Thanks for the headsup, Tom. Time to nix killproc and do something
> > cleaner -- compatible, but cleaner.
>
> As far as I could tell from the 6.1 scripts, it would work to do
>
> killproc postmaster -TERM
>
Yes, amazing it has a -9 default.
killproc should send a kill -15 to the process, wait a few seconds for
it to exit. If it does not, try kill -1, and if that doesn't kill it,
then kill -9.
> Tom Lane wrote:
> > checkpoint record. Clueless admins who resort to kill -9 as a routine
> > admin tool *will* lose their databases. Mor
Lamar Owen <[EMAIL PROTECTED]> writes:
> Thanks for the headsup, Tom. Time to nix killproc and do something
> cleaner -- compatible, but cleaner.
As far as I could tell from the 6.1 scripts, it would work to do
killproc postmaster -TERM
The problem is just that killproc has an overenth
Tom Lane wrote:
> checkpoint record. Clueless admins who resort to kill -9 as a routine
> admin tool *will* lose their databases. Moreover, the init scripts
> that are running around now are dangerous weapons if used with 7.1.
Thanks for the headsup, Tom. Time to nix killproc and do something
* Tom Lane <[EMAIL PROTECTED]> [010305 14:51] wrote:
>
> I think we need a stronger interlock to prevent this scenario, but I'm
> unsure what it should be. Ideas?
Re having multiple postmasters active by accident.
The sysV IPC stuff has some hooks in it that may help you.
One idea is to check
I have spent several days now puzzling over the corrupted WAL logfile
that Scott Parish was kind enough to send me from a 7.1beta4 crash.
It looks a lot like two different series of transactions were getting
written into the same logfile. I'd been digging like mad in the WAL
code to try to explai
69 matches
Mail list logo