I wrote:
> This patch is against HEAD --- I've not looked at how much adjustment
> it'll need for the back branches, but I'm sure there's some.
I've back-patched this now. If you want to test the patch, the v10-branch
version is at
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff
neeraj kumar writes:
> --> But the right answer is to fix it on the writing side.
> Yes I agree with this part. Even though there is very low probability, a
> process can still be killed in middle when writing. So what is your
> suggestion on how to recover from this automatically?
Here's a draf
--> But the right answer is to fix it on the writing side.
Yes I agree with this part. Even though there is very low probability, a
process can still be killed in middle when writing. So what is your
suggestion on how to recover from this automatically?
On Fri, May 10, 2019 at 9:14 AM Tom Lane wr
neeraj kumar writes:
> Also curious why query on pg_stat_activity is considering terminated
> process ?
The short answer to that is that this bug leaves shared memory in
a corrupt state. It's not really useful to worry about whether
readers should react differently to that --- you could spend en
Also curious why query on pg_stat_activity is considering terminated
process ? Irrespective of corrupted state or not, ideally query on
pg_stat_activity
should ignore already terminated process.
My 2 cents.
On Fri, May 10, 2019 at 8:01 AM neeraj kumar wrote:
> There are multiple ways see this p
There are multiple ways see this problem. One way I am seeing is : how
system will auto-recover from this particular state.
So ideally if st_procpid is set to zero it means this process is already
terminated, however it might be have left some corrupted information in
memory. So when other compone
We got more information about this issue. There is one backend process
still present into beentry which has changecount as odd value. However this
process is long gone/terminated. It means when this process was
killed/terminated its entry was not cleaned from beentry list. There seems
to be some sh
Tom, may be I didn't make my point clear.
There are two issues :
1) Why this value was left as odd
2) Why backend entry is still pending in beentry for backend process even
after it was killed/terminated.
I am talking about 2nd issue. My understanding is query on pg_stat_activity
goes via all back
neeraj kumar writes:
> Tom, may be I didn't make my point clear.
> There are two issues :
> 1) Why this value was left as odd
Because a function called by pgstat_bestart threw an error, is what
I'm guessing.
> 2) Why backend entry is still pending in beentry for backend process even
> after it w
neeraj kumar writes:
> We got more information about this issue. There is one backend process
> still present into beentry which has changecount as odd value. However this
> process is long gone/terminated. It means when this process was
> killed/terminated its entry was not cleaned from beentry l
Jeremy Schneider writes:
> Seems to me that at a minimum, this loop shouldn't go on forever. Even
> having an arbitrary, crazy high, hard-coded number of attempts before
> failure (like a million) would be better than spinning on the CPU
> forever - which is what we are seeing.
I don't think it's
Yes we use SSL to connect to DB.
Looked into code related to st_changecount :
https://github.com/postgres/postgres/blob/659e53498c3c04e4f400323c02bef98fe8d13ec8/src/include/pgstat.h#L1015-L1044
>From comment seems like each backend should have its own copy
of PgBackendStatus, it means st_changeco
neeraj kumar writes:
> Yes we use SSL to connect to DB.
Hm. I'm suspicious that one of the functions that fetch data for
an SSL connection threw an error. In particular, it doesn't look
to be hard at all to make X509_NAME_to_cstring fall over --- an
encoding conversion failure would do it, even
Took some time to get stack trace as we didn't had root permission.
Attaching stack trace of two process (out of many) stuck for same query
below[1][2]
Seems like call is unable to come out of this loop :
https://github.com/postgres/postgres/blob/master/src/backend/postmaster/pgstat.c#L3361-L3400
neeraj kumar writes:
> Took some time to get stack trace as we didn't had root permission.
> Attaching stack trace of two process (out of many) stuck for same query
> below[1][2]
Hmm, the line numbers in your stack traces don't agree with either v10
or HEAD branches for me. But assuming that you
neeraj kumar writes:
> We are using PG 10.6. We have one cron job that queries pg_stat_activity
> table to find out how many queries are running longer than X minutes and
> generate metrics.
> Query look like this :
> SELECT * FROM pg_stat_activity WHERE state='active'
> After some days, this qu
16 matches
Mail list logo