Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-08-17 Thread Michael Paquier
On Thu, Aug 4, 2016 at 6:56 AM, Michael Paquier wrote: > On Thu, Aug 4, 2016 at 2:34 AM, Alvaro Herrera > wrote: >> Michael Paquier wrote: >>> On Wed, Aug 3, 2016 at 7:21 AM, Alvaro Herrera >>> wrote: >> >>> > Why not capture both items in a single select, such as in the attached >>> > patch?

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-08-03 Thread Michael Paquier
On Thu, Aug 4, 2016 at 2:34 AM, Alvaro Herrera wrote: > Michael Paquier wrote: >> On Wed, Aug 3, 2016 at 7:21 AM, Alvaro Herrera >> wrote: > >> > Why not capture both items in a single select, such as in the attached >> > patch? >> >> Let me test this >> [... A while after ...] >> This looks

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-08-03 Thread Alvaro Herrera
Michael Paquier wrote: > On Wed, Aug 3, 2016 at 7:21 AM, Alvaro Herrera > wrote: > > Why not capture both items in a single select, such as in the attached > > patch? > > Let me test this > [... A while after ...] > This looks to work properly. 12 runs in a row have passed. Okay, applied t

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-08-02 Thread Michael Paquier
On Wed, Aug 3, 2016 at 7:21 AM, Alvaro Herrera wrote: > Michael Paquier wrote: > >> Here using pg_xlog_replay_resume() is not the correct solution because >> this would cause the node to finish recovery before we want it to, and >> so is recovery_target_action = 'promote'. If we look at the test,

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-08-02 Thread Alvaro Herrera
Michael Paquier wrote: > Here using pg_xlog_replay_resume() is not the correct solution because > this would cause the node to finish recovery before we want it to, and > so is recovery_target_action = 'promote'. If we look at the test, it > is doing the following when getting the TXID that is use

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-08-02 Thread Alvaro Herrera
Michael Paquier wrote: > On Tue, Aug 2, 2016 at 10:28 AM, Michael Paquier > wrote: > > There is still an issue with pg_basebackup when testing stream mode > > and replication slots. I am digging into this one now.. > > After 5 hours running this test in a row and 30 attempts torturing > hamster w

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-08-01 Thread Michael Paquier
On Tue, Aug 2, 2016 at 10:28 AM, Michael Paquier wrote: > There is still an issue with pg_basebackup when testing stream mode > and replication slots. I am digging into this one now.. After 5 hours running this test in a row and 30 attempts torturing hamster with a script running make check in an

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-08-01 Thread Michael Paquier
On Wed, Jul 27, 2016 at 10:00 AM, Michael Paquier wrote: > On Mon, Jul 25, 2016 at 10:05 PM, Michael Paquier > wrote: >> On Mon, Jul 25, 2016 at 2:52 PM, Michael Paquier >> wrote: >>> Ah, yes, and that's a stupid mistake. We had better use >>> replay_location instead of write_location. There is

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-07-26 Thread Michael Paquier
On Mon, Jul 25, 2016 at 10:05 PM, Michael Paquier wrote: > On Mon, Jul 25, 2016 at 2:52 PM, Michael Paquier > wrote: >> Ah, yes, and that's a stupid mistake. We had better use >> replay_location instead of write_location. There is a risk that >> records have not been replayed yet even if they hav

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-07-25 Thread Michael Paquier
On Mon, Jul 25, 2016 at 2:52 PM, Michael Paquier wrote: > Ah, yes, and that's a stupid mistake. We had better use > replay_location instead of write_location. There is a risk that > records have not been replayed yet even if they have been written on > the standby, so it is possible that the query

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-07-24 Thread Michael Paquier
On Mon, Jul 25, 2016 at 2:38 PM, Alvaro Herrera wrote: > Michael Paquier wrote: > Yeah, thanks, pushed. However this doesn't explain all the failures we see: I missed those ones, thanks for the reminder. > 1) In > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&dt=2016-07-14%2016

Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-07-24 Thread Alvaro Herrera
Michael Paquier wrote: > Lately hamster is failing every 4/5 days on the recovery regression > tests in 003 covering the recovery targets, with that: > # Postmaster PID for node "standby_2" is 20510 > # > Timed out while waiting for standby to catch up at > t/003_recovery_targets.pl line 36. > >

[HACKERS] Increasing timeout of poll_query_until for TAP tests

2016-07-24 Thread Michael Paquier
Hi all, Lately hamster is failing every 4/5 days on the recovery regression tests in 003 covering the recovery targets, with that: # Postmaster PID for node "standby_2" is 20510 # Timed out while waiting for standby to catch up at t/003_recovery_targets.pl line 36. Which means that poll_for_query