Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-04-19 Thread Michael Paquier
On Mon, Apr 18, 2016 at 4:48 PM, Michael Paquier wrote: > Another, perhaps, better idea is to add some more extra logic to > pg_conn as follows: > boolis_emergency; > PGresult *emergency_result; > boolis_emergency_consumed; > So as when an OOM shows up, is_emergency is set to true. Then a

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-04-18 Thread Michael Paquier
On Wed, Apr 6, 2016 at 3:09 PM, Michael Paquier wrote: > On Sat, Apr 2, 2016 at 12:30 AM, Tom Lane wrote: >> I wrote: >>> So the core of my complaint is that we need to fix things so that, whether >>> or not we are able to create the PGRES_FATAL_ERROR PGresult (and we'd >>> better consider the be

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-04-05 Thread Michael Paquier
On Sat, Apr 2, 2016 at 12:30 AM, Tom Lane wrote: > I wrote: >> So the core of my complaint is that we need to fix things so that, whether >> or not we are able to create the PGRES_FATAL_ERROR PGresult (and we'd >> better consider the behavior when we cannot), ... > > BTW, the real Achilles' heel o

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-04-01 Thread Tom Lane
I wrote: > So the core of my complaint is that we need to fix things so that, whether > or not we are able to create the PGRES_FATAL_ERROR PGresult (and we'd > better consider the behavior when we cannot), ... BTW, the real Achilles' heel of any attempt to ensure sane behavior at the OOM limit is

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-04-01 Thread Tom Lane
Amit Kapila writes: > Very valid point. So, if we see with patch, I think libpq will be > in PGASYNC_COPY_XXX state after such a failure and next time when it tries > to again execute statement, it will end copy mode and then allow to proceed > from there. This design is required for other purp

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-04-01 Thread Tom Lane
Michael Paquier writes: > On Fri, Apr 1, 2016 at 12:48 PM, Tom Lane wrote: >> Anyway, the short of my review is that we need more clarity of thought >> about what state libpq is in after a failure like this, and what that >> state looks like to the application, and how that should affect how >> l

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-04-01 Thread Amit Kapila
On Fri, Apr 1, 2016 at 10:34 AM, Michael Paquier wrote: > On Fri, Apr 1, 2016 at 12:48 PM, Tom Lane wrote: > > I thought about this patch a bit more... > > > > When I first looked at the patch, I didn't believe that it worked at > > all: it failed to return a PGRES_COPY_XXX result to the applicat

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-31 Thread Michael Paquier
On Fri, Apr 1, 2016 at 12:48 PM, Tom Lane wrote: > I thought about this patch a bit more... > > When I first looked at the patch, I didn't believe that it worked at > all: it failed to return a PGRES_COPY_XXX result to the application, > and yet nonetheless switched libpq's asyncStatus to PGASYNC_

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-31 Thread Tom Lane
I thought about this patch a bit more... When I first looked at the patch, I didn't believe that it worked at all: it failed to return a PGRES_COPY_XXX result to the application, and yet nonetheless switched libpq's asyncStatus to PGASYNC_COPY_XXX? Wouldn't things be hopelessly confused? I tried

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-31 Thread Tom Lane
Aleksander Alekseev writes: > +1, same here. Lets see whats committer's opinion. I fooled around with this patch quite a bit but could not bring myself to pull the trigger, because I think it fundamentally breaks applications that follow the "repeat PQgetResult until NULL" rule. The only reason

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-30 Thread Aleksander Alekseev
> I think this patch needs to be looked upon by committer now. I have > done review and added some code in this patch as well long back, just > see the e-mail [1], patch is just same as it was in October 2015. I > think myself and Michael are in agreement that this patch solves the > reported pro

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-29 Thread Amit Kapila
On Tue, Mar 29, 2016 at 8:31 PM, David Steele wrote: > Hi Amit, > > On 3/22/16 3:37 AM, Michael Paquier wrote: > > Hope this brings some light in. >> > > Do you know when you'll have time to respond to Michael's last email? I've > marked this patch "waiting on author" in the meantime. > > I think

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-29 Thread David Steele
On 3/29/16 8:21 PM, Michael Paquier wrote: > On Wed, Mar 30, 2016 at 12:01 AM, David Steele wrote: >> Hi Amit, >> >> On 3/22/16 3:37 AM, Michael Paquier wrote: >> >>> Hope this brings some light in. >> >> >> Do you know when you'll have time to respond to Michael's last email? I've >> marked this

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-29 Thread Michael Paquier
On Wed, Mar 30, 2016 at 12:01 AM, David Steele wrote: > Hi Amit, > > On 3/22/16 3:37 AM, Michael Paquier wrote: > >> Hope this brings some light in. > > > Do you know when you'll have time to respond to Michael's last email? I've > marked this patch "waiting on author" in the meantime. Shouldn't

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-29 Thread David Steele
Hi Amit, On 3/22/16 3:37 AM, Michael Paquier wrote: Hope this brings some light in. Do you know when you'll have time to respond to Michael's last email? I've marked this patch "waiting on author" in the meantime. Thanks, -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing li

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-22 Thread Michael Paquier
On Tue, Mar 22, 2016 at 2:19 PM, Amit Kapila wrote: > On Tue, Mar 22, 2016 at 9:46 AM, Tom Lane wrote: >> >> Amit Kapila writes: >> > On Mon, Mar 21, 2016 at 10:13 PM, Robert Haas >> > wrote: >> >> It is very difficult to believe that this is a good idea: >> >> >> >> --- a/src/backend/replicati

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-21 Thread Amit Kapila
On Tue, Mar 22, 2016 at 9:46 AM, Tom Lane wrote: > > Amit Kapila writes: > > On Mon, Mar 21, 2016 at 10:13 PM, Robert Haas wrote: > >> It is very difficult to believe that this is a good idea: > >> > >> --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c > >> +++ b/src/backend/repl

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-21 Thread Tom Lane
Amit Kapila writes: > On Mon, Mar 21, 2016 at 10:13 PM, Robert Haas wrote: >> It is very difficult to believe that this is a good idea: >> >> --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c >> +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c >> @@ -445,6 +445,7

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-21 Thread Amit Kapila
On Mon, Mar 21, 2016 at 10:13 PM, Robert Haas wrote: > > On Tue, Mar 1, 2016 at 12:38 AM, Michael Paquier > wrote: > > Thoughts? I have registered that in the CF app, and a patch is attached. > > It is very difficult to believe that this is a good idea: > > --- a/src/backend/replication/libpqwalr

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-21 Thread Robert Haas
On Tue, Mar 1, 2016 at 12:38 AM, Michael Paquier wrote: > Thoughts? I have registered that in the CF app, and a patch is attached. It is very difficult to believe that this is a good idea: --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c +++ b/src/backend/replication/libpqwalrec

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-09 Thread Michael Paquier
On Thu, Mar 10, 2016 at 12:12 AM, Alvaro Herrera wrote: > Aleksander Alekseev wrote: >> pg_receivexlog: could not send replication command "START_REPLICATION": >> out of memory pg_receivexlog: disconnected; waiting 5 seconds to try >> again pg_receivexlog: starting log streaming at 0/100 (time

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-09 Thread Alvaro Herrera
Aleksander Alekseev wrote: > pg_receivexlog: could not send replication command "START_REPLICATION": > out of memory pg_receivexlog: disconnected; waiting 5 seconds to try > again pg_receivexlog: starting log streaming at 0/100 (timeline 1) > > Breakpoint 1, getCopyStart (conn=0x610180, copyt

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-09 Thread Aleksander Alekseev
Hello, Michael Thanks a lot for steps to reproduce you provided. I tested your path on Ubuntu Linux 14.04 LTS (GCC 4.8.4) and FreeBSD 10.2 RELEASE (Clang 3.4.1). In both cases patch applies cleanly, there are no warnings during compilation and all regression tests pass. A few files are not proper

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-03 Thread Michael Paquier
On Fri, Mar 4, 2016 at 12:59 AM, Aleksander Alekseev wrote: >> The easiest way to perform tests with this patch is to take a debugger >> and enforce the malloc'd pointers to NULL in the code paths. > > I see. Still I don't think it's an excuse to not provide clear steps to > reproduce an issue. As

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-03 Thread Aleksander Alekseev
Hello, Michael > The easiest way to perform tests with this patch is to take a debugger > and enforce the malloc'd pointers to NULL in the code paths. I see. Still I don't think it's an excuse to not provide clear steps to reproduce an issue. As I see it anyone should be able to easily check your

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-03 Thread Michael Paquier
On Thu, Mar 3, 2016 at 10:18 PM, Aleksander Alekseev wrote: > I didn't checked your patch in detail yet but here is a thought I would > like to share. > > In my experience usually it takes number of rewrites before patch will > be accepted. To make sure that after every rewrite your patch still >

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-03 Thread Aleksander Alekseev
Hello, Michael I didn't checked your patch in detail yet but here is a thought I would like to share. In my experience usually it takes number of rewrites before patch will be accepted. To make sure that after every rewrite your patch still solves an issue you described you should probably provid