On 14.07.2019 8:03, Thomas Munro wrote:
On Tue, Jul 9, 2019 at 8:30 AM Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:
Rebased version of the patch is attached.
Thanks for including nice documentation in the patch, which gives a
good overview of what's going on.  I haven't read any code yet, but I
took it for a quick drive to understand the user experience.  These
are just some first impressions.

I started my server with -c connection_proxies=1 and tried to connect
to port 6543 and the proxy segfaulted on null ptr accessing
port->gss->enc.  I rebuilt without --with-gssapi to get past that.

First of all a lot of thanks for your review.

Sorry, I failed to reproduce the problem with GSSAPI.
I have configured postgres with --with-openssl --with-gssapi
and then try to connect to the serverwith psql using the following connection string:

psql "sslmode=require dbname=postgres port=6543 krbsrvname=POSTGRES"

There are no SIGFAULS and port->gss structure is normally initialized.

Can you please explain me more precisely how to reproduce the problem (how to configure postgres and how to connect to it)?


Using SELECT pg_backend_pid() from many different connections, I could
see that they were often being served by the same process (although
sometimes it created an extra one when there didn't seem to be a good
reason for it to do that).  I could see the proxy managing these
connections with SELECT * FROM pg_pooler_state() (I suppose this would
be wrapped in a view with a name like pg_stat_proxies).  I could see
that once I did something like SET foo.bar = 42, a backend became
dedicated to my connection and no other connection could use it.  As
described.  Neat.

Obviously your concept of tainted backends (= backends that can't be
reused by other connections because they contain non-default session
state) is quite simplistic and would help only the very simplest use
cases.  Obviously the problems that need to be solved first to do
better than that are quite large.  Personally I think we should move
all GUCs into the Session struct, put the Session struct into shared
memory, and then figure out how to put things like prepared plans into
something like Ideriha-san's experimental shared memory context so
that they also can be accessed by any process, and then we'll mostly
be tackling problems that we'll have to tackle for threads too.  But I
think you made the right choice to experiment with just reusing the
backends that have no state like that.
That was not actually my idea: it was proposed by Dimitri Fontaine.
In PgPRO-EE we have another version of builtin connection pooler
which maintains session context and allows to use GUCs, prepared statements
and temporary tables in pooled sessions.
But the main idea of this patch was to make connection pooler less invasive and
minimize changes in Postgres core. This is why I have implemented proxy.


On my FreeBSD box (which doesn't have epoll(), so it's latch.c's old
school poll() for now), I see the connection proxy process eating a
lot of CPU and the temperature rising.  I see with truss that it's
doing this as fast as it can:

poll({ 13/POLLIN 17/POLLIN|POLLOUT },2,1000)     = 1 (0x1)

Ouch.  I admit that I had the idea to test on FreeBSD because I
noticed the patch introduces EPOLLET and I figured this might have
been tested only on Linux.  FWIW the same happens on a Mac.

Yehh.
This is really the problem. In addition to FreeBSD and MacOS, it also takes place at Win32. I have to think more how to solve it. We should somehow emulate "edge-triggered" more for this system. The source of the problem is that proxy is reading data from one socket and writing it in another socket. If write socket is busy, we have to wait until is is available. But at the same time there can be data available for input, so poll will return immediately unless we remove read event for this socket. Not here how to better do it without changing
WaitEvenSet API.



   C:\projects\postgresql\src\include\../interfaces/libpq/libpq-int.h(33):
fatal error C1083: Cannot open include file: 'pthread-win32.h': No
such file or directory (src/backend/postmaster/proxy.c)
[C:\projects\postgresql\postgres.vcxproj]

I will investigate the problem with Win32 build once I get image of Win32 for VirtualBox.

These relative includes in proxy.c are part of the problem:

#include "../interfaces/libpq/libpq-fe.h"
#include "../interfaces/libpq/libpq-int.h"

I didn't dig into this much but some first reactions:
I have added
override CPPFLAGS :=  $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port

in Makefile for postmaster in order to fix this problem (as in interface/libpq/Makefile). But looks like it is not enough. As I wrote above I will try to solve this problem once I get access to Win32 environment.

1.  I see that proxy.c uses libpq, and correctly loads it as a dynamic
library just like postgres_fdw.  Unfortunately it's part of core, so
it can't use the same technique as postgres_fdw to add the libpq
headers to the include path.

2.  libpq-int.h isn't supposed to be included by code outside libpq,
and in this case it fails to find pthead-win32.h which is apparently
expects to find in either the same directory or the include path.  I
didn't look into what exactly is going on (I don't have Windows
either) but I think we can say the root problem is that you shouldn't
be including that from backend code.

Looks like proxy.c has to be moved somewhere outside core?
May be make it an extension? But it may be not so easy to implement because
proxy has to be tightly integrated with postmaster.


Reply via email to