Hi all, A quick make check with Postgres 11 and 12 for src/test/ssl/ shows a lot of difference in run time, using the same set of options with SSL and the same compilation flags (OpenSSL 1.1.1f, with debugging and assertions enabled among other things FWIW), with 12 taking up to five minutes to complete and 11 finishing as a matter of seconds for me.
I have spent a couple of hours on that, to find out that libpq tries to initialize a GSS context where the client remains stuck: #9 0x00007fcd839bf72c in krb5_expand_hostname () from /usr/lib/x86_64-linux-gnu/libkrb5.so.3 #10 0x00007fcd839bf9e0 in krb5_sname_to_principal () from /usr/lib/x86_64-linux-gnu/libkrb5.so.3 #11 0x00007fcd83ad55b4 in ?? () from /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2 #12 0x00007fcd83ac0a98 in ?? () from /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2 #13 0x00007fcd83ac200f in gss_init_sec_context () from /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2 #14 0x00007fcd8423b24d in pqsecure_open_gss (conn=0x5582fa8cad90) at fe-secure-gssapi.c:626 #15 0x00007fcd8421cd2b in PQconnectPoll (conn=0x5582fa8cad90) at fe-connect.c:3165 #16 0x00007fcd8421b311 in connectDBComplete (conn=0x5582fa8cad90) at fe-connect.c:2182 #17 0x00007fcd84218c1f in PQconnectdbParams (keywords=0x5582fa8cacf0, values=0x5582fa8cad40, expand_dbname=1) at fe-connect.c:647 #18 0x00005582f8a81c87 in main (argc=8, argv=0x7ffe5ddb9df8) at startup.c:266 However this makes little sense, why would libpq do that in the context of an OpenSSL connection? Well, makeEmptyPGconn() does that, which means that libpq would try by default to use GSS just if libpq is *built* with GSS: #ifdef ENABLE_GSS conn->try_gss = true; #endif It is possible to enforce this flag to false by using gssencmode=disable, but that's not really user-friendly in my opinion because nobody is going to remember that for connection strings with SSL settings so a lot of application are taking a performance hit at connection because of that in my opinion. I think that's also a bad idea from the start to assume that we have to try GSS by default, as any new code path opening a secured connection may fail into the trap of attempting to use GSS if this flag is not reset. Shouldn't we try to set this flag to false by default, and set it to true only if necessary depending on gssencmode? A quick hack switching this flag to false in makeEmptyPGconn() gives back the past performance to src/test/ssl/, FWIW. Looking around, it seems to me that there is a second issue as of PQconnectPoll(), where we don't reset the state machine correctly for try_gss within reset_connection_state_machine, and instead HEAD does it in connectDBStart(). Also, I have noted a hack as of pqsecure_open_gss() which does that: /* * We're done - hooray! Kind of gross, but we need to disable SSL * here so that we don't accidentally tunnel one over the other. */ #ifdef USE_SSL conn->allow_ssl_try = false; #endif And that looks like a rather bad idea to me.. Thanks, -- Michael
signature.asc
Description: PGP signature