Sorry, my client environment is Linux. My current theory is that my clients are running out of available ephemeral ports, like in this thread: http://dba.stackexchange.com/questions/59650/pgbouncer-works-great-but-occasionally-becomes-unavailable (but I"m not currently using pg bouncer). I tried pg bouncer before and had the same errors, which in retrospect makes the client-side issue seem more likely. Are there any configuration variables I can set to reduce the number of ephemeral ports required in the postgresql client libraries? Otherwise, I will attempt to reconfigure the OS of the client machines tomorrow morning.
Thanks, Steve On Tue, Mar 29, 2016 at 4:44 PM Adrian Klaver <adrian.kla...@aklaver.com> wrote: > On 03/29/2016 01:28 PM, Stephen Constable wrote: > > My apologies, I'm not sure what part of the networking stack the > > messages are coming from. It also states: > > """ > > could not connect to server: Cannot assign requested address > > Is the server running on host "<hostname>" and accepting > > TCP/IP connections on port <port>? > > """ > > Alright I lied, the above is a Postgres error message. I am just not > used to seeing 'Cannot assign requested address'. Turns out it is in > interfaces/libpq/win32.c. > > So your client is running on Windows? > > > > This error is only printed under a 32-job load, never a single job load. > > > > The processes are indeed connecting over a local network. > > > > I have only enabled the logging of connections and disconnections since > > I figured that would be the most telling :) perhaps that was not the > > best idea. but, FYI, I see over 5000 such notices in a single minute. > > I will reconfigure the logging to be more verbose. > > > > Thanks, > > Steve > > > > On Tue, Mar 29, 2016 at 4:21 PM Adrian Klaver <adrian.kla...@aklaver.com > > <mailto:adrian.kla...@aklaver.com>> wrote: > > > > On 03/29/2016 01:10 PM, Stephen Constable wrote: > > > Hi All, > > > > > > I'm a new-ish sysadmin working on porting legacy scientific code > > from a > > > local server/client to new supercomputer environment. My work is > > mostly > > > done, except that my postgres database doesn't seem to be able to > > keep > > > up with the new environment. The application is written in-house > > in a > > > mixture of FORTAN 77 and C, and uses postgres BLOBS as its main > data > > > store. This application in particular only reads from the > > database, it > > > never writes, which *should* make it easy to scale. > > > > > > My main problem is that this client application is unable to > > connect to > > > the database under a modest load (32 simultaneous jobs). The > client > > > error logs print out messages like "could not connect to server: > > Cannot > > > assign requested address" and "Cannot connect to database > > [runlog]!!!" > > > (an important database of ours). The "cannot assign requested > > address" > > > > Well those do not look like Postgres error messages to me, so the > first > > thing would be to determine what part of the stack is generating > them. > > > > Is the client software connecting to the database over a network? > > > > Are you using connection pooling? > > > > > message makes me think it's a configuration issue. The logs are > > flooded > > > with hundreds of connection and disconnection notices per > > second. This > > > > Might want to turn off logging connections/disconnections: > > > > > http://www.postgresql.org/docs/9.4/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT > > > > log_connections (boolean) > > > > log_disconnections (boolean) > > > > > same code and configuration runs fine on our mid-2000's Solaris > > 10 box > > > with postgres 8.4 (albeit very slowly) but totally fails with > these > > > connection errors on a modern Dell system running CentOS 7 or > > FreeBSD 10 > > > (I tested both) with postgres 9.4. > > > > > > While the database is under load (and jobs are actively failing), > > select > > > count(*) from pg_stat_activity returns 30-34 ish connections, show > > > max_connections returns 100, and show > superuser_reserved_connections > > > shows 3. My only other hint is that right after a fresh install > of > > > CentOS 7 my job success rate was around 50%, and now it has > > approached > > > approximately 5%, so something is changing over time. > > > > > > Does anyone have any advice or experience with similar issues? > > > > What else does the Postgres log show besides the > > connections/disconnections, that might be of interest? > > > > What does the system log show? > > > > > > > > Thanks, > > > Steve > > > > > > > > > -- > > Adrian Klaver > > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> > > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com >