dup(0) fails on Ubuntu 20.04 and macOS 10.15 with 13.0
Dear All, I've used PostgreSQL since version 9.x successfully on Linux, macOS and Windows. Today I've upgraded from 12.3 to 13.0 and suddenly I can not start the server any more on Ubuntu 20.04 (inside Docker on Ubuntu 18.04) and on macOS 10.15. I get reproducibly the error: 2020-10-05 11:48:19.720 CEST [84731] WARNING: dup(0) failed after 0 successes: Bad file descriptor 2020-10-05 11:48:19.720 CEST [84731] FATAL: insufficient file descriptors available to start server process 2020-10-05 11:48:19.720 CEST [84731] DETAIL: System allows 0, we need at least 58. 2020-10-05 11:48:19.720 CEST [84731] LOG: database system is shut down What makes this quite curious is that everything continues to work on Ubuntu 18.04, and Windows with Visual Studio 2019. I compile postgreSQL myself from source, but there are no patches or tweaks involved (that I could think relevant for this problem). I've searched for the particular error and understand that it is usually caused by system limits on new files(?) But in my case, the failing setups are relatively modern, powerful CI test machines, with virtually no load at the time of test. On Linux Ubuntu 20.04, `ulimit -n` shows `1048576` which seems also relatively high (but must be the default, not changed by me). Does this problem mean anything to anyone? I'm completely lost where to go from here, or what to try next :-( Help would be greatly appreciated! All the best, Mario Emmenlauer
Re: dup(0) fails on Ubuntu 20.04 and macOS 10.15 with 13.0
On 05.10.20 13:22, Mario Emmenlauer wrote: > I've used PostgreSQL since version 9.x successfully on Linux, macOS > and Windows. Today I've upgraded from 12.3 to 13.0 and suddenly I can > not start the server any more on Ubuntu 20.04 (inside Docker on Ubuntu > 18.04) and on macOS 10.15. > > I get reproducibly the error: > 2020-10-05 11:48:19.720 CEST [84731] WARNING: dup(0) failed after 0 > successes: Bad file descriptor > 2020-10-05 11:48:19.720 CEST [84731] FATAL: insufficient file descriptors > available to start server process > 2020-10-05 11:48:19.720 CEST [84731] DETAIL: System allows 0, we need at > least 58. > 2020-10-05 11:48:19.720 CEST [84731] LOG: database system is shut down > > What makes this quite curious is that everything continues to work on > Ubuntu 18.04, and Windows with Visual Studio 2019. > > I compile postgreSQL myself from source, but there are no patches or > tweaks involved (that I could think relevant for this problem). > > I've searched for the particular error and understand that it is usually > caused by system limits on new files(?) But in my case, the failing > setups are relatively modern, powerful CI test machines, with virtually > no load at the time of test. > > On Linux Ubuntu 20.04, `ulimit -n` shows `1048576` which seems also > relatively high (but must be the default, not changed by me). > > Does this problem mean anything to anyone? I'm completely lost where > to go from here, or what to try next :-( Help would be greatly > appreciated! I've just regression-tested 12.4 for the above problem, and I can now isolate that 12.3 and 12.4 do _not_ give me above problem whereas 13.0 _does_ give me above problem. I tried also to isolate commits that may be related, but could not find anything useful :-( Help would be greatly appreciated! All the best, Mario Emmenlauer
Is it possible to write a generic UPSERT?
Dear all, I hope this request is not too naiive. Please ignore if its not matching this lists focus, or if it just shows my lack of SQL knowledge. Any pointer to documentation would be appreciated! My question is about the "ON CONFLICT UPDATE" clause. We often have one dedicated condition that we want to achieve: Insert an item if it does not exist, or update all given fields if it does exist. Your excellent documentation makes it easy to implement this in SQL. But the statement is slightly complex to type, and I find me and my colleagues often spend more time on this than I would hope. Our two main challenges are: (1) we have to look up the uniqueness constraints on the table, and (2) we have to duplicate the insert statement in the UPDATE section again, because virtually all fields should get overwritten (except for the conflicting ones). On long inserts this can be quite annoying and error-prone. I can see how "ON CONFLICT" is very powerful. But that power seems often a burden for us. We would prefer something that is less manual effort for the specific use case. Basically, we would like: INSERT if not exist, and UPDATE _all_ non-conflicting fields in case of _any_ conflict In my (naiive) thinking, such a construct would cover 99% of our use cases. Or did other people make very different experiences? Now the question: I fail to generically implement this. I do not necessarily mean that this has to be supported by PostgreSQL. Any solution would work for us, be it in SQL, PostgreSQL or C/C++. For example an acceptable solution would be a C/C++ wrapper method that generates the statement, given the corresponding INSERT statement and the types and values of all arguments. The wrapper would also have access to the table definitions because we create all tables in code. We currently do not have concurrency concerns so a more complex solution using multiple statements would be fine. Has anybody ever done something like this? Is there an SQL way to achieve this? Or another programmatic way? All the best, Mario Emmenlauer -- BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203 Balanstr. 43 mailto: memmenlauer * biodataanalysis.de D-81669 München http://www.biodataanalysis.de/
Re: Is it possible to write a generic UPSERT?
On 12.11.20 17:55, Alban Hertroys wrote: >> On 12 Nov 2020, at 14:58, Mario Emmenlauer wrote: > > (…) > >> But the statement is slightly complex to type, and I find me and my >> colleagues often spend more time on this than I would hope. Our two >> main challenges are: >> (1) we have to look up the uniqueness constraints on the table, and >> (2) we have to duplicate the insert statement in the UPDATE section >>again, because virtually all fields should get overwritten >>(except for the conflicting ones). On long inserts this can be >>quite annoying and error-prone. >> >> I can see how "ON CONFLICT" is very powerful. But that power seems >> often a burden for us. We would prefer something that is less manual >> effort for the specific use case. Basically, we would like: >>INSERT if not exist, and >>UPDATE _all_ non-conflicting fields in case of _any_ conflict >> >> In my (naiive) thinking, such a construct would cover 99% of our >> use cases. Or did other people make very different experiences? > > (…) > >> Has anybody ever done something like this? Is there an SQL way to >> achieve this? Or another programmatic way? > > We generate the SQL @work based on the definitions in, IIRC, the > information_schema. It has tables for both the column lists per table and the > primary key definitions. > > With that, an SQL statement that returns the required SQL statement is easy > to generate, after which you can execute it either from a plpgsql execute > statement in a function or in a do-block. This is actually a very very interesting idea! I did not consider that we could completely generate the statements based on the actual table information from the information_schema. I need to give this a bit more thought but I very much like the idea, thanks for pushing me in a new direction! All the best, Mario Emmenlauer -- BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203 Balanstr. 43 mailto: memmenlauer * biodataanalysis.de D-81669 München http://www.biodataanalysis.de/
Re: Is it possible to write a generic UPSERT?
On 12.11.20 18:34, Michael Lewis wrote: > On Thu, Nov 12, 2020 at 6:58 AM Mario Emmenlauer <mailto:ma...@emmenlauer.de>> wrote: > > I can see how "ON CONFLICT" is very powerful. But that power seems > often a burden for us. We would prefer something that is less manual > effort for the specific use case. Basically, we would like: > INSERT if not exist, and > UPDATE _all_ non-conflicting fields in case of _any_ conflict > > > If you do not have significant ratio of HOT (heap only tuple) updates that > you want to preserve and you don't have sequences that are GENERATED AS ALWAYS > (rather than BY DEFAULT), you could consider just doing a DELETE where the > keys exist, then insert all the rows. It should be trivial to figure out the > primary > key or other unique index to match on. > > MERGE command is implemented for this use case in some DBMS, but not Postgres > (yet?). Actually I'm quite happy you suggest this, because its the one thing that also came to my mind :) I was not really sure if this is a sane idea. But it seems this could be the "easiest" way out, because its short (like "easy to write") and works for all our use cases. The main potential problem for my use case may be cascading deletes, that I think we may want to use in the long run. All the best, Mario Emmenlauer -- BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203 Balanstr. 43 mailto: memmenlauer * biodataanalysis.de D-81669 München http://www.biodataanalysis.de/
lib and share are installed differently, but why?
Dear All, I've been building and using PostgreSQL from source for a few years now, thanks for the awesome work! Since upgrading from 12.5 to 12.8, I've found a problem that I can not resolve myself. We build PostgreSQL twice, once to folder A, and once to folder B. This is because A is more easily deployable, while B holds all our development build targets. The only difference in the two builds is the change in `--prefix` for configure. Everything else is identical, and both builds start from a clean directory. But I see a difference in the installations: - The contents of `share` go to A/share/ vs B/share/postgresql/ - The contents of `lib` go to B/lib vs B/lib/postgresql/ I've played with all sorts of --libdir, --datarootdir and the like, but the behaviour stays the same. I fail to understand why in the case of --prefix=B (the target where we host a lot of other stuff), the install will use subdirectory 'postgresql', whereas for the --prefix=A (which is an empty directory) the install will not use a subdirectory. Whats more, when I try to run the server in A, initdb will complain that the file in A/share/postgresql/postgresql.bki is missing, which indeed is true (because the file is at A/share/postgresql.bki). Any help would be appreciated! All the best, Mario Emmenlauer -- BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203 Balanstr. 43 mailto: memmenlauer * biodataanalysis.de D-81669 München http://www.biodataanalysis.de/
Re: lib and share are installed differently, but why?
On 30.08.21 16:24, Tom Lane wrote: > Mario Emmenlauer writes: >> I've been building and using PostgreSQL from source for a few years >> now, thanks for the awesome work! > >> Since upgrading from 12.5 to 12.8, I've found a problem that I can >> not resolve myself. We build PostgreSQL twice, once to folder A, and >> once to folder B. This is because A is more easily deployable, while >> B holds all our development build targets. The only difference in >> the two builds is the change in `--prefix` for configure. Everything >> else is identical, and both builds start from a clean directory. > >> But I see a difference in the installations: >> - The contents of `share` go to A/share/ vs B/share/postgresql/ >> - The contents of `lib` go to B/lib vs B/lib/postgresql/ > > You didn't say exactly what "A" and "B" stand for, but I'll bet > that the string "A" includes "pgsql" or "postgres". Makefile.global > makes some installation path changes when that's not true, e.g. > > pkglibdir = $(libdir) > ifeq "$(findstring pgsql, $(pkglibdir))" "" > ifeq "$(findstring postgres, $(pkglibdir))" "" > override pkglibdir := $(pkglibdir)/postgresql > endif > endif Ahhh! This is very likely what I am tripped over! Thanks a million! Indeed my path A does *not* contain "postgres" while B is set to "A/postgresql". This explains the difference in the install. > This behavior is pretty ancient; I'm quite sure it didn't change > between 12.5 and 12.8. So I speculate that you made some change > in your build process that you didn't mention. I'm pretty sure there is no change in my build. But my provided information may still be incorrect. What I can see is that since postgres 12.8 I get the aforementioned error that the file "A/share/postgresql/postgresql.bki" would be missing. I did not get this error before, and therefore never investigated if the folder layout was the same between A and B. So, as you say, it may have been different since basically forever. But a new thing seems to be this error. After this find, I will change my build not to compile twice any more. Therefore the issue is resolved for me! But it may be good to keep this email for future reference, if someone else trips over the problem of a missing "postgresql.bki". Hint to this future person: does your install path contain the strings "pgsql" or "postgres", then search in this area for the error... All the best, and thanks a lot, Mario Emmenlauer
Re: dup(0) fails on Ubuntu 20.04 and macOS 10.15 with 13.0
On 05.10.20 14:35, Tom Lane wrote: > Mario Emmenlauer writes: >> I get reproducibly the error: >> 2020-10-05 11:48:19.720 CEST [84731] WARNING: dup(0) failed after 0 >> successes: Bad file descriptor > > Hmph. That code loop assumes that stdin exists to be duplicated, > but maybe if it had been closed, you'd get this error. > > However, that logic hasn't changed in decades, and we've not heard > complaints about it before. Are you starting the server in some > unusual way? Replying to a very old thread here: I could indeed trace the problem of the failing `dup(0)` back to how we start the server! We start the server from an executable that closes stdin very early on, and this seems to lead to the problem. We solved it by not closing stdin. But for future reference, if other people may be affected by this, the code could probably be modified to revert to stdout or stderr or a temporary file in case stdin is not available (guessing here...). All the best, Mario Emmenlauer