Re: Debugging leaking memory in Postgresql 13.2/Postgis 3.1
Hello Tom, On 31.03.2021 20:24, Tom Lane wrote: Based on nearby threads, it occurs to me to ask whether you have JIT enabled, and if so whether turning it off helps. There seems to be a known leak of the code fragments generated by that in some cases. That's it! I am quite surprised that a functionality, which is on by default does generate such a massive leak and goes sort of undetected. A single backend was leaking 250MB/hour, with my multiple connections it was 2GB. But exactly that happened. Doing a set jit=off immediately stopped the leak. You mentioned that this seems to be known. Do you have pointers to the relevant bug-tracker/thread? I would like to follow up on this. I have not measured the impact of jit, but in theory it could bring larger performance benefits. So having it enabled sounds like a good idea, once it stops leaking. I tried running Valgrind on postgres but I had not much success with it. processes seemed to terminate quite frequently. My last use of Valgrind is a while ago and my use-case back then was probably much simpler. Is it known which queries are leading to a leak? I still have the recording of mine, including explain. Would it help to narrow it further down to single queries which leak? Or is the JIT re-creating optimized code for each slightly modified one without freeing the old ones? So re-running the same query would not leak? https://downloads.osm-tools.org/postgresql-2021-04-03_183913.csv.gz Stephan
Re: How to deny access to Postgres when connected from host/non-local
Thanks, works. Sent from my iPhone > On Apr 3, 2021, at 11:02, Joe Conway wrote: > > On 4/2/21 7:06 PM, A. Reichstadt wrote: >> Hello, >> I try to deny access to all databases on my server if the user “postgres" >> tries to connect from a non-local host. Here is what I did in pg_hba.conf: >> # TYPE DATABASEUSERADDRESS METHOD >> # "local" is for Unix domain socket connections only >> local all all md5 >> # IPv4 local connections: >> hostall all 127.0.0.1/32md5 >> # IPv6 local connections: >> hostall all ::1/128 md5 >> # Allow replication connections from localhost, by a user with the >> # replication privilege. >> local replication all md5 >> hostreplication all 127.0.0.1/32md5 >> hostreplication all ::1/128 md5 >> hostall all 0.0.0.0/0 md5 >> local all postgrestrust >> hostall postgres0.0.0.0/0 reject >> But it continues to allow for Postgres to connect from anywhere through >> PGAdmin but also as a direct connection to port 5432. I also relaunched the >> server. This is version 12. >> What else do I have to do? >> Thanks for any help. > > See: > https://www.postgresql.org/docs/13/auth-pg-hba-conf.html > > In particular: > > "Each record specifies a connection type, a client IP > address range (if relevant for the connection type), > a database name, a user name, and the authentication > method to be used for connections matching these > parameters. The first record with a matching > connection type, client address, requested database, > and user name is used to perform authentication." > > So your reject line is never being reached. > > HTH, > > Joe > > -- > Crunchy Data - http://crunchydata.com > PostgreSQL Support for Secure Enterprises > Consulting, Training, & Open Source Development
Re: Is replacing transactions with CTE a good idea?
On Sun, Apr 4, 2021 at 10:02:20AM -0400, Dave Cramer wrote: > On Sun, 4 Apr 2021 at 09:12, Bruce Momjian wrote: > > OK, that makes sense, but I think it is wrong minded to think that this > > absolves one of taking isolation into account. > > > > When you make the first read you will still have to deal with all of the > > isolation issues > > I have no idea what you are saying above. Why is a SELECT-only CTE not > the same as a repeatable-read SELECT-only multi-statement transaction? > Are you saying that a SELECT in a CTE doesn't do SELECT FOR UPDATE? > > > No, but where is this documented ? Well, every query runs with a single snapshot, even WITH queries. We do document how non-SELECT WITH visibility is handled: https://www.postgresql.org/docs/13/sql-select.html The primary query and the WITH queries are all (notionally) executed at the same time. This implies that the effects of a data-modifying statement in WITH cannot be seen from other parts of the query, other than by reading its RETURNING output. If two such data-modifying statements attempt to modify the same row, the results are unspecified. A key property of WITH queries is that they are normally evaluated only once per execution of the primary query, even if the primary query refers to them more than once. In particular, data-modifying statements are guaranteed to be executed once and only once, regardless of whether the primary query reads all or any of their output. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: Is replacing transactions with CTE a good idea?
On Mon, 5 Apr 2021 at 14:18, Bruce Momjian wrote: > On Sun, Apr 4, 2021 at 10:02:20AM -0400, Dave Cramer wrote: > > On Sun, 4 Apr 2021 at 09:12, Bruce Momjian wrote: > > > OK, that makes sense, but I think it is wrong minded to think that > this > > > absolves one of taking isolation into account. > > > > > > When you make the first read you will still have to deal with all > of the > > > isolation issues > > > > I have no idea what you are saying above. Why is a SELECT-only CTE > not > > the same as a repeatable-read SELECT-only multi-statement > transaction? > > Are you saying that a SELECT in a CTE doesn't do SELECT FOR UPDATE? > > > > > > No, but where is this documented ? > > Well, every query runs with a single snapshot, even WITH queries. We do > document how non-SELECT WITH visibility is handled: > > https://www.postgresql.org/docs/13/sql-select.html > > The primary query and the WITH queries are all (notionally) > executed at > the same time. This implies that the effects of a data-modifying > statement in WITH cannot be seen from other parts of the query, > other > than by reading its RETURNING output. If two such data-modifying > statements attempt to modify the same row, the results are > unspecified. > > A key property of WITH queries is that they are normally evaluated > only > once per execution of the primary query, even if the primary query > refers to them more than once. In particular, data-modifying > statements > are guaranteed to be executed once and only once, regardless of > whether > the primary query reads all or any of their output. > > I think we are in agreement. My point was that WITH queries don't change the isolation semantics. I was pretty sure we didn't do a SELECT FOR UPDATE which would imply a lock. Dave Cramer www.postgres.rocks
Re: Is replacing transactions with CTE a good idea?
On Mon, Apr 5, 2021 at 02:32:36PM -0400, Dave Cramer wrote: > On Mon, 5 Apr 2021 at 14:18, Bruce Momjian wrote: > I think we are in agreement. My point was that WITH queries don't change the > isolation semantics. My point is that when you combine individual queries in a single WITH query, those queries run together with snaphot behavior as if they were in a repeatable-read multi-statement transaction. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: Primary keys and composite unique keys(basic question)
On Thu, Apr 1, 2021 at 10:26 PM Rob Sargent wrote: > > On 4/1/21 8:28 PM, Merlin Moncure wrote: > > > > This is one of the great debates in computer science and it is not > > settled. There are various tradeoffs around using a composite key > > derived from the data (aka natural key) vs generated identifiers. It's > > a complex topic with many facets: performance, organization, > > validation, and correctness are all relevant considerations. I would > > never use UUIDS for keys though. > > > > merlin > > > > > And, pray tell, for what exactly would you use universally unique > identifiers. I don't disagree that UUID are an ok choice in that scenario. I'll tell you what though, that scenario comes up fairly rarely. However, there are a couple of alternatives if you're curious. *) Generate ids from a generator service. This pattern is fairly common. It has some downsides (slower, more complicated inserts mainly) but works well in other ways. You can mitigate the performance downsides by allocated identifiers in blocks. *) Use sequences, but with a sequence id added as a composite or maksed into the integer. This works pretty well in practice. merlin
Re: Debugging leaking memory in Postgresql 13.2/Postgis 3.1
Stephan Knauss writes: > On 31.03.2021 20:24, Tom Lane wrote: >> Based on nearby threads, it occurs to me to ask whether you have JIT >> enabled, and if so whether turning it off helps. There seems to be >> a known leak of the code fragments generated by that in some cases. > That's it! > You mentioned that this seems to be known. Do you have pointers to the > relevant bug-tracker/thread? I would like to follow up on this. According to the v14 open issues page [1], there are a couple of older threads. I just added this one. regards, tom lane [1] https://wiki.postgresql.org/wiki/PostgreSQL_14_Open_Items
Re: Primary keys and composite unique keys(basic question)
On Fri, Apr 2, 2021 at 3:40 AM Laurenz Albe wrote: > > On Thu, 2021-04-01 at 21:28 -0500, Merlin Moncure wrote: > > I would never use UUIDS for keys though. > > That makes me curious for your reasons. > > I see the following disadvantages: > > - A UUID requires twice as much storage space as a bigint. > > - B-tree indexes are space optimized for inserting at the > rightmost leaf page, but UUIDs are random. > > - UUIDs are more expensive to generate. > > On the other hand, many processes trying to insert into > the same index page might lead to contention. > > Is there anything I have missed? It's a small thing, but UUIDs are absolutely not memorizable by humans; they have zero semantic value. Sequential numeric identifiers are generally easier to transpose and the value gives some clues to its age (of course, in security contexts this can be a downside). Performance-wise, UUIDS are absolutely horrible for data at scale as Tom rightly points out. Everything is randomized, just awful. There are some alternate implementations of UUID that mitigate this but I've never seen them used in the wild in actual code. merlin
Re: Primary keys and composite unique keys(basic question)
> > It's a small thing, but UUIDs are absolutely not memorizable by > humans; they have zero semantic value. Sequential numeric identifiers > are generally easier to transpose and the value gives some clues to > its age (of course, in security contexts this can be a downside). > I take the above as a definite plus. Spent too much of my life correcting others’ use of “remembered” id’s that just happened to perfectly match the wrong thing. > Performance-wise, UUIDS are absolutely horrible for data at scale as > Tom rightly points out. Everything is randomized, just awful. There > are some alternate implementations of UUID that mitigate this but I've > never seen them used in the wild in actual code. > That b-tree’s have been optimized to handle serial ints might be a considered a reaction to that popular (and distasteful) choice. Perhaps there should be a ’non-optimized’ option.
Re: Primary keys and composite unique keys(basic question)
On Mon, Apr 5, 2021 at 9:37 PM Rob Sargent wrote: > > It's a small thing, but UUIDs are absolutely not memorizable by > humans; they have zero semantic value. Sequential numeric identifiers > are generally easier to transpose and the value gives some clues to > its age (of course, in security contexts this can be a downside). > > I take the above as a definite plus. Spent too much of my life correcting > others’ use of “remembered” id’s that just happened to perfectly match the > wrong thing. > > Performance-wise, UUIDS are absolutely horrible for data at scale as > Tom rightly points out. Everything is randomized, just awful. There > are some alternate implementations of UUID that mitigate this but I've > never seen them used in the wild in actual code. > > > That b-tree’s have been optimized to handle serial ints might be a considered > a reaction to that popular (and distasteful) choice. Perhaps there should be > a ’non-optimized’ option. It's not just the BTree, but the heap as well. For large tables, you are pretty much guaranteed to read a page for each record you want to load via the key regardless of the pattern of access. It's incredibly wasteful regardless of the speed of the underlying storage fabric. Very few developers actually understand this. If computers were infinitely fast this wouldn't matter, but they aren't :-). merlin