Re: CTE, lateral or jsonb_object_agg ?
On Sat, May 20, 2023 at 4:43 PM Marcos Pegoraro wrote: > I have a table like pg_settings, so records have name and value. > Hi. Maybe I'm missing something, but why aren't you simply doing: select name, varvalue from sys_var where name = any($1) and binding your 4 (in your examples) or 10 or 1 name(s) as a text array (i.e. text[])? With a prepared statement you plan only once. You can also avoid selecting names by unnest'ing the text-array with ordinality and join+sort on it, to get the values in name order (i.e. same order as $1 you control). See my recent thread where I learned about that technique. Generating SQL text dynamically when you can avoid it with proper binding and appropriate SQL is leaving performance on the table IMHO. --DD >
Re: DBeaver postgres localhost access
On 5/20/23 09:09, Pedro Gonçalves wrote: Hi. Good afternoon. I’m having dificulties with localhost DBeaver postgres training account. Had access to it and changed password, that presently don’t remember. What can I do to get access again? From DBeaver I was told to address this request to PostgreSQL. Get your DBA to change the password for you. -- Born in Arizona, moved to Babylonia.
Re: Unexpected cross-database vacuum impact with hot_standby_feedback=on
On Fri, May 19, 2023 at 9:50 PM Laurenz Albe wrote: > Yes, that's what I would expect. There is only one "backend_xmin" in > "pg_stat_replication", which corresponds to the snapshot held by the oldest > query in any database on the standby server. > Thanks for the pointer to pg_stat_replication. In my situation there aren't any queries against the standby server, both queries are running against the primary, but backend_xmin does contain the open transaction's txid_current() value. Does the standby somehow report the txid of the open transaction on the primary back to the primary as if it were running on the standby? Is it an unavoidable limitation of the standby feedback mechanism that xmin is not tracked by database? It was certainly a surprise to me to find that hot_standby_feedback can trigger cross-database dependencies like this. Thanks, Owen.
Fwd: PGCon remote attendance
Hi pgsql-general, Will there be any option to attend the upcoming PG conference remotely? Will the talks and papers be posted online following the conference, if not? Thank you, Joe Hammerman
Re: Database size will be Huge difference while Restoring
On 5/22/23 04:38, Rajmohan Masa wrote: Hi Adrian, I Found one thing in my base Directory. Generally we are having OID in the Base Directory with a unique OID But in my machine I found some object Id's with sequence like 121193051,121193051.1 121193051.1200 and each file having the same size (1GB) and same time. By using the *pg_class* catalog, I tried to check those object names with OID's but I'm unable to get any objects with those id's. I applied Vacuum, Vacuum Full On that particular Database but the size is still the same on these objects. *My questions are,* 1. Why object id's (OID) having like this (121193051,121193051.1,121193051.2,121193051.3,121193051.4...121193051.1200) with the same size? 2. Why are these oid's not having any objects ? 3. I think maybe someone deleted/dropped these objects but why do these objects occupy the same size even if I applied *Vacuum / vacuum full.?* It all explained here: https://www.postgresql.org/docs/current/storage-file-layout.html In particular: Caution Note that while a table's filenode often matches its OID, this is not necessarily the case; some operations, like TRUNCATE, REINDEX, CLUSTER and some forms of ALTER TABLE, can change the filenode while preserving the OID. Avoid assuming that filenode and table OID are the same. Also, for certain system catalogs including pg_class itself, pg_class.relfilenode contains zero. The actual filenode number of these catalogs is stored in a lower-level data structure, and can be obtained using the pg_relation_filenode() function. * Here I'm attaching Document, please check for your reference. Thanks & Regards, Rajamohan Masa * * -- Adrian Klaver adrian.kla...@aklaver.com
Re: DBeaver postgres localhost access
On 5/20/23 07:09, Pedro Gonçalves wrote: Hi. Good afternoon. I’m having dificulties with localhost DBeaver postgres training account. Had access to it and changed password, that presently don’t remember. What can I do to get access again? From DBeaver I was told to address this request to PostgreSQL. 1) Find the pg_hba.conf file for the server and change the auth method to trust for local connections, where local means socket connection. This is shown in example here: https://www.postgresql.org/docs/current/auth-pg-hba-conf.html Example 21.1. Example pg_hba.conf Entries. 2) Reload/restart server 3) Using psql connect with psql -d -U -p Once you are connected you can use ALTER USER WITH PASSWORD 'some_pwd'; This assumes the you connect as has sufficient privileges to do the the ALTER. One way to ensure that is use the postgres user. 4) Once you have a new password set and shown to work then decide whether you want local to remain set at trust or change it to something more secure. If you change it then you will need to repeat 2). Kind Regards, Pedro Gonçalves -- Adrian Klaver adrian.kla...@aklaver.com
Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?
Thanks all for the discussions. It sounds like there are different questions to clear before we can get to a conclusion on if per-database KEK is possible or not. First question - do we, as a community, see the value of the proposal and do we believe that value is big enough for us to make any necessary changes to PostgreSQL? Another way to ask is, do we want/need to keep the parity with other database solutions that already have such ability? If the answer to the first question is no, then I will stop here. However, if the answer is yes or "it depends", then the second question is on how - It seems that per-cluster KEK can be a a good alternative to achieve the same as per-db KEK, however, there are still some shared area between clusters. Is it possible to further split this shared area? Are there more shared memory/storage between clusters? > In the proposed TDE work, yes, each cluster (which is an entier >> PostgreSQL system) would be able to have its own KEK. >> >> There's a bit of overhead from each cluster and each would have their >> own shared buffers pool of memory and such. >> > Thanks Tony
Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?
> On May 22, 2023, at 11:02, Tony Xu wrote: > there are still some shared area between clusters. That's not quite right. A PostgreSQL cluster (in the traditional sense, which means one PostgreSQL server handling a particular endpoint) is isolated from any other clusters on the same machine. If the individual clusters have their own underlying volumes, those can be encrypted independently of any other cluster. This is not quite as elegant as each database in a cluster having its own encryption key, but it does have the advantage that it can be deployed right now. A single cluster does have shared areas between databases, and those are not trivial: The global system catalogs, the write-ahead log, and the commit log, just for a start. The global system catalogs include users and roles. The effort required to split those up is very significant. And, to be somewhat blunt, is Rubrik prepared to pay for it? This engineering effort needs to be funded by someone, since engineers have to eat. This is not to say that any feature someone is willing to pay for will make it into PostgreSQL, but an effort of this size, to have any hope of reaching the point of being accepted, will need someone to pay the salaries of the people working on it. Best, -- Christophe
Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?
Thanks Christophe for the clarification. That's not quite right. A PostgreSQL cluster (in the traditional sense, > which means one PostgreSQL server handling a particular endpoint) is > isolated from any other clusters on the same machine. > Thanks. I think I had a misunderstanding over the "cluster" eariler. So a PostgreSQL cluster is a PostgreSQL server that we're talking about in general. This means per-DB KEY within a cluster/server is impossible without splitting the whole shared storage among databases, which is indeed a significant amount of work. > And, to be somewhat blunt, is Rubrik prepared to pay for it? This > engineering effort needs to be funded by someone, since engineers have to > eat. This is not to say that any feature someone is willing to pay for > will make it into PostgreSQL, but an effort of this size, to have any hope > of reaching the point of being accepted, will need someone to pay the > salaries of the people working on it. > I cannot speak on behalf of Rubrik - I'm just exploring PostgreSQL as a solution for one of my team's projects thus the questions. Nothing is finalized yet but we prefer a managed solution provided by some cloud infra providers so the context I'm getting here will allow me to better understand our provider's solution and their road map. Thanks Tony
Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?
On 5/22/23 14:22, Tony Xu wrote: Thanks Christophe for the clarification. That's not quite right. A PostgreSQL cluster (in the traditional sense, which means one PostgreSQL server handling a particular endpoint) is isolated from any other clusters on the same machine. Thanks. I think I had a misunderstanding over the "cluster" eariler. So a PostgreSQL cluster is a PostgreSQL server that we're talking about in general. This means per-DB KEY within a cluster/server is impossible without splitting the whole shared storage among databases, which is indeed a significant amount of work. "Cluster" is to Postgres as "Instance" is to SQL Server And, to be somewhat blunt, is Rubrik prepared to pay for it? This engineering effort needs to be funded by someone, since engineers have to eat. This is not to say that any feature someone is willing to pay for will make it into PostgreSQL, but an effort of this size, to have any hope of reaching the point of being accepted, will need someone to pay the salaries of the people working on it. I cannot speak on behalf of Rubrik - I'm just exploring PostgreSQL as a solution for one of my team's projects thus the questions. Nothing is finalized yet but we prefer a managed solution provided by some cloud infra providers so the context I'm getting here will allow me to better understand our provider's solution and their road map. RDS Postgresql would do the job just fine. And since you can't get to the files (only access it via port 5432 and aws cli/web, there's no need for TDE. Azure is probably the same. -- Born in Arizona, moved to Babylonia.
Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?
(please read http://deb.li/quote and don’t top-post) On Mon, 22 May 2023, Tony Xu wrote: >First question - do we, as a community, see the value of the proposal and >do we believe that value is big enough for us to make any necessary changes I’d rather like to see the energy, if there’s some invested in the first place, pointed towards creating something like a “super postmaster” that helps multiple clusters running on the same machine (even different versions, starting at the first to support such) manage the entire-machine resources (CPUs, RAM, …) better. That has use cases beyond encryption (incidentally, multi-tenant, but perhaps also keeping an older cluster around for stuff that cannot yet be dangerlessly migrated to a newer PostgreSQL version). Just my 2¢, uninformed wondering. bye, //mirabilos -- 15:41⎜ Somebody write a testsuite for helloworld :-)
Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?
On 5/22/23 12:38, Ron wrote: On 5/22/23 14:22, Tony Xu wrote: RDS Postgresql would do the job just fine. And since you can't get to the files (only access it via port 5432 and aws cli/web, there's no need for TDE. As I understand TDE whether you can get to the files is not really the point. It is that someone/thing can and if they do the files are encrypted. Pretty sure RDS is not magical enough to have no access from any source to the file system. Azure is probably the same. -- Born in Arizona, moved to Babylonia. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?
> On May 22, 2023, at 13:06, Adrian Klaver wrote: > As I understand TDE whether you can get to the files is not really the point. > It is that someone/thing can and if they do the files are encrypted. Pretty > sure RDS is not magical enough to have no access from any source to the file > system. That is true. (One of the reasons that TDE in cloud hosting is generally a regulatory issue, not really a technical one.) That being said, RDS does allow the underlying EBS volume to be encrypted, and you can do "bring your own key" using their keystores.
Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?
On 5/22/23 15:06, Adrian Klaver wrote: On 5/22/23 12:38, Ron wrote: On 5/22/23 14:22, Tony Xu wrote: RDS Postgresql would do the job just fine. And since you can't get to the files (only access it via port 5432 and aws cli/web, there's no need for TDE. As I understand TDE whether you can get to the files is not really the point. It is that someone/thing can and if they do the files are encrypted. Pretty sure RDS is not magical enough to have no access from any source to the file system. They sure hide the raw files well, only allowing access from very restricted paths (port Or maybe we just locked it down tight. -- Born in Arizona, moved to Babylonia.
Re: CTE, lateral or jsonb_object_agg ?
On Sat, May 20, 2023 at 9:43 AM Marcos Pegoraro wrote: > I have a table like pg_settings, so records have name and value. > This select is really fast, just 0.1 or 0.2 ms, but it runs millions of > times a day, so ... > > Then all the time I have to select up to 10 of these records but the > result has to be a single record. So it can be done with ... > > --Using CTE > with > BancoPadrao as (select varvalue from sys_var where name = > $$/Geral/BancoPadrao$$), > BancoMatricula as (select varvalue from sys_var where name = > $$/Geral/BancoMatricula$$), > BancoParcela as (select varvalue from sys_var where name = > $$/Geral/BancoParcela$$), > BancoMaterial as (select varvalue from sys_var where name = > $$/Geral/BancoMaterial$$) > select (select * from BancoPadrao) BancoPadrao, >(select * from BancoMatricula) BancoMatricula, >(select * from BancoParcela) BancoParcela, >(select * from BancoMaterial) BancoMaterial; > Try the classic method: select max(varvalue) filter (where name = $$/Geral/BancoPadrao$$) as BancoPadrao, max(varvalue) filter (where name = $$/Geral/BancoMatricula$$ ) as BancoMatricula, max(varvalue) filter (where name = $$/Geral/BancoParcela$$ ) as BancoParcela, max(varvalue) filter (where name = $$/Geral/BancoMaterial$$ ) as BancoMaterial from sys_Var where Name = any('{/Geral/BancoPadrao,/ Geral/BancoMatricula,/Geral/BancoParcela,/Geral/BancoMaterial}'::text[]); Regardless, these timings are mostly below what I would consider to be the noise threshold; the actual query runtime is not much compared to the work the server has to spend setting up the query. If you want real benefits for this kind of case, consider things like prepared statements (PREPARE/EXECUTE) and/or application caching. You can also consider organizing "sys_var" into a record instead of a key value store. merlin >
15 pg_upgrade with -j
Hello! We are moving from 10 to 15 and are in testing now. Our development database is about 1400G and takes 12 minutes to complete a pg_upgrade with the -k (hard-links) version. This is on a CentOS 7 server with 80 cores. Adding -j 40 to use half of those cores also finishes in 12 minutes and ps / top/ htop never show more than a single process at a time in use. Bumping that to -j 80 to use them all also finishes in 12 minutes and still only a single process. Running the suggested vacuum analyze after pg_upgrade completes takes about 19 minutes. Adding -j 40 takes that time down to around 5 minutes, jumps the server load up over 30 and htop shows 40 processes. If -j 40 helps there--why not with pg_upgrade? The full commands we are using for pg_upgrade are pretty stock: time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40 time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80 Our production database is closer to 1900G. If we're looking at a 30 minute pg_upgrade window we'll be okay but if there is anything we can do to knock that time down we will and any suggestions to do so would be greatly appreciated. Jeff Ross
Re: 15 pg_upgrade with -j
On 5/22/23 16:20, Jeff Ross wrote: Hello! We are moving from 10 to 15 and are in testing now. Our development database is about 1400G and takes 12 minutes to complete a pg_upgrade with the -k (hard-links) version. This is on a CentOS 7 server with 80 cores. Adding -j 40 to use half of those cores also finishes in 12 minutes and ps / top/ htop never show more than a single process at a time in use. Bumping that to -j 80 to use them all also finishes in 12 minutes and still only a single process. Running the suggested vacuum analyze after pg_upgrade completes takes about 19 minutes. Adding -j 40 takes that time down to around 5 minutes, jumps the server load up over 30 and htop shows 40 processes. If -j 40 helps there--why not with pg_upgrade? From docs: https://www.postgresql.org/docs/current/pgupgrade.html The --jobs option allows multiple CPU cores to be used for copying/linking of files and to dump and restore database schemas in parallel; a good place to start is the maximum of the number of CPU cores and tablespaces. This option can dramatically reduce the time to upgrade a multi-database server running on a multiprocessor machine. So is the 1400G mostly in one database in the cluster? The full commands we are using for pg_upgrade are pretty stock: time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40 time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80 Our production database is closer to 1900G. If we're looking at a 30 minute pg_upgrade window we'll be okay but if there is anything we can do to knock that time down we will and any suggestions to do so would be greatly appreciated. Jeff Ross -- Adrian Klaver adrian.kla...@aklaver.com
Re: 15 pg_upgrade with -j
On 5/22/23 5:24 PM, Adrian Klaver wrote: On 5/22/23 16:20, Jeff Ross wrote: Hello! We are moving from 10 to 15 and are in testing now. Our development database is about 1400G and takes 12 minutes to complete a pg_upgrade with the -k (hard-links) version. This is on a CentOS 7 server with 80 cores. Adding -j 40 to use half of those cores also finishes in 12 minutes and ps / top/ htop never show more than a single process at a time in use. Bumping that to -j 80 to use them all also finishes in 12 minutes and still only a single process. Running the suggested vacuum analyze after pg_upgrade completes takes about 19 minutes. Adding -j 40 takes that time down to around 5 minutes, jumps the server load up over 30 and htop shows 40 processes. If -j 40 helps there--why not with pg_upgrade? From docs: https://www.postgresql.org/docs/current/pgupgrade.html The --jobs option allows multiple CPU cores to be used for copying/linking of files and to dump and restore database schemas in parallel; a good place to start is the maximum of the number of CPU cores and tablespaces. This option can dramatically reduce the time to upgrade a multi-database server running on a multiprocessor machine. So is the 1400G mostly in one database in the cluster? The full commands we are using for pg_upgrade are pretty stock: time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 40 time /usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-10/bin/ -B /usr/pgsql-15/bin/ -d /var/lib/pgsql/10/data -D /var/lib/pgsql/15up -k -j 80 Our production database is closer to 1900G. If we're looking at a 30 minute pg_upgrade window we'll be okay but if there is anything we can do to knock that time down we will and any suggestions to do so would be greatly appreciated. Jeff Ross Yes, one big database with about 80 schemas and several other smaller databases so -j should help, right? Jeff
Re: 15 pg_upgrade with -j
Jeff Ross writes: > On 5/22/23 5:24 PM, Adrian Klaver wrote: >> So is the 1400G mostly in one database in the cluster? > Yes, one big database with about 80 schemas and several other smaller > databases so -j should help, right? AFAICT from a quick look at the code, you won't get any meaningful parallelism unless you have several large DBs and/or several large tablespaces. It looks like the assumption was that issuing link() requests in parallel wouldn't help much but just swamp your disk if they're all on the same filesystem. Maybe that could use rethinking, not sure. regards, tom lane
Re: 15 pg_upgrade with -j
On 5/22/23 16:29, Jeff Ross wrote: On 5/22/23 5:24 PM, Adrian Klaver wrote: On 5/22/23 16:20, Jeff Ross wrote: Hello! From docs: https://www.postgresql.org/docs/current/pgupgrade.html The --jobs option allows multiple CPU cores to be used for copying/linking of files and to dump and restore database schemas in parallel; a good place to start is the maximum of the number of CPU cores and tablespaces. This option can dramatically reduce the time to upgrade a multi-database server running on a multiprocessor machine. So is the 1400G mostly in one database in the cluster? The full commands we are using for pg_upgrade are pretty stock: Yes, one big database with about 80 schemas and several other smaller databases so -j should help, right? As I understand it no. That the parallelism is between databases not within a database. Further that 'database schemas' refers to schema as the overall database object definitions not the namespaces known as schemas in the database. ' Jeff -- Adrian Klaver adrian.kla...@aklaver.com
Re: PGCon remote attendance
> Hi pgsql-general, > > Will there be any option to attend the upcoming PG conference remotely? > > Will the talks and papers be posted online following the conference, if not? You can contact the conference organizer via email. https://www.pgcon.org/2023/contact.php Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Re: 15 pg_upgrade with -j
On 5/22/23 18:42, Tom Lane wrote: Jeff Ross writes: On 5/22/23 5:24 PM, Adrian Klaver wrote: So is the 1400G mostly in one database in the cluster? Yes, one big database with about 80 schemas and several other smaller databases so -j should help, right? AFAICT from a quick look at the code, you won't get any meaningful parallelism unless you have several large DBs and/or several large tablespaces. Hmm. I'm glad I'm reading this now. It looks like the assumption was that issuing link() requests in parallel wouldn't help much but just swamp your disk if they're all on the same filesystem. Maybe that could use rethinking, not sure. It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is going to be on a different LUN from /var/lib/pgsql/9.6 just like /var/lib/pgsql/backups is on a different LUN. -- Born in Arizona, moved to Babylonia.
How to speed up product code and subcode match
Hi! Price list of main products vordlusajuhinnak contains 3 prices for product (column toode) and has 39433 products: create table vordlusajuhinnak( toode varchar(60), n2 numeric(8,2), n3 numeric(8,2), n4 numeric(8,2) ); toode column in unique, may be primary key in table and contains upper case letters, digits and - characters. product table (toode) contains 733021 products: CREATE TABLE toode ( grupp character(1), toode character(60) primary key, ... lot of other columns ); Both tables have pattern indexes to speed up queries: CREATE INDEX toode_toode_pattern_idx ON toode (toode bpchar_pattern_ops ASC NULLS LAST); -- This index is probably not used, should removed: CREATE INDEX vordlusajuhinnak_toode_pattern_idx ON vordlusajuhinnak(toode bpchar_pattern_ops); Product table as both main products and subproducts with sizes. Size is separated by main product code always by / character: SHOE1-BLACK SHOE1-BLACK/38 SHOE1-BLACK/41 SHOE1-BLACK/42 SHOE1-BLACK/43 SHOE2/XXL SHOE2/L Product codes contain upper characers only in this table also. Trying to get prices for all products using create table peatoode as select toode.toode , n2, n3, n4 from toode, vordlusajuhinnak where toode.toode between vordlusajuhinnak.toode and vordlusajuhinnak.toode||'/z' Takes 4.65 hours. How to speed this query up? Output from explain: > "Nested Loop (cost=0.55..272273178.12 rows=3202240012 width=78)" " > -> Seq Scan on vordlusajuhinnak (cost=0.00..721.33 rows=39433 width=32)" " -> Index Only Scan using toode_pkey on toode > (cost=0.55..6092.62 rows=81207 width=60)" " Index Cond: (toode > >= (vordlusajuhinnak.toode)::bpchar)" " Filter: ((toode)::text <= ((vordlusajuhinnak.toode)::text || '/z'::text))" Using PostgreSQL 13.2, compiled by Visual C++ build 1900, 64-bit in Windows server and psqlODBC driver. Upgrading Postgres is possible, if this helps. Tried also using like: WHERE toode.toode=vordlusajuhinnak.toode OR toode.toode LIKE vordlusajuhinnak.toode||'/%' Posted also in https://stackoverflow.com/questions/76311957/how-to-match-character-columns-with-separated-subcolumns Andrus.