Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?
Greetings, There seems to be a bit of confusion here, so I'll try to clear it up. * Tony Xu (tony...@rubrik.com) wrote: > Thanks all for the discussions. It sounds like there are different > questions to clear before we can get to a conclusion on if per-database KEK > is possible or not. It's not possible in PG's design today to ensure that all user data is segregated by database, due to the fact that there's a shared WAL which includes data from all databases running under a given postmaster (typically considered a 'cluster' or 'instance'). There are also shared catalogs which aren't per-database to begin with (as for if that's considered user data or not depends on your definition of user data.. are users/roles considered user data, or not?). > First question - do we, as a community, see the value of the proposal and > do we believe that value is big enough for us to make any necessary changes > to PostgreSQL? Another way to ask is, do we want/need to keep the parity > with other database solutions that already have such ability? TDE is clearly valuable to this community and multiple organizations have implemented it already and there's an ongoing effort to bring TDE into core, with patches already posted and many of these design decisions made. Given the point made above, it seems very unlikely that the basics of the design are going to change, at least not unless PG changes in some fundamental way when it comes to WAL and how shared catalogs and databases work (note: I'm not aware of anyone currently working on this..). > If the answer to the first question is no, then I will stop here. > However, if the answer is yes or "it depends", then the second question is > on how - It seems that per-cluster KEK can be a a good alternative to > achieve the same as per-db KEK, however, there are still some shared area > between clusters. Is it possible to further split this shared area? Are > there more shared memory/storage between clusters? There isn't anything shared between PG clusters, to be clear. A "cluster" in PG is all data and processes running under one "postmaster", there is nothing shared between two PG clusters. Splitting up the WAL to be per-database has been contemplated before but there's no active work trying to achieve that and there's a lot of complications associated with trying to do that (such as- what to do about the shared catalogs...). Thanks, Stephen signature.asc Description: PGP signature
Re: How to speed up product code and subcode match
Hi! I ran analyze toode; create index vordlusajuhinnak_toode_pattern_idx on vordlusajuhinnak(toode bpchar_pattern_ops); create index vordlusajuhinnak_toode_idx on vordlusajuhinnak(toode); analyze vordlusajuhinnak; Select runs now more than one hour. Output from explain explain create table peatoode as select toode.toode , n2, n3, n4 from toode, vordlusajuhinnak WHERE toode.toode=vordlusajuhinnak.toode OR toode.toode LIKE vordlusajuhinnak.toode||'/%' "Gather (cost=1000.55..443361894.55 rows=143828910 width=78)" " Workers Planned: 2" " -> Nested Loop (cost=0.55..428978003.55 rows=59928712 width=78)" " Join Filter: ((toode.toode = (vordlusajuhinnak.toode)::bpchar) OR (toode.toode ~~ ((vordlusajuhinnak.toode)::text || '/%'::text)))" " -> Parallel Index Only Scan using toode_pkey on toode (cost=0.55..95017.93 rows=303869 width=60)" " -> Seq Scan on vordlusajuhinnak (cost=0.00..721.33 rows=39433 width=32)" with Set enable_nestloop to off; explain output is: "Gather (cost=1001000.55..10443361906.55 rows=143828910 width=78)" " Workers Planned: 2" " -> Nested Loop (cost=100.55..10428978015.55 rows=59928712 width=78)" " Join Filter: ((toode.toode = (vordlusajuhinnak.toode)::bpchar) OR (toode.toode ~~ ((vordlusajuhinnak.toode)::text || '/%'::text)))" " -> Parallel Index Only Scan using toode_pkey on toode (cost=0.55..95029.93 rows=303869 width=60)" " -> Seq Scan on vordlusajuhinnak (cost=0.00..721.33 rows=39433 width=32)" How to speed it up? Andrus. 23.05.2023 14:32 Bzm@g kirjutas: Great, However I think it is still way to slow. Next step is to run analyze also for the other table vordlusajuhinnak. And make sure you have an index on vordlusajuhinnak.toode similar to the index on toode.toode -- Boris Am 23.05.2023 um 12:56 schrieb Andrus : Hi! I ran analyze firma2.toode and changed where clause to use like: create table peatoode as select toode.toode , n2, n3, n4 from toode, vordlusajuhinnak WHERE toode.toode=vordlusajuhinnak.toode OR toode.toode LIKE vordlusajuhinnak.toode||'/%' In this case it took 37 minutes, returned 277966 rows. Thank you for help. Andrus. 23.05.2023 11:24 Bzm@g kirjutas: Also your row count is way off I guess. Did you ever run analyze bigtable? -- Boris Am 23.05.2023 um 10:22 schrieb...@2bz.de: Hi there, I guess the main problem is the nested loop. As a quick recheck what happened if you run your query Without nested loop? This is not a solution but a quickt test In a Session Set enable_nestedloop = off; Explain Select your query ; -- Boris Am 23.05.2023 um 08:53 schrieb Andrus: Hi! Price list of main products vordlusajuhinnak contains 3 prices for product (column toode) and has 39433 products: create table vordlusajuhinnak( toode varchar(60), n2 numeric(8,2), n3 numeric(8,2), n4 numeric(8,2) ); toode column in unique, may be primary key in table and contains upper case letters, digits and - characters. product table (toode) contains 733021 products: CREATE TABLE toode ( grupp character(1), toode character(60) primary key, ... lot of other columns ); Both tables have pattern indexes to speed up queries: CREATE INDEX toode_toode_pattern_idx ON toode (toode bpchar_pattern_ops ASC NULLS LAST); -- This index is probably not used, should removed: CREATE INDEX vordlusajuhinnak_toode_pattern_idx ON vordlusajuhinnak(toode bpchar_pattern_ops); Product table as both main products and subproducts with sizes. Size is separated by main product code always by / character: SHOE1-BLACK SHOE1-BLACK/38 SHOE1-BLACK/41 SHOE1-BLACK/42 SHOE1-BLACK/43 SHOE2/XXL SHOE2/L Product codes contain upper characers only in this table also. Trying to get prices for all products using create table peatoode as select toode.toode , n2, n3, n4 from toode, vordlusajuhinnak where toode.toode between vordlusajuhinnak.toode and vordlusajuhinnak.toode||'/z' Takes 4.65 hours. How to speed this query up? Output from explain: "Nested Loop (cost=0.55..272273178.12 rows=3202240012 width=78)"" -> Seq Scan on vordlusajuhinnak (cost=0.00..721.33 rows=39433 width=32)" " -> Index Only Scan using toode_pkey on toode (cost=0.55..6092.62 rows=81207 width=60)" "Index Cond: (toode = (vordlusajuhinnak.toode)::bpchar)" "Filter: ((toode)::text <= ((vordlusajuhinnak.toode)::text || '/z'::text))" Using PostgreSQL 13.2, compiled by Visual C++ build 1900, 64-bit in Windows server and psqlODBC driver. Upgrading Postgres is possible, if this helps. Tried also using like: WHERE toode.toode=vordlusajuhinnak.toode OR toode.toode LIKE vordlusajuhinnak.toode||'/%' Posted also in https://stackoverflow.com/questions/76311957/how-to-match-character-columns-with-separated-subcolumns Andrus.
Re: How to speed up product code and subcode match
Andrus writes: > Product table as both main products and subproducts with sizes. Size is > separated by main product code always by / character: > SHOE1-BLACK > SHOE1-BLACK/38 > SHOE1-BLACK/41 > SHOE1-BLACK/42 > SHOE1-BLACK/43 > SHOE2/XXL > SHOE2/L You could probably have devised a worse data representation if you really tried, but it would have taken some effort. Separate the product code and size into two columns --- if there's somebody who really wants to see them in the above format, give them a view or generated column. Then instead of the impossible-to-optimize queries you showed, you could do something like select toode.toode , n2, n3, n4 from toode, vordlusajuhinnak where toode.toode = vordlusajuhinnak.toode; Anytime you're trying to join two tables on something that isn't a plain equality condition (or ANDed conditions), you're in for a world of hurt. regards, tom lane
Re: How to speed up product code and subcode match
Hi! Separate the product code and size into two columns --- if there's somebody who really wants to see them in the above format, give them a view or generated column. Then instead of the impossible-to-optimize queries you showed, you could do something like select toode.toode , n2, n3, n4 from toode, vordlusajuhinnak where toode.toode = vordlusajuhinnak.toode; Can function index create index on toode ( *split_part( toode, '/',1) *) and query select toode.toode , n2, n3, n4 from toode, vordlusajuhinnak where *split_part( toode.toode, '/',1) *= vordlusajuhinnak.toode; used and keeping existing table structure? Functional index should produce same speed improvement as using separate column? Andrus.
Re: 15 pg_upgrade with -j
On 5/22/23 5:42 PM, Tom Lane wrote: Jeff Ross writes: On 5/22/23 5:24 PM, Adrian Klaver wrote: So is the 1400G mostly in one database in the cluster? Yes, one big database with about 80 schemas and several other smaller databases so -j should help, right? AFAICT from a quick look at the code, you won't get any meaningful parallelism unless you have several large DBs and/or several large tablespaces. It looks like the assumption was that issuing link() requests in parallel wouldn't help much but just swamp your disk if they're all on the same filesystem. Maybe that could use rethinking, not sure. regards, tom lane Thanks Tom. These are all smokingly fast SSDs so it would be interesting to see how well they'd hold up under that load. Jeff
Re: 15 pg_upgrade with -j
On 5/22/23 5:43 PM, Adrian Klaver wrote: From docs: https://www.postgresql.org/docs/current/pgupgrade.html The --jobs option allows multiple CPU cores to be used for copying/linking of files and to dump and restore database schemas in parallel; a good place to start is the maximum of the number of CPU cores and tablespaces. This option can dramatically reduce the time to upgrade a multi-database server running on a multiprocessor machine. So is the 1400G mostly in one database in the cluster? The full commands we are using for pg_upgrade are pretty stock: Yes, one big database with about 80 schemas and several other smaller databases so -j should help, right? As I understand it no. That the parallelism is between databases not within a database. Further that 'database schemas' refers to schema as the overall database object definitions not the namespaces known as schemas in the database. Thanks Adrian. That "restore database schemas in parallel" phrase seems like it would be really easy to read like we did and expect it to work with one database and multiple schemas. Maybe it should be changed to "restore multiple databases in parallel" instead? Jeff
Re: How to speed up product code and subcode match
Hi! Using index create index on toode ( *split_part( toode, '/',1) *) and query select toode.toode , n2, n3, n4 from toode, vordlusajuhinnak where *split_part( toode.toode, '/',1) *= vordlusajuhinnak.toode; reduces run time to 5 minutes. Andrus. 23.05.2023 17:26 Andrus kirjutas: Hi! Separate the product code and size into two columns --- if there's somebody who really wants to see them in the above format, give them a view or generated column. Then instead of the impossible-to-optimize queries you showed, you could do something like select toode.toode , n2, n3, n4 from toode, vordlusajuhinnak where toode.toode = vordlusajuhinnak.toode; Can function index create index on toode ( *split_part( toode, '/',1) *) and query select toode.toode , n2, n3, n4 from toode, vordlusajuhinnak where *split_part( toode.toode, '/',1) *= vordlusajuhinnak.toode; used and keeping existing table structure? Functional index should produce same speed improvement as using separate column? Andrus.
Re: Trying to understand a failed upgrade in AWS RDS
On Sun, 2023-05-21 at 07:56 -0700, Mike Lissner wrote: > > As far as I know it's impossible to reliably pg_upgrade a node > > that has subscriptions and eventually resume logical > > replication. > > > > > Should this go in the documentation somewhere? Maybe in the > pg_upgrade notes? I still don't understand the mechanism. You also > say that: > > > It's possible to make it work with some efforts in some basic > > configurations and / or if no changes happen on the publications > > > > > But that kind of surprises me too, actually, because it seemed like > pg_upgrade wiped out the LSN locations of the subcriber, making it > start all over. > > Upgrading a subscriber seems like something that could/should work, > so it should be documented if pg_upgrade is incompatible with > maintaining a subscription, shouldn't it? The docs are strangely silent on this. AFAIK pg_upgrade on either the publisher or subscriber breaks logical replication, which does make sense since pg_upgrade basically makes a new database cluster as it runs. There is a way to manually set the LSN position of an enabled=false replication slot, but I've failed to make that work right in tests so far.
Questions on PostgreSQL 13.4 Installer for Windows
Hi, Our developers do their work using a local PostgreSQL database server and local schemas. This is typically done by installing the Windows version of the PostgreSQL installer from EnterpriseDB. Currently, version 13.4 (64-bit) of the installer is being used. Recently, our organization has installed BeyondTrust Privilege Management software on our developer's PCs. The software forces the developer to authenticate when doing administrative tasks such as installing software and adjusting environment variables. When a developer attempts to install the PostgreSQL 13.4 database server, the developer is authenticated and the software runs as an administrator. The issue occurs when the installer initializes the database cluster. When it attempts to do this, it requires that it be done as a non-administrator. This causes the installation process to fail. We have a manual procedure to follow when this occurs, which involves manually initializing the database cluster as a non-administrator, installing the Windows service as an administrator and reinstalling some of the PostgreSQL components. The process works, but it is quite cumbersome. Has anyone had a similar issue when using the PostgreSQL installer for Windows when the BeyondTrust Privilege Management (or similar software) is installed on their PC. If so, what have you done to resolve this. We have been in contact with the company that develops the BeyondTrust software, but they have not been able to help us correct the problem as of yet. Thanks, Joel
Re: 15 pg_upgrade with -j
On 2023-05-22 21:10:48 -0500, Ron wrote: > On 5/22/23 18:42, Tom Lane wrote: > > It looks like the assumption was that issuing link() > > requests in parallel wouldn't help much but just swamp your disk > > if they're all on the same filesystem. > > Maybe that could use rethinking, not sure. > > It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is > going to be on a different LUN from /var/lib/pgsql/9.6 You can't hardlink between different file systems. Even if you could assign single directories to specific LUNs (does any file system allow this?) this would at best spread the updates across two LUNs (the inodes would presumable stay on the source LUN and the target directory would be on the target LUN). hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Re: 15 pg_upgrade with -j
On 5/23/23 12:19, Peter J. Holzer wrote: On 2023-05-22 21:10:48 -0500, Ron wrote: On 5/22/23 18:42, Tom Lane wrote: It looks like the assumption was that issuing link() requests in parallel wouldn't help much but just swamp your disk if they're all on the same filesystem. Maybe that could use rethinking, not sure. It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is going to be on a different LUN from /var/lib/pgsql/9.6 You can't hardlink between different file systems. We'd never hardlink. Eliminates the ability to return to the old system if something goes wrong. -- Born in Arizona, moved to Babylonia.
Re: 15 pg_upgrade with -j
## Ron (ronljohnso...@gmail.com): > We'd never hardlink. Eliminates the ability to return to the old > system if something goes wrong. That's why you get yourself a recent XFS and use clone mode (still sticks you to the same filesystem, but gets you up running much faster). Regards, Christoph -- Spare Space
Re: 15 pg_upgrade with -j
On 2023-05-23 13:17:24 -0500, Ron wrote: > On 5/23/23 12:19, Peter J. Holzer wrote: > > On 2023-05-22 21:10:48 -0500, Ron wrote: > > > On 5/22/23 18:42, Tom Lane wrote: > > > > It looks like the assumption was that issuing link() ^^ > > > > requests in parallel wouldn't help much but just swamp your disk > > > > if they're all on the same filesystem. > > > > Maybe that could use rethinking, not sure. > > > It does need rethinking in the era of VMs and SANs. /var/lib/pgsql/15 is > > > going to be on a different LUN from /var/lib/pgsql/9.6 > > You can't hardlink between different file systems. > > We'd never hardlink. But that was what Jeff and Tom were talking about. If you are changing the subject you should at least make it explicit. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Re: 15 pg_upgrade with -j
On 5/23/23 13:58, Christoph Moench-Tegeder wrote: ## Ron (ronljohnso...@gmail.com): We'd never hardlink. Eliminates the ability to return to the old system if something goes wrong. That's why you get yourself a recent XFS and use clone mode (still sticks you to the same filesystem, but gets you up running much faster). Sadly, our 9.6 systems (the ones we /really/ need to get off of) are RHEL 6.10 and ext4. -- Born in Arizona, moved to Babylonia.