Re: ZFS filesystem - supported ?
On Sat, 2021-10-23 at 11:29 +, Laura Smith wrote: > Given an upcoming server upgrade, I'm contemplating moving away from XFS to > ZFS > (specifically the ZoL flavour via Debian 11). > BTRFS seems to be falling away (e.g. with Redhat deprecating it etc.), hence > my preference for ZFS. > > However, somewhere in the back of my mind I seem to have a recollection of > reading > about what could be described as a "strong encouragement" to stick with more > traditional options such as ext4 or xfs. ZFS is probably reliable, so you can use it with PostgreSQL. However, I have seen reports of performance tests that were not favorable for ZFS. So you should test if the performance is good enough for your use case. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Re: Issue with pg_basebackup v.11
Thanks Tom. Regards, Ninad Shah On Sat, 23 Oct 2021 at 20:12, Tom Lane wrote: > Ninad Shah writes: > > Would keepalive setting address and mitigate the issue? > > [ shrug... ] Maybe; nobody else has more information about this > situation than you do. I suggested something to experiment with. > > regards, tom lane >
Re: ZFS filesystem - supported ?
Greetings, * Mladen Gogala (gogala.mla...@gmail.com) wrote: > On 10/23/21 23:12, Lucas wrote: > >This has proven to work very well for me. I had to restore a few backups > >already and it always worked. The bad part is that I need to stop the > >database before performing the Snapshot, for data integrity, so that means > >that I have a hot-standby server only for these snapshots. > >Lucas > > Actually, you don't need to stop the database. You need to execute > pg_start_backup() before taking a snapshot and then pg_stop_backup() when > the snapshot is done. You will need to recover the database when you finish > the restore but you will not lose any data. I know that pg_begin_backup() > and pg_stop_backup() are deprecated but since PostgreSQL doesn't have any > API for storage or file system snapshots, that's the only thing that can > help you use storage snapshots as backups. To my knowledge,the only database > that does have API for storage snapshots is DB2. The API is called "Advanced > Copy Services" or ACS. It's documented here: > > https://www.ibm.com/docs/en/db2/11.1?topic=recovery-db2-advanced-copy-services-acs > > For Postgres, the old begin/stop backup functions should be sufficient. No, it's not- you must also be sure to archive any WAL that's generated between the pg_start_backup and pg_stop_backup and then to be sure and add into the snapshot the appropriate signal files or recovery.conf, depending on PG version, to indicate that you're restoring from a backup and make sure that the WAL is made available via restore_command. Just doing stat/stop backup is *not* enough and you run the risk of having an invalid backup or corruption when you restore. If the entire system is on a single volume then you could possibly just take a snapshot of it (without any start/stop backup stuff) but it's very risky to do that and then try to do PITR with it because we don't know where consistency is reached in such a case (we *must* play all the way through to the end of the WAL which existed at the time of the snapshot in order to reach consistency). In the end though, really, it's much, much, much better to use a proper backup and archiving tool that's written specifically for PG than to try and roll your own, using snapshots or not. Thanks, Stephen signature.asc Description: PGP signature
Re: ZFS filesystem - supported ?
On Mon, Oct 25, 2021 at 10:18 AM Laurenz Albe wrote: > On Sat, 2021-10-23 at 11:29 +, Laura Smith wrote: > > Given an upcoming server upgrade, I'm contemplating moving away from XFS > to ZFS > > (specifically the ZoL flavour via Debian 11). > > BTRFS seems to be falling away (e.g. with Redhat deprecating it etc.), > hence my preference for ZFS. > > > > However, somewhere in the back of my mind I seem to have a recollection > of reading > > about what could be described as a "strong encouragement" to stick with > more traditional options such as ext4 or xfs. > > ZFS is probably reliable, so you can use it with PostgreSQL. > > However, I have seen reports of performance tests that were not favorable > for ZFS. > So you should test if the performance is good enough for your use case. > It very much depends on lots of factors. On the whole ZFS on spinning disks is going to have some performance... rough corners. And it is a lot harder to reason about a lot of things including capacity and performance when you are doing copy on write on both the db and FS level, and have compression in the picture. And there are other areas of complexity, such as how you handle partial page writes. On the whole I think for small dbs it might perform well enough. On large or high velocity dbs I think you will have more problems than expected. Having worked with PostgreSQL on ZFS I wouldn't generally recommend it as a general tool. Best Wishes, Chris Travers > > Yours, > Laurenz Albe > -- > Cybertec | https://www.cybertec-postgresql.com > > > > -- Best Wishes, Chris Travers Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in. http://www.efficito.com/learn_more
Re: ZFS filesystem - supported ?
On 10/25/21 13:13, Stephen Frost wrote: No, it's not- you must also be sure to archive any WAL that's generated between the pg_start_backup and pg_stop_backup and then to be sure and add into the snapshot the appropriate signal files or recovery.conf, depending on PG version, to indicate that you're restoring from a backup and make sure that the WAL is made available via restore_command. Just doing stat/stop backup is*not* enough and you run the risk of having an invalid backup or corruption when you restore. If the entire system is on a single volume then you could possibly just take a snapshot of it (without any start/stop backup stuff) but it's very risky to do that and then try to do PITR with it because we don't know where consistency is reached in such a case (we*must* play all the way through to the end of the WAL which existed at the time of the snapshot in order to reach consistency). In the end though, really, it's much, much, much better to use a proper backup and archiving tool that's written specifically for PG than to try and roll your own, using snapshots or not. Thanks, Stephen Stephen, thank you for correcting me. You, of course, are right. I have erroneously thought that backup of WAL logs is implied because I always back that up. And yes, that needs to be made clear. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 10/25/2021 10:13 AM, Stephen Frost wrote: Greetings, * Mladen Gogala (gogala.mla...@gmail.com) wrote: On 10/23/21 23:12, Lucas wrote: This has proven to work very well for me. I had to restore a few backups already and it always worked. The bad part is that I need to stop the database before performing the Snapshot, for data integrity, so that means that I have a hot-standby server only for these snapshots. Lucas Actually, you don't need to stop the database. You need to execute pg_start_backup() before taking a snapshot and then pg_stop_backup() when the snapshot is done. You will need to recover the database when you finish the restore but you will not lose any data. I know that pg_begin_backup() and pg_stop_backup() are deprecated but since PostgreSQL doesn't have any API for storage or file system snapshots, that's the only thing that can help you use storage snapshots as backups. To my knowledge,the only database that does have API for storage snapshots is DB2. The API is called "Advanced Copy Services" or ACS. It's documented here: https://www.ibm.com/docs/en/db2/11.1?topic=recovery-db2-advanced-copy-services-acs For Postgres, the old begin/stop backup functions should be sufficient. No, it's not- you must also be sure to archive any WAL that's generated between the pg_start_backup and pg_stop_backup and then to be sure and add into the snapshot the appropriate signal files or recovery.conf, depending on PG version, to indicate that you're restoring from a backup and make sure that the WAL is made available via restore_command. Just doing stat/stop backup is *not* enough and you run the risk of having an invalid backup or corruption when you restore. If the entire system is on a single volume then you could possibly just take a snapshot of it (without any start/stop backup stuff) but it's very risky to do that and then try to do PITR with it because we don't know where consistency is reached in such a case (we *must* play all the way through to the end of the WAL which existed at the time of the snapshot in order to reach consistency). In the end though, really, it's much, much, much better to use a proper backup and archiving tool that's written specifically for PG than to try and roll your own, using snapshots or not. Thanks, Stephen what about BTRFS since it's the successor of ZFS? -- E-BLOKOS
Re: ZFS filesystem - supported ?
On 26/10/2021, at 6:13 AM, Stephen Frost wrote: > > Greetings, > > * Mladen Gogala (gogala.mla...@gmail.com) wrote: >> On 10/23/21 23:12, Lucas wrote: >>> This has proven to work very well for me. I had to restore a few backups >>> already and it always worked. The bad part is that I need to stop the >>> database before performing the Snapshot, for data integrity, so that means >>> that I have a hot-standby server only for these snapshots. >>> Lucas >> >> Actually, you don't need to stop the database. You need to execute >> pg_start_backup() before taking a snapshot and then pg_stop_backup() when >> the snapshot is done. You will need to recover the database when you finish >> the restore but you will not lose any data. I know that pg_begin_backup() >> and pg_stop_backup() are deprecated but since PostgreSQL doesn't have any >> API for storage or file system snapshots, that's the only thing that can >> help you use storage snapshots as backups. To my knowledge,the only database >> that does have API for storage snapshots is DB2. The API is called "Advanced >> Copy Services" or ACS. It's documented here: >> >> https://www.ibm.com/docs/en/db2/11.1?topic=recovery-db2-advanced-copy-services-acs >> >> For Postgres, the old begin/stop backup functions should be sufficient. > > No, it's not- you must also be sure to archive any WAL that's generated > between the pg_start_backup and pg_stop_backup and then to be sure and > add into the snapshot the appropriate signal files or recovery.conf, > depending on PG version, to indicate that you're restoring from a backup > and make sure that the WAL is made available via restore_command. > > Just doing stat/stop backup is *not* enough and you run the risk of > having an invalid backup or corruption when you restore. > > If the entire system is on a single volume then you could possibly just > take a snapshot of it (without any start/stop backup stuff) but it's > very risky to do that and then try to do PITR with it because we don't > know where consistency is reached in such a case (we *must* play all the > way through to the end of the WAL which existed at the time of the > snapshot in order to reach consistency). > > In the end though, really, it's much, much, much better to use a proper > backup and archiving tool that's written specifically for PG than to try > and roll your own, using snapshots or not. > > Thanks, > > Stephen When I create a snapshot, the script gets the latest WAL file applied from [1] and adds that information to the Snapshot Tags in AWS. I then use that information in the future when restoring the snapshot. The script will read the tag and it will download 50 WAL Files before that and all the WAL files after that required. The WAL files are being backed up to S3. I had to restore the database to a PITR state many times, and it always worked very well. I also create slaves using the snapshot method. So, I don’t mind having to stop/start the Database for the snapshot process, as it’s proven to work fine for the last 5 years. Lucas
Re: ZFS filesystem - supported ?
On 10/25/21 15:43, E-BLOKOS wrote: what about BTRFS since it's the successor of ZFS? BTRFS is NOT the successor to ZFS. It never was. It was completely new file system developed by Oracle Corp. For some reason, Oracle seems to have lost interest in it. Red Hat has deprecated and, in all likelihood, BTRFS will go the way of Solaris and SPARC chips: ride into the glorious history of the computer science. However, BTRFS has never been widely used, not even among Fedora users like me. BTRFS was suffering from problems with corruption and performance. This is probably not the place to discuss the inner workings of snapshots, but it is worth knowing that snapshots drastically increase the IO rate on the file system - for every snapshot. That's where the slowness comes from. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: ZFS filesystem - supported ?
On 10/25/21 1:40 PM, Mladen Gogala wrote: This is probably not the place to discuss the inner workings of snapshots, but it is worth knowing that snapshots drastically increase the IO rate on the file system - for every snapshot. That's where the slowness comes from. I have recent anecdotal experience of this. I experiment with using Btrfs for a 32 TB backup system that has five 8 TB spinning disks. There's an average of 8 MBps of writes scattered around the disks, which isn't super high, obviously. The results were vaguely acceptable until I created a snapshot of it, at which point it became completely unusable. Even having one snapshot present caused hundreds of btrfs-related kernel threads to thrash in the "D" state almost constantly, and it never stopped doing that even when left for many hours. I then experimented with adding a bcache layer on top of Btrfs to see if it would help. I added a 2 TB SSD using bcache, partitioned as 1900 GB read cache and 100 GB write cache. It made very little difference and was still unusable as soon as a snapshot was taken. I did play with the various btrfs and bcache tuning knobs quite a bit and couldn't improve it. Since that test was a failure, I then decided to try the same setup with OpenZFS on a lark, with the same set of disks in a "raidz" array, with the 2 TB SSD as an l2arc read cache (no write cache). It easily handles the same load, even with 72 hourly snapshots present, with the default settings. I'm actually quite impressed with it. I'm sure that the RAID, snapshots and copy-on-write reduce the maximum performance considerably, compared to ext4. But on the other hand, it did provide the performance I expected to be possible given the setup. Btrfs *definitely* didn't; I was surprised at how badly it performed. -- Robert L Mathews, Tiger Technologies, http://www.tigertech.net/
Re: ZFS filesystem - supported ?
In my opinion, ext4 will solve any and all problems without a very deep understanding of file system architecture. In short, i would stick with ext4 unless you have a good reason not to. Maybe there is one. I have done this a long time and never thought twice about which file system should support my servers. On Mon, Oct 25, 2021, 6:01 PM Robert L Mathews wrote: > On 10/25/21 1:40 PM, Mladen Gogala wrote: > > This is probably not the place > > to discuss the inner workings of snapshots, but it is worth knowing that > > snapshots drastically increase the IO rate on the file system - for > > every snapshot. That's where the slowness comes from. > > I have recent anecdotal experience of this. I experiment with using > Btrfs for a 32 TB backup system that has five 8 TB spinning disks. > There's an average of 8 MBps of writes scattered around the disks, which > isn't super high, obviously. > > The results were vaguely acceptable until I created a snapshot of it, at > which point it became completely unusable. Even having one snapshot > present caused hundreds of btrfs-related kernel threads to thrash in the > "D" state almost constantly, and it never stopped doing that even when > left for many hours. > > I then experimented with adding a bcache layer on top of Btrfs to see if > it would help. I added a 2 TB SSD using bcache, partitioned as 1900 GB > read cache and 100 GB write cache. It made very little difference and > was still unusable as soon as a snapshot was taken. > > I did play with the various btrfs and bcache tuning knobs quite a bit > and couldn't improve it. > > Since that test was a failure, I then decided to try the same setup with > OpenZFS on a lark, with the same set of disks in a "raidz" array, with > the 2 TB SSD as an l2arc read cache (no write cache). It easily handles > the same load, even with 72 hourly snapshots present, with the default > settings. I'm actually quite impressed with it. > > I'm sure that the RAID, snapshots and copy-on-write reduce the maximum > performance considerably, compared to ext4. But on the other hand, it > did provide the performance I expected to be possible given the setup. > Btrfs *definitely* didn't; I was surprised at how badly it performed. > > -- > Robert L Mathews, Tiger Technologies, http://www.tigertech.net/ > > >
plpython3 package installation problem
Hello, As we need to use the plpython3u extension, we tried to install the plpython3 package but showed that we needed to install python3-libs, but python36-libs was already installed for patroni usage. 1. Will installing python3-libs affect current python36-libs usage? 2. If we can do some configuration to let python36-libs work as python3-libs then no need to install python3-libs? Thanks # yum localinstall /tmp/postgresql11-plpython3-11.11-1PGDG.rhel7.x86_64.rpm ... Requires: python3-libs You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest # yum search python3 | grep libs python34-libs.x86_64 : Python 3 runtime libraries python36-libs.x86_64 : Python runtime libraries shiboken-python36-libs.x86_64 : CPython bindings generator for C++ libraries - # yum list python36-libs.x86_64 Installed Packages python36-libs.x86_64 3.6.8-1.el7 Thank you
Need recommendation on PARALLEL INDEX SCAN and PARALLEL APPEND
Hi Team, Please have a look on the below problem statement and suggest us if there are any ways to make the planner pick PARALLEL INDEX SCAN and PARALLEL APPEND *Problem Statement :* We have a partitioned table with a partition key column (crdt --> timestamp). A SELECT query on this table that does not invoke the partition key column undergoes INDEX SCAN on all the partitions and it is being summed up in an APPEND node Our requirement is to make the planner pick, --PARALLEL INDEX SCAN instead of INDEX SCAN --PARALLEL APPEND instead of APPEND PostgreSQL version --> 13.4 *Table Structure :* * Partitioned table "public.pay"* * Column | Type | Collation | Nullable | Default* ---+--+---+--+- id | bigint | | not null | pri | character varying(256) | | | prf | character varying(128) | | | pi | character varying(256) | | | pas | character varying(128) | | | s | payment_state | | not null | st | jsonb | | not null | rct | character varying(32) | | | prf | jsonb | | | pa | jsonb | | | always | jsonb | | | '{}'::jsonb pr | jsonb | | | pe | jsonb | | | cda | jsonb | | | tr | jsonb | | | ar | jsonb | | | crq | jsonb | | | cr | jsonb | | | prt | jsonb | | | rrq | jsonb | | | rtrt | jsonb | | | rrt | jsonb | | | tc | character varying(32) | | | crdt | timestamp with time zone | | not null | now() isfin | boolean | | | ifi | bigint | | | rid | character varying(256) | | | pce | text | | | cce | text | | | pp | jsonb | | | *Partition key: RANGE (crdt)* *Indexes:* "pay_pkey" PRIMARY KEY, btree (id, crdt) "pay_businessid_storeid_crdt" btree ((pe ->> 'businessID'::text), (pe ->> 'storeID'::text), crdt) WHERE (pe ->> 'businessID'::text) IS NOT NULL AND (pe ->> 'storeID'::text) IS NOT NULL "pay_crdt_index" btree (crdt) "pay_ifi_idx" btree (ifi) "pay_index_isfin" btree (isfin) "pay_pi_pas_key" UNIQUE CONSTRAINT, btree (pi, pas, crdt) "pay_pri_prf_key" UNIQUE CONSTRAINT, btree (pri, prf, crdt) "pay_rid_crdt" btree (rid, crdt) WHERE rid IS NOT NULL AND crdt >= '2020-04-01 00:00:00+00'::timestamp with time zone "pay_tc_index" btree (tc, crdt) "pay_user_id_pe_index" btree ((pe ->> 'userID'::text)) WHERE (pe ->> 'userID'::text) IS NOT NULL "pay_user_id_pr_index" btree ((pr ->> 'userID'::text)) WHERE (pr ->> 'userID'::text) IS NOT NULL *Triggers:* pay_bucardo_delta AFTER INSERT OR DELETE OR UPDATE ON pay FOR EACH ROW EXECUTE FUNCTION bucardo.delta_public_pay() pay_bucardo_note_trunc_sync_payment_pay AFTER TRUNCATE ON pay FOR EACH STATEMENT EXECUTE FUNCTION bucardo.bucardo_note_truncation('sync_payment_pay') pay_payment_completion_trigger_after_upsert AFTER INSERT OR UPDATE ON pay FOR EACH ROW EXECUTE FUNCTION handle_payment_completion() *Triggers firing always:* pay_trg_change_capture_pay AFTER INSERT OR UPDATE ON pay FOR EACH ROW EXECUTE FUNCTION fun_change_capture_pay() Number of partitions: 4 (Use \d+ to list them.) *Partitions :* p_p2021_09 FOR VALUES FROM ('2021-09-01 00:00:00+00') TO ('2021-10-01 00:00:00+00'), p_p2021_10 FOR VALUES FROM ('2021-10-01 00:00:00+00') TO ('2021-11-01 00:00:00+00'), p_p2021_11 FOR VALUES FROM ('2021-11-01 00:00:00+00') TO ('2021-12-01 00:00:00+00'), p_default DEFAULT *Table_size :* *Name* *Type* *Size* pay partitioned table 0 bytes p_default table 8192 bytes p_p2021_09 table 358 MB p_p2021_10 table 370 MB p_p2021_11 table 358 MB *Note: *The table size will be in TB's in the actual scenario *Query :* SELECT id, pri, prf, pi, pas, s, st, a, rct, pr, pa, pr, pe, cda, crdt, tr, ar, crq, cr, prt, tc, ifi, isfin, rrt, rrq, rtrt, rid, pce, cce, pp FROM public.pay WHERE id = 3011852315482470422; *Query Plan :* pay=# EXPLAIN (ANALYZE,BUFFERS) SELECT id, pri, prf, pi, pas, s, st, a, rct, pr, pa, pr, pe, cda, crdt, tr, ar, crq, cr, prt, tc, ifi, isfin, rrt, rrq, rtrt, rid, pce, cce, pp FROM public.pay WHERE id = 3011852315482470422; QUERY PLAN -- Append (cost=0.29..33.09 rows=4 width=1931) (actual time=0.148..0.211 rows=0 loops=1) Buffers: shared hit=8 -> Index Scan using pay_p2021_09_pkey on pay_p2021_09 pay_1 (cost=0.29..8.30 rows=1 width=1931) (actual time=0.015..0.022 rows=0 loops=1) Index Cond: (id = '3011852315482470422'::bigint) Buffers: shared hit=2 -> Index Scan using pay_p2021_10_pkey on pay_p2021_10 pay_2 (cost=0.29..8.30 rows=1 width=1931) (actual time=0.012..0.019 rows=0 loops=1) Index Cond: (id = '3011852315482470422'::bigint) Buffers: shared hit=2 -> Index Scan using pay_p2021_11_pkey on pay_p2021_11 pay_3 (cost=0.29..8.30 rows=1 width=1931) (actual time=0.012..0.019 rows=0 loops=1) Index Cond: (id = '3011852315482470422