Setting up continuous archiving
Hi, I'm trying to compile a basic set of instruction needed to set up continuous archiving and to recover from a backup. I'm running PostgreSQL 9.3 on Debian Stretch system. I've got a database and a backup servers. The idea is to set up WAL archiving, and occasionally do full (base) backups. A base backup plus WAL segment files from that point onward must provide enough information to restore database to the latest state. Except for a window for data loss that is caused by the fact that WAL segment files aren't transferred momentarily, and more importantly that WAL segment files are only transferred upon being filled. --- Setting up continuous archiving * Set up WAL archiving * on backup server under postgres user * create /var/lib/postgresql/wal_archive dir $ mkdir /var/lib/postgresql/wal_archive * on database server under postgres user * generate ssh key $ ssh-keygen -f /var/lib/postgresql/.ssh/id_rsa # providing path to key file makes it # to not ask questions * add corresponding record to known_hosts file $ ssh-keyscan -t rsa BACKUP_SRV >> ~/.ssh/known_hosts * locally * authorize login from database to backup server $ ssh DATABASE_SRV cat ~/.ssh/id_rsa.pub | ssh BACKUP_SRV 'mkdir --mode 0700 .ssh; cat >> ~/.ssh/authorized_keys; chmod 0600 .ssh/authorized_keys' * on database server under root * change postgresql.conf wal_level = archive archive_mode = on archive_command = 'rsync -a %p BACKUP_SRV:/var/lib/postgresql/wal_archive/%f' * restart PostgreSQL # systemctl resart postgresql * Make a base backup * on database server under root * add a line to postgresql.conf max_wal_senders = 1 * add a line to pg_hba.conf host replication replication BACKUP_SRV_IP/BACKUP_SRV_NETMASK trust * restart PostgreSQL # systemctl restart postgresql * on database server under postgres user * create replication user CREATE USER replication WITH REPLICATION; or $ createuser --replication replication * on backup server under postgres user * make base backup $ pg_basebackup -h DATABASE_SRV -U replication -D /var/lib/postgresql/base_backups/$(date +%Y%m%d-%H%M%S) Restoring from a backup * under root * stop PostgreSQL if running # systemctl stop postgresql * under postgres user * move data dir $ mv 9.3{,-$(date +%Y%m%d-%H%M%S)} * copy backup $ mkdir 9.3 $ cp -r base_backups/TIMESTAMP 9.3/main * copy unarchived segment files $ find 9.3-TIMESTAMP/main/pg_xlog -maxdepth 1 -type f -exec cp -t 9.3/main/pg_xlog {} + * create recovery.conf in 9.3/main restore_command = 'cp /var/lib/postgresql/wal_archive/%f %p' * under root * start PostgreSQL # systemctl start postgresql A few notes. Running out of space on backup server can lead in its turn to database server running out of space, since WAL segment files stop being archived and keep piling up. The same might happen when archiving falls behind. Which also widens the window for data loss. WAL archiving doesn't track changes to configuration files. pg_basebackup will back up configuration files only if they are inside data dir. If database doesn't generate much WAL traffic, there could be a long delay between completing a transaction and archiving its results (archive_timeout). You might want to prevent users from accessing the database until you're sure the recovery was successful (pg_hba.conf). I'm not considering here possible issues with tablespaces and other caveats: https://www.postgresql.org/docs/9.3/static/continuous-archiving.html#CONTINUOUS-ARCHIVING-CAVEATS --- Most importantly, does it makes sense to keep more than one base backup? Also, does it look any good? Does it make sense to make ~/wal_archive and ~/base_backups dirs not readable by group and the world? From what I can see files in ~/wal_archive are 0600, ~/base_backups/TIMESTAMP is 0700. How can I confirm that it's working properly? Is WAL segments files appearing in ~/wal_archive enough? Thanks in advance. Regards, Yuri Kanivetsky
Re: Setting up continuous archiving
On 9/26/18 8:20 AM, Yuri Kanivetsky wrote: I'm trying to compile a basic set of instruction needed to set up continuous archiving and to recover from a backup. I'm running PostgreSQL 9.3 on Debian Stretch system. This is an incredibly complex topic and it is very difficult to implement correctly. My advice is that you use a mature backup solution like pgBackRest or barman rather than rolling your own. -- -David da...@pgmasters.net
using the nextval('sequence_name') in sql, the result maybe is not right
Hi, all: PostgreSQL version : 10.3. I use "nextval" in the sql , but I think the result is not right, maybe it is a bug. The test case as bellow: create sequence seq1; select nextval('seq1'); create table tx1(id1 int, id2 int); insert into tx1 select generate_series(1,100), random()*102; explain verbose select * from tx1 where id2 = nextval('seq1');; select * from tx1 where id2 = nextval('seq1'); postgres=# explain verbose select * from tx1 where id2 = nextval('seq1');; QUERY PLAN Seq Scan on public.tx1 (cost=0.00..43.90 rows=11 width=8) Output: id1, id2 Filter: (tx1.id2 = nextval('seq1'::regclass)) (3 rows) postgres=# select * from tx1 where id2 = nextval('seq1'); -- here, may be the result is not right id1 | id2 -+- 56 | 57 (1 row) :: I think "nextval('seq1')" equal 2, so " select * from tx1 where id2 = nextval('seq1')" equals " select * from tx1 where id2 = 2", is it ? Thanks, Wanglin
Re: using the nextval('sequence_name') in sql, the result maybe is not right
On Wed, 26 Sep 2018 at 14:08, Wanglin wrote: > > Hi, all: > PostgreSQL version : 10.3. I use "nextval" in the sql , but I think the > result is not right, maybe it is a bug. >The test case as bellow: >create sequence seq1; > select nextval('seq1'); > create table tx1(id1 int, id2 int); > insert into tx1 select generate_series(1,100), random()*102; > explain verbose select * from tx1 where id2 = nextval('seq1');; > select * from tx1 where id2 = nextval('seq1'); > postgres=# explain verbose select * from tx1 where id2 = nextval('seq1');; > QUERY PLAN Seq > Scan on public.tx1 (cost=0.00..43.90 rows=11 width=8) Output: id1, id2 > Filter: (tx1.id2 = nextval('seq1'::regclass)) (3 rows) > > postgres=# select * from tx1 where id2 = nextval('seq1'); -- here, may be the > result is not right id1 | id2 -+- 56 | 57 (1 row) > > :: I think "nextval('seq1')" equal 2, so " select * from tx1 where id2 = > nextval('seq1')" equals " select * from tx1 where id2 = 2", is it ? No. nextval("seq1") increments the sequence and returns the new value. It never[*] returns the same value in subsequent calls, that is the purpose of the function (and sequences in general). Normally, you would assign a sequence to a surrogate key field in your table, so that you automatically get unique values in that field (unless you mess around). That's not how you're using it, so I wonder what your purpose is for the sequence? [*] Never is never without exceptions, just like always always has. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
Re: using the nextval('sequence_name') in sql, the result maybe is not right
On 9/26/18 5:05 AM, Wanglin wrote: Hi, all: PostgreSQL version : 10.3. I use "nextval" in the sql , but I think the result is not right, maybe it is a bug. *The test case as bellow:* create sequence seq1; select nextval('seq1'); create table tx1(id1 int, id2 int); insert into tx1 select generate_series(1,100), random()*102; explain verbose select * from tx1 where id2 = nextval('seq1');; select * from tx1 where id2 = nextval('seq1'); postgres=# explain verbose select * from tx1 where id2 = nextval('seq1');; QUERY PLAN Seq Scan on public.tx1 (cost=0.00..43.90 rows=11 width=8) Output: id1, id2 Filter: (tx1.id2 = nextval('seq1'::regclass)) (3 rows) postgres=# select * from tx1 where id2 = nextval('seq1'); *-- here, **may be the result is not right* id1 | id2 -+- 56 | 57 (1 row) :: I think "nextval('seq1')" equal 2, so "select * from tx1 where id2 = nextval('seq1')" equals "select * from tx1 where id2 = 2", is it ? As Alban pointed out calling nextval() increments the sequence. As your EXPLAIN shows Postgres your SELECT is doing a sequence scan. Using your test code here I get: select * from tx1; id1 | id2 -+- 1 | 27 2 | 42 3 | 93 4 | 2 5 | 85 So going in sequence Postgres is going to compare 27 to nextval()(which is 2), not find it move to 42 = nextval()(=3) not find it and so on. If I do: select * from tx1 order by id2; I get: id1 | id2 -+- 20 | 0 2 | 42 17 | 43 63 | 45 88 | 45 27 | 46 52 | 47 47 | 47 alter sequence seq1 restart; select * from tx1 where id2 = nextval('seq1') order by id2; id1 | id2 -+- 47 | 47 The sequence catches up with the values because there are duplicate 47 values in id2. Thanks, Wanglin -- Adrian Klaver adrian.kla...@aklaver.com
Re: Out of Memory
There is also this: -bash-4.2$ prlimit -p 6590 RESOURCE DESCRIPTION SOFT HARD UNITS AS address space limitunlimited unlimited bytes CORE max core file size 0 unlimited blocks CPUCPU time unlimited unlimited seconds DATA max data size unlimited unlimited bytes FSIZE max file size unlimited unlimited blocks LOCKS max number of file locks held unlimited unlimited MEMLOCKmax locked-in-memory address space 65536 65536 bytes MSGQUEUE max bytes in POSIX mqueues819200819200 bytes NICE max nice prio allowed to raise 0 0 NOFILE max number of open files1024 4096 NPROC max number of processes 4096 59341 RSSmax resident set size unlimited unlimited pages RTPRIO max real-time priority 0 0 RTTIME timeout for real-time tasksunlimited unlimited microsecs SIGPENDING max number of pending signals 59341 59341 STACK max stack size 8388608 unlimited bytes Is it possible that the fact that my stack size is limited is what is causing my issue? Thanks, Greig -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Re: Out of Memory
greigwise writes: > Is it possible that the fact that my stack size is limited is what is > causing my issue? No. If you were hitting that limit you'd get a message specifically talking about stack. regards, tom lane
Re: Setting up continuous archiving
Greetings, * Yuri Kanivetsky (yuri.kanivet...@gmail.com) wrote: > I'm trying to compile a basic set of instruction needed to set up > continuous archiving and to recover from a backup. I'm running > PostgreSQL 9.3 on Debian Stretch system. 9.3 is about to be end-of-life in just another month or so, see: https://www.postgresql.org/support/versioning/ As mentioned, this is an extremely complicated subject and you should really use one of the tools that's been written to do exactly this. Here's a few comments as to why- > Setting up continuous archiving > > * Set up WAL archiving > > * on backup server under postgres user > > * create /var/lib/postgresql/wal_archive dir > > $ mkdir /var/lib/postgresql/wal_archive > > * on database server under postgres user > > * generate ssh key > > $ ssh-keygen -f /var/lib/postgresql/.ssh/id_rsa # providing > path to key file makes it > # to not ask > questions > > * add corresponding record to known_hosts file > > $ ssh-keyscan -t rsa BACKUP_SRV >> ~/.ssh/known_hosts > > * locally > > * authorize login from database to backup server > > $ ssh DATABASE_SRV cat ~/.ssh/id_rsa.pub | ssh BACKUP_SRV > 'mkdir --mode 0700 .ssh; cat >> ~/.ssh/authorized_keys; chmod 0600 > .ssh/authorized_keys' > > * on database server under root > > * change postgresql.conf > > wal_level = archive > archive_mode = on > archive_command = 'rsync -a %p > BACKUP_SRV:/var/lib/postgresql/wal_archive/%f' This rsync command does nothing to verify that the WAL file has been persisted to disk on the backup server, which is a problem if the backup server crashes or there's some kind of issue with it after the rsync finishes (you'll end up with gaps in your WAL stream which could prevent you from being able to restore a backup or from being able to do PITR). A good backup tool would also calculate a checksum of the WAL file and store that independently, verify that the WAL file is for the cluster configured (and not for some other cluster because someone mistakenly tried to start archiving two primaries into the same location), verify that the size of the WAL file is what's expected, and probably do a few other checks that I'm not remembering right now, but which tools like pgBackRest do. > * restart PostgreSQL > > # systemctl resart postgresql > > * Make a base backup > > * on database server under root > > * add a line to postgresql.conf > > max_wal_senders = 1 > > * add a line to pg_hba.conf > > host replication replication BACKUP_SRV_IP/BACKUP_SRV_NETMASK > trust > > * restart PostgreSQL > > # systemctl restart postgresql > > * on database server under postgres user > > * create replication user > > CREATE USER replication WITH REPLICATION; > > or > > $ createuser --replication replication > > * on backup server under postgres user > > * make base backup > > $ pg_basebackup -h DATABASE_SRV -U replication -D > /var/lib/postgresql/base_backups/$(date +%Y%m%d-%H%M%S) pg_basebackup is pretty good and it'll soon be able to perform page-level checksum validation of the database while doing a backup, assuming checksums have been enabled, but sadly it certainly didn't do that in 9.3. pg_basebackup should ensure that everything is persisted to disk, but it doesn't do anything to protect against latent corruption happening. To do that, an independent manifest of the backup needs to be built which tracks the checksum of every file backed up and then that needs to be checked when performing a restore. > Restoring from a backup > > * under root > > * stop PostgreSQL if running > > # systemctl stop postgresql > > * under postgres user > > * move data dir > > $ mv 9.3{,-$(date +%Y%m%d-%H%M%S)} > > * copy backup > > $ mkdir 9.3 > $ cp -r base_backups/TIMESTAMP 9.3/main > > * copy unarchived segment files > > $ find 9.3-TIMESTAMP/main/pg_xlog -maxdepth 1 -type f -exec cp -t > 9.3/main/pg_xlog {} + This is not something which I'd generally encourage doing.. > * create recovery.conf in 9.3/main > > restore_command = 'cp /var/lib/postgresql/wal_archive/%f %p' This restore command doesn't perform any validation of the WAL file which is being pulled back from the archive. > * under root > > * start PostgreSQL > > # systemctl start postgresql > > A few notes. > > Running out of space on backup server can lead in its turn to database > server running out of space, since WAL segment files stop being > archived and keep piling up. The same might happen when archiving > falls behind. Which also widens the window for data loss. Yes, that's a concern. pgBackRest has an option to allow you to choose if you want to let the sy
Re: Out of Memory
Tom Lane-2 wrote > greigwise < > greigwise@ > > writes: >> Is it possible that the fact that my stack size is limited is what is >> causing my issue? > > No. If you were hitting that limit you'd get a message specifically > talking about stack. > > regards, tom lane Well, darn. I'm at a loss... any suggestions of what I should do next to troubleshoot this? Thanks. Greig -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Re: how to clean the cache from databases and operating system
Hi, On 26 September 2018 at 08:25, jimmy wrote: > 1、When I execute the firse sql query, like below: > select * from tablename; > there are some datas that will be loaded into the database cache. > How to clean the data from cache. > 2、When I execute second sql query like below: > SELECT pg_prewarm('tablename', 'buffer') . Data will be loaded into > the database cache. > How to clean the data from cache. > 3、When I execute the third sql query like below: > SELECT pg_prewarm('tablename', 'main') . Data will be loaded into the > os cache. > How to clean the data from cache. > To drop the system cache, as per [0] : echo 3 > /proc/sys/vm/drop_caches To drop postgres' cache, afaik the easiest is to restart postgres. If you're like me and too lazy to do all that (and don't care about potentially losing data), you can also install an extension rjuju and I wrote that allows you to do that simply with `SELECT pg_drop_caches`. See [1] for the extension and [2] for how to use it. [0] https://www.kernel.org/doc/Documentation/sysctl/vm.txt search for "drop_caches" in that page [1] https://github.com/rjuju/pg_dropbuffers [2] https://maahl.net/pg_dropbuffers > > > >
Re: Out of Memory
I think I figured it out: vm.overcommit_memory = 2 vm.overcommit_ratio = 50 Only allows me to use 50% of my RAM... ugh! I have 16 GB, so when only 8 is left, I start seeing OOM. Will increase this setting and see if it helps. Thanks everyone for the help. Greig -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Re: Out of Memory
All right.. one more thing here. Any suggestions for how to set overcommit on a postgres db server with 16 GB of RAM and no swap? I think I want vm.overcommit_memory = 2, as I understand that prevents the OOM killer from zapping me. Is 100% the right way to go for overcommit_ratio? Is there a drawback to this? Thanks again. Greig -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Re: Why the sql is not executed in parallel mode
Which version are you running? -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Re: Setting up continuous archiving
Hello What you are doing is called "log shipping", which means that when a wal (write-ahead log) is filled in on the database server you ship it to a backup server via rsync. It is fine but as you said the disadvantage is that the file is shipped only when it is full, so you could have data loss (the last wal not shipped) A more modern and easy way is to use streaming replication: in this case the logs are streamed continuously to the standby or to the backup server (one can use streaming replication without implementing a standby database). Look at the doc on the next page than the one you referred to (https://www.postgresql.org/docs/9.3/static/high-availability.html) There is a nice tool that does one you plan to do (rsync of archived file) but also the more modern way (streaming replication): it is called barman: https://www.pgbarman.org/. You should probably use their tool but you can also read the doc to get the concepts and some ideas Rgds, Pierre On Wednesday, September 26, 2018, 9:21:29 AM GMT+2, Yuri Kanivetsky wrote: Hi, I'm trying to compile a basic set of instruction needed to set up continuous archiving and to recover from a backup. I'm running PostgreSQL 9.3 on Debian Stretch system. I've got a database and a backup servers. The idea is to set up WAL archiving, and occasionally do full (base) backups. A base backup plus WAL segment files from that point onward must provide enough information to restore database to the latest state. Except for a window for data loss that is caused by the fact that WAL segment files aren't transferred momentarily, and more importantly that WAL segment files are only transferred upon being filled. --- Setting up continuous archiving * Set up WAL archiving * on backup server under postgres user * create /var/lib/postgresql/wal_archive dir $ mkdir /var/lib/postgresql/wal_archive * on database server under postgres user * generate ssh key $ ssh-keygen -f /var/lib/postgresql/.ssh/id_rsa # providing path to key file makes it # to not ask questions * add corresponding record to known_hosts file $ ssh-keyscan -t rsa BACKUP_SRV >> ~/.ssh/known_hosts * locally * authorize login from database to backup server $ ssh DATABASE_SRV cat ~/.ssh/id_rsa.pub | ssh BACKUP_SRV 'mkdir --mode 0700 .ssh; cat >> ~/.ssh/authorized_keys; chmod 0600 .ssh/authorized_keys' * on database server under root * change postgresql.conf wal_level = archive archive_mode = on archive_command = 'rsync -a %p BACKUP_SRV:/var/lib/postgresql/wal_archive/%f' * restart PostgreSQL # systemctl resart postgresql * Make a base backup * on database server under root * add a line to postgresql.conf max_wal_senders = 1 * add a line to pg_hba.conf host replication replication BACKUP_SRV_IP/BACKUP_SRV_NETMASK trust * restart PostgreSQL # systemctl restart postgresql * on database server under postgres user * create replication user CREATE USER replication WITH REPLICATION; or $ createuser --replication replication * on backup server under postgres user * make base backup $ pg_basebackup -h DATABASE_SRV -U replication -D /var/lib/postgresql/base_backups/$(date +%Y%m%d-%H%M%S) Restoring from a backup * under root * stop PostgreSQL if running # systemctl stop postgresql * under postgres user * move data dir $ mv 9.3{,-$(date +%Y%m%d-%H%M%S)} * copy backup $ mkdir 9.3 $ cp -r base_backups/TIMESTAMP 9.3/main * copy unarchived segment files $ find 9.3-TIMESTAMP/main/pg_xlog -maxdepth 1 -type f -exec cp -t 9.3/main/pg_xlog {} + * create recovery.conf in 9.3/main restore_command = 'cp /var/lib/postgresql/wal_archive/%f %p' * under root * start PostgreSQL # systemctl start postgresql A few notes. Running out of space on backup server can lead in its turn to database server running out of space, since WAL segment files stop being archived and keep piling up. The same might happen when archiving falls behind. Which also widens the window for data loss. WAL archiving doesn't track changes to configuration files. pg_basebackup will back up configuration files only if they are inside data dir. If database doesn't generate much WAL traffic, there could be a long delay between completing a transaction and archiving its results (archive_timeout). You might want to prevent users from accessing the database until you're sure the recovery was successful (pg_hba.conf). I'm not considering here possible issues with tablespaces and other caveats: https://www.postgresql.org/docs/9.3/static/continuous-archiving.html#CONTINUOUS-ARCHIVING-CAVEATS --- Most importantly, does it makes sense to keep more than one base
Re: Out of Memory
greigwise wrote: > All right.. one more thing here. Any suggestions for how to set overcommit > on a postgres db server with 16 GB of RAM and no swap? I think I want > vm.overcommit_memory = 2, as I understand that prevents the OOM killer from > zapping me. Is 100% the right way to go for overcommit_ratio? Is there a > drawback to this? vm.overcommit_memory = 2 vm_overcommit_ratio = 100 Linux commits (swap * overcommit_ratio * RAM / 100), so without any swap the correct value would be 100. I don't know how safe it is to got entirely without swap. I would add a little bit just to be sure. Yours, Laurenz Albe
Re:Re: how to clean the cache from databases and operating system
I use windows server 2012 R2. How to drop postgresql's data in the system cache. In windows server 2012 R2, I restart postgresql by restarting postgresql service, wether it can drop postgres' cache? At 2018-09-26 22:52:08, "Maxence Ahlouche" wrote: Hi, On 26 September 2018 at 08:25, jimmy wrote: 1、When I execute the firse sql query, like below: select * from tablename; there are some datas that will be loaded into the database cache. How to clean the data from cache. 2、When I execute second sql query like below: SELECT pg_prewarm('tablename', 'buffer') . Data will be loaded into the database cache. How to clean the data from cache. 3、When I execute the third sql query like below: SELECT pg_prewarm('tablename', 'main') . Data will be loaded into the os cache. How to clean the data from cache. To drop the system cache, as per [0] : echo 3 > /proc/sys/vm/drop_caches To drop postgres' cache, afaik the easiest is to restart postgres. If you're like me and too lazy to do all that (and don't care about potentially losing data), you can also install an extension rjuju and I wrote that allows you to do that simply with `SELECT pg_drop_caches`. See [1] for the extension and [2] for how to use it. [0] https://www.kernel.org/doc/Documentation/sysctl/vm.txt search for "drop_caches" in that page [1] https://github.com/rjuju/pg_dropbuffers [2] https://maahl.net/pg_dropbuffers
how to know that one query use the data in the os cache
I use postgresql for windows server 2012 R2. I use select pg_prewarm('tablename','read','main'); to load data into the os cache. How can I know the database used the data in the os cache when I use the sql, select * from tablename, to query. explain(analyze true, buffers true) select * from tablename This sql above just show that the data in the database cache are used by querying. Can not show the data in the os cache. Because I am optimizing the query speed by loading data into the os cache. but the speed of query is as the same as the speed before using pg_prewarm. So I doubt the database do not use the data in the os cache when it is querying. I want to pick up the speed of query by using os cache and database cache.
Re: Out of Memory
## Laurenz Albe (laurenz.a...@cybertec.at): > vm.overcommit_memory = 2 > vm_overcommit_ratio = 100 > > Linux commits (swap * overcommit_ratio * RAM / 100), ^ That should be a "+". See Documentation/sysctl/vm.txt and Documentation/vm/overcommit-accounting.rst in your kernel tree. > so without any swap the correct value would be 100. > > I don't know how safe it is to got entirely without swap. That's completely safe. Even more: if your production machine ever needs swap, it's already mostly dead in the water. So there's also no benefit in having the kernel copy stuff to swap "just in case" (read: set sysctl vm.swappiness to 0) and use the swap disk space for something more productive. If your memory usage ever exceeds available memory, something already went horribly wrong, and you want to know about it right now. A good way to know about this kind of mistakes is having stuff fail hard, obviously and loudly (i.e. Out Of Memory). Do not try to save some day by swapping: stuff is "sometimes slow" and if you really need the swap, everything crawls to a halt anyway. Of course, this does not hold for personal workstations and the like where you might want to use the swap space for suspend to disk, or have some buffer in case of runaway processes. Regards, Christoph -- Spare Space.